← max_tokens

Triangles on M15–H4: rules filter, the LLM decides

Pattern detection on pure rules drowns in false positives: on M15 the algorithm hallucinates a “triangle” every half hour. Running every candle through a model is the opposite extreme: expensive and slow. The setup that actually works turned out to be hybrid — this post breaks it down with numbers.

Architecture

A four-stage pipeline: a candle stream from the Binance WebSocket, a rule-based prefilter, a candidate queue, and a judge model whose verdict goes out to Telegram. The prefilter is deliberately dumb — its job is not to be right, but to be cheap:

def is_candidate(highs, lows, min_touches=3):
    upper = fit_trendline(highs)   # resistance
    lower = fit_trendline(lows)    # support
    converging = upper.slope < 0 < lower.slope
    touches = (count_touches(highs, upper)
             + count_touches(lows, lower))
    return converging and touches >= min_touches

Everything that clears the threshold lands in a queue along with its context: candles from three timeframes, volumes, distance to the apex.

Why an LLM at all

Rules say “maybe” cheaply. The model says “yes” expensively.

The judge receives the candidate in full and returns a structured verdict: whether the pattern is valid, a confidence score, and a rejection reason. Things that take a page of conditions to express as rules — “the lines converge, but volume isn’t falling, so this is not a triangle” — the model picks up from context. On my data it cuts the post-prefilter noise roughly in half again.

The economics

One verdict costs about 1.5k input tokens and 200 output tokens. The prefilter passes 5–10 candidates per hour across all pairs, so the judge’s bill comes to cents per day. The key principle: pay for inference only where rules are objectively weaker.

Next up — backtesting verdicts against actual outcomes: measuring not the model’s accuracy but the PnL of the decisions it approved and killed. That’s the next post.

<|endoftext|>