Week 2: First Trades, First Lessons
The bot stopped watching and started acting. 8 paper trades executed, an overtrading problem caught mid-week, and a cluster detection fix shipped before the damage compounded.
Week 2 at a Glance
8
Trades Taken
+$14.20
Net P&L (Paper)
62%
Win Rate
100%
Uptime
Signal Breakdown
8
Executed
19
Queued / Rejected
380+
Hold Decisions
What Changed This Week
Week 2 was a different animal. The confidence thresholds we tightened after Week 1's false starts started clearing — slowly, then all at once. By Tuesday morning, mean reversion had its first auto-execute signal since launch: ETH/USDT, ranging market, fear reading at 0.64. The LLM rated it 87% confidence. It executed.
That first trade closed at +1.3%. Small, but the pipeline worked end-to-end: signal, LLM review, critic approval, execution, Telegram alert, dashboard update. No human in the loop. No drama.
Trades That Worked
Mean Reversion — ETH/USDT
Three ETH mean reversion trades executed in ranging conditions. All profitable:
- Long at -0.9% deviation — Closed at +1.3%. Regime: ranging. Confidence: 87%. LLM reasoning: "Extreme fear at 0.64 provides contrarian edge with sufficient liquidity."
- Long at +2.1% from mean — Closed for $3.23 profit. Bollinger Band width confirmed low volatility regime. Mean reversion confluence from 3 of 4 technical indicators.
- RSI bounce off 32 — Partial close taken at RSI 48. $2.10 profit. Position reduced but not fully closed per risk rules.
Polymarket — Correct Side
Two prediction market positions resolved in our favor. The bot identified mispriced probabilities where market odds diverged from on-chain and news data by more than our 8% threshold. Both contracts resolved correctly. Combined: +$8.87 paper profit.
The Problem: Overtrading
By Wednesday, a pattern emerged that we didn't like: the bot was taking three correlated mean reversion setups within a 90-minute window. All three were on ETH. All three had high individual confidence scores. All three were essentially the same trade.
This is the clustering problem. When one strategy generates conviction, it can fire multiple overlapping signals on the same underlying asset. Individually each signal looks clean. Together, they're one oversized, undiversified bet disguised as three separate trades.
Two of those three trades closed at small losses when ETH broke out of its range unexpectedly. The clustering amplified the damage.
Incident: Correlated Entry Clustering
Date: Day 11 (March 27)
What happened: 3 ETH long positions opened within 90 minutes. All mean reversion. All valid in isolation. Combined exposure: 15% of paper capital.
Impact: -$8.40 when ETH broke range to the downside. Without clustering, max exposure would have been capped at 5%.
Root cause: No cross-signal correlation check before execution.
The Fix: Cluster Detection
We shipped cluster detection before end of day Thursday. The logic is straightforward:
- Before executing any signal, check open positions and signals queued in the last 60 minutes.
- If two or more signals share the same underlying asset (or have correlation> 0.7), the second signal is blocked until the first resolves.
- Max simultaneous exposure per asset: 5% of paper capital. Hard cap.
- Cooldown period: 45 minutes after any closed trade on an asset before new signals on that asset can execute.
Friday and the weekend were clean. No clustering. Two additional trades executed on separate assets (SOL/USDT and a BTC Bollinger Bounce). Both small, both profitable.
What Broke (Infrastructure)
Issue: LLM provider timeout during a high-load period on Day 9. The primary Claude API call took 18 seconds — above our 15-second threshold — and the bot fell back to Qwen3.5-plus automatically.
Result: The fallback worked. Decision quality was similar. The three-tier fallback (Claude → Qwen → Ollama local) held up under real conditions.
Issue: Prompt sanitization gap — the LLM received a Polymarket contract title that included bracketed probability estimates, which it used to anchor its own assessment. This is a form of prompt contamination.
Fix: Strip all numerical probability data from the raw contract title before passing to the LLM. The model now forms its own estimate before seeing market prices.
What We Learned
1. High confidence per signal does not mean low portfolio risk. Three 87%-confidence trades on the same asset is not a diversified portfolio. Correlation is the variable the individual-signal confidence score doesn't capture.
2. Mean reversion works in ranging markets — and only in ranging markets. The regime detector is the most important variable in the system. Getting the regime classification right is more valuable than improving any individual strategy.
3. Fallbacks should be tested under load, not just in isolation. The provider timeout on Day 9 was unplanned, but it was the best possible way to validate the fallback chain. It worked. Now we know.
Week 3 Focus
- Validate cluster detection across a full week of live conditions
- Add inter-strategy correlation matrix — quantify how much each strategy pair overlaps in terms of underlying assets and entry conditions
- Improve regime detection precision: current classifier uses ADX + BB width. Testing volume profile as a third input.
- Begin tracking equity curve properly — cumulative P&L chart, max drawdown, Sharpe approximation over rolling 7-day window
The Bottom Line
Net P&L for the week: +$14.20 paper profit. That's not a number worth celebrating — the real win is identifying and patching the clustering problem before it had a chance to compound over 90 days.
A trading system that catches its own bugs in week 2 instead of week 8 is a trading system that might actually survive.
Week 3 report publishes next Monday. Wins, losses, and whatever else breaks — you'll see all of it.
Follow the Challenge
New transparency report every Monday. Real data, real lessons, zero hype.