Why Our Critic Agent Vetoes Bad Trades
Most trading bots optimize for entries. We optimize for not losing. Here's the adversarial AI layer that challenges every trade before it executes — and the risk gates that protect capital first.
The Problem: Strategy Optimization ≠ Risk Management
There's a fundamental asymmetry in how most trading bots are built. Teams spend months optimizing entry signals — RSI thresholds, MACD crossovers, LLM sentiment analysis — but treat risk management as an afterthought. A few hard-coded stop-losses, a max position size, maybe a daily drawdown circuit breaker if they're sophisticated.
Then they're confused when their bot blows up during a regime shift or gets caught in a stop hunt.
Here's the reality: a mediocre strategy with excellent risk management will outperform an excellent strategy with mediocre risk management. Every time.
At Tacavar, we built our system around this principle. The centerpiece of our risk architecture is what we call the critic agent — an adversarial AI layer that reviews every trade decision before it executes and has the power to veto, adjust, or escalate questionable trades.
Target metrics: 15-25% veto rate, >10% drawdown reduction vs. no critic, <5% escalation rate. If the critic isn't vetoing trades, it's not doing its job.
How the Critic Agent Works
The critic fires after our Trader agent makes a decision but before anything executes or goes to Telegram. It's a separate LLM call — adversarial by design — that receives the full context of the proposed trade and returns one of five verdicts:
Trade proceeds through normal routing
Proceed but flag for post-trade watch (ADX shift, drawdown acceleration)
Suggests smaller size, tighter stop, or different entry — max 2 adjustment cycles
Blocks trade entirely, logs reasoning, suggests alternative
Uncertain — sends to human with 5-minute timeout, then auto-veto
Critically: the critic doesn't just rubber-stamp decisions. Its system prompt explicitly instructs it to challenge the trade. If the Trader cites RSI < 40 as a buy signal, the critic checks whether that's regime-appropriate. If signals conflict, it biases toward caution.
What the Critic Sees
Every review includes the full decision package:
- Proposed action — buy/sell/adjust_risk, position size, confidence score, entry quality score
- Market snapshot — price, ATR, RSI, EMA50/200, ADX, volume vs. average, regime classification
- Portfolio state — current exposure, correlation matrix, VaR, open positions, account balance
- Recent performance — consecutive losses, drawdown %, Sharpe/Sortino ratios, loss streak history
- External signals — Polymarket probabilities, social sentiment, Fear & Greed Index, any contradictory researcher signals
The critic can't make decisions in a vacuum. It sees everything that matters — and it uses that context to find flaws in the Trader's reasoning.
The Veto Conditions That Matter
We don't use binary hard rules for most vetoes. Instead, we use probabilistic conditions that scale with market context. Here are the key veto triggers in our production system:
1. Trend Disagreement (Probabilistic)
If price is below EMA50, ADX > 25 (strong trend), and DI- negative is above DI+ positive, the critic vetoes with 90% probability. This is a strong downtrend — buying dips is statistically dangerous.
If ADX is between 20-25 (weaker trend), veto probability drops to 50%. The critic might suggest a smaller size instead of a full veto.
2. Dynamic Cooldown After Exit
After exiting a position, there's a cooldown period before re-entry is allowed. This isn't fixed — it scales with ATR:
cooldown_min = max(15, min(60, int(30 * atr_ratio))) # High vol = 60min cooldown # Low vol = 15min minimum
High volatility = longer cooldown (up to 60 minutes). Low volatility = shorter cooldown (15 minutes minimum). This prevents the bot from whipsawing in choppy conditions.
3. Entry Quality + Volume Spike
If entry quality score is below 70 and volume is >2× the 20-period average, the critic vetoes. This is a classic fakeout pattern — sudden volume spike on a low-quality entry is often a stop hunt or liquidity grab.
4. Drawdown Tiers
Drawdown vetoes are tiered:
- >5% drawdown: Hard veto — no new positions allowed
- 3-5% drawdown: Position size halved automatically
- 1.5-3% drawdown: Routes to Opus (stronger model) for review
5. Consecutive Losses
After 4 consecutive losses, the critic escalates to human review — no automatic trades. After 2 consecutive losses, it routes to Opus instead of the faster Sonnet model. This is a recognition that loss streaks often indicate regime mismatch, not bad luck.
6. Portfolio Exposure Cap
Any trade that would push crypto exposure above 10% of total portfolio is vetoed. This is a hard cap — no exceptions. Diversification isn't optional.
7. Open Interest Squeeze Risk
We calculate an OI squeeze score (0-100) based on:
- Current OI vs. 7-day average
- Funding rate extremes
- Long/short ratio imbalance
- Taker buy/sell pressure
If squeeze score ≥ 80, the critic vetoes new long positions. This is a crowded trade — the risk of a long squeeze (forced liquidations cascading) is too high.
8. Cluster Detection
If more than 2 capital actions are proposed within a 10-minute window, the critic forces review on all of them. This prevents herding behavior — multiple similar trades fired in quick succession during volatile conditions.
Example: How a Veto Plays Out
Trader proposes: buy ETH-USD, 5% size, confidence 0.82
Critic response: VETO — ADX=28 downtrend, price below EMA50, consecutive losses = 2. Alternative: Consider hold until ADX < 25 or price reclaims EMA50.
Result: Trade blocked. Logged to decisions.jsonl with full reasoning. Telegram notification sent with veto reason and alternative.
Model Routing: When We Use Opus vs. Sonnet
Running the critic through a top-tier LLM for every decision would be prohibitively expensive. We use intelligent model routing:
Default: Claude Sonnet — fast and cheap, suitable for routine decisions in normal conditions.
Escalate to Opus when any of:
- Consecutive losses ≥ 2
- Drawdown > 1.5%
- Trader confidence < 0.75
- Researcher signals contradict Trader
Opus is slower and more expensive, but it's better at nuanced reasoning in edge cases. We only pay for that capability when the situation warrants it.
Cost monitor: If monthly critic spend exceeds $50, we log a warning and route non-high-risk decisions to Sonnet regardless of loss count. Risk management can't bankrupt the operation.
Latency Requirements and Error Handling
The critic can't add meaningful latency. Target: <200ms for agree/adjust verdicts. We run the critic call asynchronously while preparing the confidence routing.
Timeout is 3 seconds. If the critic doesn't respond in time, we default to escalate — better to slow down and get human review than to blindly proceed.
Error handling is explicit:
try:
verdict = run_critic_review(decision, context)
except json.JSONDecodeError:
verdict = escalate("Invalid JSON from critic")
except TimeoutError:
verdict = escalate("Critic timeout")
except Exception as e:
log_error(f"Critic failed: {e}")
verdict = escalate("Critic error")If the critic fails for any reason, we escalate. Never silently proceed without risk review.
Why This Architecture Matters
The critic agent isn't just a risk control — it's a philosophical statement about how to build AI trading systems.
Most teams build bots that optimize for being right. They want the highest win rate, the sharpest backtest, the most impressive P&L chart. The problem: being right 60% of the time still leaves you exposed to ruin on the 40% you're wrong.
We optimize for not being wrong in dangerous ways. The critic doesn't care about win rate. It cares about:
- Is this trade appropriate for the current regime?
- Are we ignoring contradictory signals?
- Would this position expose us to unacceptable drawdown?
- Are we repeating a pattern that's lost money before?
This is adversarial by design. The critic's job is to find flaws, not confirmations. It's the difference between a rubber-stamp compliance team and a skeptical risk committee at a systematic trading firm.
The Results So Far
We're in week 3 of our 90-day paper trading challenge. The critic has vetoed approximately 18% of proposed trades — right in our target range. Common veto reasons:
- ADX downtrend + Trader trying to buy dips (40% of vetoes)
- Consecutive losses triggering escalation (25% of vetoes)
- OI squeeze risk on crowded longs (15% of vetoes)
- Cluster detection — too many actions in short window (12% of vetoes)
- Entry quality + volume spike fakeout pattern (8% of vetoes)
We'll publish full critic performance data at the end of the 90-day challenge — veto accuracy, false positive rate, and estimated P&L impact from blocked trades. Transparency is part of the commitment.
See the Critic Agent in Action
Every critic verdict, every veto, every escalation — logged publicly as part of our 90-Day Paper Trading Challenge. No selective reporting. No spin.