Lesson Learned: Claude Opus 4.5 Optimization (Dec 11, 2025)
Lesson Learned: Claude Opus 4.5 Optimization (Dec 11, 2025)
ID: ll_011 Date: December 11, 2025 Severity: OPTIMIZATION Category: AI/ML, Cost Optimization, Model Selection Impact: 40-60% reduction in LLM costs while maintaining quality
Executive Summary
Implemented effort-based model selection and smart council patterns for Claude Opus 4.5, enabling significant cost reduction while maintaining decision quality for critical trades.
The Opportunity
Before Optimization
| Metric | Value |
|---|---|
| Model Used | Full council for ALL trades |
| Cost per Trade | ~$0.08-0.10 |
| Time per Analysis | 15-30 seconds |
| Council Models | Always 3+ models |
After Optimization
| Metric | Value |
|---|---|
| Routine Trades | Single model (Gemini Flash/Sonnet) |
| Standard Trades | 2-model consensus |
| Critical Trades | Full 3-stage council |
| Estimated Savings | 40-60% |
Implementation
1. EffortLevel Enum
class EffortLevel(Enum):
LOW = "low" # Haiku/Flash, 500 tokens, temp 0.3
MEDIUM = "medium" # Sonnet, 2000 tokens, temp 0.5
HIGH = "high" # Opus, 4096 tokens, temp 0.7
2. Agent-Specific Effort
agent_effort_levels = {
"execution_agent": EffortLevel.LOW, # Simple order execution
"signal_agent": EffortLevel.MEDIUM, # Standard analysis
"risk_agent": EffortLevel.MEDIUM, # Risk calculations
"research_agent": EffortLevel.HIGH, # Deep research
"rl_agent": EffortLevel.HIGH, # RL reasoning
"meta_agent": EffortLevel.HIGH, # Coordination
}
3. Confidence-Based Escalation
def should_escalate_model(confidence: float, current_effort: EffortLevel) -> bool:
"""Escalate if confidence < 0.7 and not already at HIGH"""
if current_effort == EffortLevel.HIGH:
return False
return confidence < 0.7
4. Smart Council Modes
| Mode | Use Case | Cost |
|---|---|---|
routine |
Standard signals, routine checks | Single model |
standard |
Important trades, unclear signals | 2-model consensus |
critical |
Large positions, risky conditions | Full 3-stage council |
Key Decisions
When to Use Each Mode
ROUTINE (80% of trades):
- Standard technical signals (MACD, RSI)
- Small position sizes (< $50)
- High historical win rate patterns
- Routine rebalancing
STANDARD (15% of trades):
- Mid-size positions ($50-200)
- Mixed signals (some bullish, some bearish)
- New symbols not in training data
- Market regime uncertainty
CRITICAL (5% of trades):
- Large positions (> $200)
- High volatility conditions
- Major news events
- Position exits (lock in gains/cut losses)
Verification Tests
Test 1: Effort Configuration Works
def test_effort_config_levels():
"""Verify effort configs are correctly defined."""
from src.agent_framework.agent_sdk_config import EffortConfig, EffortLevel
low = EffortConfig.for_level(EffortLevel.LOW)
assert low.max_tokens == 500
assert low.temperature == 0.3
high = EffortConfig.for_level(EffortLevel.HIGH)
assert high.max_tokens == 4096
Test 2: Model Selection by Effort
def test_model_selection_by_effort():
"""Verify correct model selection per effort level."""
from src.agent_framework.agent_sdk_config import get_agent_sdk_config, EffortLevel
config = get_agent_sdk_config()
assert "haiku" in config.get_model_for_effort(EffortLevel.LOW)
assert "sonnet" in config.get_model_for_effort(EffortLevel.MEDIUM)
assert "opus" in config.get_model_for_effort(EffortLevel.HIGH)
Test 3: Confidence Escalation
def test_confidence_escalation_triggers():
"""Verify escalation triggers at confidence threshold."""
from src.agent_framework.agent_sdk_config import get_agent_sdk_config, EffortLevel
config = get_agent_sdk_config()
# Low confidence should escalate
assert config.should_escalate_model(0.5, EffortLevel.LOW) == True
assert config.should_escalate_model(0.5, EffortLevel.MEDIUM) == True
# High confidence should not escalate
assert config.should_escalate_model(0.9, EffortLevel.LOW) == False
# HIGH effort never escalates
assert config.should_escalate_model(0.3, EffortLevel.HIGH) == False
Metrics to Track
| Metric | Target | Alert Threshold |
|---|---|---|
| Cost per trade (routine) | < $0.02 | > $0.05 |
| Cost per trade (critical) | < $0.10 | > $0.15 |
| Win rate (routine) | > 55% | < 50% |
| Win rate (critical) | > 65% | < 55% |
| Escalation rate | < 20% | > 40% |
| Council usage | < 10% | > 25% |
Key Principles
1. Start Low, Escalate on Uncertainty
“Don’t pay for Opus when Haiku will do.”
2. Full Council Only for Critical Decisions
“Multi-model consensus is expensive. Use it wisely.”
3. Confidence-Driven Model Selection
“Let the model tell you when it needs help.”
4. Trade Importance Determines Analysis Depth
“A $10 trade doesn’t need a $0.10 analysis.”
Cost Analysis
Opus 4.5 Pricing (67% cheaper than Opus 3)
| Model | Input (per 1M) | Output (per 1M) | |——-|—————-|—————–| | Opus 4.5 | $5 | $25 | | Sonnet 4.5 | $3 | $15 | | Haiku 3.5 | $0.25 | $1.25 |
Expected Monthly Savings
- Before: ~$50/month (full council every trade)
- After: ~$20/month (smart routing)
- Savings: ~$30/month (60%)
Integration with RAG
This lesson is indexed for:
- Model Selection Queries: “which model should I use for X”
- Cost Optimization Queries: “how to reduce LLM costs”
- Agent Configuration Queries: “how to configure agent effort levels”
Related Files
src/agent_framework/agent_sdk_config.py- EffortLevel, EffortConfigsrc/core/multi_llm_analysis.py- smart_council(), quick_analysis()tests/test_opus_optimization.py- Verification tests
Tags
#opus-4.5 #cost-optimization #model-selection #effort-level #smart-council #llm #rag #ml
Change Log
- 2025-12-11: Initial implementation
- 2025-12-11: Added to RAG lessons learned