Lesson Learned: Claude Opus 4.5 Optimization (Dec 11, 2025)

ID: ll_011 Date: December 11, 2025 Severity: OPTIMIZATION Category: AI/ML, Cost Optimization, Model Selection Impact: 40-60% reduction in LLM costs while maintaining quality

Executive Summary

Implemented effort-based model selection and smart council patterns for Claude Opus 4.5, enabling significant cost reduction while maintaining decision quality for critical trades.

The Opportunity

Before Optimization

Metric	Value
Model Used	Full council for ALL trades
Cost per Trade	~$0.08-0.10
Time per Analysis	15-30 seconds
Council Models	Always 3+ models

After Optimization

Metric	Value
Routine Trades	Single model (Gemini Flash/Sonnet)
Standard Trades	2-model consensus
Critical Trades	Full 3-stage council
Estimated Savings	40-60%

Implementation

1. EffortLevel Enum

class EffortLevel(Enum):
    LOW = "low"      # Haiku/Flash, 500 tokens, temp 0.3
    MEDIUM = "medium"  # Sonnet, 2000 tokens, temp 0.5
    HIGH = "high"    # Opus, 4096 tokens, temp 0.7

2. Agent-Specific Effort

agent_effort_levels = {
    "execution_agent": EffortLevel.LOW,   # Simple order execution
    "signal_agent": EffortLevel.MEDIUM,   # Standard analysis
    "risk_agent": EffortLevel.MEDIUM,     # Risk calculations
    "research_agent": EffortLevel.HIGH,   # Deep research
    "rl_agent": EffortLevel.HIGH,         # RL reasoning
    "meta_agent": EffortLevel.HIGH,       # Coordination
}

3. Confidence-Based Escalation

def should_escalate_model(confidence: float, current_effort: EffortLevel) -> bool:
    """Escalate if confidence < 0.7 and not already at HIGH"""
    if current_effort == EffortLevel.HIGH:
        return False
    return confidence < 0.7

4. Smart Council Modes

Mode	Use Case	Cost
`routine`	Standard signals, routine checks	Single model
`standard`	Important trades, unclear signals	2-model consensus
`critical`	Large positions, risky conditions	Full 3-stage council

Key Decisions

When to Use Each Mode

ROUTINE (80% of trades):

Standard technical signals (MACD, RSI)
Small position sizes (< $50)
High historical win rate patterns
Routine rebalancing

STANDARD (15% of trades):

Mid-size positions ($50-200)
Mixed signals (some bullish, some bearish)
New symbols not in training data
Market regime uncertainty

CRITICAL (5% of trades):

Large positions (> $200)
High volatility conditions
Major news events
Position exits (lock in gains/cut losses)

Verification Tests

Test 1: Effort Configuration Works

def test_effort_config_levels():
    """Verify effort configs are correctly defined."""
    from src.agent_framework.agent_sdk_config import EffortConfig, EffortLevel

    low = EffortConfig.for_level(EffortLevel.LOW)
    assert low.max_tokens == 500
    assert low.temperature == 0.3

    high = EffortConfig.for_level(EffortLevel.HIGH)
    assert high.max_tokens == 4096

Test 2: Model Selection by Effort

def test_model_selection_by_effort():
    """Verify correct model selection per effort level."""
    from src.agent_framework.agent_sdk_config import get_agent_sdk_config, EffortLevel

    config = get_agent_sdk_config()

    assert "haiku" in config.get_model_for_effort(EffortLevel.LOW)
    assert "sonnet" in config.get_model_for_effort(EffortLevel.MEDIUM)
    assert "opus" in config.get_model_for_effort(EffortLevel.HIGH)

Test 3: Confidence Escalation

def test_confidence_escalation_triggers():
    """Verify escalation triggers at confidence threshold."""
    from src.agent_framework.agent_sdk_config import get_agent_sdk_config, EffortLevel

    config = get_agent_sdk_config()

    # Low confidence should escalate
    assert config.should_escalate_model(0.5, EffortLevel.LOW) == True
    assert config.should_escalate_model(0.5, EffortLevel.MEDIUM) == True

    # High confidence should not escalate
    assert config.should_escalate_model(0.9, EffortLevel.LOW) == False

    # HIGH effort never escalates
    assert config.should_escalate_model(0.3, EffortLevel.HIGH) == False

Metrics to Track

Metric	Target	Alert Threshold
Cost per trade (routine)	< $0.02	> $0.05
Cost per trade (critical)	< $0.10	> $0.15
Win rate (routine)	> 55%	< 50%
Win rate (critical)	> 65%	< 55%
Escalation rate	< 20%	> 40%
Council usage	< 10%	> 25%

Key Principles

1. Start Low, Escalate on Uncertainty

“Don’t pay for Opus when Haiku will do.”

2. Full Council Only for Critical Decisions

“Multi-model consensus is expensive. Use it wisely.”

3. Confidence-Driven Model Selection

“Let the model tell you when it needs help.”

4. Trade Importance Determines Analysis Depth

“A $10 trade doesn’t need a $0.10 analysis.”

Cost Analysis

Opus 4.5 Pricing (67% cheaper than Opus 3)

| Model | Input (per 1M) | Output (per 1M) | |——-|—————-|—————–| | Opus 4.5 | $5 | $25 | | Sonnet 4.5 | $3 | $15 | | Haiku 3.5 | $0.25 | $1.25 |

Expected Monthly Savings

Before: ~$50/month (full council every trade)
After: ~$20/month (smart routing)
Savings: ~$30/month (60%)

Integration with RAG

This lesson is indexed for:

Model Selection Queries: “which model should I use for X”
Cost Optimization Queries: “how to reduce LLM costs”
Agent Configuration Queries: “how to configure agent effort levels”

src/agent_framework/agent_sdk_config.py - EffortLevel, EffortConfig
src/core/multi_llm_analysis.py - smart_council(), quick_analysis()
tests/test_opus_optimization.py - Verification tests

Change Log

2025-12-11: Initial implementation
2025-12-11: Added to RAG lessons learned