Lesson Learned: Deep Research Safety Improvements (Dec 11, 2025)

ID: ll_012 Date: December 11, 2025 Severity: HIGH Category: Risk Management, Safety Systems, ML Integration Impact: Proactive improvements to prevent future trading losses

Executive Summary

Based on Deep Research analysis of the trading repository, four critical safety improvements were identified and implemented to support the $100/day North Star goal while preventing catastrophic losses. These improvements replace “dumb” fixed percentage limits with intelligent, data-driven safety gates.

The Analysis

What Was Identified

Issue Problem Solution
Fixed 10% Position Limits Too aggressive in quiet markets, too loose in volatile ATR-based volatility-adjusted limits
Execution Latency Signal price vs entry price drift Drift detection test
Revenge Trading Risk Bot can spiral during market chop $50/hour heartbeat auto-disable
LLM Hallucinations Invalid outputs could reach broker Regex validation before execution

Root Cause: Static vs Dynamic Safety Gates

The original system used hardcoded percentage limits that don’t account for market conditions:

# OLD (static, "dumb")
MAX_POSITION_PCT = 0.10  # Always 10%, regardless of market volatility

# NEW (dynamic, "smart")
if atr_pct < 0.01:    # Calm market
    max_position = 0.08  # Can be more aggressive
elif atr_pct > 0.03:  # Volatile market
    max_position = 0.03  # Must be conservative
else:
    max_position = 0.05  # Normal sizing

The Fix

1. ATR-Based Volatility-Adjusted Limits

File: src/safety/volatility_adjusted_safety.py (class ATRBasedLimits)

How It Works:

  • Calculates 14-day ATR as percentage of price
  • Classifies market regime: calm, normal, volatile, extreme
  • Dynamically adjusts position limits based on volatility

Regime-Based Limits: | Regime | ATR% Range | Max Position | |——–|————|————–| | Calm | < 1% | 8% | | Normal | 1-2% | 5% | | Volatile | 2-3% | 3% | | Extreme | > 5% | 1% |

2. Drift Detection Test

File: src/safety/volatility_adjusted_safety.py (class DriftDetector)

How It Works:

  • Records signal price at decision time
  • Compares to entry price at execution time
  • Warns if drift > 0.1%, aborts if drift > 0.5%

Usage:

from src.safety.pre_trade_hook import record_signal_price, validate_before_trade

# When signal is generated
record_signal_price("SPY", 500.00)

# When order is about to execute
result = validate_before_trade(
    symbol="SPY", side="buy", amount=10.0,
    portfolio_value=100000.0, entry_price=500.05
)
# Will warn if drift > 0.1%

3. Hourly Loss Heartbeat

File: src/safety/volatility_adjusted_safety.py (class HourlyLossHeartbeat)

How It Works:

  • Tracks cumulative hourly losses
  • Auto-disables trading if losses exceed $50 in one hour
  • Prevents “revenge trading” spirals during market chop
  • Auto-resets at the next hour

Critical Insight: Daily circuit breakers are too coarse. A bot can lose significant money in rapid succession within a single hour. The heartbeat catches this pattern.

4. LLM Hallucination Check

File: src/safety/volatility_adjusted_safety.py (class LLMHallucinationChecker)

How It Works:

  • Regex validates ticker symbols (1-5 uppercase letters)
  • Checks quantities for NaN, negative, impossibly large values
  • Validates sentiment scores within [-1, 1] range
  • Catches suspicious values: “nan”, “undefined”, “null”, etc.

Example Catches:

# These will be BLOCKED:
{"ticker": "NaN", "side": "buy"}           # Invalid ticker
{"ticker": "AAPL", "quantity": 1000000}    # Impossible quantity
{"ticker": "MSFT", "sentiment": "undefined"}  # Hallucinated value

Integration

Pre-Trade Hook Enhanced

File: src/safety/pre_trade_hook.py

All four checks integrated into validate_before_trade():

result = validate_before_trade(
    symbol="SPY",
    side="buy",
    amount=10.0,
    portfolio_value=100000.0,
    entry_price=500.0,        # For drift detection
    llm_output={"ticker": "SPY", "side": "buy"}  # For hallucination check
)

# Result includes:
# - circuit_breaker: Daily loss check
# - verification_framework: RAG lessons check
# - volatility_adjusted_safety: All 4 new checks

Key Insights

Why “Smart” Gates Beat “Dumb” Gates

Dumb Gate Smart Gate
Fixed 10% always ATR-adjusted 1-8%
Daily loss limit Hourly heartbeat
No drift check 0.1%/0.5% thresholds
No LLM validation Regex + range checks

Quote from Deep Research

“Risk management gates must be backed by data, not arbitrary percentages.”

“Fat finger protection at the broker API level prevents the AI from accidentally betting 100% of the account.”

Prevention Rules

Rule 1: Always Use Entry Price for Drift Detection

When processing a trading signal:

from src.safety.pre_trade_hook import record_signal_price

# Record when signal is generated
record_signal_price(symbol, current_price)

# Later, pass entry_price to validation
validate_before_trade(..., entry_price=execution_price)

Rule 2: Validate LLM Outputs Before Execution

Never pass raw LLM output to trading logic:

from src.safety.volatility_adjusted_safety import get_hallucination_checker

checker = get_hallucination_checker()
result = checker.validate_trade_signal(llm_output)

if not result.is_valid:
    logger.error(f"LLM hallucination: {result.errors}")
    return  # Don't execute

Rule 3: Record Trade Results for Heartbeat

After every trade:

from src.safety.pre_trade_hook import record_trade_result_for_heartbeat

status = record_trade_result_for_heartbeat(symbol, profit_loss)
if status["is_blocked"]:
    logger.warning(f"Hourly heartbeat triggered: {status['reason']}")

Verification Tests

Test 1: ATR-Based Sizing

def test_atr_based_sizing():
    from src.safety.volatility_adjusted_safety import ATRBasedLimits

    atr = ATRBasedLimits()

    # Calm market (1% ATR)
    result = atr.calculate_position_limit("SPY", 500, atr_value=5)
    assert result.volatility_regime == "calm"
    assert result.adjusted_limit_pct >= 0.05

    # Volatile market (3% ATR)
    result = atr.calculate_position_limit("NVDA", 500, atr_value=15)
    assert result.volatility_regime == "volatile"
    assert result.adjusted_limit_pct <= 0.03

Test 2: Drift Detection

def test_drift_detection():
    from src.safety.volatility_adjusted_safety import DriftDetector

    drift = DriftDetector()
    drift.record_signal("SPY", 500.00)

    # Small drift - OK
    result = drift.check_drift("SPY", 500.05)
    assert not result.should_abort

    # Large drift - ABORT
    drift.record_signal("SPY", 500.00)
    result = drift.check_drift("SPY", 502.50)
    assert result.should_abort

Test 3: Hourly Heartbeat

def test_hourly_heartbeat():
    from src.safety.volatility_adjusted_safety import HourlyLossHeartbeat

    heartbeat = HourlyLossHeartbeat(hourly_loss_limit=50)

    # Record losses
    heartbeat.record_trade_result("SPY", -20)
    heartbeat.record_trade_result("SPY", -20)
    heartbeat.record_trade_result("SPY", -15)

    # Should be blocked
    status = heartbeat.check_heartbeat()
    assert status.is_blocked

Test 4: Hallucination Check

def test_hallucination_check():
    from src.safety.volatility_adjusted_safety import LLMHallucinationChecker

    checker = LLMHallucinationChecker()

    # Valid output
    result = checker.validate_trade_signal({"ticker": "AAPL", "side": "buy"})
    assert result.is_valid

    # Invalid ticker
    result = checker.validate_trade_signal({"ticker": "NaN", "side": "buy"})
    assert not result.is_valid

RAG Integration

Automatic Learning Loop

When anomalies are detected, they’re automatically ingested into RAG:

Trade Executes → Anomaly Detected → Record to lessons_learned_rag →
Update Pattern Database → Future Trades Check RAG → Prevent Similar Issues

Vector Store Integration

The lessons learned system uses semantic search (sentence-transformers) to find relevant past mistakes based on:

  • Trade context (symbol, side, amount)
  • Error descriptions
  • Root cause patterns
  • ll_009_ci_syntax_failure_dec11.md - CI/CD safety gaps
  • ll_011_missing_function_imports_dec11.md - Import validation
  • (Future) GitHub Actions integration for automated checks

Tags

#risk-management #atr #volatility #drift-detection #heartbeat #hallucination #llm-validation #lessons-learned #ml #rag #deep-research #safety

Change Log

  • 2025-12-11: Initial implementation based on Deep Research analysis
  • 2025-12-11: Merged via PR #539