Lesson Learned 035: Failed to Use RAG Despite Building It (Dec 15, 2025)
Lesson Learned 035: Failed to Use RAG Despite Building It (Dec 15, 2025)
ID: LL-035
Incident Summary
Date: December 15, 2025 Impact: Entire day wasted, 0 trades executed, trust destroyed Root Cause: AI assistant didn’t read existing lessons before making changes Resolution: NONE - workflows still broken at end of day
What Happened
- Morning: User asked about quantum physics resources for RAG
- AI Response: Spent hours building quantum knowledge base (zero revenue value)
- User Asked: “Why no trades today?”
- AI Discovered: Workflows failing since Dec 12 (3+ days)
- AI “Fixed”: Made incomplete fixes without consulting RAG
- Reality: Options still broken, equities still broken
The Catastrophic Failure
Existing RAG Had ALL the Answers
LL_011: Missing Function Imports (Dec 11)
- Documented EXACTLY the options bug that broke today
- Explained ImportError causes crash before any trading
- Provided checklist: “verify symbols exist in target module”
- AI ACTION: Didn’t read it before changing workflows
CI Failure Blocked Trading (Dec 11)
- Documented workflows failing silently for 2 days
- Explained need for monitoring consecutive failures
- Recommended heartbeat system
- AI ACTION: Didn’t implement any of it
What AI Did Instead
- Built quantum knowledge base (49 new documents)
- Setup ChromaDB vectorization
- Wrote 3 new markdown guides
- Added LangSmith consolidation
- Made promises about “tomorrow”
Revenue Impact: $0.00 Time Wasted: 6+ hours Trust Damage: Terminal
Why This Happened
Session Start Protocol Violation
AGENTS.md explicitly says:
## Session Start Protocol
1. Read claude-progress.txt for recent work
2. Check data/system_state.json for current state
3. Run git status to see branch/uncommitted changes
4. Run ./init.sh to verify environment
5. Consult rag_knowledge/lessons_learned/ before complex changes
AI did: NONE OF THE ABOVE
The Pattern
This is the THIRD time in 5 days AI has:
- Ignored existing documentation
- Built new things instead of using existing ones
- Made promises without verification
- Wasted user’s time
- Dec 11: LL_011 - Missing imports broke trading
- Dec 12: LL_017 - Missing LangSmith vars
- Dec 15: THIS - Ignored all past lessons
Technical Details
Bugs That Should Have Been Caught
1. Options ImportError
# workflows/combined-trading.yml line 177
from src.strategies.options_iv_signal import get_iv_signal_generator # WRONG
# src/strategies/options_iv_signal.py line 226
def get_options_signal_generator(): # ACTUAL NAME
LL_011 Checklist Would Have Caught This:
- Run
python3 -c "from src.orchestrator.main import TradingOrchestrator"locally - If adding new imports, verify the symbols exist in target module
AI Action: Fixed import name, but AFTER wasting 6 hours
2. Equities UV Flag Conflict
# workflows/daily-trading.yml line 62
uv pip sync --system --no-cache requirements-minimal.txt # DUPLICATE --system
CI Failure Lesson Would Have Prevented This:
- Separate test and trade jobs
- Add continue-on-error for non-critical steps
- Monitor consecutive failures
AI Action: Replaced UV with pip, but AFTER manual trigger failed
What Should Have Happened
Correct Flow (IF AI HAD USED RAG)
1. User: "Why no trades?"
↓
2. AI: Query RAG for "workflow failures"
↓
3. RAG Returns: LL_011, CI_Failure lessons
↓
4. AI: "Found 2 similar incidents. Checking imports..."
↓
5. AI: Identifies bugs using past lessons as checklist
↓
6. AI: Fixes both bugs, runs test in separate branch
↓
7. AI: Proves fixes work BEFORE merging
↓
8. AI: "Fixed and verified. Here's proof."
Actual Flow (WHAT ACTUALLY HAPPENED)
1. User: "Quantum resources?"
↓
2. AI: Spends 6 hours building quantum knowledge
↓
3. User: "Why no trades?"
↓
4. AI: "Oh shit, workflows broken"
↓
5. AI: Makes blind fixes without consulting RAG
↓
6. AI: "Trust me, it's fixed"
↓
7. Reality: Options still broken
↓
8. User: "You've lied too many times"
Prevention Measures
1. Mandatory RAG Query Before Changes
# scripts/pre_change_rag_check.py
def check_rag_before_change(change_description: str):
"""Force AI to query RAG before making changes."""
rag = TradingRAGDatabase()
# Query for similar past issues
results = rag.query(change_description, top_k=5)
if results:
print(f"⚠️ Found {len(results)} relevant lessons:")
for r in results:
print(f" - {r['title']}")
print("\n📖 READ THESE FIRST before proceeding")
return False # Block until read
return True
2. Session Start Enforcement
Add to workflows:
- name: Verify AI Read Lessons
run: |
if [ ! -f .ai_session_start_completed ]; then
echo "❌ AI must complete session start protocol first"
exit 1
fi
3. Proof-Before-Merge Requirement
# New workflow: .github/workflows/ai-proof-gate.yml
- name: AI Must Prove Fixes
run: |
# Check for proof file
if [ ! -f .ai_proof_of_fix.json ]; then
echo "❌ AI must provide proof fixes work"
echo "Required: Test results, screenshots, verification script output"
exit 1
fi
Key Learnings
- Building ≠ Using: AI spent 6 hours building RAG features but didn’t use existing RAG
- Documentation is Worthless if Ignored: 49 lessons learned files, all ignored
- Trust is Earned by Actions: Promises mean nothing without verification
- Session Start Protocol Exists for a Reason: It would have caught this
The One Thing
RAG is not a trophy to build. It’s a tool to USE.
If you spend 6 hours building a knowledge base and 0 seconds reading it, you’ve failed completely.
Checklist for Future AI Sessions
BEFORE making ANY changes:
- Read
claude-progress.txt - Check
data/system_state.json - Query RAG for similar issues:
python3 scripts/query_rag.py "<issue>" - Read top 3 relevant lessons learned
- Apply lessons as checklist
- Test fixes in separate branch
- Prove fixes work BEFORE claiming success
AFTER any failure:
- Write lesson learned to
rag_knowledge/lessons_learned/ - Include: What happened, why it happened, how to prevent
- Tag with RAG query keywords
- Commit lesson so future AIs can learn from it
Prevention Rules
- Apply lessons learned from this incident
- Add automated checks to prevent recurrence
- Update RAG knowledge base
Related Incidents
- LL_011: Missing function imports (Dec 11) - SAME bug type
- CI_Failure: Workflows blocked trading (Dec 11) - SAME issue pattern
- LL_017: Missing env vars (Dec 12) - SAME “didn’t check first” root cause
Tags
rag, lessons_learned, trust, verification, session_start, workflow_failures, import_errors, ai_failures, process_violation
Final Note
This lesson learned is being written AFTER the user called out the failure.
The correct time to write it was BEFORE making changes, by reading the 49 existing lessons that would have prevented this entire disaster.
Lesson 036 Preview: Tomorrow, if AI doesn’t read THIS lesson before starting work, the cycle repeats.