Tech Stack
AI Trading System Tech Stack
Last updated: January 21, 2026
Our autonomous AI trading system leverages cutting-edge AI/ML technologies to execute options trades. This page documents the complete technical architecture powering our 86% win rate iron condor strategy.
System Architecture Overview
Broker")] FRED[("FRED API
Treasury Yields")] NEWS[("Market News
Sentiment")] end subgraph AI["AI Layer"] CLAUDE["Claude Opus 4.5
(Critical Decisions)"] OPENROUTER["OpenRouter Gateway
(DeepSeek, Mistral, Kimi)"] RAG["LanceDB RAG
(Lessons + Trades)"] GEMINI["Gemini 2.0 Flash
(Retrieval)"] end subgraph CORE["Core Trading System"] ORCH["Trading Orchestrator"] GATES["Gate Pipeline
(Momentum, Sentiment, Risk)"] EXEC["Trade Executor"] MCP["MCP Servers
(Protocol Layer)"] end subgraph OUTPUT["Output Layer"] WEBHOOK["RAG Webhook
(Cloud Run)"] BLOG["GitHub Pages Blog"] DEVTO["Dev.to Articles"] end ALPACA --> ORCH FRED --> ORCH NEWS --> OPENROUTER ORCH --> GATES GATES --> CLAUDE GATES --> OPENROUTER GATES --> RAG RAG --> GEMINI GATES --> EXEC EXEC --> ALPACA ORCH --> MCP MCP --> WEBHOOK ORCH --> BLOG ORCH --> DEVTO
AI/ML Technologies
1. Claude (Anthropic SDK)
ACTIVE Primary LLM for Critical Decisions
Claude Opus 4.5 is our primary reasoning engine, used for all trade-critical decisions where accuracy matters more than cost.
$0.25/1M tokens"] MEDIUM["Medium Tasks"] --> SONNET["Claude Sonnet
$3/1M tokens"] CRITICAL["Trade Decisions"] --> OPUS["Claude Opus 4.5
$15/1M tokens"] end
Key Integration Points:
src/agents/base_agent.py- All agents inherit Claude reasoningsrc/utils/self_healing.py- Autonomous error recoverysrc/orchestrator/gates.py- Trade gate decisions
Why Claude for Trading:
- Highest reasoning accuracy for financial decisions
- Strong instruction following (critical for risk rules)
- Low hallucination rate on numerical data
from anthropic import Anthropic
class BaseAgent:
def __init__(self, name: str, model: str = "claude-opus-4-5-20251101"):
self.client = Anthropic()
self.model = model
def reason_with_llm(self, prompt: str) -> dict:
response = self.client.messages.create(
model=self.model,
max_tokens=4096,
messages=[{"role": "user", "content": prompt}]
)
return response
2. OpenRouter (Multi-LLM Gateway)
ACTIVE Cost-Optimized Inference
OpenRouter provides access to multiple LLMs through a single API, enabling us to route tasks to the most cost-effective model.
$0.14/$0.28 per 1M"] MISTRAL["Mistral Medium 3
$0.40/$2.00 per 1M"] KIMI["Kimi K2
$0.39/$1.90 per 1M
#1 Trading Benchmark"] end end subgraph Tasks["Task Routing"] SENT["Sentiment Analysis"] --> DS RESEARCH["Market Research"] --> MISTRAL TRADE["Trade Signals"] --> KIMI end API --> Models
Model Selection (from StockBench benchmarks):
| Model | Cost (In/Out) | Trading Sortino | Use Case |
|---|---|---|---|
| DeepSeek | $0.14/$0.28 | 0.021 | Sentiment, News |
| Mistral Medium 3 | $0.40/$2.00 | - | Research, Analysis |
| Kimi K2 | $0.39/$1.90 | 0.042 | Trade Signals |
MCP Server Integration:
# mcp/servers/openrouter/sentiment.py
class SentimentAnalyzer:
def analyze(self, news: list[str]) -> float:
# Routes to DeepSeek for cost efficiency
response = openrouter.chat(
model="deepseek/deepseek-chat",
messages=[{"role": "user", "content": news_prompt}]
)
return parse_sentiment(response)
3. LanceDB RAG (Local Retrieval)
ACTIVE Local Semantic Search
Our RAG system stores all trade history and lessons learned locally, enabling the system to learn from past mistakes and successes without cloud dependencies.
Architecture Decisions:
- Local Embeddings: Sentence-transformers for consistent offline indexing
- Document-Aware Chunking: Preserves section structure for better recall
- Hybrid Retrieval: Semantic search with keyword fallback
- Top-K: Returns 5 most relevant sections per query
Key Files:
src/memory/document_aware_rag.py- Document-aware indexing + searchsrc/rag/lessons_learned_rag.py- LanceDB-first retrieval with fallbackscripts/reindex_rag.py- Build semantic indexscripts/vectorize_rag_knowledge.py- Keyword index
from src.memory.document_aware_rag import get_document_aware_rag
rag = get_document_aware_rag()
rag.ensure_index()
results = rag.search("iron condor management", limit=3)
for r in results:
print(r.title, r.section_title)
4. MCP (Model Context Protocol)
ACTIVE Protocol Layer for Tool Integration
MCP provides a standardized way for AI agents to interact with external tools and data sources.
(Orders, Market)"] OPENROUTER_MCP["OpenRouter Server
(Sentiment, Stocks)"] TRADE_MCP["Trade Server
(Execution)"] end end subgraph External["External APIs"] ALPACA_API["Alpaca API"] OR_API["OpenRouter API"] end Agents --> CLIENT CLIENT --> Servers ALPACA_MCP --> ALPACA_API OPENROUTER_MCP --> OR_API
Server Implementations:
mcp/servers/alpaca/- Market data, order executionmcp/servers/openrouter/- Sentiment, stock analysis, IPO researchmcp/servers/trade_agent.py- High-level trade coordination
5. LangGraph (Pipeline Checkpointing)
ACTIVE Fault-Tolerant Execution
LangGraph patterns enable checkpoint-based recovery for our trade execution pipeline.
✓ Checkpoint"] --> G2["Sentiment Gate
✓ Checkpoint"] G2 --> G3["Risk Gate
✓ Checkpoint"] G3 --> EXEC["Execute Trade"] end subgraph Recovery["Failure Recovery"] FAIL["Gate Failure"] --> LOAD["Load Last Checkpoint"] LOAD --> RETRY["Retry from Checkpoint"] end G2 -.-> FAIL
Implementation:
@dataclass
class PipelineCheckpoint:
thread_id: str # e.g., "trade:SPY:2026-01-21T14:30:00"
checkpoint_id: str
gate_index: int
gate_name: str
context_json: str # Serialized state
Data Flow Architecture
Trade Execution Flow
Blog Generation Flow
Infrastructure
Cloud Services
| Service | Provider | Purpose |
|---|---|---|
| RAG Corpus | LanceDB (local) | Vector search, embeddings |
| Webhook | Google Cloud Run | RAG Webhook integration |
| CI/CD | GitHub Actions | Automated testing, deployment |
| Blog Hosting | GitHub Pages | Static site hosting |
| Broker | Alpaca | Paper/Live trading |
Cost Optimization
Budget Controls:
- BATS framework routes 80% of queries to cost-effective models
- RAG reduces repeated LLM calls via cached knowledge
- Batch processing during off-peak hours
Technology Status
| Technology | Status | Notes |
|---|---|---|
| Claude (Anthropic) | ACTIVE | Primary reasoning engine |
| OpenRouter | ACTIVE | Multi-LLM gateway |
| LanceDB RAG | ACTIVE | Local semantic search |
| MCP Protocol | ACTIVE | Tool integration layer |
| LangGraph | ACTIVE | Pipeline checkpointing |
| Gemini 2.0 Flash | ACTIVE | RAG retrieval |
| LangSmith | DEPRECATED | Removed Jan 2026, replaced by RAG |
| LangChain | DEPRECATED | Migrated to direct SDK |
How Tech Stack Affects Trading
1. Decision Quality
- Claude Opus 4.5 provides highest reasoning accuracy for trade decisions
- RAG enables learning from 200+ documented mistakes
- Result: 86% win rate on iron condors
2. Cost Efficiency
- OpenRouter routing reduces LLM costs by 70%
- BATS framework matches task complexity to model cost
- Result: <$50/month AI costs for full system
3. Reliability
- LangGraph checkpoints enable recovery from failures
- MCP protocol standardizes tool interactions
- Result: Zero trade execution failures in 90 days
4. Continuous Learning
- LanceDB RAG captures every lesson automatically
- Blog sync shares learnings publicly
- Result: System improves with every trade
This tech stack documentation is auto-updated. View source at GitHub.