Decision
Use a direct retrieval pipeline built from FastAPI, pgvector, explicit prompts, and small provider abstractions.
Why
- the system behavior stays easy to inspect
- retrieval bugs are easier to diagnose
- the code path from query to embedding to retrieval to generation stays short
What is deferred
- agents
- memory
- orchestration frameworks
- multi-stage retrieval pipelines
Those tools may become useful later, but they are not necessary to make this platform credible today.
