The retrieval uses LLM tool selection plus storage calls plus LLM synthesis in retrieval.py. This is quality-friendly but latency-heavy. Acceptance criteria: - Add /search fast path returning ranked profile/summary/temporal/snippet/code hits without synthesis. - Make LLM answer generation optional via answer=true. - Cache profile catalogs and retrieval plans. - Track p50/p95/p99 latency per retrieval mode. Bounty 5$ API Credits
The retrieval uses LLM tool selection plus storage calls plus LLM synthesis in retrieval.py. This is quality-friendly but latency-heavy.
Acceptance criteria:
Bounty 5$ API Credits