Render Developer Q&A Assistant showcasing observable AI with Pydantic Agents, Pydantic Embedder, Logfire, and Render
Intelligent question-answering system that demonstrates real-world AI observability patterns. This example project shows how to build, instrument, and monitor a multi-stage LLM pipeline with full cost tracking, quality evaluation, and performance monitoring.
- What This App Does
- What This Demonstrates
- Architecture
- Quick Start
- Deploy to Render
- Example Metrics
- Documentation
- Contributing
- License
This is an AI-powered Q&A assistant for Render documentation. Users can ask questions about Render's platform, and the app provides accurate, well-researched answers backed by the official documentation.
- Ask a question - "How do I deploy a Node.js app on Render?" or "What database plans are available?"
- Watch the pipeline - See real-time progress through 8 stages (embedding β retrieval β generation β verification)
- Get accurate answers - Receive detailed responses with sources from Render docs
- Quality guaranteed - Every answer is verified for accuracy and rated by dual AI evaluators
- Hybrid search - Combines semantic understanding with keyword matching for better retrieval
- Multi-stage verification - Extracts claims, verifies against docs, checks technical accuracy
- Iterative refinement - Automatically regenerates low-quality answers with feedback
- Cost tracking - See exactly how much each question costs to answer
- Real-time streaming - Progressive response updates via Server-Sent Events
"How do I set up PostgreSQL on Render?"
"What's the difference between Web Services and Static Sites?"
"How much does a Starter plan cost?"
"Can I use custom domains with Render?"
"How do I configure environment variables?"
The app answers questions about deployment, databases, pricing, configuration, networking, and all other Render platform features based on ~10,000 documentation chunks.
- LLM Traces - Complete visibility into every AI call (OpenAI + Anthropic auto-instrumented)
- HTTP Tracing - FastAPI auto-instrumentation for request/response tracking
- Database Monitoring - AsyncPG auto-instrumentation for query performance
- Cost Tracking - Per-stage and per-execution cost attribution with custom metrics
- Multi-Model Evals - Dual-rater quality assessment (OpenAI + Anthropic)
- Session Tracking - End-to-end user journey with distributed tracing
- Custom Metrics - Business-specific metrics (cost, quality, iterations)
- SQL Queries - Custom analytics on AI performance
This project is built end-to-end on the Pydantic ecosystem:
- Pydantic AI Agents β every pipeline stage (generation, claims extraction, accuracy check, dual-rater evaluation) is a
pydantic_ai.Agentwith a typedoutput_type. Multi-provider orchestration (Claude + GPT) runs throughOpenAIProvider/AnthropicProviderin a single pipeline. Seebackend/pipeline/. - Pydantic Embedder β
pydantic_ai.EmbedderwithOpenAIEmbeddingModelpowers question embedding (embed_query) and batch claim embedding (embed_documents) for verification. Auto-instrumented bylogfire.instrument_pydantic_ai(). Seebackend/pipeline/embeddings.pyandbackend/pipeline/verification.py. - Pydantic Models β Claims, accuracy scores, eval dimensions, and pipeline state are parsed directly into Pydantic models (e.g.
ClaimsOutput,EvaluationOutput).pydantic-settingsmanages config inbackend/config.py. - Pydantic GenAI Prices β model pricing is loaded dynamically from the
pydantic/genai-pricesregistry, then combined with per-agent token counts fromresult.usage()to produce per-stage cost attribution. Seebackend/prices.py. - Logfire β distributed traces, custom metrics, dual-model evals, and cost attribution. Auto-instruments FastAPI, AsyncPG, HTTPX, and Pydantic AI. See
backend/observability.py.
- Zero-Config Deployment - Push to deploy with render.yaml
- PostgreSQL with pgvector + full-text - Managed hybrid search database
- Web Service + Static Site - Full-stack deployment
- Environment Management - Secure secrets handling
- Auto-Scaling - Handle variable AI workloads
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Frontend (React + TypeScript) β
β Deployed as: Render Static Site β
β - Question input UI β
β - Real-time progress via SSE β
β - Answer display with metrics β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β HTTPS
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Backend API (FastAPI + Pydantic AI + Logfire) β
β Deployed as: Render Web Service (Python 3.13) β
β β
β 8-Stage Pipeline: β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β [1] Question Embedding (OpenAI) β β
β β [2] RAG Document Retrieval (pgvector + BM25) β β
β β [3] Answer Generation (Claude Sonnet 4.5) β β
β β [4] Claims Extraction (GPT-5.4-mini) β β
β β [5] Claims Verification (RAG again) β β
β β [6] Technical Accuracy (Claude Sonnet 4) β β
β β [7] Quality Rating (OpenAI + Anthropic) β β
β β [8] Quality Gate (Pass or Iterate) β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β
ββββββββββββββββββββββββ βββββββββββββββββββββββββββββ
β PostgreSQL β β Logfire β
β (Render Managed) β β (Pydantic) β
β - pgvector ext β β - Distributed traces β
β - RAG embeddings β β - Cost attribution β
β - Full-text search β β - Quality metrics β
ββββββββββββββββββββββββ β - Custom dashboards β
βββββββββββββββββββββββββββββ
render-qa-assistant/
βββ backend/
β βββ main.py # FastAPI application entry
β βββ requirements.txt # Legacy pip dependencies (reference only)
β βββ api/
β β βββ logs.py # Logfire logs API endpoint
β βββ pipeline/ # 8-stage pipeline implementation
β βββ models.py # Pydantic models
β βββ database.py # PostgreSQL + pgvector
β βββ observability.py # Logfire configuration
β βββ config.py # Settings management
βββ frontend/
β βββ src/ # React + TypeScript UI
β βββ package.json
β βββ vite.config.ts
βββ data/
β βββ embeddings/ # Pre-embedded documentation
β βββ scripts/ # Data ingestion scripts
βββ docs/
β βββ PIPELINE.md # Detailed pipeline guide
β βββ OBSERVABILITY.md # Logfire instrumentation guide
β βββ CONFIGURATION.md # Configuration reference
β βββ HYBRID_SEARCH.md # Hybrid search deep-dive
βββ pyproject.toml # Python dependencies (uv)
βββ uv.lock # Locked dependency versions
βββ .python-version # Pins Python to 3.13
βββ render.yaml # Infrastructure as code
βββ .env.example # Environment variables template
βββ README.md # This file
- uv (manages Python 3.13 automatically)
- Node.js 18+
- PostgreSQL 16+ (with pgvector extension)
- OpenAI API key
- Anthropic API key
- Logfire account β sign in at logfire.pydantic.dev, create a project (US region), then:
- Settings β Write Tokens β create a token β
LOGFIRE_TOKENin.env - Settings β Read Tokens β create a token β
LOGFIRE_READ_TOKENin.env - View traces in the Live panel under your project
- Settings β Write Tokens β create a token β
# 1. Install everything (uv installs Python 3.13 automatically)
make install
# 2. Set up .env file (copy from example and fill in your keys)
cp .env.example .env
# 3. Start database
make db-start
# 4. Load documentation (this step might take a while!)
make ingest
# 5. Run backend (in one terminal)
make run-backend
# 6. Run frontend (in another terminal)
make run-frontendmake ingest runs the full pipeline: bulk doc embeddings, plus the curated "special pages" that get explicit-injection into RAG context (pricing, AI agent template, autoscaling, Node.js). To re-load just one of those after editing its script, use the per-target shortcuts:
make add-pricing # render.com/pricing tables
make add-ai-agent # render.com/templates/self-orchestrating-agents-python
make add-autoscaling # render.com/docs/scaling
make add-nodejs # render.com/docs/deploy-node-express-app# 1. Install Python dependencies (uv reads .python-version β 3.13)
uv sync --group dev
# 2. Configure environment
cp .env.example .env
# Edit .env with your API keys
# 3. Start PostgreSQL with Docker
docker-compose up -d
# 4. Generate and load documentation
uv run python data/scripts/generate_embeddings.py
uv run python data/scripts/ingest_docs.py
# 5. Run backend (from project root)
uv run uvicorn backend.main:app --reload --port 8000
# 6. Run frontend (separate terminal)
cd frontend && npm install && npm run devAccess locally:
- Frontend: http://localhost:3000
- API docs: http://localhost:8000/docs
- Logfire: https://logfire.pydantic.dev
Before clicking the deploy button, sign in at logfire.pydantic.dev, create a project (US region), and generate two tokens:
- Preferences β Write Tokens β create token β save as
LOGFIRE_TOKEN - Preferences β Read Tokens β create token β save as
LOGFIRE_READ_TOKEN
You'll paste both into the Render Dashboard in step 3.
Render reads render.yaml and provisions:
- PostgreSQL database with pgvector (
pydantic-agents-db) - Backend API web service (
pydantic-agents-api, FastAPI + Pydantic AI + Logfire) - Frontend static site (
pydantic-agents-frontend, Next.js)
You'll be prompted only for these four secrets:
| Variable | Source |
|---|---|
OPENAI_API_KEY |
platform.openai.com |
ANTHROPIC_API_KEY |
console.anthropic.com |
LOGFIRE_TOKEN |
Logfire write token from step 1 |
LOGFIRE_READ_TOKEN |
Logfire read token from step 1 |
Auto-filled by Render (no action needed): DATABASE_URL (injected from the database service), QUALITY_THRESHOLD, ACCURACY_THRESHOLD, MAX_ITERATIONS, MAX_TOKENS, RAG_TOP_K, SIMILARITY_THRESHOLD, VERIFICATION_THRESHOLD, ENABLE_CACHING, LOG_LEVEL.
After the backend deploys, copy its public URL (https://pydantic-agents-api-XXXX.onrender.com) and set it as the NEXT_PUBLIC_API_URL env var on the frontend service. Trigger a redeploy of the frontend so the new value takes effect.
- Backend:
https://pydantic-agents-api-XXXX.onrender.com - Frontend:
https://pydantic-agents-frontend-XXXX.onrender.com
Doc ingestion runs automatically as a preDeployCommand on every backend deploy. The bulk corpus is loaded once via ingest_docs.py --skip-if-exists; the curated special pages (add_pricing_page.py, add_ai_agent_template_page.py, add_autoscaling_page.py, add_nodejs_page.py) re-run on every deploy so canonical answers stay in sync with the latest source pages.
ββββββββββββββββββββββββββββββββββ¬βββββββββββ¬βββββββββββ
β Stage β Cost β % Total β
ββββββββββββββββββββββββββββββββββΌβββββββββββΌβββββββββββ€
β Question Embedding β $0.0002 β 2% β
β RAG Retrieval β $0.0001 β 1% β
β Answer Generation (Claude) β $0.0450 β 56% β
β Claims Extraction (GPT) β $0.0080 β 10% β
β Claims Verification (RAG) β $0.0015 β 2% β
β Accuracy Check (Claude) β $0.0180 β 22% β
β Quality Rating (Dual) β $0.0070 β 9% β
ββββββββββββββββββββββββββββββββββΌβββββββββββΌβββββββββββ€
β TOTAL (first iteration) β $0.0798 β 100% β
ββββββββββββββββββββββββββββββββββ΄βββββββββββ΄βββββββββββ
- Average Response Time: 4.2 seconds (first iteration)
- P95 Response Time: 8.7 seconds
- Iteration Rate: 12% of questions require refinement
- Success Rate: 95% accuracy (validated by dual evaluators)
- Average Quality Score: 89/100
- OpenAI Average: 87/100
- Anthropic Average: 91/100
- Inter-rater Agreement: 77% (within 10 points)
- docs/PIPELINE.md - Detailed breakdown of the 8-stage pipeline
- docs/OBSERVABILITY.md - Comprehensive Logfire instrumentation guide
- docs/CONFIGURATION.md - All configuration options and tuning
- docs/HYBRID_SEARCH.md - Technical deep-dive on hybrid search
- Logfire Documentation: https://docs.pydantic.dev/logfire/
- Pydantic AI Documentation: https://ai.pydantic.dev/
- Render Documentation: https://docs.render.com/
This is a demo project, but improvements are welcome!
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests
- Submit a pull request
MIT License - see LICENSE file for details
Built to showcase:
- Logfire by Pydantic - AI observability platform
- Render - Modern cloud platform
- Pydantic AI - Type-safe AI agent framework
- OpenAI & Anthropic - LLM providers
Ready to build observable AI? Fork this repo and deploy to Render to get started!