Skip to content

render-examples/pydantic-agents

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

38 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Pydantic Agents

Render Developer Q&A Assistant showcasing observable AI with Pydantic Agents, Pydantic Embedder, Logfire, and Render

Deploy to Render

Intelligent question-answering system that demonstrates real-world AI observability patterns. This example project shows how to build, instrument, and monitor a multi-stage LLM pipeline with full cost tracking, quality evaluation, and performance monitoring.

Table of Contents


What This App Does

This is an AI-powered Q&A assistant for Render documentation. Users can ask questions about Render's platform, and the app provides accurate, well-researched answers backed by the official documentation.

User Experience

  1. Ask a question - "How do I deploy a Node.js app on Render?" or "What database plans are available?"
  2. Watch the pipeline - See real-time progress through 8 stages (embedding β†’ retrieval β†’ generation β†’ verification)
  3. Get accurate answers - Receive detailed responses with sources from Render docs
  4. Quality guaranteed - Every answer is verified for accuracy and rated by dual AI evaluators

Key Features

  • Hybrid search - Combines semantic understanding with keyword matching for better retrieval
  • Multi-stage verification - Extracts claims, verifies against docs, checks technical accuracy
  • Iterative refinement - Automatically regenerates low-quality answers with feedback
  • Cost tracking - See exactly how much each question costs to answer
  • Real-time streaming - Progressive response updates via Server-Sent Events

Example Questions

"How do I set up PostgreSQL on Render?"
"What's the difference between Web Services and Static Sites?"
"How much does a Starter plan cost?"
"Can I use custom domains with Render?"
"How do I configure environment variables?"

The app answers questions about deployment, databases, pricing, configuration, networking, and all other Render platform features based on ~10,000 documentation chunks.


What This Demonstrates

Logfire Features

  • LLM Traces - Complete visibility into every AI call (OpenAI + Anthropic auto-instrumented)
  • HTTP Tracing - FastAPI auto-instrumentation for request/response tracking
  • Database Monitoring - AsyncPG auto-instrumentation for query performance
  • Cost Tracking - Per-stage and per-execution cost attribution with custom metrics
  • Multi-Model Evals - Dual-rater quality assessment (OpenAI + Anthropic)
  • Session Tracking - End-to-end user journey with distributed tracing
  • Custom Metrics - Business-specific metrics (cost, quality, iterations)
  • SQL Queries - Custom analytics on AI performance

Pydantic Stack

This project is built end-to-end on the Pydantic ecosystem:

  • Pydantic AI Agents β€” every pipeline stage (generation, claims extraction, accuracy check, dual-rater evaluation) is a pydantic_ai.Agent with a typed output_type. Multi-provider orchestration (Claude + GPT) runs through OpenAIProvider / AnthropicProvider in a single pipeline. See backend/pipeline/.
  • Pydantic Embedder β€” pydantic_ai.Embedder with OpenAIEmbeddingModel powers question embedding (embed_query) and batch claim embedding (embed_documents) for verification. Auto-instrumented by logfire.instrument_pydantic_ai(). See backend/pipeline/embeddings.py and backend/pipeline/verification.py.
  • Pydantic Models β€” Claims, accuracy scores, eval dimensions, and pipeline state are parsed directly into Pydantic models (e.g. ClaimsOutput, EvaluationOutput). pydantic-settings manages config in backend/config.py.
  • Pydantic GenAI Prices β€” model pricing is loaded dynamically from the pydantic/genai-prices registry, then combined with per-agent token counts from result.usage() to produce per-stage cost attribution. See backend/prices.py.
  • Logfire β€” distributed traces, custom metrics, dual-model evals, and cost attribution. Auto-instruments FastAPI, AsyncPG, HTTPX, and Pydantic AI. See backend/observability.py.

Render Capabilities

  • Zero-Config Deployment - Push to deploy with render.yaml
  • PostgreSQL with pgvector + full-text - Managed hybrid search database
  • Web Service + Static Site - Full-stack deployment
  • Environment Management - Secure secrets handling
  • Auto-Scaling - Handle variable AI workloads

Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Frontend (React + TypeScript)                              β”‚
β”‚  Deployed as: Render Static Site                            β”‚
β”‚  - Question input UI                                        β”‚
β”‚  - Real-time progress via SSE                               β”‚
β”‚  - Answer display with metrics                              β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                          ↓ HTTPS
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Backend API (FastAPI + Pydantic AI + Logfire)              β”‚
β”‚  Deployed as: Render Web Service (Python 3.13)              β”‚
β”‚                                                             β”‚
β”‚  8-Stage Pipeline:                                          β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚  β”‚ [1] Question Embedding      (OpenAI)                   β”‚ β”‚
β”‚  β”‚ [2] RAG Document Retrieval  (pgvector + BM25)          β”‚ β”‚
β”‚  β”‚ [3] Answer Generation       (Claude Sonnet 4.5)        β”‚ β”‚
β”‚  β”‚ [4] Claims Extraction       (GPT-5.4-mini)             β”‚ β”‚
β”‚  β”‚ [5] Claims Verification     (RAG again)                β”‚ β”‚
β”‚  β”‚ [6] Technical Accuracy      (Claude Sonnet 4)          β”‚ β”‚
β”‚  β”‚ [7] Quality Rating          (OpenAI + Anthropic)       β”‚ β”‚
β”‚  β”‚ [8] Quality Gate            (Pass or Iterate)          β”‚ β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
            ↓                                    ↓
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”           β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  PostgreSQL          β”‚           β”‚  Logfire                  β”‚
β”‚  (Render Managed)    β”‚           β”‚  (Pydantic)               β”‚
β”‚  - pgvector ext      β”‚           β”‚  - Distributed traces     β”‚
β”‚  - RAG embeddings    β”‚           β”‚  - Cost attribution       β”‚
β”‚  - Full-text search  β”‚           β”‚  - Quality metrics        β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜           β”‚  - Custom dashboards      β”‚
                                   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Project Structure

render-qa-assistant/
β”œβ”€β”€ backend/
β”‚   β”œβ”€β”€ main.py                    # FastAPI application entry
β”‚   β”œβ”€β”€ requirements.txt           # Legacy pip dependencies (reference only)
β”‚   β”œβ”€β”€ api/
β”‚   β”‚   └── logs.py                # Logfire logs API endpoint
β”‚   β”œβ”€β”€ pipeline/                  # 8-stage pipeline implementation
β”‚   β”œβ”€β”€ models.py                  # Pydantic models
β”‚   β”œβ”€β”€ database.py                # PostgreSQL + pgvector
β”‚   β”œβ”€β”€ observability.py           # Logfire configuration
β”‚   └── config.py                  # Settings management
β”œβ”€β”€ frontend/
β”‚   β”œβ”€β”€ src/                       # React + TypeScript UI
β”‚   β”œβ”€β”€ package.json
β”‚   └── vite.config.ts
β”œβ”€β”€ data/
β”‚   β”œβ”€β”€ embeddings/                # Pre-embedded documentation
β”‚   └── scripts/                   # Data ingestion scripts
β”œβ”€β”€ docs/
β”‚   β”œβ”€β”€ PIPELINE.md                # Detailed pipeline guide
β”‚   β”œβ”€β”€ OBSERVABILITY.md           # Logfire instrumentation guide
β”‚   β”œβ”€β”€ CONFIGURATION.md           # Configuration reference
β”‚   └── HYBRID_SEARCH.md           # Hybrid search deep-dive
β”œβ”€β”€ pyproject.toml                 # Python dependencies (uv)
β”œβ”€β”€ uv.lock                        # Locked dependency versions
β”œβ”€β”€ .python-version                # Pins Python to 3.13
β”œβ”€β”€ render.yaml                    # Infrastructure as code
β”œβ”€β”€ .env.example                   # Environment variables template
└── README.md                      # This file

Quick Start

Prerequisites

  • uv (manages Python 3.13 automatically)
  • Node.js 18+
  • PostgreSQL 16+ (with pgvector extension)
  • OpenAI API key
  • Anthropic API key
  • Logfire account β€” sign in at logfire.pydantic.dev, create a project (US region), then:
    1. Settings β†’ Write Tokens β†’ create a token β†’ LOGFIRE_TOKEN in .env
    2. Settings β†’ Read Tokens β†’ create a token β†’ LOGFIRE_READ_TOKEN in .env
    3. View traces in the Live panel under your project

Local Development (with Make)

# 1. Install everything (uv installs Python 3.13 automatically)
make install

# 2. Set up .env file (copy from example and fill in your keys)
cp .env.example .env

# 3. Start database
make db-start

# 4. Load documentation (this step might take a while!)
make ingest

# 5. Run backend (in one terminal)
make run-backend

# 6. Run frontend (in another terminal)
make run-frontend

make ingest runs the full pipeline: bulk doc embeddings, plus the curated "special pages" that get explicit-injection into RAG context (pricing, AI agent template, autoscaling, Node.js). To re-load just one of those after editing its script, use the per-target shortcuts:

make add-pricing      # render.com/pricing tables
make add-ai-agent     # render.com/templates/self-orchestrating-agents-python
make add-autoscaling  # render.com/docs/scaling
make add-nodejs       # render.com/docs/deploy-node-express-app

Manual Setup

# 1. Install Python dependencies (uv reads .python-version β†’ 3.13)
uv sync --group dev

# 2. Configure environment
cp .env.example .env
# Edit .env with your API keys

# 3. Start PostgreSQL with Docker
docker-compose up -d

# 4. Generate and load documentation
uv run python data/scripts/generate_embeddings.py
uv run python data/scripts/ingest_docs.py

# 5. Run backend (from project root)
uv run uvicorn backend.main:app --reload --port 8000

# 6. Run frontend (separate terminal)
cd frontend && npm install && npm run dev

Access locally:


Deploy to Render

1. Set up a Logfire account.

Before clicking the deploy button, sign in at logfire.pydantic.dev, create a project (US region), and generate two tokens:

  • Preferences β†’ Write Tokens β†’ create token β†’ save as LOGFIRE_TOKEN
  • Preferences β†’ Read Tokens β†’ create token β†’ save as LOGFIRE_READ_TOKEN

You'll paste both into the Render Dashboard in step 3.

2. One-click deploy

Deploy to Render

Render reads render.yaml and provisions:

  • PostgreSQL database with pgvector (pydantic-agents-db)
  • Backend API web service (pydantic-agents-api, FastAPI + Pydantic AI + Logfire)
  • Frontend static site (pydantic-agents-frontend, Next.js)

3. Fill in environment variables

You'll be prompted only for these four secrets:

Variable Source
OPENAI_API_KEY platform.openai.com
ANTHROPIC_API_KEY console.anthropic.com
LOGFIRE_TOKEN Logfire write token from step 1
LOGFIRE_READ_TOKEN Logfire read token from step 1

Auto-filled by Render (no action needed): DATABASE_URL (injected from the database service), QUALITY_THRESHOLD, ACCURACY_THRESHOLD, MAX_ITERATIONS, MAX_TOKENS, RAG_TOP_K, SIMILARITY_THRESHOLD, VERIFICATION_THRESHOLD, ENABLE_CACHING, LOG_LEVEL.

4. Wire the frontend to the backend

After the backend deploys, copy its public URL (https://pydantic-agents-api-XXXX.onrender.com) and set it as the NEXT_PUBLIC_API_URL env var on the frontend service. Trigger a redeploy of the frontend so the new value takes effect.

5. Done

  • Backend: https://pydantic-agents-api-XXXX.onrender.com
  • Frontend: https://pydantic-agents-frontend-XXXX.onrender.com

Doc ingestion runs automatically as a preDeployCommand on every backend deploy. The bulk corpus is loaded once via ingest_docs.py --skip-if-exists; the curated special pages (add_pricing_page.py, add_ai_agent_template_page.py, add_autoscaling_page.py, add_nodejs_page.py) re-run on every deploy so canonical answers stay in sync with the latest source pages.


Example Metrics

Cost Breakdown (per question)

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Stage                          β”‚ Cost     β”‚ % Total  β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ Question Embedding             β”‚ $0.0002  β”‚    2%    β”‚
β”‚ RAG Retrieval                  β”‚ $0.0001  β”‚    1%    β”‚
β”‚ Answer Generation (Claude)     β”‚ $0.0450  β”‚   56%    β”‚
β”‚ Claims Extraction (GPT)        β”‚ $0.0080  β”‚   10%    β”‚
β”‚ Claims Verification (RAG)      β”‚ $0.0015  β”‚    2%    β”‚
β”‚ Accuracy Check (Claude)        β”‚ $0.0180  β”‚   22%    β”‚
β”‚ Quality Rating (Dual)          β”‚ $0.0070  β”‚    9%    β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ TOTAL (first iteration)        β”‚ $0.0798  β”‚  100%    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Performance Metrics

  • Average Response Time: 4.2 seconds (first iteration)
  • P95 Response Time: 8.7 seconds
  • Iteration Rate: 12% of questions require refinement
  • Success Rate: 95% accuracy (validated by dual evaluators)

Quality Scores

  • Average Quality Score: 89/100
  • OpenAI Average: 87/100
  • Anthropic Average: 91/100
  • Inter-rater Agreement: 77% (within 10 points)

Documentation

Core Guides

External Resources


Contributing

This is a demo project, but improvements are welcome!

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Add tests
  5. Submit a pull request

License

MIT License - see LICENSE file for details


Acknowledgments

Built to showcase:

  • Logfire by Pydantic - AI observability platform
  • Render - Modern cloud platform
  • Pydantic AI - Type-safe AI agent framework
  • OpenAI & Anthropic - LLM providers

Ready to build observable AI? Fork this repo and deploy to Render to get started!

About

A demo AI pipeline showcasing observable AI with Pydantic AI, Logfire, and Render

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors