This document provides a technical deep dive into how pgsqlite works internally.
┌─────────────────┐ PostgreSQL ┌──────────────────┐
│ │ Wire Protocol │ │
│ PostgreSQL │ ◄─────────────────► │ pgsqlite │
│ Client │ (v3) │ │
│ │ │ ┌──────────────┐ │
└─────────────────┘ │ │ │ │
│ │ SQLite │ │
│ │ Database │ │
│ │ │ │
│ └──────────────┘ │
└──────────────────┘
Implements PostgreSQL Wire Protocol v3:
- Message Codec: Encodes/decodes PostgreSQL protocol messages
- Authentication: Handles client authentication (currently accepts all)
- SSL Negotiation: Manages SSL/TLS handshake for secure connections
Maintains client connection state:
- Session State: Tracks prepared statements, parameters, transaction status
- Connection Pooling: Manages SQLite connections efficiently
- Database Handler: Thread-safe wrapper around SQLite connection
Client Query → Parse → Analyze → Translate → Execute → Format → Response
- Extracts query type (SELECT, INSERT, etc.)
- Identifies prepared statement placeholders
- Validates basic SQL syntax
- Resolves table and column references
- Infers data types from schema
- Builds query metadata
- Converts PostgreSQL-specific syntax to SQLite
- Handles type casting (e.g.,
::int4) - Rewrites unsupported features
- Runs query against SQLite
- Manages transactions
- Handles prepared statements
- Converts SQLite values to PostgreSQL format
- Applies type-specific formatting
- Builds protocol response messages
Maps PostgreSQL's 100+ types to SQLite's 5 storage classes:
PostgreSQL SQLite Storage Metadata
---------- -------------- --------
SMALLINT → INTEGER type_oid: 21
INTEGER → INTEGER type_oid: 23
BIGINT → INTEGER type_oid: 20
REAL → REAL type_oid: 700
VARCHAR(n) → TEXT type_oid: 1043
TIMESTAMP → INTEGER type_oid: 1114
DECIMAL → TEXT type_oid: 1700
- Inbound: PostgreSQL format → SQLite storage
- Outbound: SQLite storage → PostgreSQL format
- Special handling: BOOLEAN (0/1 ↔ t/f), DECIMAL (rust_decimal)
Handles PostgreSQL-specific features:
-- PostgreSQL
CREATE TABLE users (
id SERIAL PRIMARY KEY,
email VARCHAR(255)
);
-- Translated to SQLite
CREATE TABLE users (
id INTEGER PRIMARY KEY AUTOINCREMENT,
email TEXT
);-- PostgreSQL
INSERT INTO users (email) VALUES ('test@example.com') RETURNING id;
-- Simulated in SQLite
INSERT INTO users (email) VALUES ('test@example.com');
SELECT id FROM users WHERE rowid = last_insert_rowid();┌─────────────────┐
│ Result Cache │ Complete query results (100 entries, 60s TTL)
├─────────────────┤
│ Query Cache │ Parsed query plans (1000 entries, 600s TTL)
├─────────────────┤
│ Statement Pool │ Prepared statements (100 statements, LRU)
├─────────────────┤
│ Schema Cache │ Table metadata (bulk loaded, 300s TTL)
└─────────────────┘
For simple queries, bypasses full parsing:
- Pattern matching for INSERT/UPDATE/DELETE
- Direct SQLite execution
- Minimal overhead (1.0-1.5x vs raw SQLite)
- Memory-mapped values for large data
- Reusable buffer pool
- Direct socket writes
1. Client sends: SELECT * FROM users WHERE id = 1
2. Protocol decode: Parse message type, extract query
3. Query analysis:
- Identify SELECT query
- Find table "users"
- Check schema cache
4. Translation: (minimal for simple SELECT)
5. Execution:
- Check result cache
- If miss, execute on SQLite
- Fetch results
6. Value conversion:
- Map SQLite values to PostgreSQL types
- Format timestamps, booleans, etc.
7. Protocol encode:
- RowDescription message
- DataRow messages
- CommandComplete message
8. Cache results for future queries
pgsqlite maintains metadata in special tables:
Stores PostgreSQL type information for each column:
CREATE TABLE __pgsqlite_schema (
table_name TEXT,
column_name TEXT,
data_type TEXT,
type_oid INTEGER,
PRIMARY KEY (table_name, column_name)
);Tracks ENUM type definitions:
CREATE TABLE __pgsqlite_enums (
enum_name TEXT PRIMARY KEY,
enum_values TEXT -- JSON array
);- BEGIN/COMMIT/ROLLBACK: Direct pass-through to SQLite
- Savepoints: Supported via SQLite's savepoint mechanism
- Isolation: SQLite's default (SERIALIZABLE)
TcpListener → Accept → Spawn Task → Handle Client
↓
Session Loop → Process MessagesSame as TCP but using Unix domain sockets for lower latency.
Errors are translated to PostgreSQL error codes:
| Condition | PostgreSQL Code | Example |
|---|---|---|
| Syntax Error | 42601 | Invalid SQL |
| Constraint Violation | 23505 | Unique constraint |
| Type Mismatch | 42804 | Type casting error |
| No Such Table | 42P01 | Table not found |
- Authentication: Currently accepts all connections (implement as needed)
- SSL/TLS: Full support for encrypted connections
- SQL Injection: Prepared statements prevent injection
- File Access: Limited to configured database file
- Multi-Database Support: Route to different SQLite files
- Read Replicas: Multiple SQLite connections for reads
- Write-Ahead Logging: Better concurrency with WAL mode
- Plugin System: Extensible function support
- Clustering: Distributed SQLite with consensus
When extending pgsqlite:
- Maintain Type Safety: Always preserve PostgreSQL type semantics
- Cache Aggressively: But invalidate correctly
- Fail Gracefully: Return PostgreSQL-compatible errors
- Test Thoroughly: Both unit and integration tests
- Document Changes: Update this architecture doc