Skip to content

feat(logosdb): add LogosDB embedded vector database engine#297

Open
jose-compu wants to merge 2 commits into
qdrant:masterfrom
jose-compu:feat/logosdb-integration
Open

feat(logosdb): add LogosDB embedded vector database engine#297
jose-compu wants to merge 2 commits into
qdrant:masterfrom
jose-compu:feat/logosdb-integration

Conversation

@jose-compu
Copy link
Copy Markdown

Summary

Adds LogosDB as a new engine — a fast, embedded HNSW vector store written in C/C++ with Python bindings, backed by memory-mapped binary storage and hnswlib.

  • LogosDBConfigurator — creates/recreates the local DB directory; writes a .meta.json sidecar with dim, distance, and max_elements for use by the upload and search workers.
  • LogosDBUploader — batch inserts via put_batch; encodes each record's integer ID as the text field for recall retrieval.
  • LogosDBSearcher — wraps db.search(), decodes hit.text back to integer ID.
  • Experiment preset logosdb-m16-ef200 targeting glove-25-angular.
  • logosdb>=0.9.0 added to pyproject.toml.

Design notes

  • LogosDB is embedded (no server, single-process). The DB directory path is passed via connection_params.path.
  • max_elements (HNSW pre-allocation) defaults to 2,000,000 and is configurable via collection_params.
  • Distance mapping: cosine → DIST_COSINE, dot → DIST_IP, l2 → DIST_L2.
  • Upload and search workers open the same DB directory; for search (parallel=1) this is safe. Multi-process search is not recommended with this engine.

Benchmark result

Dataset: glove-25-angular (1.18M vectors, dim=25, cosine) — Apple M-series, serial search:

Metric Value
Upload time 345 s
mean precision@10 0.939
RPS 9,401
Mean latency 0.092 ms
p95 latency 0.117 ms
p99 latency 0.146 ms

Test plan

  • pip install logosdb
  • python run.py --engines "logosdb-m16-ef200" --datasets "glove-25-angular"
  • Verify results written to results/logosdb-m16-ef200-glove-25-angular-*.json

jose-compu and others added 2 commits May 17, 2026 15:54
- Implement LogosDB engine: LogosDBConfigurator, LogosDBUploader, LogosDBSearcher
- DB path configured via connection_params.path (default /tmp/logosdb_vdb_bench)
- max_elements configurable via collection_params (default 2,000,000)
- Distance mapping: cosine → DIST_COSINE, dot → DIST_IP, l2 → DIST_L2
- Metadata sidecar (.meta.json) bridges configure → upload → search phases
- Add experiment preset logosdb-m16-ef200 (glove-25-angular baseline)
- Add logosdb>=0.9.0 to pyproject.toml dependencies

Benchmark result (glove-25-angular, 1.18M vectors, dim=25, cosine):
  upload_time=345s  rps=9401  mean_precision=0.939  p99=0.15ms

Co-authored-by: Cursor <cursoragent@cursor.com>
@jose-compu
Copy link
Copy Markdown
Author

can you please review @fabriziobonavita @ankane @agourlay @timvisee thx

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant