turbovec
RyanCodrai
Rust vector index with TurboQuant compression (ICLR 2026) - SIMD kernels, online ingest.
What is turbovec?
turbovec implements Google Research's TurboQuant algorithm (ICLR 2026) in Rust with Python bindings and hand-written SIMD kernels (NEON, AVX-512). It claims compressing a 10M-document corpus from 31GB to 4GB with search faster than FAISS on 4-bit configs (project's own claim), supports online ingest with no training phase, and integrates with LangChain, LlamaIndex, Haystack and Agno.
Pros & Cons
Pros
- Grounded in a peer-reviewed ICLR 2026 paper
- SIMD-optimized Rust core with ergonomic Python bindings
- No training phase - online ingest suits dynamic collections
Cons
- Single developer - no visible team or org backing
- Beta maturity and a young repo - production reliability unproven at scale
- Compression-vs-recall trade-offs not independently benchmarked
License
MIT (OSI-open)
When it is interesting
Fast semantic search over large corpora (10M+) with storage budgets too tight for full float32 embeddings.
When it is too early
Use cases needing maximum recall at any storage cost, or a commercially-backed vector DB with SLA.
Commercial alternative & related
- Commercial counterpart: Pinecone / Zilliz Cloud
This repo featured in the 2026-07 edition of the Open-Source AI Radar.
langextract
Python library from Google for LLM-powered structured extraction with source grounding.
LEANN
StarTrail-org
RAG on everything - graph-based vector index claiming 97% storage savings for private on-device search.
chandra
datalab-to
High-accuracy document digitization (OCR/layout) with code and an open model.