I’ve been working on a project called Valori, a Python-native vector database I built from the ground up — not by reinventing every algorithm, but by wiring together efficient, well-known indexing and search techniques into a cohesive, hackable framework.
The idea came from my frustration with existing vector DBs that were either too heavy for experimentation or too opaque to modify. I wanted something simple, modular, and extensible — so I built it.
What it does:
Lets you store, index, and search high-dimensional vectors
Supports multiple indices (Flat, HNSW, IVF, LSH, Annoy)
Has memory, disk, and hybrid storage backends
Includes a full document processing pipeline (parsing, cleaning, chunking, embedding)
Offers quantization, persistence, and plugin-based extensibility
All written in Python, integrated with NumPy, and production-tested with logging and monitoring built in.
Install:
pip install valori
GitHub: https://github.com/varshith-Git/valori
PyPI: https://pypi.org/project/valori
I’d love to hear your thoughts —
What’s missing for you in current vector DBs?
If you’ve built LLM or RAG systems, what do you wish a lightweight, pure Python DB like this handled better?
Would you prefer tighter integrations (LangChain, Haystack, etc.) or a more “build-it-yourself” style?
Feedback, criticism, or collaboration ideas are all welcome. — Varshith (varshith.gudur17@gmail.com )
how much was this vibe coded? looks cool but its too much for me to digest.
where did you get the original mental model to begin building it?
It’s definitely dense, but not as wild as it looks. The mental model was: take the core building blocks from FAISS and Milvus, make them composable in Python, and expose everything clearly.
The “vibe” part came from trying to make it feel like a system that could run in production, not just a toy. So yeah, it’s a little heavy, but it earned the vibe honestly.
What’s the advantage if this being in python?
The point isn’t raw speed it’s hackability. You can plug in new models or indexing layers in minutes without dropping to C++.
I think the “simple, modular, and extensible” makes this interesting. And for those, it being written in Python are relevant.
Exactly Python makes the whole stack composable instead of compiled shut. That’s where the fun (and flexibility) lives.
dude you already missed the window.
nothing is better than sqlite as a library and don't use high perforamnce as your value for a python product
SQLite’s perfect if you’ve got rows and tables. Valori’s for when you’ve got embeddings and chaos.