Index-as-artifact paradigm
Indexes are treated as deliverables: the build phase produces persistent index files + metadata, and the query phase only loads and executes for reproducible regressions across environments.
zvec treats vector retrieval as a two-stage engineering system: offline builds persistable ANN indexes from embeddings, online executes budgeted nearest-neighbor queries with explicit recall/latency targets, and throughput work (batching, parallelism, SIMD) lives on the hot path. It fits as an infrastructure component: instead of adopting a full vector database, you embed ANN into search, recommendation recall, or multimodal pipelines, and keep iteration traceable via versioned configs and index artifacts.
| ✕Traditional Pain Points | ✓Innovative Solutions |
|---|---|
| When vector search is glued into app code, index formats, params, and tuning are not reproducible, making regressions hard to diagnose. | zvec makes indexes first-class artifacts: build and query are decoupled, indexes are persistent/versioned, and online focuses on load+execute for controllable regressions. |
| Adopting a full vector database can be operationally heavy for lightweight recall use cases. | ANN-first execution puts latency/throughput optimizations in the query layer, with configurable recall–performance tradeoffs for embedded deployment. |
1git clone https://github.com/alibaba/zvec.git && cd zvec1# Follow repo build commands (e.g., cargo build --release or cmake --build)1# Example: zvec build-index --input embeddings.bin --output index.zv --config config.yaml1# Example: zvec query --index index.zv --vector query.bin --topk 101# Pin a query set + expected topK, store metrics/outputs for version comparisons| Core Scene | Target Audience | Solution | Outcome |
|---|---|---|---|
| Embeddable ANN recall layer for semantic search | search/KB teams | embed embedding-based nearest-neighbor recall into existing retrieval | better recall within latency budgets and regression-friendly tuning |
| Vector recall for recommender systems with AB iteration | recsys/growth teams | build indexes offline and recall candidates online with low latency | versioned recall components with safer AB and rollbacks |
| Local vector retrieval component for multimodal apps | multimodal/content understanding teams | run vector retrieval on-prem or at the edge | clear data boundaries, controlled cost, and throughput scaling with hardware |