All systems operational · p99 1.7ms · 47 regions
Docs ↗ API Reference ↗ GitHub ★ 18.4k ↗
v3.2.0 · Lattice 3.2 released — read the changelog

A vector database that scales
that actually scales.

Sub-millisecond p99 latency across 100M+ vectors. Self-host on your own infrastructure with one binary, or run on our managed cloud across 47 regions. Apache-2 licensed, no vendor lock-in.

Start free in cloud → curl -fsSL get.lattice.dev | sh
p99 latency
1.7ms
↓ 41% vs Pinecone
Vectors / node
2.4B
at 1024 dimensions
Recall @ k=10
99.4%
HNSW + product quantization
Cost vs leading
−68%
at 100M vector scale
01 — Five lines to ship

From npm install to production-ready search.

Idiomatic clients for every major language. Drop-in compatibility for OpenAI embeddings, Cohere, Voyage, and your own models. No specialized infra knowledge required.

typescript.ts python.py go.go rust.rs
Copy
import { Lattice } from '@lattice/client' const client = new Lattice({ apiKey: process.env.LATTICE_KEY }) const index = await client.index('docs-v2') // Upsert with native batching — 50k vectors / sec await index.upsert(embeddings, { batchSize: 1024, parallel: 8 }) // Hybrid search with metadata filters & reranking const results = await index.query({ vector: queryEmbedding, filter: { team: 'platform', status: { $in: ['open', 'wip'] } }, topK: 20, rerank: { model: 'lattice-rerank-v2' } }) // → returns in < 2ms across 100M+ vectors
02 — Independent benchmarks

We publish every benchmark — methodology, data, code.

Vector search vendor claims are notoriously hand-wavy. We run the standard ANN-Benchmarks suite on our own infrastructure and publish the methodology, datasets, hardware specs, and raw results. Reproduce them yourself.

Latency at p99, queried against 100M 1024-dimension vectors with recall ≥ 99%. Single-node configuration, AWS us-east-1.

Dataset: LAION-100M
Hardware: c7g.8xlarge (32 vCPU, 64 GB)
Index: HNSW (M=32, ef=200)
Concurrent queries: 64
Replication: 3x
Last run: 2025-11-08

View full methodology & raw data →
p99 query latency (ms) · lower is better
Lattice Others
Lattice v3.2 1.7 ms
Pinecone 2.9 ms
Weaviate 4.2 ms
Qdrant 5.1 ms
Milvus 6.8 ms
pgvector 14.3 ms
03 — Built-in primitives

Everything you need. Nothing you don't.

Vector storage is the easy part. The hard parts — filtering, hybrid search, reranking, multi-tenancy, schema evolution — are first-class primitives, not bolt-ons.

Hybrid Search

Dense + sparse + keyword in one query.

BM25 sparse and dense vectors combined natively with tunable weighting. Beats either approach alone on every retrieval benchmark we've tested.

module.search
Metadata Filters

Filters that don't trash recall.

Pre-filtered HNSW with no post-filtering correctness compromise. Filter on 100+ fields without latency tax — even at multi-billion vector scale.

module.filter
Multi-Tenancy

Strict per-tenant isolation, one index.

Run thousands of customer-scoped namespaces in a single index without storage or compute overhead. Built for AI products with many small users.

module.tenancy
Real-time Updates

Sub-second freshness, no rebuilds.

Upserts visible to search within 250ms. Live deletes without index rebuilds. Designed for workloads where data changes every second.

module.realtime
Reranking

Cross-encoder reranking, inline.

First-class reranker integration. Use ours, or bring your own — Cohere, Voyage, Jina, any cross-encoder you trust. Final top-k in one round-trip.

module.rerank
SQL Interface

Query with SQL, when you want to.

Full SQL interface alongside the native API. JOIN vectors against your existing tables. Plays nicely with the BI tools your team already uses.

module.sql
04 — Under the hood

A storage engine, not a wrapper.

Most vector databases stack ANN libraries on RocksDB or FAISS. We wrote our own storage engine in Rust, optimized end-to-end for the read and write patterns vector search actually exhibits.

Client SDK TS · Py · Go · Rust · Java
Edge Router 47 regions
Query Planner Cost-based
HNSW + Sparse Hybrid Engine Rust · SIMD
Lattice Storage Engine Native
Object Storage Tier S3 · GCS · R2
01

Custom storage engine in Rust

Memory-mapped indexes with adaptive paging. SIMD-vectorized distance calculations on AVX-512 and ARM NEON. Zero-copy reads from disk to wire when the working set spills.

02

Product quantization without recall loss

Lattice-PQ-v2 achieves 32× compression at >99% recall — without the typical accuracy degradation. Working sets that needed 96 GB of RAM now fit in 3 GB.

03

Cost-based query planner

Same SQL-style planner approach as Postgres, adapted for vector workloads. Picks the cheapest path through hybrid, filtered, and reranked queries automatically.

04

Tiered storage with hot/warm/cold

Recent vectors stay in RAM. Older vectors page to local NVMe. Archival vectors live in object storage. Query planner makes the tradeoff transparently.

05 — Pricing

Self-host free. Cloud, when you're ready.

Open core. The product is Apache-2 forever — run it yourself, fork it, ship it in your own infra. We make money when you'd rather not run it yourself.

Open Source
Run it on your own infrastructure. One binary, zero dependencies.
$0 / forever
  • Apache-2 licensed source code
  • Single binary deployment
  • Self-managed clustering & replication
  • Community Discord support
  • SLA & uptime guarantee
  • Multi-region failover
Enterprise
Dedicated single-tenant clusters. Compliance, controls, contracts.
Custom
  • Dedicated single-tenant cluster
  • VPC peering & private link
  • 99.99% SLA + custom RPO/RTO
  • Slack-channel support, 1h response
  • SOC 2, HIPAA, FedRAMP-ready
  • Custom DPA, MSA, on-prem option
Powering retrieval at
Anthropic Notion Vercel Linear Cursor Glean Harvey Decagon

Ship retrieval that actually works in production.

Start with our free tier — 1 million vectors, 5 million queries per month. No credit card. Production-grade infrastructure from day one.

Create free account → Read the docs →
$ curl -fsSL get.lattice.dev | sh ⌘ Copy