v3.2.0 · Lattice 3.2 released — read the changelog

A vector database that scales
that actually scales.

Sub-millisecond p99 latency across 100M+ vectors. Self-host on your own infrastructure with one binary, or run on our managed cloud across 47 regions. Apache-2 licensed, no vendor lock-in.

Start free in cloud → curl -fsSL get.lattice.dev | sh

p99 latency

1.7ms

↓ 41% vs Pinecone

Vectors / node

2.4B

at 1024 dimensions

Recall @ k=10

99.4%

HNSW + product quantization

Cost vs leading

−68%

at 100M vector scale

01 — Five lines to ship

From npm install to production-ready search.

Idiomatic clients for every major language. Drop-in compatibility for OpenAI embeddings, Cohere, Voyage, and your own models. No specialized infra knowledge required.

        typescript.ts
        python.py
        go.go
        rust.rs
      

        
        Copy
      

import { Lattice } from '@lattice/client'

const client = new Lattice({ apiKey: process.env.LATTICE_KEY })
const index = await client.index('docs-v2')

// Upsert with native batching — 50k vectors / sec
await index.upsert(embeddings, { batchSize: 1024, parallel: 8 })

// Hybrid search with metadata filters & reranking
const results = await index.query({
  vector: queryEmbedding,
  filter: { team: 'platform', status: { $in: ['open', 'wip'] } },
  topK: 20,
  rerank: { model: 'lattice-rerank-v2' }
})

// → returns in < 2ms across 100M+ vectors
    

02 — Independent benchmarks

We publish every benchmark — methodology, data, code.

Vector search vendor claims are notoriously hand-wavy. We run the standard ANN-Benchmarks suite on our own infrastructure and publish the methodology, datasets, hardware specs, and raw results. Reproduce them yourself.

Latency at p99, queried against 100M 1024-dimension vectors with recall ≥ 99%. Single-node configuration, AWS us-east-1.

Dataset: LAION-100M
Hardware: c7g.8xlarge (32 vCPU, 64 GB)
Index: HNSW (M=32, ef=200)
Concurrent queries: 64
Replication: 3x
Last run: 2025-11-08

View full methodology & raw data →

p99 query latency (ms) · lower is better

Lattice Others

Lattice v3.2 1.7 ms

Pinecone 2.9 ms

Weaviate 4.2 ms

Qdrant 5.1 ms

Milvus 6.8 ms

pgvector 14.3 ms

03 — Built-in primitives

Everything you need. Nothing you don't.

Vector storage is the easy part. The hard parts — filtering, hybrid search, reranking, multi-tenancy, schema evolution — are first-class primitives, not bolt-ons.

Hybrid Search

Dense + sparse + keyword in one query.

BM25 sparse and dense vectors combined natively with tunable weighting. Beats either approach alone on every retrieval benchmark we've tested.

module.search→

Metadata Filters

Filters that don't trash recall.

Pre-filtered HNSW with no post-filtering correctness compromise. Filter on 100+ fields without latency tax — even at multi-billion vector scale.

module.filter→

Multi-Tenancy

Strict per-tenant isolation, one index.

Run thousands of customer-scoped namespaces in a single index without storage or compute overhead. Built for AI products with many small users.

module.tenancy→

Real-time Updates

Sub-second freshness, no rebuilds.

Upserts visible to search within 250ms. Live deletes without index rebuilds. Designed for workloads where data changes every second.

module.realtime→

Reranking

Cross-encoder reranking, inline.

First-class reranker integration. Use ours, or bring your own — Cohere, Voyage, Jina, any cross-encoder you trust. Final top-k in one round-trip.

module.rerank→

SQL Interface

Query with SQL, when you want to.

Full SQL interface alongside the native API. JOIN vectors against your existing tables. Plays nicely with the BI tools your team already uses.

module.sql→

04 — Under the hood

A storage engine, not a wrapper.

Most vector databases stack ANN libraries on RocksDB or FAISS. We wrote our own storage engine in Rust, optimized end-to-end for the read and write patterns vector search actually exhibits.

Client SDK TS · Py · Go · Rust · Java

↓

Edge Router 47 regions

↓

          Query Planner
          Cost-based
        

↓

          HNSW + Sparse Hybrid Engine
          Rust · SIMD
        

↓

          Lattice Storage Engine
          Native
        

↓

Object Storage Tier S3 · GCS · R2

Custom storage engine in Rust

Memory-mapped indexes with adaptive paging. SIMD-vectorized distance calculations on AVX-512 and ARM NEON. Zero-copy reads from disk to wire when the working set spills.

Product quantization without recall loss

Lattice-PQ-v2 achieves 32× compression at >99% recall — without the typical accuracy degradation. Working sets that needed 96 GB of RAM now fit in 3 GB.

Cost-based query planner

Same SQL-style planner approach as Postgres, adapted for vector workloads. Picks the cheapest path through hybrid, filtered, and reranked queries automatically.

Tiered storage with hot/warm/cold

Recent vectors stay in RAM. Older vectors page to local NVMe. Archival vectors live in object storage. Query planner makes the tradeoff transparently.

05 — Pricing

Self-host free. Cloud, when you're ready.

Open core. The product is Apache-2 forever — run it yourself, fork it, ship it in your own infra. We make money when you'd rather not run it yourself.

Open Source

Run it on your own infrastructure. One binary, zero dependencies.

$0 / forever

Apache-2 licensed source code
Single binary deployment
Self-managed clustering & replication
Community Discord support
SLA & uptime guarantee
Multi-region failover

Cloud · Standard

Managed Lattice on our infra. Most teams start here.

$0.18 / 1M reads

Serverless, scales to zero
47 regions, < 50ms anywhere
99.9% uptime SLA
Email support, 24h response
SOC 2 Type II & HIPAA
Dedicated infra & VPC peering

Enterprise

Dedicated single-tenant clusters. Compliance, controls, contracts.

Custom

Dedicated single-tenant cluster
VPC peering & private link
99.99% SLA + custom RPO/RTO
Slack-channel support, 1h response
SOC 2, HIPAA, FedRAMP-ready
Custom DPA, MSA, on-prem option

A vector database that scales that actually scales.