Operator-honest · Siren-based ranking · 2026-05-11

Pinecone · Weaviate · Qdrant · Milvus / Zilliz · Chroma · pgvector · Turbopuffer · MongoDB Atlas Vector · Vespa · LanceDB.
One question: which one is right for your stage?

Q: What's the difference between BM25 + vector hybrid and sparse + dense hybrid?

True BM25 + vector hybrid (Weaviate, Vespa, MongoDB Atlas $rankFusion) means the engine runs BM25 lexical scoring AND vector similarity scoring on the same query, then fuses scores at query time — both signals are first-class citizens in the engine. Sparse + dense hybrid (Pinecone, Qdrant, Milvus) means you store sparse vectors (which represent BM25-like term frequency signals as vectors) alongside dense embedding vectors, and fuse them at query time via score combination. Functionally similar for most workloads; architecturally different. True hybrid wins on simplicity (no separate sparse vector indexing) and on edge cases where lexical match is critical (exact phrase, rare terms). Sparse + dense wins on flexibility (you can encode any signal as sparse vectors). Weaviate and Vespa are the only engines in the category with true hybrid baked in.

Q: When is metadata filtering more important than vector similarity?

Filtering is more important than similarity when (1) you have hard predicate constraints (date range, user_id, tenant_id, category) that must be respected — no amount of vector similarity matters if results don't pass the filter, (2) your filter dramatically reduces the candidate set (e.g. 'find similar to X but only within these 100 documents from this user') — purpose-built filter pushdown is critical here, (3) you're shipping multi-tenant SaaS where tenant_id filter is on every query — Weaviate's purpose-built multi-tenant beats namespace + filter patterns. The honest 2026 default: filtering is at least as important as vector recall for most production AI features. Engines with payload-aware indexing (Qdrant) or filter pushdown during vector search (Pinecone, Weaviate, Milvus) win over engines that filter post-search.

Q: Multi-tenant patterns — namespaces vs partitions vs collections vs row-level security?

Four realistic patterns in 2026: (1) Namespaces (Pinecone) — one index, thousands of namespaces, namespace_id on every query, strong isolation, simple. (2) Multi-tenant baked in (Weaviate) — purpose-built tenant isolation with per-tenant indexes inside one cluster, scales to thousands of tenants without management overhead. (3) Partitions (Milvus) — one collection, multiple partitions, partition key isolation, works at scale but operationally heavier. (4) Row-level security (pgvector + Postgres) — RLS policies enforce per-tenant access, strongest control but requires careful schema design and policy management. Honest 2026 default: Weaviate's purpose-built multi-tenant wins for SaaS shipping AI features per-customer; Pinecone namespaces win for simpler multi-tenant patterns; pgvector RLS wins when transactional consistency between tenant data and vectors matters.

Q: Custom ranking + ML reranking — which engines support it?

Vespa is the only engine in the category with true production-grade custom ranking expressions + in-engine ML reranking models — combine BM25 + vector + recency + custom signals + run a reranking model in the same query. This is why Yahoo Mail + Spotify recommendations + OkCupid matching run on Vespa. Other engines support reranking via separate API calls (retrieve top-N from vector DB, send to a reranking model API like Cohere Rerank or Voyage Rerank, return reordered results) — works fine for most production AI features but adds a network hop. Milvus + Pinecone + Weaviate are all adding native reranking integrations in 2025-2026 roadmap, but Vespa remains the deepest custom-ranking engine. The honest 2026 default: external reranking via Cohere/Voyage API is the standard pattern; Vespa-style in-engine ranking is the right choice when search quality is the team's core competency.

Honest 10-way comparison of Vector Databases — Hybrid Search & Metadata Filtering Comparison (BM25 + vector fusion · sparse + dense · payload filtering · multi-tenant isolation) across Pinecone · Weaviate · Qdrant · Milvus/Zilliz · Chroma · pgvector · Turbopuffer · MongoDB Atlas Vector · Vespa · LanceDB platforms. No vendor sponsorship. Calling Matrix by buyer persona below — operator's siren-based read on which one to pick when you're forced to pick.

The 10 platforms · what each is actually best at.

Honest read on positioning, ideal customer, and where each one is the wrong call. No vendor sponsorship, no affiliate links — operator-grade signal.

1. Pinecone Sparse + dense hybrid · Strong metadata filtering · 2024 hybrid GA

Hybrid search via sparse + dense vector fusion (sparse vectors represent BM25-style signals as vectors) — solid rather than dominant on this axis. Pinecone's hybrid approach: encode BM25-style sparse vectors alongside dense embeddings, fuse at query time via alpha-weighted score combination. Strong metadata filtering with native filter pushdown — vector search WITH filters in one operation, not vector-then-filter. Multi-tenant via namespaces (one index, thousands of namespaces, strong isolation). Hybrid sparse-dense GA in 2024.

✓ Strongest atSolid hybrid via sparse + dense fusion, strong metadata filter pushdown (filter applied during vector search not after), namespace-based multi-tenant isolation, hosted production-default for hybrid workloads.

✗ Wrong forTeams that want true BM25 + vector fusion baked into the engine (Weaviate + Vespa win), shops needing custom ranking expressions (Vespa wins), per-tenant index isolation at scale (Weaviate's multi-tenant architecture purpose-built).

Pick Pinecone if: solid hybrid + strong filtering + zero-ops hosted is the bar.

2. Weaviate True BM25 + vector fusion BAKED IN · Multi-tenant A+ · Hybrid leader

The true hybrid search leader — BM25 + vector fusion baked into the query engine, not bolted on as separate sparse vectors. Weaviate hybrid query: one operation, alpha-tunable BM25 + vector score fusion, returns ranked results from both signals. Strong metadata filtering with where-clause syntax. Multi-tenant architecture is purpose-built — one cluster, thousands of tenant namespaces with full isolation (no noisy-neighbor effects). Module ecosystem includes generative modules (text2vec-openai, generative-anthropic) that integrate with hybrid retrieval.

✓ Strongest atTrue BM25 + vector hybrid baked in (only DB in category with this architecture), multi-tenant A+ (purpose-built for SaaS isolation), GraphQL + REST query ergonomics, opinionated module ecosystem.

✗ Wrong forTeams wanting simplest hosted-only UX (Pinecone wins), absolute lowest hosted $/vector (Weaviate Cloud cheaper than Pinecone but more setup), shops needing custom ranking expressions at billion-doc scale (Vespa wins).

Pick Weaviate if: true hybrid + multi-tenant isolation + opinionated module ecosystem matter together.

3. Qdrant Sparse + dense hybrid · Payload-aware filtering · Strong filter performance

Strong on filtered-vector-search — purpose-built for vector search WITH metadata filters as one fused operation. Qdrant hybrid: sparse + dense vectors (similar to Pinecone's approach) with alpha-weighted fusion. Filtering excellence: payload-aware indexing means filters are applied during vector search (not post-filter), strong performance even when filters reduce candidate set significantly. Geo + nested + range filters supported. Multi-tenant via collections (one cluster, multiple collections, less polished than Weaviate's tenant isolation but functional).

✓ Strongest atFiltered vector search performance (payload-aware indexing), sparse + dense hybrid fusion, geo + nested + range filter support, single Rust binary self-host.

✗ Wrong forTeams wanting true BM25 + vector fusion baked into engine (Weaviate wins), per-tenant SaaS isolation at scale (Weaviate's multi-tenant purpose-built), absolute hosted-managed scale (Pinecone wins).

Pick Qdrant if: filtered vector search performance with payload-aware indexing matters most.

4. Milvus / Zilliz Sparse + dense hybrid · Partition-based multi-tenant · Strong filtering at scale

Sparse + dense hybrid via separate sparse vectors, with strong scalar field filtering at billion-vector scale. Milvus hybrid: sparse + dense vectors stored together, fused at query time via RRF or weighted sum. Scalar field filtering with index optimization (boolean, integer, varchar fields can be indexed for fast filter pushdown). Multi-tenant via partitions (one collection, multiple partitions, isolation by partition key) — works at scale but less polished than Weaviate's purpose-built multi-tenant. Hybrid + filtering both work cleanly at 100M-1B+ scale where other engines struggle.

✓ Strongest atSparse + dense hybrid at billion-vector scale, scalar field filter index optimization, partition-based multi-tenant at scale, hybrid + filtering both maintain performance at scale.

✗ Wrong forTeams wanting true BM25 + vector fusion baked in (Weaviate wins), per-tenant SaaS isolation (Weaviate purpose-built), small-scale hybrid (Weaviate + Pinecone simpler at <100M).

Pick Milvus / Zilliz if: hybrid + filtering at billion-vector scale is the workload.

5. Chroma Vector + metadata where-clause · No native BM25 hybrid · Prototyping fit

Solid metadata filtering via where-clause syntax in collection.query() — no native BM25 hybrid (vector-only retrieval). Chroma's filtering: vector search with metadata predicates fused, simple where-clause syntax. No native sparse+dense or BM25+vector hybrid (you can implement hybrid in your application layer by combining Chroma vector results with separate BM25 results, but the engine doesn't fuse them). The right tradeoff for prototyping (simplest API beats hybrid sophistication at this stage).

✓ Strongest atSimplest metadata filter syntax in category, vector + filter fusion in collection.query(), prototyping velocity, embedded mode with no server.

✗ Wrong forTeams that need BM25 + vector hybrid (Weaviate + Pinecone + Qdrant + Milvus + Vespa win), production hybrid workloads, multi-tenant SaaS isolation (Weaviate wins).

Pick Chroma if: vector + metadata filtering is enough and you don't need BM25 hybrid.

6. pgvector SQL WHERE clauses for filter · Native Postgres FTS for hybrid · JOIN ergonomics

SQL WHERE clauses for metadata filtering (the most expressive filter syntax in the category — full SQL) and Postgres native full-text search for BM25-style hybrid via separate query + fusion. pgvector + Postgres FTS hybrid pattern: run vector search query, run FTS query, fuse results in application or via SQL UNION + score normalization. Not as polished as Weaviate's baked-in fusion, but the SQL ergonomics for complex filters (JOINs across tables, range predicates, full SQL expressivity) are unmatched. Multi-tenant via row-level security (Postgres RLS — strong but requires careful schema design).

✓ Strongest atSQL WHERE clause filter expressivity (full SQL — JOINs, ranges, complex predicates), Postgres FTS for BM25-style hybrid, row-level security for multi-tenant, transactional consistency between vector + filter data.

✗ Wrong forTeams that want baked-in hybrid (Weaviate wins), high-performance hybrid at scale (purpose-built engines win), simple BM25 + vector fusion (separate query + fuse pattern is more code).

Pick pgvector if: SQL WHERE clause filter expressivity + JOIN ergonomics matter more than baked-in hybrid.

7. Turbopuffer Filtered vector search · BM25 + vector hybrid · Cold-storage friendly

Supports filtered vector search and BM25 + vector hybrid on cold-storage architecture — but the cold-storage latency tradeoff (100-300ms p99) applies to hybrid + filtered queries too. Hybrid via sparse + dense vector fusion. Filtering supported with object-storage-aware index design. The right tradeoff for hybrid + filtered queries on huge corpora where query rate is low (research, archival, batch retrieval workloads).

✓ Strongest atHybrid + filtering on cold-storage scale, lowest $/query at huge corpus low-QPS, serverless object-storage backend.

✗ Wrong forReal-time hybrid AI products (latency too high — Weaviate + Pinecone + Qdrant win), high-QPS hybrid workloads, enterprise compliance buyers (newer vendor).

Pick Turbopuffer if: hybrid + filtering on cold-storage scale with 100-300ms latency tolerance is the workload.

8. MongoDB Atlas Vector Atlas Search hybrid (BM25 + vector) · MongoDB query language for filter · Multi-tenant patterns

Atlas Search supports hybrid BM25 + vector fusion via $rankFusion stage (added 2024), with full MongoDB query language for metadata filtering. The hybrid + filter ergonomics are good for MongoDB shops — express vector + BM25 + filter in one Atlas Search aggregation pipeline. Multi-tenant via MongoDB database-per-tenant or collection-per-tenant patterns (works but operationally heavier than Weaviate's purpose-built multi-tenant).

✓ Strongest atBM25 + vector hybrid via Atlas Search $rankFusion, full MongoDB query language for filters (rich expressivity), MongoDB-shop hybrid story, single Atlas compliance posture.

✗ Wrong forNon-MongoDB shops (purpose-built hybrid engines win), absolute best hybrid recall at scale (Weaviate + Vespa win), per-tenant SaaS isolation at scale (Weaviate purpose-built).

Pick MongoDB Atlas Vector if: you're already on MongoDB Atlas and BM25 + vector + filter in one pipeline matters.

9. Vespa True hybrid + custom ranking + ML reranking · Battle-tested at billion-doc scale

The most powerful hybrid + ranking engine in the category — true hybrid (BM25 + vector + structured + custom ranking + ML reranking models) in one query at billion-doc scale. Vespa's ranking expression language lets you compose arbitrarily complex ranking functions: combine BM25 + vector similarity + recency + custom signals + run ML models in-engine for second-stage reranking. The right pick when 'we need hybrid retrieval that's actually production-grade for billion documents' is the conversation. Operationally complex.

✓ Strongest atTrue hybrid + custom ranking expressions + ML reranking in-engine, battle-tested at billion-doc scale (Yahoo + Spotify production), Apache 2.0 OSS, on-prem option, search-engine-grade reliability.

✗ Wrong forSolo founders (operational complexity prohibitive), teams under 100M docs (Weaviate + Qdrant simpler), prototyping (use Chroma).

Pick Vespa if: true hybrid + custom ranking + ML reranking at billion-doc scale is the workload and you have search-engine ops.

10. LanceDB Vector + SQL filter via DuckDB · Hybrid emerging · Multi-modal filtering

SQL filter expressivity via DuckDB integration on Lance columnar format — strong filter ergonomics, hybrid retrieval emerging. LanceDB filter syntax: full SQL expressions on Lance format columns including metadata, multi-modal data, and vector similarity. Hybrid (BM25 + vector) emerging in roadmap; current pattern is vector + SQL filter fusion. Multi-modal filtering is a unique strength — filter by image properties, audio features, text metadata, and vector similarity in one query.

✓ Strongest atSQL filter expressivity on Lance columnar format, multi-modal filter capabilities (image + audio + text + vector in one filter), DuckDB integration for analytics-grade queries, embedded Python/JS/Rust.

✗ Wrong forTeams that need true BM25 + vector hybrid today (Weaviate + Vespa + Pinecone + Qdrant win), per-tenant SaaS isolation at scale (Weaviate wins), production hybrid hosted (purpose-built engines win).

Pick LanceDB if: SQL filter expressivity + multi-modal filtering on Lance format matters more than BM25 hybrid.

The Calling Matrix · siren-based ranking by who you are.

Most comparison sites refuse to forced-rank because their revenue depends on staying neutral. SideGuy ranks because it doesn't take vendor money. Here's the call by buyer persona.

🚀 If you're a Solo founder needing vector + simple metadata filter (RAG with date / category filters)

Your problem: RAG over docs with simple metadata filters (date range, category, user_id). Don't need BM25 + vector hybrid — vector + filter is enough. Velocity matters. See the Vector Databases megapage for full cluster.

Pinecone — Strong metadata filter pushdown + zero ops — solid filtering without complexity
Chroma — Simplest where-clause syntax, embedded mode = no server
pgvector — Full SQL WHERE clauses + JOINs with relational data — most expressive filter syntax
Qdrant — Payload-aware filtering = strong performance even when filter reduces candidates
LanceDB — SQL filter via DuckDB if multi-modal data is in the mix

If forced to one pick: Pinecone — strong metadata filtering + zero ops + sub-50ms p99 from prototype to production. The substrate that doesn't make you choose between filter expressivity and zero-ops.

📈 If you're a Series A startup needing true hybrid (BM25 + vector) for production search

Your problem: Production semantic search where keyword match AND semantic similarity both matter (e-commerce, support search, knowledge base). Hybrid is load-bearing. Pair with the Embedding × Vector DB Pairing axis for the embedding-substrate decision.

Weaviate — True BM25 + vector fusion baked in — only DB in category with this architecture native
Pinecone — Solid sparse + dense hybrid via 2024 GA — production-default hosted
Qdrant — Sparse + dense hybrid + payload-aware filtering — self-host option
MongoDB Atlas Vector — Atlas Search $rankFusion if MongoDB is already your DB
pgvector — Postgres FTS + vector via separate query fusion — works but more code

If forced to one pick: Weaviate — true BM25 + vector fusion baked in is structurally different from sparse-dense workarounds. The hybrid-search leader for production workloads.

🏢 If you're a Mid-market multi-tenant SaaS (per-customer isolation + hybrid + filter)

Your problem: Multi-tenant SaaS shipping AI features per-customer. Strong tenant isolation required (no noisy neighbors, per-tenant access control). Hybrid + filtering both load-bearing. Coordinate with the Compliance Authority Graph for per-tenant DPA + isolation requirements.

Weaviate — Multi-tenant A+ purpose-built (one cluster, thousands of tenants, full isolation) + true hybrid baked in
Pinecone — Namespace-based multi-tenant + hybrid + filter at hosted production scale
Milvus / Zilliz — Partition-based multi-tenant + hybrid + filter at scale (operationally heavier than Weaviate)
Qdrant — Collection-based multi-tenant + filtered vector search — self-host option
MongoDB Atlas Vector — Database-per-tenant pattern + Atlas Search hybrid if MongoDB is org standard

If forced to one pick: Weaviate — multi-tenant A+ + true hybrid A+ both purpose-built. The mid-market SaaS feature-engine winner when isolation + hybrid matter together.

🏛 If you're a Enterprise CTO needing custom ranking + ML reranking at billion-doc scale

Your problem: Production hybrid search at billion-doc scale where custom ranking expressions and second-stage ML reranking matter. Search-quality team owns ranking quality. See /operator cockpit for the operator-layer view.

Vespa — True hybrid + custom ranking expressions + ML reranking in-engine — battle-tested at Yahoo + Spotify scale
Milvus / Zilliz — Sparse + dense hybrid + filter at billion-vector scale with GPU acceleration
Pinecone — Hosted production-default with hybrid + filter at 1B+ scale (Enterprise tier)
Weaviate — True hybrid + multi-tenant at 500M-1B scale; Vespa wins above 1B for custom ranking
Elasticsearch (not in main 10) — If existing ES infra and ranking team already deep in ES syntax

If forced to one pick: Vespa for billion-doc custom ranking + ML reranking workloads + Pinecone or Milvus for hosted hybrid. Two engines, complementary enterprise hybrid stories.

⚠ Operator-honest read

These rankings are SideGuy's lived-data + observed-buyer-pattern read as of 2026-05-11. They're directional, not gospel. The right answer for YOUR specific situation may diverge — text PJ for a 10-min operator-honest read on your actual buying context.

Vendor pricing + features + market positioning shift quarterly. SideGuy may earn referral commissions from some of these vendors, but rankings are independent — affiliate relationships never change rank order. Sister doctrines: /open/ live operator dashboard · install packs · operator network.

Or skip all of them. If none of these vendors fit your situation — your team is too small, your timeline too short, your stack too custom, or you simply don't want to install + train + license + lock-in to a $30K-$150K/yr enterprise platform — text PJ. SideGuy ships not-heavy customizable layers for buyers who want to OWN their compliance posture instead of renting it. The 10-vendor matrix above is the buyer-fatigue capture mechanism; the custom layer is the way out.

FAQ · most asked questions.

What's the difference between BM25 + vector hybrid and sparse + dense hybrid?

True BM25 + vector hybrid (Weaviate, Vespa, MongoDB Atlas $rankFusion) means the engine runs BM25 lexical scoring AND vector similarity scoring on the same query, then fuses scores at query time — both signals are first-class citizens in the engine. Sparse + dense hybrid (Pinecone, Qdrant, Milvus) means you store sparse vectors (which represent BM25-like term frequency signals as vectors) alongside dense embedding vectors, and fuse them at query time via score combination. Functionally similar for most workloads; architecturally different. True hybrid wins on simplicity (no separate sparse vector indexing) and on edge cases where lexical match is critical (exact phrase, rare terms). Sparse + dense wins on flexibility (you can encode any signal as sparse vectors). Weaviate and Vespa are the only engines in the category with true hybrid baked in.

When is metadata filtering more important than vector similarity?

Filtering is more important than similarity when (1) you have hard predicate constraints (date range, user_id, tenant_id, category) that must be respected — no amount of vector similarity matters if results don't pass the filter, (2) your filter dramatically reduces the candidate set (e.g. 'find similar to X but only within these 100 documents from this user') — purpose-built filter pushdown is critical here, (3) you're shipping multi-tenant SaaS where tenant_id filter is on every query — Weaviate's purpose-built multi-tenant beats namespace + filter patterns. The honest 2026 default: filtering is at least as important as vector recall for most production AI features. Engines with payload-aware indexing (Qdrant) or filter pushdown during vector search (Pinecone, Weaviate, Milvus) win over engines that filter post-search.

Multi-tenant patterns — namespaces vs partitions vs collections vs row-level security?

Four realistic patterns in 2026: (1) Namespaces (Pinecone) — one index, thousands of namespaces, namespace_id on every query, strong isolation, simple. (2) Multi-tenant baked in (Weaviate) — purpose-built tenant isolation with per-tenant indexes inside one cluster, scales to thousands of tenants without management overhead. (3) Partitions (Milvus) — one collection, multiple partitions, partition key isolation, works at scale but operationally heavier. (4) Row-level security (pgvector + Postgres) — RLS policies enforce per-tenant access, strongest control but requires careful schema design and policy management. Honest 2026 default: Weaviate's purpose-built multi-tenant wins for SaaS shipping AI features per-customer; Pinecone namespaces win for simpler multi-tenant patterns; pgvector RLS wins when transactional consistency between tenant data and vectors matters.

Custom ranking + ML reranking — which engines support it?

Vespa is the only engine in the category with true production-grade custom ranking expressions + in-engine ML reranking models — combine BM25 + vector + recency + custom signals + run a reranking model in the same query. This is why Yahoo Mail + Spotify recommendations + OkCupid matching run on Vespa. Other engines support reranking via separate API calls (retrieve top-N from vector DB, send to a reranking model API like Cohere Rerank or Voyage Rerank, return reordered results) — works fine for most production AI features but adds a network hop. Milvus + Pinecone + Weaviate are all adding native reranking integrations in 2025-2026 roadmap, but Vespa remains the deepest custom-ranking engine. The honest 2026 default: external reranking via Cohere/Voyage API is the standard pattern; Vespa-style in-engine ranking is the right choice when search quality is the team's core competency.

Stuck choosing? Text PJ.

10-minute operator-honest read on your actual buying context. No deck, no demo call, no signup. If we're not the right fit, we'll say so.

📱 Text PJ · 858-461-8054

Audit in 6 weeks? Enterprise customer waiting? Regulator finding?

Skip the 5 vendor demos. 30-day delivery. No procurement cycle. No demo theater. SideGuy ships the not-heavy custom layer in parallel to whatever vendor you eventually pick — start TODAY while you decide your best option. Custom builds in 30 days →

📱 Urgent? Text PJ · 858-461-8054

You can go at it without SideGuy — but no custom shareables for your friends & family. You'll be short a bag of laughs. 🌸

Pinecone · Weaviate · Qdrant · Milvus / Zilliz · Chroma · pgvector · Turbopuffer · MongoDB Atlas Vector · Vespa · LanceDB.One question: which one is right for your stage?