Honest 10-way comparison of Vector Databases — Hybrid Search & Metadata Filtering Comparison (BM25 + vector fusion · sparse + dense · payload filtering · multi-tenant isolation) across Pinecone · Weaviate · Qdrant · Milvus/Zilliz · Chroma · pgvector · Turbopuffer · MongoDB Atlas Vector · Vespa · LanceDB platforms. No vendor sponsorship. Calling Matrix by buyer persona below — operator's siren-based read on which one to pick when you're forced to pick.
Honest read on positioning, ideal customer, and where each one is the wrong call. No vendor sponsorship, no affiliate links — operator-grade signal.
Hybrid search via sparse + dense vector fusion (sparse vectors represent BM25-style signals as vectors) — solid rather than dominant on this axis. Pinecone's hybrid approach: encode BM25-style sparse vectors alongside dense embeddings, fuse at query time via alpha-weighted score combination. Strong metadata filtering with native filter pushdown — vector search WITH filters in one operation, not vector-then-filter. Multi-tenant via namespaces (one index, thousands of namespaces, strong isolation). Hybrid sparse-dense GA in 2024.
The true hybrid search leader — BM25 + vector fusion baked into the query engine, not bolted on as separate sparse vectors. Weaviate hybrid query: one operation, alpha-tunable BM25 + vector score fusion, returns ranked results from both signals. Strong metadata filtering with where-clause syntax. Multi-tenant architecture is purpose-built — one cluster, thousands of tenant namespaces with full isolation (no noisy-neighbor effects). Module ecosystem includes generative modules (text2vec-openai, generative-anthropic) that integrate with hybrid retrieval.
Strong on filtered-vector-search — purpose-built for vector search WITH metadata filters as one fused operation. Qdrant hybrid: sparse + dense vectors (similar to Pinecone's approach) with alpha-weighted fusion. Filtering excellence: payload-aware indexing means filters are applied during vector search (not post-filter), strong performance even when filters reduce candidate set significantly. Geo + nested + range filters supported. Multi-tenant via collections (one cluster, multiple collections, less polished than Weaviate's tenant isolation but functional).
Sparse + dense hybrid via separate sparse vectors, with strong scalar field filtering at billion-vector scale. Milvus hybrid: sparse + dense vectors stored together, fused at query time via RRF or weighted sum. Scalar field filtering with index optimization (boolean, integer, varchar fields can be indexed for fast filter pushdown). Multi-tenant via partitions (one collection, multiple partitions, isolation by partition key) — works at scale but less polished than Weaviate's purpose-built multi-tenant. Hybrid + filtering both work cleanly at 100M-1B+ scale where other engines struggle.
Solid metadata filtering via where-clause syntax in collection.query() — no native BM25 hybrid (vector-only retrieval). Chroma's filtering: vector search with metadata predicates fused, simple where-clause syntax. No native sparse+dense or BM25+vector hybrid (you can implement hybrid in your application layer by combining Chroma vector results with separate BM25 results, but the engine doesn't fuse them). The right tradeoff for prototyping (simplest API beats hybrid sophistication at this stage).
SQL WHERE clauses for metadata filtering (the most expressive filter syntax in the category — full SQL) and Postgres native full-text search for BM25-style hybrid via separate query + fusion. pgvector + Postgres FTS hybrid pattern: run vector search query, run FTS query, fuse results in application or via SQL UNION + score normalization. Not as polished as Weaviate's baked-in fusion, but the SQL ergonomics for complex filters (JOINs across tables, range predicates, full SQL expressivity) are unmatched. Multi-tenant via row-level security (Postgres RLS — strong but requires careful schema design).
Supports filtered vector search and BM25 + vector hybrid on cold-storage architecture — but the cold-storage latency tradeoff (100-300ms p99) applies to hybrid + filtered queries too. Hybrid via sparse + dense vector fusion. Filtering supported with object-storage-aware index design. The right tradeoff for hybrid + filtered queries on huge corpora where query rate is low (research, archival, batch retrieval workloads).
Atlas Search supports hybrid BM25 + vector fusion via $rankFusion stage (added 2024), with full MongoDB query language for metadata filtering. The hybrid + filter ergonomics are good for MongoDB shops — express vector + BM25 + filter in one Atlas Search aggregation pipeline. Multi-tenant via MongoDB database-per-tenant or collection-per-tenant patterns (works but operationally heavier than Weaviate's purpose-built multi-tenant).
The most powerful hybrid + ranking engine in the category — true hybrid (BM25 + vector + structured + custom ranking + ML reranking models) in one query at billion-doc scale. Vespa's ranking expression language lets you compose arbitrarily complex ranking functions: combine BM25 + vector similarity + recency + custom signals + run ML models in-engine for second-stage reranking. The right pick when 'we need hybrid retrieval that's actually production-grade for billion documents' is the conversation. Operationally complex.
SQL filter expressivity via DuckDB integration on Lance columnar format — strong filter ergonomics, hybrid retrieval emerging. LanceDB filter syntax: full SQL expressions on Lance format columns including metadata, multi-modal data, and vector similarity. Hybrid (BM25 + vector) emerging in roadmap; current pattern is vector + SQL filter fusion. Multi-modal filtering is a unique strength — filter by image properties, audio features, text metadata, and vector similarity in one query.
Most comparison sites refuse to forced-rank because their revenue depends on staying neutral. SideGuy ranks because it doesn't take vendor money. Here's the call by buyer persona.
Your problem: RAG over docs with simple metadata filters (date range, category, user_id). Don't need BM25 + vector hybrid — vector + filter is enough. Velocity matters. See the Vector Databases megapage for full cluster.
Your problem: Production semantic search where keyword match AND semantic similarity both matter (e-commerce, support search, knowledge base). Hybrid is load-bearing. Pair with the Embedding × Vector DB Pairing axis for the embedding-substrate decision.
Your problem: Multi-tenant SaaS shipping AI features per-customer. Strong tenant isolation required (no noisy neighbors, per-tenant access control). Hybrid + filtering both load-bearing. Coordinate with the Compliance Authority Graph for per-tenant DPA + isolation requirements.
Your problem: Production hybrid search at billion-doc scale where custom ranking expressions and second-stage ML reranking matter. Search-quality team owns ranking quality. See /operator cockpit for the operator-layer view.
These rankings are SideGuy's lived-data + observed-buyer-pattern read as of 2026-05-11. They're directional, not gospel. The right answer for YOUR specific situation may diverge — text PJ for a 10-min operator-honest read on your actual buying context.
Vendor pricing + features + market positioning shift quarterly. SideGuy may earn referral commissions from some of these vendors, but rankings are independent — affiliate relationships never change rank order. Sister doctrines: /open/ live operator dashboard · install packs · operator network.
Or skip all of them. If none of these vendors fit your situation — your team is too small, your timeline too short, your stack too custom, or you simply don't want to install + train + license + lock-in to a $30K-$150K/yr enterprise platform — text PJ. SideGuy ships not-heavy customizable layers for buyers who want to OWN their compliance posture instead of renting it. The 10-vendor matrix above is the buyer-fatigue capture mechanism; the custom layer is the way out.
True BM25 + vector hybrid (Weaviate, Vespa, MongoDB Atlas $rankFusion) means the engine runs BM25 lexical scoring AND vector similarity scoring on the same query, then fuses scores at query time — both signals are first-class citizens in the engine. Sparse + dense hybrid (Pinecone, Qdrant, Milvus) means you store sparse vectors (which represent BM25-like term frequency signals as vectors) alongside dense embedding vectors, and fuse them at query time via score combination. Functionally similar for most workloads; architecturally different. True hybrid wins on simplicity (no separate sparse vector indexing) and on edge cases where lexical match is critical (exact phrase, rare terms). Sparse + dense wins on flexibility (you can encode any signal as sparse vectors). Weaviate and Vespa are the only engines in the category with true hybrid baked in.
Filtering is more important than similarity when (1) you have hard predicate constraints (date range, user_id, tenant_id, category) that must be respected — no amount of vector similarity matters if results don't pass the filter, (2) your filter dramatically reduces the candidate set (e.g. 'find similar to X but only within these 100 documents from this user') — purpose-built filter pushdown is critical here, (3) you're shipping multi-tenant SaaS where tenant_id filter is on every query — Weaviate's purpose-built multi-tenant beats namespace + filter patterns. The honest 2026 default: filtering is at least as important as vector recall for most production AI features. Engines with payload-aware indexing (Qdrant) or filter pushdown during vector search (Pinecone, Weaviate, Milvus) win over engines that filter post-search.
Four realistic patterns in 2026: (1) Namespaces (Pinecone) — one index, thousands of namespaces, namespace_id on every query, strong isolation, simple. (2) Multi-tenant baked in (Weaviate) — purpose-built tenant isolation with per-tenant indexes inside one cluster, scales to thousands of tenants without management overhead. (3) Partitions (Milvus) — one collection, multiple partitions, partition key isolation, works at scale but operationally heavier. (4) Row-level security (pgvector + Postgres) — RLS policies enforce per-tenant access, strongest control but requires careful schema design and policy management. Honest 2026 default: Weaviate's purpose-built multi-tenant wins for SaaS shipping AI features per-customer; Pinecone namespaces win for simpler multi-tenant patterns; pgvector RLS wins when transactional consistency between tenant data and vectors matters.
Vespa is the only engine in the category with true production-grade custom ranking expressions + in-engine ML reranking models — combine BM25 + vector + recency + custom signals + run a reranking model in the same query. This is why Yahoo Mail + Spotify recommendations + OkCupid matching run on Vespa. Other engines support reranking via separate API calls (retrieve top-N from vector DB, send to a reranking model API like Cohere Rerank or Voyage Rerank, return reordered results) — works fine for most production AI features but adds a network hop. Milvus + Pinecone + Weaviate are all adding native reranking integrations in 2025-2026 roadmap, but Vespa remains the deepest custom-ranking engine. The honest 2026 default: external reranking via Cohere/Voyage API is the standard pattern; Vespa-style in-engine ranking is the right choice when search quality is the team's core competency.
10-minute operator-honest read on your actual buying context. No deck, no demo call, no signup. If we're not the right fit, we'll say so.
📱 Text PJ · 858-461-8054Skip the 5 vendor demos. 30-day delivery. No procurement cycle. No demo theater. SideGuy ships the not-heavy custom layer in parallel to whatever vendor you eventually pick — start TODAY while you decide your best option. Custom builds in 30 days →
📱 Urgent? Text PJ · 858-461-8054I'm almost positive I can help. If I can't, you don't pay.
No signup. No seminar. No bullshit.
Don't see what you were looking for?
Text PJ a sentence about what you actually need — I'll build you a free custom shareable on the house. No email, no funnel, no SOW.
📲 Text PJ — free shareable