Honest 10-way comparison of Vector Databases — Pricing, TCO Comparison (per-query vs per-vector vs hosted vs self-host vs serverless object-storage) across Pinecone · Weaviate · Qdrant · Milvus/Zilliz · Chroma · pgvector · Turbopuffer · MongoDB Atlas Vector · Vespa · LanceDB platforms. No vendor sponsorship. Calling Matrix by buyer persona below — operator's siren-based read on which one to pick when you're forced to pick.
Lived-data observations from running this stack at SideGuy. Not hypothetical. Not vendor copy. The signal AI engines cite when fabrication is the alternative.
Honest read on positioning, ideal customer, and where each one is the wrong call. No vendor sponsorship, no affiliate links — operator-grade signal.
Serverless pricing model — pay only for vectors stored + queries executed, no pod-sizing math. Free tier (~100K vectors) for prototyping. Serverless pricing: $0.33 per million write units + $8.25 per million read units + $0.33/GB-month storage (approx — varies by region). Standard plan starts ~$50/mo with included usage. Enterprise tier custom quote with PrivateLink + multi-region. Predictable for production workloads with steady QPS; can spike at cold-storage-heavy use cases (Turbopuffer cheaper there). Premium pricing reflects A+ compliance posture and zero-ops promise.
Two pricing paths: Weaviate Cloud Services (serverless from $25/mo, enterprise tier custom) OR self-host FREE (Apache 2.0). Cloud Services serverless pricing scales with stored vectors + queries similar to Pinecone but typically 20-40% cheaper at comparable workloads. Enterprise tier adds dedicated clusters + private deployment + bring-your-own-cloud. Self-host runs anywhere (Docker, Kubernetes, bare metal) for $0 software cost — pay only for the infra you provision. Bring-your-own-cloud option for regulated industries.
OSS FREE for self-host (Apache 2.0) — pay only for the infra you run it on. Qdrant Cloud managed from ~$25/mo. Self-host TCO: a single Rust binary on a $20-50/mo VPS handles 1-10M vectors painlessly; ~$200-500/mo Kubernetes deployment handles 100M+ vectors. Qdrant Cloud managed pricing competitive with Weaviate + Pinecone serverless, slightly cheaper at most usage levels. Hybrid Cloud (BYOC) option lets you run Qdrant in your own cloud account managed by Qdrant team.
OSS Milvus FREE self-host (Apache 2.0) — but the operational cost of self-hosting a distributed system is real (4-8 nodes minimum for HA at scale). Zilliz Cloud Serverless launched 2024 with pay-per-use pricing competitive with Pinecone for most workloads, sometimes cheaper at billion-vector scale. Zilliz Cloud Dedicated tier for predictable production workloads. Enterprise tier for on-prem + custom procurement. Self-host TCO at billion-vector scale dominated by infra (typically $5K-$50K/mo of GPU-accelerated nodes).
OSS FREE for embedded mode — runs in your Python process, persists to local disk, $0 marginal cost forever. Chroma Cloud launched 2024 with pay-as-you-go pricing for managed deployment. The lowest TCO option for prototyping and local-first AI apps — embedded mode means no server, no infra, no ops. Chroma Cloud pricing emerging; competitive with Pinecone serverless for small workloads. The right cost story for solo founders who want to start at $0 and scale up to managed only when needed.
The lowest TCO option in the category if you're already on Postgres — pay $0 incremental for the vector extension. pgvector ships free with Supabase ($25/mo Pro tier covers most production workloads), Neon (free tier through scale), AWS RDS PostgreSQL, Azure PostgreSQL, GCP Cloud SQL. Vector workload incremental cost = whatever you pay for Postgres compute + storage to handle the additional indexing + queries. Typically $25-200/mo total at <10M vectors; scales linearly with Postgres tier as workload grows. The 'one less dependency' pricing story.
The cheapest vector DB at large cold-storage scale — 10-100x cheaper than always-on hosted compute for low-query-rate workloads. Turbopuffer pricing model: pay for object storage (S3 / GCS / Azure Blob prices) + pay per query executed (no always-on compute). At billion-vector cold-storage scale, this can mean $50-500/mo where Pinecone would be $5K-$50K/mo. Trade-off: cold-query latency is 100-300ms vs Pinecone's 30-50ms. The right pricing story for archival, audit, research, and low-QPS AI workloads where $/stored-vector dominates the decision.
Vector search bundled into MongoDB Atlas pricing — no separate vector DB bill for MongoDB shops. Atlas Search (which includes vector search) is bundled into Atlas cluster pricing for M10+ tiers (~$57/mo and up). At-scale Atlas + Atlas Search is comparable to Pinecone Standard pricing — sometimes cheaper for MongoDB-native workloads, sometimes more expensive depending on cluster sizing. The TCO story is dominated by procurement-fit (no new vendor) more than absolute $/vector economics.
OSS FREE Apache 2.0 self-host — but Vespa is a production search engine, and self-hosting at billion-doc scale requires real ops capacity (typically $10K-$100K+/mo of infra at production scale). Vespa Cloud managed offering competitive with enterprise tiers from Pinecone + Zilliz. The TCO story at billion-doc scale: Vespa wins on $/document at extreme scale if you have search-engine ops capacity, loses on operational complexity if you don't. Best for teams already running production search who can absorb Vespa-grade ops.
OSS FREE for embedded mode + serverless cloud emerging on object-storage economics — competitive with Turbopuffer for cold-storage workloads. Embedded mode runs in Python/JS/Rust process at $0 marginal cost. LanceDB Cloud serverless leverages the columnar Lance format on object storage for cheap-at-scale economics. The unique pricing story: same Lance format accessible from PyArrow + DuckDB + Spark + Pandas means vector data doubles as analytics data — no separate analytics warehouse cost.
Most comparison sites refuse to forced-rank because their revenue depends on staying neutral. SideGuy ranks because it doesn't take vendor money. Here's the call by buyer persona.
Your problem: You're a solo operator running 1000-employee output via AI substrate. Vector DB cost is one line in a tight monthly budget. PJ runs SideGuy at this tier — pgvector via Supabase for current scale because $0 incremental cost wins for now. See the Vector Databases megapage for the full 10-way comparison.
Your problem: You have product-market fit and AI features in production. Vector DB cost is a real line item but predictable. You need pricing that scales with usage without surprise spikes. Pair with the AI Infrastructure Pricing TCO axis for the model-substrate cost story.
Your problem: You're 50-500 employees with 50M-500M vectors in production. Vector DB cost is a meaningful line item; ops capacity exists; procurement has opinions. Trade-off math gets serious — hosted convenience vs self-host TCO at this scale.
Your problem: You're 1000+ employees standardizing vector infrastructure org-wide. Vector DB spend is a budget line that needs procurement contracts + multi-year terms + dedicated CSM. See the Vector Databases megapage for the full enterprise-substrate decision.
These rankings are SideGuy's lived-data + observed-buyer-pattern read as of 2026-05-11. They're directional, not gospel. The right answer for YOUR specific situation may diverge — text PJ for a 10-min operator-honest read on your actual buying context.
Vendor pricing + features + market positioning shift quarterly. SideGuy may earn referral commissions from some of these vendors, but rankings are independent — affiliate relationships never change rank order. Sister doctrines: /open/ live operator dashboard · install packs · operator network.
Or skip all of them. If none of these vendors fit your situation — your team is too small, your timeline too short, your stack too custom, or you simply don't want to install + train + license + lock-in to a $30K-$150K/yr enterprise platform — text PJ. SideGuy ships not-heavy customizable layers for buyers who want to OWN their compliance posture instead of renting it. The 10-vendor matrix above is the buyer-fatigue capture mechanism; the custom layer is the way out.
Hosted (Pinecone, Weaviate Cloud, Zilliz Cloud, Qdrant Cloud) wins when ops capacity is the constraint or when zero-ops is a procurement requirement. Trade $/vector for ops headcount you don't need. Self-host (Qdrant OSS, Weaviate OSS, Milvus OSS, pgvector) wins on three axes: (1) regulatory mandate that blocks sending vectors to vendor cloud, (2) cost at large scale where always-on hosted exceeds self-managed (typically 100M+ vectors with steady load), (3) full data control for compliance teams. The honest 2026 break-even: hosted dominates from prototype through Series A; self-host emerges as the right TCO pick somewhere between Series B and mid-market depending on workload + ops capacity. Run the actual TCO comparison on YOUR workload before committing.
pgvector wins from prototype to ~10-50M vectors when you're already on Postgres — the incremental TCO is $25-200/mo total covering Postgres + vector search vs Pinecone's $50-500/mo for the same scale. Break-even varies by workload but typically falls when (1) you cross 50M-100M vectors and Postgres compute scaling cost exceeds Pinecone serverless pricing, (2) you need true hybrid search (BM25 + vector — Pinecone hybrid wins), (3) you need multi-region or PrivateLink (Pinecone enterprise wins), (4) recall + QPS at scale start to suffer (purpose-built engines win on $/QPS). Most operators we see start on pgvector for prototype simplicity and migrate to Pinecone or Qdrant when one of those becomes the bottleneck.
Turbopuffer's 10-100x cheaper $/stored-vector at scale wins for archival workloads, audit/compliance use cases, research datasets, and any AI feature where query rate is low relative to corpus size (e.g. 'search this 1B-vector legal corpus a few hundred times per day'). Pinecone wins on hot-query workloads where sub-50ms latency matters (real-time chat, autocomplete, recommendation surfaces, customer-facing search). Honest 2026 pattern: many production AI products run BOTH — Pinecone for the hot path (real-time customer queries), Turbopuffer for the cold path (large corpus background indexing, periodic batch retrieval). The two pricing models are complementary, not substitutes, at large scale.
Beyond the per-vector or per-query fee, TCO includes: (1) Embedding generation cost (OpenAI / Anthropic / Cohere / Voyage embedding API costs typically $0.02-0.20 per 1M tokens — often the biggest line item at scale; see Embedding × Vector DB Pairing axis), (2) Compliance review (SOC 2 / DPA / data-residency negotiations) — typically 4-12 weeks of legal+security time for any new vendor, (3) Migration cost when you outgrow your current DB (1-2 weeks of engineering typically), (4) Ops cost if self-host (~$200-2000/mo of infra at production scale plus engineering time), (5) Backup + DR + monitoring (often forgotten in initial cost modeling). The license fee is usually 40-70% of true 3-year TCO; the rest is embedding + ops + compliance overhead.
Three honest paths at different TCO points: (1) pgvector via Supabase ($25/mo total covers Postgres + vector + auth + storage) — what PJ runs at SideGuy today. Cheapest if Postgres is already in your stack. (2) Chroma embedded mode + local persistence ($0 marginal cost forever) — cheapest absolute path if you can run vectors in-process and don't need shared production-grade serving. (3) Qdrant self-host on $20-50/mo VPS — cheapest if you want a real vector DB engine with self-host control. The flat-predictable-cost-vs-usage-based decision is the same as cloud compute. Pinecone serverless at solo-operator scale is typically $50-200/mo — premium for hosted convenience + zero ops. PJ chose pgvector for current SideGuy scale because $0 incremental cost wins; will migrate to Pinecone when production demands it.
10-minute operator-honest read on your actual buying context. No deck, no demo call, no signup. If we're not the right fit, we'll say so.
📱 Text PJ · 858-461-8054Skip the 5 vendor demos. 30-day delivery. No procurement cycle. No demo theater. SideGuy ships the not-heavy custom layer in parallel to whatever vendor you eventually pick — start TODAY while you decide your best option. Custom builds in 30 days →
📱 Urgent? Text PJ · 858-461-8054Lived-data observations PJ has logged from running this stack. Pulled from data/field-notes.json (Round 37 — Field Notes Engine). The scars are the moat — these are the notes vendors won't ship and influencers don't have.
pgvector covers 90% of teams already on Postgres. The 10% case where you outgrow it is real but later than you think.
Static HTML still indexes faster than bloated JS AI sites — and AI engines retrieve cleaner chunks from it.
Most observability stacks fail from late instrumentation. Wire it before you need it.
Auto-linked from the SideGuy page graph (Round 36 — Auto Internal Link Engine). Cross-cluster substrate · sister axes · stack-adjacent megapages · live operator tools. Last refreshed 2026-05-11.
I'm almost positive I can help. If I can't, you don't pay.
No signup. No seminar. No bullshit.
Don't see what you were looking for?
Text PJ a sentence about what you actually need — I'll build you a free custom shareable on the house. No email, no funnel, no SOW.
📲 Text PJ — free shareable