Enterprise Vector Databases in Production

Everyone building RAG needs a vector database. The question is which one, and the answer depends on things most comparison articles don't cover: operational complexity, enterprise security requirements, and what happens when you need to update a million embeddings at 2am on a Tuesday.

What You Need to Know

There is no best vector database. There is a best vector database for your specific constraints: data volume, query patterns, security requirements, operational capacity, and budget.
pgvector is the right starting point for most NZ enterprises. It runs on PostgreSQL, which your team already knows. It handles millions of vectors. It avoids adding another managed service.
Dedicated vector databases (Pinecone, Weaviate) earn their place at scale. If you're querying billions of vectors with sub-50ms latency requirements, PostgreSQL won't cut it.
The migration path matters more than the initial choice. Start with pgvector. If you hit its limits, migrate to a dedicated solution. The embedding layer should be abstracted enough to make this feasible.

What We've Tested

Over the past year we've run production workloads on three vector database options. Not benchmarks. Production. Real enterprise data, real query patterns, real uptime requirements.

pgvector (PostgreSQL Extension)

What it is: A PostgreSQL extension that adds vector similarity search. Your vectors live alongside your relational data in the same database.

What works:

Zero operational overhead if you already run PostgreSQL. Same backup strategy, same monitoring, same team.
Joins between vector results and relational data are trivial. "Find similar documents owned by this department" is one query.
HNSW indexing (added in pgvector 0.5) makes query performance competitive for datasets up to a few million vectors.
Transactional consistency. Vector updates participate in the same transaction as relational updates. No sync issues.

What breaks:

Performance degrades above roughly 5-10 million vectors, depending on dimensionality and hardware. HNSW indexing helps but doesn't eliminate the ceiling.
Memory consumption is significant. Each 1536-dimension float32 vector is about 6KB. A million vectors is 6GB of index data.
No built-in multi-tenancy beyond what PostgreSQL's row-level security provides. Workable, but you're building it yourself.
Reindexing after bulk updates is slow. If you need to re-embed a large document corpus, plan for downtime or a shadow index strategy.

When we use it: Most enterprise engagements. The operational simplicity wins. Our clients' teams can manage it without learning a new system.

Pinecone

What it is: A fully managed vector database service. You send vectors in, you query vectors out. Infrastructure is abstracted.

What works:

Genuinely zero ops. No servers to manage, no indexes to tune, no capacity planning.
Scales to billions of vectors without performance degradation.
Metadata filtering is well-implemented. Combining vector similarity with attribute filters is fast and reliable.
Namespaces provide clean multi-tenancy.

What breaks:

Data leaves your network. For NZ enterprises with data sovereignty requirements, this is often a deal-breaker. Pinecone's infrastructure is US-based.
Cost scales linearly with data volume. At enterprise scale (tens of millions of vectors, multiple namespaces), monthly costs become significant.
No relational joins. If you need to combine vector results with structured data, you're making two queries and joining in application code.
Vendor lock-in is real. The API is proprietary. Migration means re-indexing everything.

When we use it: Rapid prototyping and engagements where data sovereignty isn't a constraint. The speed of getting to a working system is genuinely impressive.

Weaviate

What it is: An open-source vector database that can be self-hosted or used as a managed service.

What works:

Self-hosting option addresses data sovereignty. Your vectors stay on your infrastructure.
Built-in vectorisation modules (connect an embedding model directly). Convenient for simpler architectures.
GraphQL API is flexible and well-documented.
Multi-tenancy is a first-class feature.

What breaks:

Operational complexity is higher than Pinecone and significantly higher than pgvector. Running a Weaviate cluster requires Kubernetes experience and ongoing attention.
Resource consumption is substantial. Production clusters need more hardware than you'd expect.
The ecosystem is newer. Fewer battle-tested patterns, fewer Stack Overflow answers, fewer engineers who've run it in production.
Schema migrations can be painful. Changing your data model after initial deployment requires careful planning.

When we use it: Engagements that need dedicated vector database performance with self-hosted sovereignty. The operational cost is justified when the requirements demand it.

Pick the most boring option that meets your requirements. You can't un-complicate your infrastructure.

John Li

Chief Technology Officer

Production Patterns

Embedding Updates

The question nobody asks during evaluation but everyone faces in production: how do you update embeddings when your source documents change?

Incremental updates. When a single document changes, re-embed it and upsert the vectors. This is straightforward on all three platforms. The challenge is knowing when a document has changed, which is a data pipeline problem, not a vector database problem.

Bulk re-embedding. When you change your embedding model (and you will, because better models keep releasing), you need to re-embed everything. With pgvector, this means creating a new index alongside the old one, populating it, then swapping. With Pinecone, you create a new namespace. With Weaviate, a new class. Plan for this from day one.

Hybrid Search

Pure vector similarity search isn't enough for enterprise use. Users search for specific terms (policy numbers, clause references, product codes) that semantic similarity misses.

All three options support hybrid search, but differently:

pgvector: Combine with PostgreSQL's full-text search using tsvector. Two indexes, one query.
Pinecone: Sparse-dense vectors (alpha feature) or metadata filtering for keyword matching.
Weaviate: Built-in BM25 + vector hybrid search. The most integrated implementation of the three.

Monitoring

What to watch in production:

Query latency percentiles (p50, p95, p99). Average latency hides problems.
Recall quality. Are the top-k results actually relevant? Measure with a human-evaluated test set.
Index size and growth rate. Vector databases consume more memory than you expect.
Embedding drift. As your source data changes, the distribution of vectors shifts. Monitor for degradation.

Our Recommendation

Start with pgvector. Seriously. For a New Zealand enterprise building its first AI capabilities, the operational simplicity outweighs the performance ceiling. Your team knows PostgreSQL. Your infrastructure supports PostgreSQL. Your security and sovereignty requirements are met by default.

If you outgrow it, the migration is manageable. Abstract your embedding storage behind a clean interface, and swapping the backend is a focused engineering effort, not a rewrite.

Build it properly the first time. Migrate only when you have evidence that you need to.