mdcms/neuraldb-docs/pages/ops-scaling.md

58 lines
1.6 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
title: Scaling
sort: 120
section-id: operations
keywords: scaling, sharding, read replicas, horizontal scaling, capacity planning, performance
description: Scaling NeuralDB horizontally with sharding, read replicas, and capacity planning
language: en
---
# Scaling
## Read Replicas
```python
primary = NeuralDB("postgresql://neuraldb:pass@primary:5432/mydb")
replica = NeuralDB("postgresql://neuraldb:pass@replica:5432/mydb")
def search(query_vector):
return replica.query("SELECT * FROM docs ORDER BY embedding <=> %s LIMIT 10", [query_vector])
def insert(content, embedding):
return primary.execute("INSERT INTO docs (content, embedding) VALUES (%s, %s)", [content, embedding])
```
| Replicas | Approx peak QPS (1536-dim, 10M vectors) |
|---------|-----------------------------------------|
| 1 primary | 8,000 |
| 1 primary + 2 replicas | 24,000 |
| 1 primary + 4 replicas | 48,000 |
## Horizontal Sharding
```sql
SELECT neuraldb_cluster.init_cluster(shards => 8, replication_factor => 2);
CREATE TABLE documents (
id UUID NOT NULL DEFAULT gen_random_uuid(),
tenant_id UUID NOT NULL,
content TEXT,
embedding VECTOR(1536)
) SHARD BY tenant_id;
```
## Capacity Planning
```
Row data ≈ avg_row_bytes × num_rows × 1.3
Vector data ≈ dimensions × 4 bytes × num_vectors
HNSW graph ≈ vector_data × 1.3 (must fit in vector_buffer)
WAL ≈ daily_writes × retention_days
```
| Resource | Warning | Critical |
|---------|---------|----------|
| Connections | 80% of max | 95% of max |
| Storage | 70% full | 85% full |
| vector_buffer | 80% | 90% |
| Replication lag | 30s | 120s |