diff --git a/neuraldb-docs/pages/install-cloud.md b/neuraldb-docs/pages/install-cloud.md new file mode 100644 index 0000000..9f7a884 --- /dev/null +++ b/neuraldb-docs/pages/install-cloud.md @@ -0,0 +1,130 @@ +--- +title: Cloud Managed +sort: 120 +section-id: installation +keywords: cloud, managed, NeuralDB Cloud, regions, tiers, SaaS +description: Setting up NeuralDB Cloud — the fully managed service with global regions and flexible tiers +language: en +--- + +# Cloud Managed + +NeuralDB Cloud is the fully managed version of NeuralDB. It handles provisioning, patching, backups, monitoring, and scaling — so you can focus on building your application rather than managing database infrastructure. + +## Getting Started + +### 1. Create an Account + +Sign up at [cloud.neuraldb.io](https://cloud.neuraldb.io). You can authenticate with Google, GitHub, or an email address. + +### 2. Create a Cluster + +Click **New Cluster** and configure: + +- **Region**: choose the cloud region closest to your application servers +- **Tier**: select based on your workload requirements (see tier comparison below) +- **Storage**: initial storage allocation (can be scaled later) +- **High Availability**: enable for production workloads + +### 3. Connect + +Once the cluster is provisioned (typically under 3 minutes), your connection string appears in the dashboard: + +``` +postgresql://neuraldb:[password]@[cluster-id].cloud.neuraldb.io:5432/[database]?sslmode=require +``` + +Use this with any PostgreSQL-compatible driver or psql: + +```bash +psql "postgresql://neuraldb:mypassword@abc123.cloud.neuraldb.io:5432/mydb?sslmode=require" +``` + +## Available Regions + +| Region | Cloud Provider | Availability | +|--------|---------------|-------------| +| us-east-1 (N. Virginia) | AWS | GA | +| us-west-2 (Oregon) | AWS | GA | +| eu-west-1 (Ireland) | AWS | GA | +| eu-central-1 (Frankfurt) | AWS | GA | +| ap-northeast-1 (Tokyo) | AWS | GA | +| ap-southeast-1 (Singapore) | AWS | GA | +| us-central1 (Iowa) | GCP | Beta | +| europe-west4 (Netherlands) | GCP | Beta | +| eastus (Virginia) | Azure | Beta | + +Multi-region replication is available on Business and Enterprise tiers. + +## Pricing Tiers + +### Starter + +Free tier for development and experimentation. + +| Resource | Limit | +|---------|-------| +| Storage | 5 GB | +| Vector dimensions | Up to 1536 | +| Max connections | 10 | +| PITR | No | +| HA | No | +| SLA | No | + +### Developer + +$29/month. + +| Resource | Limit | +|---------|-------| +| vCPU | 2 dedicated | +| RAM | 8 GB | +| Storage | 100 GB NVMe SSD | +| Connections | 100 | +| PITR | 7 days | +| HA | No | + +### Business + +$199/month. + +| Resource | Limit | +|---------|-------| +| vCPU | 8 dedicated | +| RAM | 32 GB | +| Storage | 500 GB NVMe SSD | +| Connections | 500 | +| PITR | 30 days | +| HA | Yes (1 standby) | +| Read replicas | Up to 3 | +| SLA | 99.95% | + +### Enterprise + +Custom pricing for mission-critical applications. + +## Connecting from Your Application + +### Connection Pooling + +NeuralDB Cloud includes PgBouncer-based connection pooling: + +``` +postgresql://neuraldb:[password]@[cluster-id]-pooler.cloud.neuraldb.io:5432/[database] +``` + +### SSL/TLS + +All connections require TLS. Download the cluster CA certificate from the dashboard: + +``` +sslmode=verify-full&sslrootcert=/path/to/ca.pem +``` + +## Branching + +Create instant copy-on-write clones of your production database: + +```bash +neuraldb-cloud branch create staging --from production +``` diff --git a/neuraldb-docs/pages/install-docker.md b/neuraldb-docs/pages/install-docker.md new file mode 100644 index 0000000..7444a30 --- /dev/null +++ b/neuraldb-docs/pages/install-docker.md @@ -0,0 +1,86 @@ +--- +title: Docker Install +sort: 100 +section-id: installation +keywords: Docker, install, docker run, docker-compose, volumes, container +description: Installing NeuralDB using Docker — single container and docker-compose setups +language: en +--- + +# Docker Install + +Docker is the fastest way to run NeuralDB locally or in a single-server deployment. + +## Quick Start + +```bash +docker run -d \ + --name neuraldb \ + -p 5432:5432 \ + -e NEURALDB_PASSWORD=mypassword \ + -e NEURALDB_DB=mydb \ + -v neuraldb_data:/var/lib/neuraldb/data \ + neuraldb/neuraldb:latest +``` + +Connect with psql: + +```bash +psql -h localhost -p 5432 -U neuraldb -d mydb +``` + +## Environment Variables + +| Variable | Default | Description | +|----------|---------|-------------| +| `NEURALDB_PASSWORD` | required | Password for the `neuraldb` superuser | +| `NEURALDB_USER` | `neuraldb` | Superuser username | +| `NEURALDB_DB` | `neuraldb` | Default database name | +| `NEURALDB_PORT` | `5432` | TCP port | +| `NEURALDB_SHARED_BUFFERS` | `256MB` | Row store page cache | +| `NEURALDB_VECTOR_BUFFER` | `512MB` | Vector index memory | + +## docker-compose Setup + +```yaml +version: '3.9' + +services: + neuraldb: + image: neuraldb/neuraldb:1.0 + container_name: neuraldb + restart: unless-stopped + ports: + - "127.0.0.1:5432:5432" + environment: + NEURALDB_PASSWORD: ${NEURALDB_PASSWORD} + NEURALDB_SHARED_BUFFERS: "4GB" + NEURALDB_VECTOR_BUFFER: "8GB" + volumes: + - neuraldb_data:/var/lib/neuraldb/data + healthcheck: + test: ["CMD-SHELL", "pg_isready -U neuraldb"] + interval: 10s + timeout: 5s + retries: 5 + +volumes: + neuraldb_data: +``` + +```bash +echo "NEURALDB_PASSWORD=$(openssl rand -base64 32)" > .env +docker-compose up -d +``` + +## Upgrading + +```bash +docker pull neuraldb/neuraldb:1.1 +docker stop neuraldb && docker rm neuraldb +docker run -d --name neuraldb \ + -v neuraldb_data:/var/lib/neuraldb/data \ + -e NEURALDB_PASSWORD=mypassword \ + neuraldb/neuraldb:1.1 +docker exec neuraldb neuraldb-migrate +``` diff --git a/neuraldb-docs/pages/install-kubernetes.md b/neuraldb-docs/pages/install-kubernetes.md new file mode 100644 index 0000000..b824213 --- /dev/null +++ b/neuraldb-docs/pages/install-kubernetes.md @@ -0,0 +1,80 @@ +--- +title: Kubernetes +sort: 110 +section-id: installation +keywords: Kubernetes, Helm, StatefulSet, PVC, k8s, cluster, deployment +description: Deploying NeuralDB on Kubernetes using the official Helm chart and StatefulSets +language: en +--- + +# Kubernetes + +The recommended way to run NeuralDB on Kubernetes is via the official Helm chart. + +## Installing the Helm Chart + +```bash +helm repo add neuraldb https://charts.neuraldb.io +helm repo update +kubectl create namespace neuraldb +helm install neuraldb neuraldb/neuraldb \ + --namespace neuraldb \ + --set auth.password=mysecretpassword \ + --set persistence.size=100Gi +``` + +## Chart Configuration + +```yaml +image: + repository: neuraldb/neuraldb + tag: "1.0" + +replicaCount: 1 +readReplicaCount: 2 + +resources: + requests: + cpu: "2" + memory: "8Gi" + limits: + cpu: "8" + memory: "32Gi" + +persistence: + enabled: true + storageClass: "fast-ssd" + size: 500Gi + +vectorBuffer: "16Gi" +sharedBuffers: "8Gi" +maxConnections: 200 + +ha: + enabled: true + replication: + mode: synchronous + +backup: + enabled: true + schedule: "0 2 * * *" + s3: + bucket: my-neuraldb-backups + region: us-east-1 +``` + +## Services + +| Service | Port | Description | +|---------|------|-------------| +| `neuraldb-primary` | 5432 | Primary — reads + writes | +| `neuraldb-replica` | 5432 | Read replicas — reads only | +| `neuraldb-headless` | 5432 | StatefulSet pod discovery | + +## Scaling Read Replicas + +```bash +helm upgrade neuraldb neuraldb/neuraldb \ + --namespace neuraldb \ + --set readReplicaCount=4 +``` diff --git a/neuraldb-docs/pages/install-local.md b/neuraldb-docs/pages/install-local.md new file mode 100644 index 0000000..7938623 --- /dev/null +++ b/neuraldb-docs/pages/install-local.md @@ -0,0 +1,65 @@ +--- +title: Local Development +sort: 130 +section-id: installation +keywords: local, development, binary, homebrew, winget, install, macOS, Linux, Windows +description: Installing NeuralDB locally for development using binaries, Homebrew, or winget +language: en +--- + +# Local Development + +## macOS + +```bash +brew tap neuraldb/tap +brew install neuraldb +brew services start neuraldb +``` + +## Linux + +### Ubuntu / Debian + +```bash +curl -fsSL https://packages.neuraldb.io/gpg | sudo gpg --dearmor -o /usr/share/keyrings/neuraldb-keyring.gpg +echo "deb [signed-by=/usr/share/keyrings/neuraldb-keyring.gpg] https://packages.neuraldb.io/apt stable main" \ + | sudo tee /etc/apt/sources.list.d/neuraldb.list +sudo apt update && sudo apt install -y neuraldb +sudo systemctl enable --now neuraldb +``` + +### RHEL / Fedora + +```bash +sudo rpm --import https://packages.neuraldb.io/gpg +sudo tee /etc/yum.repos.d/neuraldb.repo <<'EOF' +[neuraldb] +name=NeuralDB Repository +baseurl=https://packages.neuraldb.io/rpm/stable +enabled=1 +gpgcheck=1 +gpgkey=https://packages.neuraldb.io/gpg +EOF +sudo dnf install -y neuraldb +sudo systemctl enable --now neuraldb +``` + +## Windows + +```powershell +winget install NeuralDB.NeuralDB +``` + +## First-Time Setup + +```bash +neuraldb init +neuraldb start +neuraldb-cli +``` + +```sql +ALTER USER neuraldb PASSWORD 'your-new-password'; +CREATE DATABASE myapp; +``` diff --git a/neuraldb-docs/pages/nql-aggregations.md b/neuraldb-docs/pages/nql-aggregations.md new file mode 100644 index 0000000..1d2bc67 --- /dev/null +++ b/neuraldb-docs/pages/nql-aggregations.md @@ -0,0 +1,89 @@ +--- +title: Aggregations +sort: 130 +section-id: query-language +keywords: aggregations, GROUP BY, COUNT, SUM, vectors, AVG, centroid, analytics +description: Aggregating data in NQL including GROUP BY, COUNT, SUM, and vector-specific aggregation functions +language: en +--- + +# Aggregations + +NQL supports the full SQL aggregation toolkit, extended with vector-specific aggregate functions. + +## Standard Aggregations + +```sql +SELECT category, COUNT(*) AS doc_count +FROM documents +GROUP BY category +ORDER BY doc_count DESC; + +SELECT category, AVG(price), MIN(price), MAX(price) +FROM products +WHERE available = true +GROUP BY category; +``` + +## Vector Aggregations + +### `AVG(embedding)` — Centroid + +```sql +SELECT AVG(embedding) AS centroid +FROM documents +WHERE category = 'technology'; +``` + +Find documents closest to the centroid: + +```sql +WITH centroid AS ( + SELECT AVG(embedding) AS c FROM documents WHERE category = 'technology' +) +SELECT id, title, 1 - (embedding <=> centroid.c) AS similarity +FROM documents, centroid +WHERE category = 'technology' +ORDER BY embedding <=> centroid.c +LIMIT 10; +``` + +### `vector_centroid(embedding, weight)` + +```sql +SELECT vector_centroid(embedding, rating) AS weighted_centroid +FROM products WHERE category = 'electronics'; +``` + +## GROUP BY with Vector Search + +```sql +SELECT DISTINCT ON (category) + id, category, title, 1 - (embedding <=> $1) AS similarity +FROM documents +ORDER BY category, embedding <=> $1; +``` + +## Window Functions + +```sql +SELECT id, title, category, + 1 - (embedding <=> $1) AS similarity, + RANK() OVER (PARTITION BY category ORDER BY embedding <=> $1) AS rank_in_category +FROM documents +WHERE 1 - (embedding <=> $1) > 0.5 +ORDER BY category, rank_in_category; +``` + +## Time-Series Semantic Analytics + +```sql +WITH weekly_centroids AS ( + SELECT date_trunc('week', created_at) AS week, AVG(embedding) AS centroid + FROM documents GROUP BY week +) +SELECT w1.week, 1 - (w1.centroid <=> w2.centroid) AS similarity_to_prev_week +FROM weekly_centroids w1 +LEFT JOIN weekly_centroids w2 ON w2.week = w1.week - INTERVAL '1 week' +ORDER BY w1.week; +``` diff --git a/neuraldb-docs/pages/nql-basics.md b/neuraldb-docs/pages/nql-basics.md new file mode 100644 index 0000000..830e91b --- /dev/null +++ b/neuraldb-docs/pages/nql-basics.md @@ -0,0 +1,89 @@ +--- +title: NQL Basics +sort: 100 +section-id: query-language +keywords: NQL, NeuralDB Query Language, SQL, syntax, basics, queries +description: Introduction to NeuralDB Query Language (NQL) — syntax, data types, and basic operations +language: en +--- + +# NQL Basics + +NQL (NeuralDB Query Language) is a superset of standard SQL. Every valid SQL statement is also valid NQL. NQL adds extensions for vector operations, embedding generation, and semantic search primitives. + +## Connecting + +```bash +psql -h localhost -p 5432 -U neuraldb -d mydb +neuraldb-cli -h localhost +``` + +## Data Types + +### VECTOR(n) + +```sql +CREATE TABLE documents ( + id UUID PRIMARY KEY DEFAULT gen_random_uuid(), + content TEXT NOT NULL, + embedding VECTOR(1536) +); +``` + +### HALFVEC(n) and SPARSEVEC(n) + +```sql +embedding HALFVEC(1536) -- 16-bit, half the memory +bm25_vector SPARSEVEC(30000) -- sparse, non-zero elements only +``` + +## Basic CRUD + +```sql +CREATE TABLE products ( + id UUID PRIMARY KEY DEFAULT gen_random_uuid(), + name TEXT NOT NULL, category TEXT, price DECIMAL(10,2), + stock INTEGER DEFAULT 0, embedding VECTOR(1536), + created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW() +); + +INSERT INTO products (name, category, price, stock, embedding) +VALUES ('Wireless Headphones', 'electronics', 299.99, 150, '[0.023, -0.187, ...]'); + +SELECT id, name, price FROM products WHERE category = 'electronics'; + +UPDATE products SET price = 279.99, embedding = '[...]' WHERE id = $1; + +DELETE FROM products WHERE id = $1; +``` + +## Creating Vector Indexes + +```sql +CREATE INDEX ON documents USING hnsw (embedding vector_cosine_ops); +CREATE INDEX ON documents USING hnsw (embedding vector_l2_ops); +CREATE INDEX ON documents USING hnsw (embedding vector_ip_ops); +``` + +## Basic Vector Queries + +```sql +SELECT id, content, 1 - (embedding <=> $1) AS similarity +FROM documents +ORDER BY embedding <=> $1 +LIMIT 10; +``` + +| Operator | Metric | Index ops | +|----------|--------|----------| +| `<=>` | Cosine distance | `vector_cosine_ops` | +| `<->` | Euclidean (L2) | `vector_l2_ops` | +| `<#>` | Negative dot product | `vector_ip_ops` | + +## NQL Functions + +```sql +SELECT vector_dims(embedding) FROM documents LIMIT 1; -- returns 1536 +SELECT vector_norm(embedding) FROM documents LIMIT 5; +SELECT cosine_similarity(embedding, $1) AS similarity FROM documents ORDER BY similarity DESC LIMIT 10; +``` diff --git a/neuraldb-docs/pages/nql-hybrid.md b/neuraldb-docs/pages/nql-hybrid.md new file mode 100644 index 0000000..aeb8ca3 --- /dev/null +++ b/neuraldb-docs/pages/nql-hybrid.md @@ -0,0 +1,73 @@ +--- +title: Hybrid Queries +sort: 120 +section-id: query-language +keywords: hybrid queries, vector, relational, filters, combined, semantic search, metadata +description: Combining vector similarity and relational filters in NQL hybrid queries +language: en +--- + +# Hybrid Queries + +Hybrid queries combine vector similarity search with relational filter predicates in a single SQL statement. + +## Basic Hybrid Query + +```sql +SELECT id, name, price, 1 - (embedding <=> $1) AS similarity +FROM products +WHERE category = 'electronics' + AND stock > 0 + AND price < 500 +ORDER BY embedding <=> $1 +LIMIT 10; +``` + +## Query Planner Hints + +```sql +-- Force pre-filter +SELECT /*+ PREFILTER */ id, name, 1 - (embedding <=> $1) AS score +FROM products WHERE category = 'electronics' +ORDER BY score DESC LIMIT 10; + +-- Force post-filter +SELECT /*+ POSTFILTER */ id, name, 1 - (embedding <=> $1) AS score +FROM products WHERE price < 500 +ORDER BY embedding <=> $1 LIMIT 10; +``` + +## Hybrid Full-Text + Vector (BM25) + +```sql +WITH vector_search AS ( + SELECT id, ROW_NUMBER() OVER (ORDER BY embedding <=> $1) AS rank + FROM documents ORDER BY embedding <=> $1 LIMIT 100 +), +fts_search AS ( + SELECT id, ROW_NUMBER() OVER (ORDER BY ts_rank_cd(tsv, query) DESC) AS rank + FROM documents, to_tsquery('english', $2) query + WHERE tsv @@ query ORDER BY ts_rank_cd(tsv, query) DESC LIMIT 100 +), +rrf AS ( + SELECT COALESCE(v.id, f.id) AS id, + (COALESCE(1.0/(60+v.rank),0) + COALESCE(1.0/(60+f.rank),0)) AS rrf_score + FROM vector_search v FULL OUTER JOIN fts_search f ON v.id = f.id +) +SELECT d.id, d.content, rrf.rrf_score +FROM rrf JOIN documents d ON d.id = rrf.id +ORDER BY rrf_score DESC LIMIT 10; +``` + +## Composite Scoring + +```sql +SELECT id, name, price, rating, + (0.7 * (1 - (embedding <=> $1)) + + 0.2 * (rating / 5.0) + + 0.1 * (1 - EXTRACT(DAYS FROM NOW() - created_at) / 365.0) + ) AS composite_score +FROM products +WHERE available = true AND price < $2 +ORDER BY composite_score DESC LIMIT 20; +``` diff --git a/neuraldb-docs/pages/nql-transactions.md b/neuraldb-docs/pages/nql-transactions.md new file mode 100644 index 0000000..a5a75b3 --- /dev/null +++ b/neuraldb-docs/pages/nql-transactions.md @@ -0,0 +1,68 @@ +--- +title: Transactions +sort: 140 +section-id: query-language +keywords: transactions, ACID, isolation levels, MVCC, BEGIN, COMMIT, ROLLBACK +description: ACID transactions in NeuralDB — isolation levels, MVCC, savepoints, and advisory locks +language: en +--- + +# Transactions + +NeuralDB provides full ACID transactions with MVCC. Unlike most vector databases, NeuralDB guarantees atomicity across both relational and vector data. + +## Basic Transaction Syntax + +```sql +BEGIN; +INSERT INTO documents (content, embedding) VALUES ($1, $2); +UPDATE document_stats SET total_count = total_count + 1; +COMMIT; +``` + +## Isolation Levels + +```sql +BEGIN; +SET TRANSACTION ISOLATION LEVEL READ COMMITTED; +-- each statement sees only rows committed before it +COMMIT; + +BEGIN ISOLATION LEVEL REPEATABLE READ; +-- reads are stable throughout the transaction +COMMIT; + +BEGIN ISOLATION LEVEL SERIALIZABLE; +-- may raise: ERROR: could not serialize access +COMMIT; +``` + +## Savepoints + +```sql +BEGIN; +INSERT INTO documents (content, embedding) VALUES ($1, $2); +SAVEPOINT after_insert; +UPDATE document_stats SET count = count + 1 WHERE id = $3; +ROLLBACK TO SAVEPOINT after_insert; +UPDATE document_stats SET count = count + 1 WHERE id = $4; +COMMIT; +``` + +## Vector Transactions + +```sql +BEGIN; +INSERT INTO documents (id, content, embedding) VALUES ($1, $2, $3); +-- If ROLLBACK, neither row nor index entry exists +ROLLBACK; +``` + +## Advisory Locks + +```sql +SELECT pg_advisory_lock(42); +SELECT pg_try_advisory_lock(42); -- returns boolean +SELECT pg_advisory_unlock(42); +SELECT pg_advisory_xact_lock(42); -- auto-released at commit/rollback +``` diff --git a/neuraldb-docs/pages/nql-vectors.md b/neuraldb-docs/pages/nql-vectors.md new file mode 100644 index 0000000..d9790a1 --- /dev/null +++ b/neuraldb-docs/pages/nql-vectors.md @@ -0,0 +1,79 @@ +--- +title: Vector Queries +sort: 110 +section-id: query-language +keywords: vector queries, NEAREST, SIMILAR, cosine, dot product, euclidean, ANN +description: Writing vector similarity queries in NQL — NEAREST, SIMILAR, distance operators, and recall tuning +language: en +--- + +# Vector Queries + +## Distance Operators + +```sql +embedding <=> query_vector -- cosine distance +embedding <-> query_vector -- euclidean (L2) +embedding <#> query_vector -- negative dot product +``` + +Always pair `ORDER BY` with `LIMIT` to use the HNSW index: + +```sql +SELECT id, content FROM documents +ORDER BY embedding <=> '[0.1, 0.2, ...]' +LIMIT 10; +``` + +## NEAREST Clause + +```sql +SELECT id, content, score +FROM documents +NEAREST TO embedding = '[0.1, 0.2, ...]' USING COSINE +TOP 10; +``` + +## SIMILAR Clause + +```sql +SELECT id, content, score +FROM documents +SIMILAR TO embedding = $1 USING COSINE THRESHOLD 0.75 +LIMIT 100; +``` + +## Recall Tuning + +```sql +SET hnsw.ef_search = 200; -- higher = better recall, slower +``` + +| ef_search | Recall@10 | p50 latency | QPS | +|-----------|-----------|-------------|-----| +| 20 | 89% | 0.7ms | 12,000 | +| 40 | 95% | 1.2ms | 8,400 | +| 80 | 98% | 2.1ms | 4,800 | +| 200 | 99.5% | 4.8ms | 2,100 | + +## Exact Search + +```sql +SET neuraldb.vector_scan = 'exact'; +SELECT * FROM documents ORDER BY embedding <=> $1 LIMIT 10; +RESET neuraldb.vector_scan; +``` + +## Multi-Vector Queries + +```sql +WITH queries AS ( + SELECT UNNEST(ARRAY['[...]'::VECTOR(1536), '[...]'::VECTOR(1536)]) AS qv +), +ranked AS ( + SELECT d.id, d.content, MIN(d.embedding <=> q.qv) AS best_distance + FROM documents d, queries q + GROUP BY d.id, d.content +) +SELECT * FROM ranked ORDER BY best_distance LIMIT 20; +``` diff --git a/neuraldb-docs/pages/ops-backup.md b/neuraldb-docs/pages/ops-backup.md new file mode 100644 index 0000000..f2d4271 --- /dev/null +++ b/neuraldb-docs/pages/ops-backup.md @@ -0,0 +1,70 @@ +--- +title: Backup & Restore +sort: 110 +section-id: operations +keywords: backup, restore, snapshot, WAL archiving, PITR, point-in-time recovery +description: Backup and restore strategies for NeuralDB — snapshots, WAL archiving, and point-in-time recovery +language: en +--- + +# Backup & Restore + +## Physical Snapshot + +```bash +pg_basebackup \ + --host=localhost --port=5432 --username=backup_user \ + --pgdata=/backups/neuraldb/$(date +%Y%m%d) \ + --wal-method=stream --checkpoint=fast --compress=lz4 --progress +``` + +## WAL Archiving + +```ini +wal_level = replica +archive_mode = on +archive_command = 'aws s3 cp %p s3://my-backups/neuraldb/wal/%f' +archive_timeout = 60 +``` + +Verify: + +```sql +SELECT last_archived_wal, last_archived_time, archived_count, failed_count +FROM pg_stat_archiver; +``` + +## pgBackRest + +```bash +sudo apt install pgbackrest +# Full backup +sudo -u postgres pgbackrest --stanza=neuraldb backup --type=full +# Differential +sudo -u postgres pgbackrest --stanza=neuraldb backup --type=diff +``` + +Cron schedule: + +```cron +0 1 * * 0 postgres pgbackrest --stanza=neuraldb backup --type=full +0 1 * * 1-6 postgres pgbackrest --stanza=neuraldb backup --type=diff +``` + +## Point-in-Time Recovery + +```bash +systemctl stop neuraldb +pgbackrest --stanza=neuraldb restore \ + --target="2026-05-15 14:30:00+00" \ + --target-action=promote --delta +systemctl start neuraldb +``` + +## Logical Backup + +```bash +pg_dump -h localhost -U neuraldb mydb | lz4 | \ + aws s3 cp - s3://my-backups/neuraldb/logical-$(date +%Y%m%d).sql.lz4 +pg_dump -Fc -h localhost -U neuraldb mydb > mydb-$(date +%Y%m%d).dump +``` diff --git a/neuraldb-docs/pages/ops-migration.md b/neuraldb-docs/pages/ops-migration.md new file mode 100644 index 0000000..0b06a70 --- /dev/null +++ b/neuraldb-docs/pages/ops-migration.md @@ -0,0 +1,67 @@ +--- +title: Migration +sort: 130 +section-id: operations +keywords: migration, import, Postgres, Pinecone, Weaviate, data migration, ETL +description: Migrating data to NeuralDB from PostgreSQL, Pinecone, Weaviate, and other sources +language: en +--- + +# Migration + +## From PostgreSQL + +```bash +pg_dump -h source-host -U source-user -d source-db --format=custom > source-backup.dump +psql -h neuraldb-host -U neuraldb -c "CREATE DATABASE myapp;" +pg_restore -h neuraldb-host -U neuraldb -d myapp --jobs=8 --no-owner source-backup.dump +``` + +Add vector columns post-migration: + +```sql +ALTER TABLE documents ADD COLUMN embedding VECTOR(1536); +CREATE INDEX CONCURRENTLY documents_embedding_idx + ON documents USING hnsw (embedding vector_cosine_ops); +``` + +## From PostgreSQL + pgvector + +```bash +pg_dump -h source-host -U source-user -d source-db --format=custom \ + --exclude-extension=vector > pgvector-backup.dump +pg_restore -h neuraldb-host -U neuraldb -d myapp --jobs=8 pgvector-backup.dump +``` + +## From Pinecone + +```python +import pinecone +from neuraldb import NeuralDB, BulkIngestor + +pc = pinecone.Pinecone(api_key=os.environ["PINECONE_API_KEY"]) +index = pc.Index("my-index") +client = NeuralDB(os.environ["NEURALDB_URL"]) + +client.execute(""" + CREATE TABLE IF NOT EXISTS pinecone_migration ( + id TEXT PRIMARY KEY, embedding VECTOR(1536), metadata JSONB, + migrated_at TIMESTAMP WITH TIME ZONE DEFAULT NOW() + ) +""") + +ingestor = BulkIngestor(client, table="pinecone_migration", batch_size=500) +with ingestor as ing: + for ids_batch in paginate_pinecone_ids(index, batch_size=1000): + fetch_response = index.fetch(ids=ids_batch) + for vector_id, vector_data in fetch_response.vectors.items(): + ing.add({"id": vector_id, "embedding": vector_data.values, "metadata": vector_data.metadata or {}}) +``` + +## Verifying Migration + +```sql +SELECT COUNT(*) FROM documents; +SELECT COUNT(*) FROM documents WHERE embedding IS NULL; +SELECT index_name, hnsw_in_memory, estimated_recall FROM neuraldb_stat_vector_indexes; +``` diff --git a/neuraldb-docs/pages/ops-monitoring.md b/neuraldb-docs/pages/ops-monitoring.md new file mode 100644 index 0000000..838c470 --- /dev/null +++ b/neuraldb-docs/pages/ops-monitoring.md @@ -0,0 +1,65 @@ +--- +title: Monitoring +sort: 100 +section-id: operations +keywords: monitoring, Prometheus, Grafana, metrics, alerts, observability, dashboards +description: Monitoring NeuralDB with Prometheus metrics, Grafana dashboards, and alert configuration +language: en +--- + +# Monitoring + +## Prometheus Metrics + +Enable the metrics exporter: + +```ini +metrics.enabled = true +metrics.port = 9187 +metrics.path = /metrics +``` + +Key metrics: + +| Metric | Type | Description | +|--------|------|-------------| +| `neuraldb_connections_total` | Gauge | Current connections by state | +| `neuraldb_query_duration_seconds` | Histogram | Query duration percentiles | +| `neuraldb_vector_queries_total` | Counter | Vector similarity queries by index | +| `neuraldb_hnsw_index_size_bytes` | Gauge | In-memory size of HNSW graphs | +| `neuraldb_replication_lag_seconds` | Gauge | Time lag per replica | +| `neuraldb_database_size_bytes` | Gauge | Total database size | + +## Grafana Dashboard + +Import official dashboard ID **18921** from Grafana.com. + +## Alerting Rules + +```yaml +groups: + - name: neuraldb + rules: + - alert: NeuralDBConnectionsHigh + expr: neuraldb_connections_total{state="active"} / neuraldb_connections_max > 0.85 + for: 2m + labels: { severity: warning } + - alert: NeuralDBReplicationLagHigh + expr: neuraldb_replication_lag_seconds > 30 + for: 1m + labels: { severity: warning } + - alert: NeuralDBVectorBufferExhausted + expr: neuraldb_hnsw_index_size_bytes > (neuraldb_vector_buffer_size_bytes * 0.90) + for: 5m + labels: { severity: warning } +``` + +## Built-In Query Statistics + +```sql +SELECT query, calls, round(mean_exec_time::numeric, 2) AS avg_ms +FROM pg_stat_statements ORDER BY mean_exec_time DESC LIMIT 10; + +SELECT sum(blks_hit) * 100.0 / sum(blks_hit + blks_read) AS cache_hit_ratio +FROM pg_stat_database WHERE datname != 'template0'; +``` diff --git a/neuraldb-docs/pages/ops-scaling.md b/neuraldb-docs/pages/ops-scaling.md new file mode 100644 index 0000000..c2d4e06 --- /dev/null +++ b/neuraldb-docs/pages/ops-scaling.md @@ -0,0 +1,58 @@ +--- +title: Scaling +sort: 120 +section-id: operations +keywords: scaling, sharding, read replicas, horizontal scaling, capacity planning, performance +description: Scaling NeuralDB horizontally with sharding, read replicas, and capacity planning +language: en +--- + +# Scaling + +## Read Replicas + +```python +primary = NeuralDB("postgresql://neuraldb:pass@primary:5432/mydb") +replica = NeuralDB("postgresql://neuraldb:pass@replica:5432/mydb") + +def search(query_vector): + return replica.query("SELECT * FROM docs ORDER BY embedding <=> %s LIMIT 10", [query_vector]) + +def insert(content, embedding): + return primary.execute("INSERT INTO docs (content, embedding) VALUES (%s, %s)", [content, embedding]) +``` + +| Replicas | Approx peak QPS (1536-dim, 10M vectors) | +|---------|-----------------------------------------| +| 1 primary | 8,000 | +| 1 primary + 2 replicas | 24,000 | +| 1 primary + 4 replicas | 48,000 | + +## Horizontal Sharding + +```sql +SELECT neuraldb_cluster.init_cluster(shards => 8, replication_factor => 2); + +CREATE TABLE documents ( + id UUID NOT NULL DEFAULT gen_random_uuid(), + tenant_id UUID NOT NULL, + content TEXT, + embedding VECTOR(1536) +) SHARD BY tenant_id; +``` + +## Capacity Planning + +``` +Row data ≈ avg_row_bytes × num_rows × 1.3 +Vector data ≈ dimensions × 4 bytes × num_vectors +HNSW graph ≈ vector_data × 1.3 (must fit in vector_buffer) +WAL ≈ daily_writes × retention_days +``` + +| Resource | Warning | Critical | +|---------|---------|----------| +| Connections | 80% of max | 95% of max | +| Storage | 70% full | 85% full | +| vector_buffer | 80% | 90% | +| Replication lag | 30s | 120s | diff --git a/neuraldb-docs/pages/ops-troubleshooting.md b/neuraldb-docs/pages/ops-troubleshooting.md new file mode 100644 index 0000000..7bdc4ce --- /dev/null +++ b/neuraldb-docs/pages/ops-troubleshooting.md @@ -0,0 +1,71 @@ +--- +title: Troubleshooting +sort: 140 +section-id: operations +keywords: troubleshooting, errors, diagnostics, FAQ, common problems, debug +description: Common NeuralDB errors, diagnostic techniques, and frequently asked questions +language: en +--- + +# Troubleshooting + +## Connection Issues + +### `FATAL: password authentication failed` + +```bash +sudo -u neuraldb neuraldb-cli +``` +```sql +ALTER USER neuraldb PASSWORD 'new-password'; +``` + +### `could not connect to server: Connection refused` + +```bash +systemctl status neuraldb +ss -tlnp | grep 5432 +journalctl -u neuraldb -n 50 +``` + +### Connection slots exhausted + +```sql +SELECT count(*), state FROM pg_stat_activity GROUP BY state; +SELECT pg_terminate_backend(pid) FROM pg_stat_activity +WHERE state = 'idle' AND state_change < NOW() - INTERVAL '10 minutes'; +``` + +## Vector Query Issues + +### Slow Vector Searches + +```sql +EXPLAIN (ANALYZE, BUFFERS) +SELECT id FROM documents ORDER BY embedding <=> '[...]' LIMIT 10; +``` + +Common causes: missing LIMIT, HNSW graph not in memory, ef_search too low. + +```sql +SELECT * FROM neuraldb_stat_vector_indexes; -- check hnsw_in_memory +SET enable_seqscan = off; -- force index for debugging +``` + +### Low Recall + +```sql +SET hnsw.ef_search = 200; +SET neuraldb.vector_scan = 'exact'; -- compare against exact search +``` + +## FAQ + +**Q: Can I use NeuralDB as a drop-in for PostgreSQL?** +Yes. NeuralDB implements the PostgreSQL wire protocol. + +**Q: What should `vector_buffer` be set to?** +`SELECT SUM(hnsw_graph_size_bytes) FROM neuraldb_stat_vector_indexes` — set `vector_buffer` at least this large. + +**Q: Is NeuralDB compatible with pgvector?** +Yes. All pgvector types (`VECTOR`, `HALFVEC`, `SPARSEVEC`) and operators (`<=>`, `<->`, `<#>`) work without modification. diff --git a/neuraldb-docs/pages/sdk-go.md b/neuraldb-docs/pages/sdk-go.md new file mode 100644 index 0000000..f07c236 --- /dev/null +++ b/neuraldb-docs/pages/sdk-go.md @@ -0,0 +1,85 @@ +--- +title: Go SDK +sort: 120 +section-id: client-sdks +keywords: Go, Golang, SDK, client, connection pool, query builder, pgx +description: The NeuralDB Go SDK — installation, connection pooling, and vector query builder +language: en +--- + +# Go SDK + +Built on `pgx`, the high-performance PostgreSQL driver for Go. + +## Installation + +```bash +go get github.com/neuraldb/neuraldb-go +``` + +Requires Go 1.21+. + +## Connecting + +```go +client, err := neuraldb.Connect(ctx, "postgresql://neuraldb:password@localhost:5432/mydb") +if err != nil { log.Fatal(err) } +defer client.Close(ctx) +``` + +### Connection Pool + +```go +config, _ := pgxpool.ParseConfig(os.Getenv("NEURALDB_URL")) +config.MaxConns = 20 +config.MinConns = 5 +pool, _ := neuraldb.NewPool(ctx, config) +``` + +## Working with Vectors + +```go +v := types.NewVector([]float32{0.023, -0.187, 0.412}) + +func InsertDocument(ctx context.Context, pool *neuraldb.Pool, doc Document) error { + _, err := pool.Exec(ctx, + `INSERT INTO documents (id, content, embedding) VALUES ($1, $2, $3)`, + doc.ID, doc.Content, doc.Embedding, + ) + return err +} + +func SemanticSearch(ctx context.Context, pool *neuraldb.Pool, queryEmbedding []float32, limit int) ([]SearchResult, error) { + qv := types.NewVector(queryEmbedding) + rows, err := pool.Query(ctx, ` + SELECT id, content, 1 - (embedding <=> $1) AS similarity + FROM documents WHERE embedding IS NOT NULL + ORDER BY embedding <=> $1 LIMIT $2 + `, qv, limit) + if err != nil { return nil, err } + defer rows.Close() + var results []SearchResult + for rows.Next() { + var r SearchResult + rows.Scan(&r.ID, &r.Content, &r.Similarity) + results = append(results, r) + } + return results, rows.Err() +} +``` + +## Transactions + +```go +pool.BeginTxFunc(ctx, pgx.TxOptions{}, func(tx pgx.Tx) error { + for _, doc := range docs { + _, err := tx.Exec(ctx, + `INSERT INTO documents (content, embedding) VALUES ($1, $2)`, + doc.Content, doc.Embedding, + ) + if err != nil { return err } + } + _, err := tx.Exec(ctx, `UPDATE stats SET doc_count = doc_count + $1`, len(docs)) + return err +}) +``` diff --git a/neuraldb-docs/pages/sdk-javascript.md b/neuraldb-docs/pages/sdk-javascript.md new file mode 100644 index 0000000..b698ebb --- /dev/null +++ b/neuraldb-docs/pages/sdk-javascript.md @@ -0,0 +1,80 @@ +--- +title: JavaScript SDK +sort: 110 +section-id: client-sdks +keywords: JavaScript, TypeScript, SDK, Node.js, browser, npm, client +description: The NeuralDB JavaScript/TypeScript SDK for Node.js and browser environments +language: en +--- + +# JavaScript SDK + +## Installation + +```bash +npm install @neuraldb/client +``` + +## Basic Setup + +```typescript +import { NeuralDB } from '@neuraldb/client'; + +const client = new NeuralDB({ + connectionString: process.env.NEURALDB_URL!, + ssl: { rejectUnauthorized: true }, +}); +await client.connect(); +``` + +### Connection Pool + +```typescript +import { NeuralDBPool } from '@neuraldb/client'; + +const pool = new NeuralDBPool({ + connectionString: process.env.NEURALDB_URL!, + max: 20, + idleTimeoutMillis: 30000, +}); +``` + +## Vector Operations + +```typescript +import { toVector } from '@neuraldb/client'; + +await client.query( + 'INSERT INTO documents (content, embedding) VALUES ($1, $2)', + ['My document content', toVector([0.023, -0.187, 0.412])] +); + +async function semanticSearch(query: string, limit = 10) { + const embeddingResponse = await openai.embeddings.create({ + model: 'text-embedding-3-small', input: query, + }); + const queryVector = embeddingResponse.data[0].embedding; + const { rows } = await client.query<{ id: string; content: string; similarity: number }>( + `SELECT id, content, 1 - (embedding <=> $1) AS similarity + FROM documents WHERE embedding IS NOT NULL + ORDER BY embedding <=> $1 LIMIT $2`, + [toVector(queryVector), limit] + ); + return rows; +} +``` + +## High-Level Document API + +```typescript +import { DocumentStore } from '@neuraldb/client'; + +const store = new DocumentStore(client, { + table: 'documents', + embeddingColumn: 'embedding', + embeddingModel: { provider: 'openai', model: 'text-embedding-3-small', apiKey: process.env.OPENAI_API_KEY! }, +}); + +await store.add([{ content: 'First document', metadata: { source: 'web' } }]); +const results = await store.search('query text', { limit: 10, filter: { source: 'web' } }); +``` diff --git a/neuraldb-docs/pages/sdk-python.md b/neuraldb-docs/pages/sdk-python.md new file mode 100644 index 0000000..709c2f4 --- /dev/null +++ b/neuraldb-docs/pages/sdk-python.md @@ -0,0 +1,90 @@ +--- +title: Python SDK +sort: 100 +section-id: client-sdks +keywords: Python, SDK, client, connection, CRUD, vector operations, psycopg +description: Installing and using the NeuralDB Python SDK — connection, CRUD, and vector operations +language: en +--- + +# Python SDK + +Built on `psycopg3` with NeuralDB-specific helpers for vector operations and batch ingestion. + +## Installation + +```bash +pip install neuraldb +pip install neuraldb[asyncio] # async support +``` + +## Connecting + +```python +from neuraldb import NeuralDB + +client = NeuralDB("postgresql://neuraldb:password@localhost:5432/mydb") + +# Async +from neuraldb import AsyncNeuralDB +async with AsyncNeuralDB("postgresql://...") as client: + result = await client.query("SELECT 1") + +# Pool +from neuraldb import NeuralDBPool +pool = NeuralDBPool("postgresql://...", min_size=5, max_size=20) +with pool.acquire() as client: + result = client.query("SELECT COUNT(*) FROM documents") +``` + +## CRUD Operations + +```python +from neuraldb import Vector + +client.execute( + "INSERT INTO documents (content, source, embedding) VALUES (%s, %s, %s)", + ("My document content", "web-scraper", Vector([0.023, -0.187, 0.412])) +) + +rows = client.query("SELECT id, content FROM documents WHERE source = %s", ("web-scraper",)) +for row in rows: + print(row["id"], row["content"]) + +client.execute("UPDATE documents SET content = %s, embedding = %s WHERE id = %s", + ("Updated", Vector(new_embedding), doc_id)) +client.execute("DELETE FROM documents WHERE id = %s", (doc_id,)) +``` + +## Vector Search + +```python +results = client.query(""" + SELECT id, content, 1 - (embedding <=> %s) AS similarity + FROM documents WHERE embedding IS NOT NULL + ORDER BY embedding <=> %s LIMIT 10 +""", (Vector(query_embedding), Vector(query_embedding))) +``` + +## Transactions + +```python +with client.transaction(): + client.execute("INSERT INTO documents (content, embedding) VALUES (%s, %s)", (content, Vector(embedding))) + client.execute("UPDATE stats SET count = count + 1") +``` + +## Bulk Ingestion + +```python +from neuraldb import BulkIngestor + +ingestor = BulkIngestor(client, table="documents", + columns=["content", "source", "embedding"], batch_size=1000, + embedding_model="openai/text-embedding-3-small", embedding_column="embedding", text_column="content") + +with ingestor as ing: + for doc in docs: + ing.add(doc) +print(f"Ingested {ingestor.total_inserted} documents") +``` diff --git a/neuraldb-docs/pages/sdk-rest.md b/neuraldb-docs/pages/sdk-rest.md new file mode 100644 index 0000000..38109e2 --- /dev/null +++ b/neuraldb-docs/pages/sdk-rest.md @@ -0,0 +1,85 @@ +--- +title: REST API +sort: 130 +section-id: client-sdks +keywords: REST API, HTTP, endpoints, authentication, JSON, API +description: NeuralDB REST API reference — all endpoints, authentication headers, and response formats +language: en +--- + +# REST API + +## Base URL + +``` +https://your-neuraldb-host:8080/api/v1 +``` + +## Authentication + +``` +Authorization: Bearer ndb_live_your_api_key_here +``` + +## Query Endpoint + +```http +POST /api/v1/query +Content-Type: application/json +Authorization: Bearer ndb_live_... + +{ + "query": "SELECT id, content, 1 - (embedding <=> $1) AS similarity FROM documents ORDER BY embedding <=> $1 LIMIT 5", + "params": [[0.023, -0.187, 0.412]], + "database": "mydb" +} +``` + +Response: + +```json +{ + "rows": [{"id": "uuid-1", "content": "First document", "similarity": 0.923}], + "rowCount": 1, + "executionTimeMs": 3.2 +} +``` + +## Document Endpoints + +### Insert + +```http +POST /api/v1/collections/my_docs/documents + +{"documents": [{"content": "NeuralDB is an AI-native database", "metadata": {"source": "blog"}}], + "embedding_model": "openai/text-embedding-3-small"} +``` + +### Search + +```http +POST /api/v1/collections/my_docs/search + +{"query": "AI-native database", "limit": 10, "min_similarity": 0.7, + "filters": {"category": "technology"}, "embedding_model": "openai/text-embedding-3-small"} +``` + +## Error Codes + +| HTTP Status | Error Code | Description | +|-------------|-----------|-------------| +| 400 | `QUERY_ERROR` | Invalid NQL query | +| 401 | `UNAUTHORIZED` | Missing or invalid API key | +| 403 | `FORBIDDEN` | Insufficient role permissions | +| 404 | `NOT_FOUND` | Document or collection not found | +| 429 | `RATE_LIMITED` | Too many requests | +| 500 | `INTERNAL_ERROR` | Server error | + +## Rate Limits + +| Plan | Queries/min | Documents/min | +|------|------------|---------------| +| Starter | 30 | 100 | +| Developer | 300 | 1,000 | +| Business | 3,000 | 10,000 |