Add neuraldb-docs site files (batch 2: install, nql, ops, sdk pages)

This commit is contained in:
Kristian Benestad 2026-05-20 12:29:55 +07:00
parent 3058dbee3e
commit 941fdf6252
18 changed files with 1430 additions and 0 deletions

View file

@ -0,0 +1,130 @@
---
title: Cloud Managed
sort: 120
section-id: installation
keywords: cloud, managed, NeuralDB Cloud, regions, tiers, SaaS
description: Setting up NeuralDB Cloud — the fully managed service with global regions and flexible tiers
language: en
---
# Cloud Managed
NeuralDB Cloud is the fully managed version of NeuralDB. It handles provisioning, patching, backups, monitoring, and scaling — so you can focus on building your application rather than managing database infrastructure.
## Getting Started
### 1. Create an Account
Sign up at [cloud.neuraldb.io](https://cloud.neuraldb.io). You can authenticate with Google, GitHub, or an email address.
### 2. Create a Cluster
Click **New Cluster** and configure:
- **Region**: choose the cloud region closest to your application servers
- **Tier**: select based on your workload requirements (see tier comparison below)
- **Storage**: initial storage allocation (can be scaled later)
- **High Availability**: enable for production workloads
### 3. Connect
Once the cluster is provisioned (typically under 3 minutes), your connection string appears in the dashboard:
```
postgresql://neuraldb:[password]@[cluster-id].cloud.neuraldb.io:5432/[database]?sslmode=require
```
Use this with any PostgreSQL-compatible driver or psql:
```bash
psql "postgresql://neuraldb:mypassword@abc123.cloud.neuraldb.io:5432/mydb?sslmode=require"
```
## Available Regions
| Region | Cloud Provider | Availability |
|--------|---------------|-------------|
| us-east-1 (N. Virginia) | AWS | GA |
| us-west-2 (Oregon) | AWS | GA |
| eu-west-1 (Ireland) | AWS | GA |
| eu-central-1 (Frankfurt) | AWS | GA |
| ap-northeast-1 (Tokyo) | AWS | GA |
| ap-southeast-1 (Singapore) | AWS | GA |
| us-central1 (Iowa) | GCP | Beta |
| europe-west4 (Netherlands) | GCP | Beta |
| eastus (Virginia) | Azure | Beta |
Multi-region replication is available on Business and Enterprise tiers.
## Pricing Tiers
### Starter
Free tier for development and experimentation.
| Resource | Limit |
|---------|-------|
| Storage | 5 GB |
| Vector dimensions | Up to 1536 |
| Max connections | 10 |
| PITR | No |
| HA | No |
| SLA | No |
### Developer
$29/month.
| Resource | Limit |
|---------|-------|
| vCPU | 2 dedicated |
| RAM | 8 GB |
| Storage | 100 GB NVMe SSD |
| Connections | 100 |
| PITR | 7 days |
| HA | No |
### Business
$199/month.
| Resource | Limit |
|---------|-------|
| vCPU | 8 dedicated |
| RAM | 32 GB |
| Storage | 500 GB NVMe SSD |
| Connections | 500 |
| PITR | 30 days |
| HA | Yes (1 standby) |
| Read replicas | Up to 3 |
| SLA | 99.95% |
### Enterprise
Custom pricing for mission-critical applications.
## Connecting from Your Application
### Connection Pooling
NeuralDB Cloud includes PgBouncer-based connection pooling:
```
postgresql://neuraldb:[password]@[cluster-id]-pooler.cloud.neuraldb.io:5432/[database]
```
### SSL/TLS
All connections require TLS. Download the cluster CA certificate from the dashboard:
```
sslmode=verify-full&sslrootcert=/path/to/ca.pem
```
## Branching
Create instant copy-on-write clones of your production database:
```bash
neuraldb-cloud branch create staging --from production
```

View file

@ -0,0 +1,86 @@
---
title: Docker Install
sort: 100
section-id: installation
keywords: Docker, install, docker run, docker-compose, volumes, container
description: Installing NeuralDB using Docker — single container and docker-compose setups
language: en
---
# Docker Install
Docker is the fastest way to run NeuralDB locally or in a single-server deployment.
## Quick Start
```bash
docker run -d \
--name neuraldb \
-p 5432:5432 \
-e NEURALDB_PASSWORD=mypassword \
-e NEURALDB_DB=mydb \
-v neuraldb_data:/var/lib/neuraldb/data \
neuraldb/neuraldb:latest
```
Connect with psql:
```bash
psql -h localhost -p 5432 -U neuraldb -d mydb
```
## Environment Variables
| Variable | Default | Description |
|----------|---------|-------------|
| `NEURALDB_PASSWORD` | required | Password for the `neuraldb` superuser |
| `NEURALDB_USER` | `neuraldb` | Superuser username |
| `NEURALDB_DB` | `neuraldb` | Default database name |
| `NEURALDB_PORT` | `5432` | TCP port |
| `NEURALDB_SHARED_BUFFERS` | `256MB` | Row store page cache |
| `NEURALDB_VECTOR_BUFFER` | `512MB` | Vector index memory |
## docker-compose Setup
```yaml
version: '3.9'
services:
neuraldb:
image: neuraldb/neuraldb:1.0
container_name: neuraldb
restart: unless-stopped
ports:
- "127.0.0.1:5432:5432"
environment:
NEURALDB_PASSWORD: ${NEURALDB_PASSWORD}
NEURALDB_SHARED_BUFFERS: "4GB"
NEURALDB_VECTOR_BUFFER: "8GB"
volumes:
- neuraldb_data:/var/lib/neuraldb/data
healthcheck:
test: ["CMD-SHELL", "pg_isready -U neuraldb"]
interval: 10s
timeout: 5s
retries: 5
volumes:
neuraldb_data:
```
```bash
echo "NEURALDB_PASSWORD=$(openssl rand -base64 32)" > .env
docker-compose up -d
```
## Upgrading
```bash
docker pull neuraldb/neuraldb:1.1
docker stop neuraldb && docker rm neuraldb
docker run -d --name neuraldb \
-v neuraldb_data:/var/lib/neuraldb/data \
-e NEURALDB_PASSWORD=mypassword \
neuraldb/neuraldb:1.1
docker exec neuraldb neuraldb-migrate
```

View file

@ -0,0 +1,80 @@
---
title: Kubernetes
sort: 110
section-id: installation
keywords: Kubernetes, Helm, StatefulSet, PVC, k8s, cluster, deployment
description: Deploying NeuralDB on Kubernetes using the official Helm chart and StatefulSets
language: en
---
# Kubernetes
The recommended way to run NeuralDB on Kubernetes is via the official Helm chart.
## Installing the Helm Chart
```bash
helm repo add neuraldb https://charts.neuraldb.io
helm repo update
kubectl create namespace neuraldb
helm install neuraldb neuraldb/neuraldb \
--namespace neuraldb \
--set auth.password=mysecretpassword \
--set persistence.size=100Gi
```
## Chart Configuration
```yaml
image:
repository: neuraldb/neuraldb
tag: "1.0"
replicaCount: 1
readReplicaCount: 2
resources:
requests:
cpu: "2"
memory: "8Gi"
limits:
cpu: "8"
memory: "32Gi"
persistence:
enabled: true
storageClass: "fast-ssd"
size: 500Gi
vectorBuffer: "16Gi"
sharedBuffers: "8Gi"
maxConnections: 200
ha:
enabled: true
replication:
mode: synchronous
backup:
enabled: true
schedule: "0 2 * * *"
s3:
bucket: my-neuraldb-backups
region: us-east-1
```
## Services
| Service | Port | Description |
|---------|------|-------------|
| `neuraldb-primary` | 5432 | Primary — reads + writes |
| `neuraldb-replica` | 5432 | Read replicas — reads only |
| `neuraldb-headless` | 5432 | StatefulSet pod discovery |
## Scaling Read Replicas
```bash
helm upgrade neuraldb neuraldb/neuraldb \
--namespace neuraldb \
--set readReplicaCount=4
```

View file

@ -0,0 +1,65 @@
---
title: Local Development
sort: 130
section-id: installation
keywords: local, development, binary, homebrew, winget, install, macOS, Linux, Windows
description: Installing NeuralDB locally for development using binaries, Homebrew, or winget
language: en
---
# Local Development
## macOS
```bash
brew tap neuraldb/tap
brew install neuraldb
brew services start neuraldb
```
## Linux
### Ubuntu / Debian
```bash
curl -fsSL https://packages.neuraldb.io/gpg | sudo gpg --dearmor -o /usr/share/keyrings/neuraldb-keyring.gpg
echo "deb [signed-by=/usr/share/keyrings/neuraldb-keyring.gpg] https://packages.neuraldb.io/apt stable main" \
| sudo tee /etc/apt/sources.list.d/neuraldb.list
sudo apt update && sudo apt install -y neuraldb
sudo systemctl enable --now neuraldb
```
### RHEL / Fedora
```bash
sudo rpm --import https://packages.neuraldb.io/gpg
sudo tee /etc/yum.repos.d/neuraldb.repo <<'EOF'
[neuraldb]
name=NeuralDB Repository
baseurl=https://packages.neuraldb.io/rpm/stable
enabled=1
gpgcheck=1
gpgkey=https://packages.neuraldb.io/gpg
EOF
sudo dnf install -y neuraldb
sudo systemctl enable --now neuraldb
```
## Windows
```powershell
winget install NeuralDB.NeuralDB
```
## First-Time Setup
```bash
neuraldb init
neuraldb start
neuraldb-cli
```
```sql
ALTER USER neuraldb PASSWORD 'your-new-password';
CREATE DATABASE myapp;
```

View file

@ -0,0 +1,89 @@
---
title: Aggregations
sort: 130
section-id: query-language
keywords: aggregations, GROUP BY, COUNT, SUM, vectors, AVG, centroid, analytics
description: Aggregating data in NQL including GROUP BY, COUNT, SUM, and vector-specific aggregation functions
language: en
---
# Aggregations
NQL supports the full SQL aggregation toolkit, extended with vector-specific aggregate functions.
## Standard Aggregations
```sql
SELECT category, COUNT(*) AS doc_count
FROM documents
GROUP BY category
ORDER BY doc_count DESC;
SELECT category, AVG(price), MIN(price), MAX(price)
FROM products
WHERE available = true
GROUP BY category;
```
## Vector Aggregations
### `AVG(embedding)` — Centroid
```sql
SELECT AVG(embedding) AS centroid
FROM documents
WHERE category = 'technology';
```
Find documents closest to the centroid:
```sql
WITH centroid AS (
SELECT AVG(embedding) AS c FROM documents WHERE category = 'technology'
)
SELECT id, title, 1 - (embedding <=> centroid.c) AS similarity
FROM documents, centroid
WHERE category = 'technology'
ORDER BY embedding <=> centroid.c
LIMIT 10;
```
### `vector_centroid(embedding, weight)`
```sql
SELECT vector_centroid(embedding, rating) AS weighted_centroid
FROM products WHERE category = 'electronics';
```
## GROUP BY with Vector Search
```sql
SELECT DISTINCT ON (category)
id, category, title, 1 - (embedding <=> $1) AS similarity
FROM documents
ORDER BY category, embedding <=> $1;
```
## Window Functions
```sql
SELECT id, title, category,
1 - (embedding <=> $1) AS similarity,
RANK() OVER (PARTITION BY category ORDER BY embedding <=> $1) AS rank_in_category
FROM documents
WHERE 1 - (embedding <=> $1) > 0.5
ORDER BY category, rank_in_category;
```
## Time-Series Semantic Analytics
```sql
WITH weekly_centroids AS (
SELECT date_trunc('week', created_at) AS week, AVG(embedding) AS centroid
FROM documents GROUP BY week
)
SELECT w1.week, 1 - (w1.centroid <=> w2.centroid) AS similarity_to_prev_week
FROM weekly_centroids w1
LEFT JOIN weekly_centroids w2 ON w2.week = w1.week - INTERVAL '1 week'
ORDER BY w1.week;
```

View file

@ -0,0 +1,89 @@
---
title: NQL Basics
sort: 100
section-id: query-language
keywords: NQL, NeuralDB Query Language, SQL, syntax, basics, queries
description: Introduction to NeuralDB Query Language (NQL) — syntax, data types, and basic operations
language: en
---
# NQL Basics
NQL (NeuralDB Query Language) is a superset of standard SQL. Every valid SQL statement is also valid NQL. NQL adds extensions for vector operations, embedding generation, and semantic search primitives.
## Connecting
```bash
psql -h localhost -p 5432 -U neuraldb -d mydb
neuraldb-cli -h localhost
```
## Data Types
### VECTOR(n)
```sql
CREATE TABLE documents (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
content TEXT NOT NULL,
embedding VECTOR(1536)
);
```
### HALFVEC(n) and SPARSEVEC(n)
```sql
embedding HALFVEC(1536) -- 16-bit, half the memory
bm25_vector SPARSEVEC(30000) -- sparse, non-zero elements only
```
## Basic CRUD
```sql
CREATE TABLE products (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
name TEXT NOT NULL, category TEXT, price DECIMAL(10,2),
stock INTEGER DEFAULT 0, embedding VECTOR(1536),
created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW()
);
INSERT INTO products (name, category, price, stock, embedding)
VALUES ('Wireless Headphones', 'electronics', 299.99, 150, '[0.023, -0.187, ...]');
SELECT id, name, price FROM products WHERE category = 'electronics';
UPDATE products SET price = 279.99, embedding = '[...]' WHERE id = $1;
DELETE FROM products WHERE id = $1;
```
## Creating Vector Indexes
```sql
CREATE INDEX ON documents USING hnsw (embedding vector_cosine_ops);
CREATE INDEX ON documents USING hnsw (embedding vector_l2_ops);
CREATE INDEX ON documents USING hnsw (embedding vector_ip_ops);
```
## Basic Vector Queries
```sql
SELECT id, content, 1 - (embedding <=> $1) AS similarity
FROM documents
ORDER BY embedding <=> $1
LIMIT 10;
```
| Operator | Metric | Index ops |
|----------|--------|----------|
| `<=>` | Cosine distance | `vector_cosine_ops` |
| `<->` | Euclidean (L2) | `vector_l2_ops` |
| `<#>` | Negative dot product | `vector_ip_ops` |
## NQL Functions
```sql
SELECT vector_dims(embedding) FROM documents LIMIT 1; -- returns 1536
SELECT vector_norm(embedding) FROM documents LIMIT 5;
SELECT cosine_similarity(embedding, $1) AS similarity FROM documents ORDER BY similarity DESC LIMIT 10;
```

View file

@ -0,0 +1,73 @@
---
title: Hybrid Queries
sort: 120
section-id: query-language
keywords: hybrid queries, vector, relational, filters, combined, semantic search, metadata
description: Combining vector similarity and relational filters in NQL hybrid queries
language: en
---
# Hybrid Queries
Hybrid queries combine vector similarity search with relational filter predicates in a single SQL statement.
## Basic Hybrid Query
```sql
SELECT id, name, price, 1 - (embedding <=> $1) AS similarity
FROM products
WHERE category = 'electronics'
AND stock > 0
AND price < 500
ORDER BY embedding <=> $1
LIMIT 10;
```
## Query Planner Hints
```sql
-- Force pre-filter
SELECT /*+ PREFILTER */ id, name, 1 - (embedding <=> $1) AS score
FROM products WHERE category = 'electronics'
ORDER BY score DESC LIMIT 10;
-- Force post-filter
SELECT /*+ POSTFILTER */ id, name, 1 - (embedding <=> $1) AS score
FROM products WHERE price < 500
ORDER BY embedding <=> $1 LIMIT 10;
```
## Hybrid Full-Text + Vector (BM25)
```sql
WITH vector_search AS (
SELECT id, ROW_NUMBER() OVER (ORDER BY embedding <=> $1) AS rank
FROM documents ORDER BY embedding <=> $1 LIMIT 100
),
fts_search AS (
SELECT id, ROW_NUMBER() OVER (ORDER BY ts_rank_cd(tsv, query) DESC) AS rank
FROM documents, to_tsquery('english', $2) query
WHERE tsv @@ query ORDER BY ts_rank_cd(tsv, query) DESC LIMIT 100
),
rrf AS (
SELECT COALESCE(v.id, f.id) AS id,
(COALESCE(1.0/(60+v.rank),0) + COALESCE(1.0/(60+f.rank),0)) AS rrf_score
FROM vector_search v FULL OUTER JOIN fts_search f ON v.id = f.id
)
SELECT d.id, d.content, rrf.rrf_score
FROM rrf JOIN documents d ON d.id = rrf.id
ORDER BY rrf_score DESC LIMIT 10;
```
## Composite Scoring
```sql
SELECT id, name, price, rating,
(0.7 * (1 - (embedding <=> $1))
+ 0.2 * (rating / 5.0)
+ 0.1 * (1 - EXTRACT(DAYS FROM NOW() - created_at) / 365.0)
) AS composite_score
FROM products
WHERE available = true AND price < $2
ORDER BY composite_score DESC LIMIT 20;
```

View file

@ -0,0 +1,68 @@
---
title: Transactions
sort: 140
section-id: query-language
keywords: transactions, ACID, isolation levels, MVCC, BEGIN, COMMIT, ROLLBACK
description: ACID transactions in NeuralDB — isolation levels, MVCC, savepoints, and advisory locks
language: en
---
# Transactions
NeuralDB provides full ACID transactions with MVCC. Unlike most vector databases, NeuralDB guarantees atomicity across both relational and vector data.
## Basic Transaction Syntax
```sql
BEGIN;
INSERT INTO documents (content, embedding) VALUES ($1, $2);
UPDATE document_stats SET total_count = total_count + 1;
COMMIT;
```
## Isolation Levels
```sql
BEGIN;
SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
-- each statement sees only rows committed before it
COMMIT;
BEGIN ISOLATION LEVEL REPEATABLE READ;
-- reads are stable throughout the transaction
COMMIT;
BEGIN ISOLATION LEVEL SERIALIZABLE;
-- may raise: ERROR: could not serialize access
COMMIT;
```
## Savepoints
```sql
BEGIN;
INSERT INTO documents (content, embedding) VALUES ($1, $2);
SAVEPOINT after_insert;
UPDATE document_stats SET count = count + 1 WHERE id = $3;
ROLLBACK TO SAVEPOINT after_insert;
UPDATE document_stats SET count = count + 1 WHERE id = $4;
COMMIT;
```
## Vector Transactions
```sql
BEGIN;
INSERT INTO documents (id, content, embedding) VALUES ($1, $2, $3);
-- If ROLLBACK, neither row nor index entry exists
ROLLBACK;
```
## Advisory Locks
```sql
SELECT pg_advisory_lock(42);
SELECT pg_try_advisory_lock(42); -- returns boolean
SELECT pg_advisory_unlock(42);
SELECT pg_advisory_xact_lock(42); -- auto-released at commit/rollback
```

View file

@ -0,0 +1,79 @@
---
title: Vector Queries
sort: 110
section-id: query-language
keywords: vector queries, NEAREST, SIMILAR, cosine, dot product, euclidean, ANN
description: Writing vector similarity queries in NQL — NEAREST, SIMILAR, distance operators, and recall tuning
language: en
---
# Vector Queries
## Distance Operators
```sql
embedding <=> query_vector -- cosine distance
embedding <-> query_vector -- euclidean (L2)
embedding <#> query_vector -- negative dot product
```
Always pair `ORDER BY` with `LIMIT` to use the HNSW index:
```sql
SELECT id, content FROM documents
ORDER BY embedding <=> '[0.1, 0.2, ...]'
LIMIT 10;
```
## NEAREST Clause
```sql
SELECT id, content, score
FROM documents
NEAREST TO embedding = '[0.1, 0.2, ...]' USING COSINE
TOP 10;
```
## SIMILAR Clause
```sql
SELECT id, content, score
FROM documents
SIMILAR TO embedding = $1 USING COSINE THRESHOLD 0.75
LIMIT 100;
```
## Recall Tuning
```sql
SET hnsw.ef_search = 200; -- higher = better recall, slower
```
| ef_search | Recall@10 | p50 latency | QPS |
|-----------|-----------|-------------|-----|
| 20 | 89% | 0.7ms | 12,000 |
| 40 | 95% | 1.2ms | 8,400 |
| 80 | 98% | 2.1ms | 4,800 |
| 200 | 99.5% | 4.8ms | 2,100 |
## Exact Search
```sql
SET neuraldb.vector_scan = 'exact';
SELECT * FROM documents ORDER BY embedding <=> $1 LIMIT 10;
RESET neuraldb.vector_scan;
```
## Multi-Vector Queries
```sql
WITH queries AS (
SELECT UNNEST(ARRAY['[...]'::VECTOR(1536), '[...]'::VECTOR(1536)]) AS qv
),
ranked AS (
SELECT d.id, d.content, MIN(d.embedding <=> q.qv) AS best_distance
FROM documents d, queries q
GROUP BY d.id, d.content
)
SELECT * FROM ranked ORDER BY best_distance LIMIT 20;
```

View file

@ -0,0 +1,70 @@
---
title: Backup & Restore
sort: 110
section-id: operations
keywords: backup, restore, snapshot, WAL archiving, PITR, point-in-time recovery
description: Backup and restore strategies for NeuralDB — snapshots, WAL archiving, and point-in-time recovery
language: en
---
# Backup & Restore
## Physical Snapshot
```bash
pg_basebackup \
--host=localhost --port=5432 --username=backup_user \
--pgdata=/backups/neuraldb/$(date +%Y%m%d) \
--wal-method=stream --checkpoint=fast --compress=lz4 --progress
```
## WAL Archiving
```ini
wal_level = replica
archive_mode = on
archive_command = 'aws s3 cp %p s3://my-backups/neuraldb/wal/%f'
archive_timeout = 60
```
Verify:
```sql
SELECT last_archived_wal, last_archived_time, archived_count, failed_count
FROM pg_stat_archiver;
```
## pgBackRest
```bash
sudo apt install pgbackrest
# Full backup
sudo -u postgres pgbackrest --stanza=neuraldb backup --type=full
# Differential
sudo -u postgres pgbackrest --stanza=neuraldb backup --type=diff
```
Cron schedule:
```cron
0 1 * * 0 postgres pgbackrest --stanza=neuraldb backup --type=full
0 1 * * 1-6 postgres pgbackrest --stanza=neuraldb backup --type=diff
```
## Point-in-Time Recovery
```bash
systemctl stop neuraldb
pgbackrest --stanza=neuraldb restore \
--target="2026-05-15 14:30:00+00" \
--target-action=promote --delta
systemctl start neuraldb
```
## Logical Backup
```bash
pg_dump -h localhost -U neuraldb mydb | lz4 | \
aws s3 cp - s3://my-backups/neuraldb/logical-$(date +%Y%m%d).sql.lz4
pg_dump -Fc -h localhost -U neuraldb mydb > mydb-$(date +%Y%m%d).dump
```

View file

@ -0,0 +1,67 @@
---
title: Migration
sort: 130
section-id: operations
keywords: migration, import, Postgres, Pinecone, Weaviate, data migration, ETL
description: Migrating data to NeuralDB from PostgreSQL, Pinecone, Weaviate, and other sources
language: en
---
# Migration
## From PostgreSQL
```bash
pg_dump -h source-host -U source-user -d source-db --format=custom > source-backup.dump
psql -h neuraldb-host -U neuraldb -c "CREATE DATABASE myapp;"
pg_restore -h neuraldb-host -U neuraldb -d myapp --jobs=8 --no-owner source-backup.dump
```
Add vector columns post-migration:
```sql
ALTER TABLE documents ADD COLUMN embedding VECTOR(1536);
CREATE INDEX CONCURRENTLY documents_embedding_idx
ON documents USING hnsw (embedding vector_cosine_ops);
```
## From PostgreSQL + pgvector
```bash
pg_dump -h source-host -U source-user -d source-db --format=custom \
--exclude-extension=vector > pgvector-backup.dump
pg_restore -h neuraldb-host -U neuraldb -d myapp --jobs=8 pgvector-backup.dump
```
## From Pinecone
```python
import pinecone
from neuraldb import NeuralDB, BulkIngestor
pc = pinecone.Pinecone(api_key=os.environ["PINECONE_API_KEY"])
index = pc.Index("my-index")
client = NeuralDB(os.environ["NEURALDB_URL"])
client.execute("""
CREATE TABLE IF NOT EXISTS pinecone_migration (
id TEXT PRIMARY KEY, embedding VECTOR(1536), metadata JSONB,
migrated_at TIMESTAMP WITH TIME ZONE DEFAULT NOW()
)
""")
ingestor = BulkIngestor(client, table="pinecone_migration", batch_size=500)
with ingestor as ing:
for ids_batch in paginate_pinecone_ids(index, batch_size=1000):
fetch_response = index.fetch(ids=ids_batch)
for vector_id, vector_data in fetch_response.vectors.items():
ing.add({"id": vector_id, "embedding": vector_data.values, "metadata": vector_data.metadata or {}})
```
## Verifying Migration
```sql
SELECT COUNT(*) FROM documents;
SELECT COUNT(*) FROM documents WHERE embedding IS NULL;
SELECT index_name, hnsw_in_memory, estimated_recall FROM neuraldb_stat_vector_indexes;
```

View file

@ -0,0 +1,65 @@
---
title: Monitoring
sort: 100
section-id: operations
keywords: monitoring, Prometheus, Grafana, metrics, alerts, observability, dashboards
description: Monitoring NeuralDB with Prometheus metrics, Grafana dashboards, and alert configuration
language: en
---
# Monitoring
## Prometheus Metrics
Enable the metrics exporter:
```ini
metrics.enabled = true
metrics.port = 9187
metrics.path = /metrics
```
Key metrics:
| Metric | Type | Description |
|--------|------|-------------|
| `neuraldb_connections_total` | Gauge | Current connections by state |
| `neuraldb_query_duration_seconds` | Histogram | Query duration percentiles |
| `neuraldb_vector_queries_total` | Counter | Vector similarity queries by index |
| `neuraldb_hnsw_index_size_bytes` | Gauge | In-memory size of HNSW graphs |
| `neuraldb_replication_lag_seconds` | Gauge | Time lag per replica |
| `neuraldb_database_size_bytes` | Gauge | Total database size |
## Grafana Dashboard
Import official dashboard ID **18921** from Grafana.com.
## Alerting Rules
```yaml
groups:
- name: neuraldb
rules:
- alert: NeuralDBConnectionsHigh
expr: neuraldb_connections_total{state="active"} / neuraldb_connections_max > 0.85
for: 2m
labels: { severity: warning }
- alert: NeuralDBReplicationLagHigh
expr: neuraldb_replication_lag_seconds > 30
for: 1m
labels: { severity: warning }
- alert: NeuralDBVectorBufferExhausted
expr: neuraldb_hnsw_index_size_bytes > (neuraldb_vector_buffer_size_bytes * 0.90)
for: 5m
labels: { severity: warning }
```
## Built-In Query Statistics
```sql
SELECT query, calls, round(mean_exec_time::numeric, 2) AS avg_ms
FROM pg_stat_statements ORDER BY mean_exec_time DESC LIMIT 10;
SELECT sum(blks_hit) * 100.0 / sum(blks_hit + blks_read) AS cache_hit_ratio
FROM pg_stat_database WHERE datname != 'template0';
```

View file

@ -0,0 +1,58 @@
---
title: Scaling
sort: 120
section-id: operations
keywords: scaling, sharding, read replicas, horizontal scaling, capacity planning, performance
description: Scaling NeuralDB horizontally with sharding, read replicas, and capacity planning
language: en
---
# Scaling
## Read Replicas
```python
primary = NeuralDB("postgresql://neuraldb:pass@primary:5432/mydb")
replica = NeuralDB("postgresql://neuraldb:pass@replica:5432/mydb")
def search(query_vector):
return replica.query("SELECT * FROM docs ORDER BY embedding <=> %s LIMIT 10", [query_vector])
def insert(content, embedding):
return primary.execute("INSERT INTO docs (content, embedding) VALUES (%s, %s)", [content, embedding])
```
| Replicas | Approx peak QPS (1536-dim, 10M vectors) |
|---------|-----------------------------------------|
| 1 primary | 8,000 |
| 1 primary + 2 replicas | 24,000 |
| 1 primary + 4 replicas | 48,000 |
## Horizontal Sharding
```sql
SELECT neuraldb_cluster.init_cluster(shards => 8, replication_factor => 2);
CREATE TABLE documents (
id UUID NOT NULL DEFAULT gen_random_uuid(),
tenant_id UUID NOT NULL,
content TEXT,
embedding VECTOR(1536)
) SHARD BY tenant_id;
```
## Capacity Planning
```
Row data ≈ avg_row_bytes × num_rows × 1.3
Vector data ≈ dimensions × 4 bytes × num_vectors
HNSW graph ≈ vector_data × 1.3 (must fit in vector_buffer)
WAL ≈ daily_writes × retention_days
```
| Resource | Warning | Critical |
|---------|---------|----------|
| Connections | 80% of max | 95% of max |
| Storage | 70% full | 85% full |
| vector_buffer | 80% | 90% |
| Replication lag | 30s | 120s |

View file

@ -0,0 +1,71 @@
---
title: Troubleshooting
sort: 140
section-id: operations
keywords: troubleshooting, errors, diagnostics, FAQ, common problems, debug
description: Common NeuralDB errors, diagnostic techniques, and frequently asked questions
language: en
---
# Troubleshooting
## Connection Issues
### `FATAL: password authentication failed`
```bash
sudo -u neuraldb neuraldb-cli
```
```sql
ALTER USER neuraldb PASSWORD 'new-password';
```
### `could not connect to server: Connection refused`
```bash
systemctl status neuraldb
ss -tlnp | grep 5432
journalctl -u neuraldb -n 50
```
### Connection slots exhausted
```sql
SELECT count(*), state FROM pg_stat_activity GROUP BY state;
SELECT pg_terminate_backend(pid) FROM pg_stat_activity
WHERE state = 'idle' AND state_change < NOW() - INTERVAL '10 minutes';
```
## Vector Query Issues
### Slow Vector Searches
```sql
EXPLAIN (ANALYZE, BUFFERS)
SELECT id FROM documents ORDER BY embedding <=> '[...]' LIMIT 10;
```
Common causes: missing LIMIT, HNSW graph not in memory, ef_search too low.
```sql
SELECT * FROM neuraldb_stat_vector_indexes; -- check hnsw_in_memory
SET enable_seqscan = off; -- force index for debugging
```
### Low Recall
```sql
SET hnsw.ef_search = 200;
SET neuraldb.vector_scan = 'exact'; -- compare against exact search
```
## FAQ
**Q: Can I use NeuralDB as a drop-in for PostgreSQL?**
Yes. NeuralDB implements the PostgreSQL wire protocol.
**Q: What should `vector_buffer` be set to?**
`SELECT SUM(hnsw_graph_size_bytes) FROM neuraldb_stat_vector_indexes` — set `vector_buffer` at least this large.
**Q: Is NeuralDB compatible with pgvector?**
Yes. All pgvector types (`VECTOR`, `HALFVEC`, `SPARSEVEC`) and operators (`<=>`, `<->`, `<#>`) work without modification.

View file

@ -0,0 +1,85 @@
---
title: Go SDK
sort: 120
section-id: client-sdks
keywords: Go, Golang, SDK, client, connection pool, query builder, pgx
description: The NeuralDB Go SDK — installation, connection pooling, and vector query builder
language: en
---
# Go SDK
Built on `pgx`, the high-performance PostgreSQL driver for Go.
## Installation
```bash
go get github.com/neuraldb/neuraldb-go
```
Requires Go 1.21+.
## Connecting
```go
client, err := neuraldb.Connect(ctx, "postgresql://neuraldb:password@localhost:5432/mydb")
if err != nil { log.Fatal(err) }
defer client.Close(ctx)
```
### Connection Pool
```go
config, _ := pgxpool.ParseConfig(os.Getenv("NEURALDB_URL"))
config.MaxConns = 20
config.MinConns = 5
pool, _ := neuraldb.NewPool(ctx, config)
```
## Working with Vectors
```go
v := types.NewVector([]float32{0.023, -0.187, 0.412})
func InsertDocument(ctx context.Context, pool *neuraldb.Pool, doc Document) error {
_, err := pool.Exec(ctx,
`INSERT INTO documents (id, content, embedding) VALUES ($1, $2, $3)`,
doc.ID, doc.Content, doc.Embedding,
)
return err
}
func SemanticSearch(ctx context.Context, pool *neuraldb.Pool, queryEmbedding []float32, limit int) ([]SearchResult, error) {
qv := types.NewVector(queryEmbedding)
rows, err := pool.Query(ctx, `
SELECT id, content, 1 - (embedding <=> $1) AS similarity
FROM documents WHERE embedding IS NOT NULL
ORDER BY embedding <=> $1 LIMIT $2
`, qv, limit)
if err != nil { return nil, err }
defer rows.Close()
var results []SearchResult
for rows.Next() {
var r SearchResult
rows.Scan(&r.ID, &r.Content, &r.Similarity)
results = append(results, r)
}
return results, rows.Err()
}
```
## Transactions
```go
pool.BeginTxFunc(ctx, pgx.TxOptions{}, func(tx pgx.Tx) error {
for _, doc := range docs {
_, err := tx.Exec(ctx,
`INSERT INTO documents (content, embedding) VALUES ($1, $2)`,
doc.Content, doc.Embedding,
)
if err != nil { return err }
}
_, err := tx.Exec(ctx, `UPDATE stats SET doc_count = doc_count + $1`, len(docs))
return err
})
```

View file

@ -0,0 +1,80 @@
---
title: JavaScript SDK
sort: 110
section-id: client-sdks
keywords: JavaScript, TypeScript, SDK, Node.js, browser, npm, client
description: The NeuralDB JavaScript/TypeScript SDK for Node.js and browser environments
language: en
---
# JavaScript SDK
## Installation
```bash
npm install @neuraldb/client
```
## Basic Setup
```typescript
import { NeuralDB } from '@neuraldb/client';
const client = new NeuralDB({
connectionString: process.env.NEURALDB_URL!,
ssl: { rejectUnauthorized: true },
});
await client.connect();
```
### Connection Pool
```typescript
import { NeuralDBPool } from '@neuraldb/client';
const pool = new NeuralDBPool({
connectionString: process.env.NEURALDB_URL!,
max: 20,
idleTimeoutMillis: 30000,
});
```
## Vector Operations
```typescript
import { toVector } from '@neuraldb/client';
await client.query(
'INSERT INTO documents (content, embedding) VALUES ($1, $2)',
['My document content', toVector([0.023, -0.187, 0.412])]
);
async function semanticSearch(query: string, limit = 10) {
const embeddingResponse = await openai.embeddings.create({
model: 'text-embedding-3-small', input: query,
});
const queryVector = embeddingResponse.data[0].embedding;
const { rows } = await client.query<{ id: string; content: string; similarity: number }>(
`SELECT id, content, 1 - (embedding <=> $1) AS similarity
FROM documents WHERE embedding IS NOT NULL
ORDER BY embedding <=> $1 LIMIT $2`,
[toVector(queryVector), limit]
);
return rows;
}
```
## High-Level Document API
```typescript
import { DocumentStore } from '@neuraldb/client';
const store = new DocumentStore(client, {
table: 'documents',
embeddingColumn: 'embedding',
embeddingModel: { provider: 'openai', model: 'text-embedding-3-small', apiKey: process.env.OPENAI_API_KEY! },
});
await store.add([{ content: 'First document', metadata: { source: 'web' } }]);
const results = await store.search('query text', { limit: 10, filter: { source: 'web' } });
```

View file

@ -0,0 +1,90 @@
---
title: Python SDK
sort: 100
section-id: client-sdks
keywords: Python, SDK, client, connection, CRUD, vector operations, psycopg
description: Installing and using the NeuralDB Python SDK — connection, CRUD, and vector operations
language: en
---
# Python SDK
Built on `psycopg3` with NeuralDB-specific helpers for vector operations and batch ingestion.
## Installation
```bash
pip install neuraldb
pip install neuraldb[asyncio] # async support
```
## Connecting
```python
from neuraldb import NeuralDB
client = NeuralDB("postgresql://neuraldb:password@localhost:5432/mydb")
# Async
from neuraldb import AsyncNeuralDB
async with AsyncNeuralDB("postgresql://...") as client:
result = await client.query("SELECT 1")
# Pool
from neuraldb import NeuralDBPool
pool = NeuralDBPool("postgresql://...", min_size=5, max_size=20)
with pool.acquire() as client:
result = client.query("SELECT COUNT(*) FROM documents")
```
## CRUD Operations
```python
from neuraldb import Vector
client.execute(
"INSERT INTO documents (content, source, embedding) VALUES (%s, %s, %s)",
("My document content", "web-scraper", Vector([0.023, -0.187, 0.412]))
)
rows = client.query("SELECT id, content FROM documents WHERE source = %s", ("web-scraper",))
for row in rows:
print(row["id"], row["content"])
client.execute("UPDATE documents SET content = %s, embedding = %s WHERE id = %s",
("Updated", Vector(new_embedding), doc_id))
client.execute("DELETE FROM documents WHERE id = %s", (doc_id,))
```
## Vector Search
```python
results = client.query("""
SELECT id, content, 1 - (embedding <=> %s) AS similarity
FROM documents WHERE embedding IS NOT NULL
ORDER BY embedding <=> %s LIMIT 10
""", (Vector(query_embedding), Vector(query_embedding)))
```
## Transactions
```python
with client.transaction():
client.execute("INSERT INTO documents (content, embedding) VALUES (%s, %s)", (content, Vector(embedding)))
client.execute("UPDATE stats SET count = count + 1")
```
## Bulk Ingestion
```python
from neuraldb import BulkIngestor
ingestor = BulkIngestor(client, table="documents",
columns=["content", "source", "embedding"], batch_size=1000,
embedding_model="openai/text-embedding-3-small", embedding_column="embedding", text_column="content")
with ingestor as ing:
for doc in docs:
ing.add(doc)
print(f"Ingested {ingestor.total_inserted} documents")
```

View file

@ -0,0 +1,85 @@
---
title: REST API
sort: 130
section-id: client-sdks
keywords: REST API, HTTP, endpoints, authentication, JSON, API
description: NeuralDB REST API reference — all endpoints, authentication headers, and response formats
language: en
---
# REST API
## Base URL
```
https://your-neuraldb-host:8080/api/v1
```
## Authentication
```
Authorization: Bearer ndb_live_your_api_key_here
```
## Query Endpoint
```http
POST /api/v1/query
Content-Type: application/json
Authorization: Bearer ndb_live_...
{
"query": "SELECT id, content, 1 - (embedding <=> $1) AS similarity FROM documents ORDER BY embedding <=> $1 LIMIT 5",
"params": [[0.023, -0.187, 0.412]],
"database": "mydb"
}
```
Response:
```json
{
"rows": [{"id": "uuid-1", "content": "First document", "similarity": 0.923}],
"rowCount": 1,
"executionTimeMs": 3.2
}
```
## Document Endpoints
### Insert
```http
POST /api/v1/collections/my_docs/documents
{"documents": [{"content": "NeuralDB is an AI-native database", "metadata": {"source": "blog"}}],
"embedding_model": "openai/text-embedding-3-small"}
```
### Search
```http
POST /api/v1/collections/my_docs/search
{"query": "AI-native database", "limit": 10, "min_similarity": 0.7,
"filters": {"category": "technology"}, "embedding_model": "openai/text-embedding-3-small"}
```
## Error Codes
| HTTP Status | Error Code | Description |
|-------------|-----------|-------------|
| 400 | `QUERY_ERROR` | Invalid NQL query |
| 401 | `UNAUTHORIZED` | Missing or invalid API key |
| 403 | `FORBIDDEN` | Insufficient role permissions |
| 404 | `NOT_FOUND` | Document or collection not found |
| 429 | `RATE_LIMITED` | Too many requests |
| 500 | `INTERNAL_ERROR` | Server error |
## Rate Limits
| Plan | Queries/min | Documents/min |
|------|------------|---------------|
| Starter | 30 | 100 |
| Developer | 300 | 1,000 |
| Business | 3,000 | 10,000 |