mirror of https://github.com/kbenestad/mdcms.git synced 2026-06-18 15:24:32 +00:00

kbenestad 59efc20dde Updated sample-sites.

2026-05-18 14:30:49 +07:00

5.4 KiB

Raw Permalink Blame History

title	sort	section-id	keywords	description	language
NQL Basics	100	query-language	NQL, NeuralDB Query Language, SQL, syntax, basics, queries	Introduction to NeuralDB Query Language (NQL) — syntax, data types, and basic operations	en

NQL Basics

NQL (NeuralDB Query Language) is a superset of standard SQL. Every valid SQL statement is also valid NQL. NQL adds extensions for vector operations, embedding generation, and semantic search primitives.

If you know SQL, you already know most of NQL. This page covers the NQL-specific additions and the data types introduced for AI workloads.

Connecting

NeuralDB speaks the PostgreSQL wire protocol. Connect with any PostgreSQL client:

# psql
psql -h localhost -p 5432 -U neuraldb -d mydb

# neuraldb-cli (enhanced interactive shell)
neuraldb-cli -h localhost

Connection string format:

postgresql://[user[:password]@][host][:port][/dbname][?param=value...]

Data Types

VECTOR(n)

The core NQL extension. Stores a fixed-length array of 32-bit floats representing a vector embedding:

-- Declare a vector column with 1536 dimensions (OpenAI ada-002 output)
CREATE TABLE documents (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  content TEXT NOT NULL,
  embedding VECTOR(1536)
);

Insert a vector by providing a bracketed float array:

INSERT INTO documents (content, embedding)
VALUES ('Hello, world', '[0.1, 0.2, 0.3, ...]');  -- 1536 values

HALFVEC(n)

A 16-bit float variant of VECTOR. Half the memory, slight precision loss. Useful when vector_buffer is a constraint:

embedding HALFVEC(1536)

SPARSEVEC(n)

Sparse vector representation — stores only non-zero elements. Efficient for high-dimensional but sparse vectors (e.g., BM25 term-frequency vectors):

bm25_vector SPARSEVEC(30000)

Basic CRUD

Creating Tables

CREATE TABLE products (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  name TEXT NOT NULL,
  description TEXT,
  category TEXT,
  price DECIMAL(10, 2),
  stock INTEGER DEFAULT 0,
  embedding VECTOR(1536),
  created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW()
);

Inserting Data

INSERT INTO products (name, description, category, price, stock, embedding)
VALUES (
  'Wireless Headphones',
  'Premium noise-cancelling wireless headphones with 30-hour battery',
  'electronics',
  299.99,
  150,
  '[0.023, -0.187, 0.412, ...]'  -- 1536 floats from your embedding model
);

Reading Data

Standard SQL SELECT works as expected:

SELECT id, name, price FROM products WHERE category = 'electronics';
SELECT * FROM products WHERE price BETWEEN 50 AND 300 AND stock > 0;

Updating Data

UPDATE products SET price = 279.99, embedding = '[...]' WHERE id = $1;

When updating the embedding column, the HNSW index is updated atomically.

Deleting Data

DELETE FROM products WHERE id = $1;

Creating Vector Indexes

Without an index, vector similarity queries perform exact linear scans (O(n)). Create an HNSW index for sub-linear performance:

-- Cosine similarity (most common for text embeddings)
CREATE INDEX ON documents USING hnsw (embedding vector_cosine_ops);

-- Euclidean distance
CREATE INDEX ON documents USING hnsw (embedding vector_l2_ops);

-- Dot product (for recommendation systems)
CREATE INDEX ON documents USING hnsw (embedding vector_ip_ops);

Build an index on an existing large table in parallel:

SET max_parallel_maintenance_workers = 8;
CREATE INDEX CONCURRENTLY ON documents USING hnsw (embedding vector_cosine_ops);

Basic Vector Queries

Find Similar Documents

SELECT id, content, 1 - (embedding <=> $1) AS similarity
FROM documents
ORDER BY embedding <=> $1
LIMIT 10;

The <=> operator is cosine distance (lower = more similar). Subtract from 1 for a similarity score (higher = more similar).

Distance Operators

Operator	Distance metric	Index ops
`<=>`	Cosine distance	`vector_cosine_ops`
`<->`	Euclidean (L2) distance	`vector_l2_ops`
`<#>`	Negative dot product	`vector_ip_ops`

Distance Threshold

-- Only return results with cosine similarity > 0.8
SELECT id, content, 1 - (embedding <=> $1) AS similarity
FROM documents
WHERE 1 - (embedding <=> $1) > 0.8
ORDER BY embedding <=> $1
LIMIT 20;

Note: The WHERE 1 - (embedding <=> $1) > 0.8 condition is evaluated after the ANN search, not before. Use LIMIT generously enough to capture all relevant results before the threshold filter.

NQL Functions

`to_vector(text)`

Convert a string literal to a VECTOR:

SELECT to_vector('[0.1, 0.2, 0.3]')::VECTOR(3);

`vector_dims(v)`

Return the number of dimensions:

SELECT vector_dims(embedding) FROM documents LIMIT 1;
-- Returns: 1536

`vector_norm(v)`

Return the L2 norm of a vector:

SELECT vector_norm(embedding) FROM documents LIMIT 5;

`cosine_similarity(a, b)`, `l2_distance(a, b)`, `dot_product(a, b)`

Named function alternatives to the operators:

SELECT cosine_similarity(embedding, $1) AS similarity
FROM documents
ORDER BY similarity DESC
LIMIT 10;

Transactions

NQL supports full ACID transactions:

BEGIN;

INSERT INTO documents (content, embedding) VALUES ($1, $2);
UPDATE document_stats SET total_count = total_count + 1;

COMMIT;

On error, roll back:

BEGIN;
-- ... operations ...
ROLLBACK;

5.4 KiB Raw Permalink Blame History