mdcms/sample-sites/neuraldb-docs/pages/nql-basics.md
2026-05-18 14:30:49 +07:00

5.4 KiB

title sort section-id keywords description language
NQL Basics 100 query-language NQL, NeuralDB Query Language, SQL, syntax, basics, queries Introduction to NeuralDB Query Language (NQL) — syntax, data types, and basic operations en

NQL Basics

NQL (NeuralDB Query Language) is a superset of standard SQL. Every valid SQL statement is also valid NQL. NQL adds extensions for vector operations, embedding generation, and semantic search primitives.

If you know SQL, you already know most of NQL. This page covers the NQL-specific additions and the data types introduced for AI workloads.

Connecting

NeuralDB speaks the PostgreSQL wire protocol. Connect with any PostgreSQL client:

# psql
psql -h localhost -p 5432 -U neuraldb -d mydb

# neuraldb-cli (enhanced interactive shell)
neuraldb-cli -h localhost

Connection string format:

postgresql://[user[:password]@][host][:port][/dbname][?param=value...]

Data Types

VECTOR(n)

The core NQL extension. Stores a fixed-length array of 32-bit floats representing a vector embedding:

-- Declare a vector column with 1536 dimensions (OpenAI ada-002 output)
CREATE TABLE documents (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  content TEXT NOT NULL,
  embedding VECTOR(1536)
);

Insert a vector by providing a bracketed float array:

INSERT INTO documents (content, embedding)
VALUES ('Hello, world', '[0.1, 0.2, 0.3, ...]');  -- 1536 values

HALFVEC(n)

A 16-bit float variant of VECTOR. Half the memory, slight precision loss. Useful when vector_buffer is a constraint:

embedding HALFVEC(1536)

SPARSEVEC(n)

Sparse vector representation — stores only non-zero elements. Efficient for high-dimensional but sparse vectors (e.g., BM25 term-frequency vectors):

bm25_vector SPARSEVEC(30000)

Basic CRUD

Creating Tables

CREATE TABLE products (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  name TEXT NOT NULL,
  description TEXT,
  category TEXT,
  price DECIMAL(10, 2),
  stock INTEGER DEFAULT 0,
  embedding VECTOR(1536),
  created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW()
);

Inserting Data

INSERT INTO products (name, description, category, price, stock, embedding)
VALUES (
  'Wireless Headphones',
  'Premium noise-cancelling wireless headphones with 30-hour battery',
  'electronics',
  299.99,
  150,
  '[0.023, -0.187, 0.412, ...]'  -- 1536 floats from your embedding model
);

Reading Data

Standard SQL SELECT works as expected:

SELECT id, name, price FROM products WHERE category = 'electronics';
SELECT * FROM products WHERE price BETWEEN 50 AND 300 AND stock > 0;

Updating Data

UPDATE products SET price = 279.99, embedding = '[...]' WHERE id = $1;

When updating the embedding column, the HNSW index is updated atomically.

Deleting Data

DELETE FROM products WHERE id = $1;

Creating Vector Indexes

Without an index, vector similarity queries perform exact linear scans (O(n)). Create an HNSW index for sub-linear performance:

-- Cosine similarity (most common for text embeddings)
CREATE INDEX ON documents USING hnsw (embedding vector_cosine_ops);

-- Euclidean distance
CREATE INDEX ON documents USING hnsw (embedding vector_l2_ops);

-- Dot product (for recommendation systems)
CREATE INDEX ON documents USING hnsw (embedding vector_ip_ops);

Build an index on an existing large table in parallel:

SET max_parallel_maintenance_workers = 8;
CREATE INDEX CONCURRENTLY ON documents USING hnsw (embedding vector_cosine_ops);

Basic Vector Queries

Find Similar Documents

SELECT id, content, 1 - (embedding <=> $1) AS similarity
FROM documents
ORDER BY embedding <=> $1
LIMIT 10;

The <=> operator is cosine distance (lower = more similar). Subtract from 1 for a similarity score (higher = more similar).

Distance Operators

Operator Distance metric Index ops
<=> Cosine distance vector_cosine_ops
<-> Euclidean (L2) distance vector_l2_ops
<#> Negative dot product vector_ip_ops

Distance Threshold

-- Only return results with cosine similarity > 0.8
SELECT id, content, 1 - (embedding <=> $1) AS similarity
FROM documents
WHERE 1 - (embedding <=> $1) > 0.8
ORDER BY embedding <=> $1
LIMIT 20;

Note: The WHERE 1 - (embedding <=> $1) > 0.8 condition is evaluated after the ANN search, not before. Use LIMIT generously enough to capture all relevant results before the threshold filter.

NQL Functions

to_vector(text)

Convert a string literal to a VECTOR:

SELECT to_vector('[0.1, 0.2, 0.3]')::VECTOR(3);

vector_dims(v)

Return the number of dimensions:

SELECT vector_dims(embedding) FROM documents LIMIT 1;
-- Returns: 1536

vector_norm(v)

Return the L2 norm of a vector:

SELECT vector_norm(embedding) FROM documents LIMIT 5;

cosine_similarity(a, b), l2_distance(a, b), dot_product(a, b)

Named function alternatives to the operators:

SELECT cosine_similarity(embedding, $1) AS similarity
FROM documents
ORDER BY similarity DESC
LIMIT 10;

Transactions

NQL supports full ACID transactions:

BEGIN;

INSERT INTO documents (content, embedding) VALUES ($1, $2);
UPDATE document_stats SET total_count = total_count + 1;

COMMIT;

On error, roll back:

BEGIN;
-- ... operations ...
ROLLBACK;