mirror of https://github.com/kbenestad/mdcms.git synced 2026-06-18 15:24:32 +00:00

kbenestad 941fdf6252 Add neuraldb-docs site files (batch 2: install, nql, ops, sdk pages)

2026-05-20 12:29:55 +07:00

1.6 KiB

Raw Blame History

title	sort	section-id	keywords	description	language
Vector Queries	110	query-language	vector queries, NEAREST, SIMILAR, cosine, dot product, euclidean, ANN	Writing vector similarity queries in NQL — NEAREST, SIMILAR, distance operators, and recall tuning	en

Vector Queries

Distance Operators

embedding <=> query_vector  -- cosine distance
embedding <-> query_vector  -- euclidean (L2)
embedding <#> query_vector  -- negative dot product

Always pair ORDER BY with LIMIT to use the HNSW index:

SELECT id, content FROM documents
ORDER BY embedding <=> '[0.1, 0.2, ...]'
LIMIT 10;

NEAREST Clause

SELECT id, content, score
FROM documents
NEAREST TO embedding = '[0.1, 0.2, ...]' USING COSINE
TOP 10;

SIMILAR Clause

SELECT id, content, score
FROM documents
SIMILAR TO embedding = $1 USING COSINE THRESHOLD 0.75
LIMIT 100;

Recall Tuning

SET hnsw.ef_search = 200;  -- higher = better recall, slower

ef_search	Recall@10	p50 latency	QPS
20	89%	0.7ms	12,000
40	95%	1.2ms	8,400
80	98%	2.1ms	4,800
200	99.5%	4.8ms	2,100

Exact Search

SET neuraldb.vector_scan = 'exact';
SELECT * FROM documents ORDER BY embedding <=> $1 LIMIT 10;
RESET neuraldb.vector_scan;

Multi-Vector Queries

WITH queries AS (
  SELECT UNNEST(ARRAY['[...]'::VECTOR(1536), '[...]'::VECTOR(1536)]) AS qv
),
ranked AS (
  SELECT d.id, d.content, MIN(d.embedding <=> q.qv) AS best_distance
  FROM documents d, queries q
  GROUP BY d.id, d.content
)
SELECT * FROM ranked ORDER BY best_distance LIMIT 20;

1.6 KiB Raw Blame History

Vector Queries

Distance Operators

NEAREST Clause

SIMILAR Clause

Recall Tuning

Exact Search

Multi-Vector Queries

1.6 KiB

Raw Blame History