First-class vector search built directly into the SQL engine. HNSW indexing, multiple quantisation modes, sparse vectors, hybrid BM25+vector search, and a complete RAG pipeline — all in standard SQL.
Absolute DB implements Hierarchical Navigable Small World (HNSW) indexing natively in the storage engine — no external vector database required. HNSW delivers sub-millisecond approximate nearest-neighbour search at any scale.
| Property | Value |
|---|---|
| Top-10 search latency | < 0.05 ms (in-memory); < 0.1 ms (disk-backed) |
| Maximum dimensions | 4,096 |
| Storage mode | In-memory or disk-backed (automatic) |
| Index build | Online, non-blocking background thread |
| Recall | Configurable via ef_construction and ef_search |
-- Create a vector column and HNSW index
CREATE TABLE documents (
id BIGSERIAL PRIMARY KEY,
title TEXT,
body TEXT,
embedding VECTOR(1536) -- OpenAI text-embedding-3 dimensions
);
-- HNSW index with cosine distance metric
CREATE INDEX idx_docs_embedding
ON documents USING HNSW (embedding)
WITH (metric = 'cosine', m = 16, ef_construction = 200);
-- HNSW index with L2 (Euclidean) distance
CREATE INDEX idx_docs_l2
ON documents USING HNSW (embedding)
WITH (metric = 'l2');
| Parameter | Default | Description |
|---|---|---|
metric | 'cosine' | Distance metric: cosine, l2, inner_product |
m | 16 | Number of bi-directional links per node. Higher = better recall, more memory |
ef_construction | 200 | Candidate set size during index build. Higher = better recall, slower build |
ef_search | 100 | Candidate set size at query time. Higher = better recall, slower search |
Three distance metrics are supported, each with its own SQL operator:
| Metric | Operator | Best For |
|---|---|---|
| Cosine similarity | <=> | Text embeddings, semantic search; scale-invariant |
| L2 (Euclidean) | <-> | Image embeddings, geometric distance |
| Inner product | <#> | Dot-product similarity; requires normalised vectors |
-- Cosine similarity (most common for text)
SELECT id, title, embedding <=> '[0.1, 0.2, ...]'::vector AS dist
FROM documents ORDER BY dist LIMIT 10;
-- L2 (Euclidean) distance
SELECT id, title, embedding <-> '[0.1, 0.2, ...]'::vector AS dist
FROM documents ORDER BY dist LIMIT 10;
-- Inner product (for dot-product models)
SELECT id, title, embedding <#> '[0.1, 0.2, ...]'::vector AS dist
FROM documents ORDER BY dist LIMIT 10;
Vector quantisation reduces memory and storage requirements for large embedding collections while maintaining high recall. Three quantisation modes are available:
PQ compresses a 1,536-dimension vector to just 96 bytes — a 16× compression ratio — by encoding sub-vectors as references to a trained codebook. K-means centroid training happens at index creation time. Excellent recall/compression trade-off for large-scale deployments.
SQ converts FP32 floating-point values to 8-bit integers using per-dimension min/max scaling. Provides 4× compression with minimal recall loss. Faster to compute than PQ.
BQ converts each dimension to a single bit. Provides 32× compression versus FP32 and enables Hamming distance computation, which is extremely fast on modern CPUs. Best for models trained for binary quantisation (e.g., Cohere Embed v3 binary).
-- Product Quantisation (16x compression, best recall/size trade-off)
CREATE INDEX idx_pq ON documents USING HNSW (embedding)
WITH (metric = 'cosine', quantisation = 'pq', pq_segments = 16);
-- Scalar Quantisation (4x compression, near-lossless)
CREATE INDEX idx_sq ON documents USING HNSW (embedding)
WITH (metric = 'cosine', quantisation = 'sq8');
-- Binary Quantisation (32x compression, fastest compute)
CREATE INDEX idx_bq ON documents USING HNSW (embedding)
WITH (metric = 'cosine', quantisation = 'binary');
Matryoshka Representation Learning (MRL) allows embeddings to be truncated to a shorter dimension while still producing meaningful similarity results. Absolute DB supports this natively — store full-dimension embeddings and query at any truncated dimension without re-embedding.
This is compatible with OpenAI text-embedding-3-small and text-embedding-3-large models, which both support Matryoshka truncation.
-- Store full 3072-dim embeddings (text-embedding-3-large)
CREATE TABLE docs (
id BIGSERIAL PRIMARY KEY,
body TEXT,
embedding VECTOR(3072)
);
-- Query with truncated dimensions (faster, smaller index)
-- absdb_truncate() returns the first N dimensions
SELECT id, body,
absdb_truncate(embedding, 256) <=> absdb_truncate(EMBED('text-embedding-3-large', $1), 256)
FROM docs
ORDER BY 3 LIMIT 10;
Sparse vectors represent high-dimensional spaces where most dimensions are zero — ideal for SPLADE, BM25-sparse, and learned sparse retrieval models. Absolute DB stores sparse vectors as sorted (dimension, value) pairs and indexes them via an inverted index for efficient dot-product computation.
| Property | Value |
|---|---|
| Maximum dimensions | 30,000 |
| Maximum non-zeros per vector | 1,000 |
| Supported models | SPLADE, BM25-sparse, learned sparse models |
| Similarity metrics | Cosine, L2 normalised dot product |
-- Create table with sparse vector column
CREATE TABLE sparse_docs (
id BIGSERIAL PRIMARY KEY,
body TEXT,
svec SPARSEVEC(30000)
);
-- Insert a sparse vector (text format: {dim:val, dim:val, ...})
INSERT INTO sparse_docs (body, svec) VALUES
('Raft consensus protocol', '{42:1.3, 107:0.9, 512:2.1, 1024:0.7}');
-- Sparse vector similarity search
SELECT id, body,
svec <=> '{42:1.3, 107:0.9}'::sparsevec AS score
FROM sparse_docs
ORDER BY score DESC
LIMIT 10;
Hybrid search combines keyword relevance (BM25) with semantic similarity (dense vector cosine) using Reciprocal Rank Fusion (RRF). RRF is rank-based, so the two score distributions need not be on the same scale — it simply combines the reciprocal ranks of results from each retrieval system.
This approach outperforms either BM25 or vector search alone on most retrieval benchmarks because it captures both exact-match signals and semantic meaning simultaneously.
-- Hybrid BM25 + vector search with RRF fusion
SELECT id, title, body
FROM documents
WHERE body MATCH 'distributed consensus algorithm'
ORDER BY embedding <=> EMBED('minilm', 'distributed consensus algorithm')
LIMIT 10;
-- Explicit RRF with weight control
SELECT * FROM absdb_hybrid_search(
table => 'documents',
query => 'distributed consensus',
vec => EMBED('minilm', 'distributed consensus'),
bm25_weight => 0.4,
vec_weight => 0.6,
top_k => 10
);
ColBERT (Contextualized Late Interaction over BERT) represents each document and query as a sequence of token-level vectors rather than a single pooled vector. At query time, the MaxSim operator computes the maximum similarity between each query token vector and all document token vectors. This late interaction approach maximises recall, especially for long documents.
-- Store per-token embeddings as a multi-vector
CREATE TABLE colbert_docs (
id BIGSERIAL PRIMARY KEY,
title TEXT,
tokens VECTOR(128)[] -- array of token vectors
);
-- MaxSim search: highest per-token similarity across all tokens
SELECT id, title,
absdb_maxsim(tokens, EMBED_TOKENS('colbert', $1)) AS score
FROM colbert_docs
ORDER BY score DESC
LIMIT 10;
Enable auto-embedding on a table to have Absolute DB automatically generate and store embeddings whenever rows are inserted or updated. The embedding model runs in-database — no application-side embedding pipeline required.
-- Enable auto-embedding: body column → embedding column
ALTER TABLE documents
SET (
auto_embed = true,
embed_column = 'body',
embed_target = 'embedding',
embed_model = 'minilm'
);
-- Now plain INSERTs automatically populate the embedding column
INSERT INTO documents (title, body) VALUES
('Raft Paper', 'Raft is a consensus algorithm designed for understandability...');
-- embedding column is populated automatically
-- Supported built-in models: minilm, e5-small, multilingual-e5
-- External ONNX models: register with absdb_model_register()
Each HNSW index can be scoped to a tenant namespace with separate access controls. This enables multi-tenant vector search where tenants cannot see each other's data, while all tenants share the same server instance.
-- Create tenant-scoped vector index
CREATE INDEX idx_tenant_a_embedding
ON documents USING HNSW (embedding)
WITH (metric = 'cosine', namespace = 'tenant_a');
-- Tenant A's searches are automatically scoped
-- to their namespace via row-level security
SET app.tenant_id = 'tenant_a';
SELECT id, title, embedding <=> $1 AS dist
FROM documents
ORDER BY dist LIMIT 10;
Absolute DB includes a complete Retrieval-Augmented Generation pipeline: chunk text, embed chunks, retrieve by similarity, and re-rank — all in SQL. The built-in absdb_rag_query() function handles the full pipeline.
-- Full RAG pipeline (retrieve + re-rank)
SELECT * FROM absdb_rag_query(
query => 'How does Raft handle leader failure?',
table => 'knowledge_base',
top_k => 5,
rerank => true,
rerank_model => 'cross-encoder'
);
-- Result columns: id, chunk_text, score, rank
-- Manual pipeline: chunk → embed → store
SELECT absdb_chunk_and_embed(
source_table => 'raw_documents',
target_table => 'knowledge_base',
chunk_size => 512,
chunk_overlap => 64,
model => 'minilm'
);
Agent memory store with TTL and consolidation is also available — episodic memory for recent interactions and semantic memory for long-term facts, both backed by HNSW.
| SQL Element | Description |
|---|---|
VECTOR(n) | Dense vector column type, up to 4,096 dimensions |
SPARSEVEC(n) | Sparse vector column type, up to 30,000 dimensions |
USING HNSW | Create HNSW index on a VECTOR column |
<=> | Cosine distance operator |
<-> | L2 (Euclidean) distance operator |
<#> | Negative inner product operator |
EMBED(model, text) | Generate embedding vector from text using named model |
EMBED_TOKENS(model, text) | Generate token-level embedding array (ColBERT) |
absdb_maxsim(vec_array, query_vecs) | ColBERT MaxSim score |
absdb_truncate(vec, dims) | Truncate vector to first N dimensions (Matryoshka) |
absdb_hybrid_search(...) | Hybrid BM25 + vector search with RRF fusion |
absdb_rag_query(...) | Full RAG retrieve-and-rerank pipeline |
~154 KB binary · zero external dependencies · 2,737 tests passing · SQL:2023 100%