Vector Search

HNSW Index

Absolute DB implements Hierarchical Navigable Small World (HNSW) indexing natively in the storage engine — no external vector database required. HNSW delivers sub-millisecond approximate nearest-neighbour search at any scale.

Property	Value
Top-10 search latency	< 0.05 ms (in-memory); < 0.1 ms (disk-backed)
Maximum dimensions	4,096
Storage mode	In-memory or disk-backed (automatic)
Index build	Online, non-blocking background thread
Recall	Configurable via `ef_construction` and `ef_search`

sql — create HNSW index

-- Create a vector column and HNSW index
CREATE TABLE documents (
    id        BIGSERIAL PRIMARY KEY,
    title     TEXT,
    body      TEXT,
    embedding VECTOR(1536)   -- OpenAI text-embedding-3 dimensions
);

-- HNSW index with cosine distance metric
CREATE INDEX idx_docs_embedding
    ON documents USING HNSW (embedding)
    WITH (metric = 'cosine', m = 16, ef_construction = 200);

-- HNSW index with L2 (Euclidean) distance
CREATE INDEX idx_docs_l2
    ON documents USING HNSW (embedding)
    WITH (metric = 'l2');

HNSW Parameters

Parameter	Default	Description
`metric`	`'cosine'`	Distance metric: `cosine`, `l2`, `inner_product`
`m`	`16`	Number of bi-directional links per node. Higher = better recall, more memory
`ef_construction`	`200`	Candidate set size during index build. Higher = better recall, slower build
`ef_search`	`100`	Candidate set size at query time. Higher = better recall, slower search

Distance Metrics

Three distance metrics are supported, each with its own SQL operator:

Metric	Operator	Best For
Cosine similarity	`<=>`	Text embeddings, semantic search; scale-invariant
L2 (Euclidean)	`<->`	Image embeddings, geometric distance
Inner product	`<#>`	Dot-product similarity; requires normalised vectors

sql — distance metric examples

-- Cosine similarity (most common for text)
SELECT id, title, embedding <=> '[0.1, 0.2, ...]'::vector AS dist
FROM documents ORDER BY dist LIMIT 10;

-- L2 (Euclidean) distance
SELECT id, title, embedding <-> '[0.1, 0.2, ...]'::vector AS dist
FROM documents ORDER BY dist LIMIT 10;

-- Inner product (for dot-product models)
SELECT id, title, embedding <#> '[0.1, 0.2, ...]'::vector AS dist
FROM documents ORDER BY dist LIMIT 10;

Quantisation

Vector quantisation reduces memory and storage requirements for large embedding collections while maintaining high recall. Three quantisation modes are available:

Product Quantisation (PQ)

PQ compresses a 1,536-dimension vector to just 96 bytes — a 16× compression ratio — by encoding sub-vectors as references to a trained codebook. K-means centroid training happens at index creation time. Excellent recall/compression trade-off for large-scale deployments.

Scalar Quantisation (SQ)

SQ converts FP32 floating-point values to 8-bit integers using per-dimension min/max scaling. Provides 4× compression with minimal recall loss. Faster to compute than PQ.

Binary Quantisation (BQ)

BQ converts each dimension to a single bit. Provides 32× compression versus FP32 and enables Hamming distance computation, which is extremely fast on modern CPUs. Best for models trained for binary quantisation (e.g., Cohere Embed v3 binary).

sql — quantisation options

-- Product Quantisation (16x compression, best recall/size trade-off)
CREATE INDEX idx_pq ON documents USING HNSW (embedding)
WITH (metric = 'cosine', quantisation = 'pq', pq_segments = 16);

-- Scalar Quantisation (4x compression, near-lossless)
CREATE INDEX idx_sq ON documents USING HNSW (embedding)
WITH (metric = 'cosine', quantisation = 'sq8');

-- Binary Quantisation (32x compression, fastest compute)
CREATE INDEX idx_bq ON documents USING HNSW (embedding)
WITH (metric = 'cosine', quantisation = 'binary');

Matryoshka Adaptive Dimensions

Matryoshka Representation Learning (MRL) allows embeddings to be truncated to a shorter dimension while still producing meaningful similarity results. Absolute DB supports this natively — store full-dimension embeddings and query at any truncated dimension without re-embedding.

This is compatible with OpenAI text-embedding-3-small and text-embedding-3-large models, which both support Matryoshka truncation.

sql — Matryoshka truncation

-- Store full 3072-dim embeddings (text-embedding-3-large)
CREATE TABLE docs (
    id        BIGSERIAL PRIMARY KEY,
    body      TEXT,
    embedding VECTOR(3072)
);

-- Query with truncated dimensions (faster, smaller index)
-- absdb_truncate() returns the first N dimensions
SELECT id, body,
       absdb_truncate(embedding, 256) <=> absdb_truncate(EMBED('text-embedding-3-large', $1), 256)
FROM docs
ORDER BY 3 LIMIT 10;

Sparse Vectors

Sparse vectors represent high-dimensional spaces where most dimensions are zero — ideal for SPLADE, BM25-sparse, and learned sparse retrieval models. Absolute DB stores sparse vectors as sorted (dimension, value) pairs and indexes them via an inverted index for efficient dot-product computation.

Property	Value
Maximum dimensions	30,000
Maximum non-zeros per vector	1,000
Supported models	SPLADE, BM25-sparse, learned sparse models
Similarity metrics	Cosine, L2 normalised dot product

sql — sparse vector search

-- Create table with sparse vector column
CREATE TABLE sparse_docs (
    id       BIGSERIAL PRIMARY KEY,
    body     TEXT,
    svec     SPARSEVEC(30000)
);

-- Insert a sparse vector (text format: {dim:val, dim:val, ...})
INSERT INTO sparse_docs (body, svec) VALUES
('Raft consensus protocol', '{42:1.3, 107:0.9, 512:2.1, 1024:0.7}');

-- Sparse vector similarity search
SELECT id, body,
       svec <=> '{42:1.3, 107:0.9}'::sparsevec AS score
FROM sparse_docs
ORDER BY score DESC
LIMIT 10;

Hybrid Search (BM25 + Vector)

Hybrid search combines keyword relevance (BM25) with semantic similarity (dense vector cosine) using Reciprocal Rank Fusion (RRF). RRF is rank-based, so the two score distributions need not be on the same scale — it simply combines the reciprocal ranks of results from each retrieval system.

This approach outperforms either BM25 or vector search alone on most retrieval benchmarks because it captures both exact-match signals and semantic meaning simultaneously.

sql — hybrid search

-- Hybrid BM25 + vector search with RRF fusion
SELECT id, title, body
FROM documents
WHERE body MATCH 'distributed consensus algorithm'
ORDER BY embedding <=> EMBED('minilm', 'distributed consensus algorithm')
LIMIT 10;

-- Explicit RRF with weight control
SELECT * FROM absdb_hybrid_search(
    table   => 'documents',
    query   => 'distributed consensus',
    vec     => EMBED('minilm', 'distributed consensus'),
    bm25_weight => 0.4,
    vec_weight  => 0.6,
    top_k       => 10
);

Multi-Vector MaxSim (ColBERT)

ColBERT (Contextualized Late Interaction over BERT) represents each document and query as a sequence of token-level vectors rather than a single pooled vector. At query time, the MaxSim operator computes the maximum similarity between each query token vector and all document token vectors. This late interaction approach maximises recall, especially for long documents.

sql — ColBERT MaxSim search

-- Store per-token embeddings as a multi-vector
CREATE TABLE colbert_docs (
    id      BIGSERIAL PRIMARY KEY,
    title   TEXT,
    tokens  VECTOR(128)[]   -- array of token vectors
);

-- MaxSim search: highest per-token similarity across all tokens
SELECT id, title,
       absdb_maxsim(tokens, EMBED_TOKENS('colbert', $1)) AS score
FROM colbert_docs
ORDER BY score DESC
LIMIT 10;

Auto-Embedding on INSERT/UPDATE

Enable auto-embedding on a table to have Absolute DB automatically generate and store embeddings whenever rows are inserted or updated. The embedding model runs in-database — no application-side embedding pipeline required.

sql — auto-embedding configuration

-- Enable auto-embedding: body column → embedding column
ALTER TABLE documents
    SET (
        auto_embed      = true,
        embed_column    = 'body',
        embed_target    = 'embedding',
        embed_model     = 'minilm'
    );

-- Now plain INSERTs automatically populate the embedding column
INSERT INTO documents (title, body) VALUES
    ('Raft Paper', 'Raft is a consensus algorithm designed for understandability...');
-- embedding column is populated automatically

-- Supported built-in models: minilm, e5-small, multilingual-e5
-- External ONNX models: register with absdb_model_register()

Namespace / Collection Isolation

Each HNSW index can be scoped to a tenant namespace with separate access controls. This enables multi-tenant vector search where tenants cannot see each other's data, while all tenants share the same server instance.

sql — namespace isolation

-- Create tenant-scoped vector index
CREATE INDEX idx_tenant_a_embedding
    ON documents USING HNSW (embedding)
    WITH (metric = 'cosine', namespace = 'tenant_a');

-- Tenant A's searches are automatically scoped
-- to their namespace via row-level security
SET app.tenant_id = 'tenant_a';
SELECT id, title, embedding <=> $1 AS dist
FROM documents
ORDER BY dist LIMIT 10;

RAG Pipeline

Absolute DB includes a complete Retrieval-Augmented Generation pipeline: chunk text, embed chunks, retrieve by similarity, and re-rank — all in SQL. The built-in absdb_rag_query() function handles the full pipeline.

sql — RAG query

-- Full RAG pipeline (retrieve + re-rank)
SELECT * FROM absdb_rag_query(
    query      => 'How does Raft handle leader failure?',
    table      => 'knowledge_base',
    top_k      => 5,
    rerank     => true,
    rerank_model => 'cross-encoder'
);

-- Result columns: id, chunk_text, score, rank

-- Manual pipeline: chunk → embed → store
SELECT absdb_chunk_and_embed(
    source_table => 'raw_documents',
    target_table => 'knowledge_base',
    chunk_size   => 512,
    chunk_overlap => 64,
    model        => 'minilm'
);

Agent memory store with TTL and consolidation is also available — episodic memory for recent interactions and semantic memory for long-term facts, both backed by HNSW.

SQL Reference

SQL Element	Description
`VECTOR(n)`	Dense vector column type, up to 4,096 dimensions
`SPARSEVEC(n)`	Sparse vector column type, up to 30,000 dimensions
`USING HNSW`	Create HNSW index on a VECTOR column
`<=>`	Cosine distance operator
`<->`	L2 (Euclidean) distance operator
`<#>`	Negative inner product operator
`EMBED(model, text)`	Generate embedding vector from text using named model
`EMBED_TOKENS(model, text)`	Generate token-level embedding array (ColBERT)
`absdb_maxsim(vec_array, query_vecs)`	ColBERT MaxSim score
`absdb_truncate(vec, dims)`	Truncate vector to first N dimensions (Matryoshka)
`absdb_hybrid_search(...)`	Hybrid BM25 + vector search with RRF fusion
`absdb_rag_query(...)`	Full RAG retrieve-and-rerank pipeline

Contents