Pure dense retrieval misses exact-keyword, code snippet, and proper-noun queries. Postgres has native full-text search (`tsvector`, `tsquery`, `ts_rank`) that implements a BM25-like scoring model. Combining dense ANN scores with BM25 keyword scores using Reciprocal Rank Fusion (RRF) consistently outperforms either retrieval method alone. For the full RRF algorithm and BEIR benchmark comparisons, see hybrid search BM25 + dense tutorial.
The Postgres hybrid query runs both retrievals in a single SQL statement using a CTE and merges scores with RRF:
```sql
WITH dense AS (
-- Dense retrieval: top 50 by cosine similarity
SELECT id, 1 - (embedding <=> $1::vector) AS dense_score,
ROW_NUMBER() OVER (ORDER BY embedding <=> $1::vector) AS dense_rank
FROM docs
ORDER BY embedding <=> $1::vector
LIMIT 50
),
bm25 AS (
-- BM25 retrieval: top 50 by ts_rank
SELECT id,
ts_rank(ts_content, plainto_tsquery('english', $2)) AS bm25_score,
ROW_NUMBER() OVER (
ORDER BY ts_rank(ts_content, plainto_tsquery('english', $2)) DESC
) AS bm25_rank
FROM docs
WHERE ts_content @@ plainto_tsquery('english', $2)
LIMIT 50
),
fused AS (
-- Reciprocal Rank Fusion: score = 1/(60 + rank_dense) + 1/(60 + rank_bm25)
SELECT
COALESCE(d.id, b.id) AS id,
COALESCE(1.0 / (60.0 + d.dense_rank), 0) +
COALESCE(1.0 / (60.0 + b.bm25_rank), 0) AS rrf_score
FROM dense d
FULL OUTER JOIN bm25 b ON d.id = b.id
)
SELECT docs.id, docs.source_id, docs.content, fused.rrf_score
FROM fused
JOIN docs ON docs.id = fused.id
ORDER BY fused.rrf_score DESC
LIMIT $3;
```
```python
HYBRID_SQL = """
WITH dense AS (
SELECT id, 1 - (embedding <=> $1::vector) AS dense_score,
ROW_NUMBER() OVER (ORDER BY embedding <=> $1::vector) AS dense_rank
FROM docs
ORDER BY embedding <=> $1::vector
LIMIT 50
),
bm25 AS (
SELECT id,
ts_rank(ts_content, plainto_tsquery('english', $2)) AS bm25_score,
ROW_NUMBER() OVER (
ORDER BY ts_rank(ts_content, plainto_tsquery('english', $2)) DESC
) AS bm25_rank
FROM docs
WHERE ts_content @@ plainto_tsquery('english', $2)
LIMIT 50
),
fused AS (
SELECT
COALESCE(d.id, b.id) AS id,
COALESCE(1.0 / (60.0 + d.dense_rank), 0) +
COALESCE(1.0 / (60.0 + b.bm25_rank), 0) AS rrf_score
FROM dense d
FULL OUTER JOIN bm25 b ON d.id = b.id
)
SELECT docs.id, docs.source_id, docs.content, fused.rrf_score
FROM fused
JOIN docs ON docs.id = fused.id
ORDER BY fused.rrf_score DESC
LIMIT $3;
"""
async def retrieve_hybrid(
query: str,
conn,
top_k: int = 5,
) -> list[dict]:
"""Hybrid dense + BM25 retrieval via RRF in a single Postgres query."""
q_emb = oai.embeddings.create(model=EMBED_MODEL, input=[query]).data[0].embedding
q_str = str(q_emb)
rows = await conn.fetch(HYBRID_SQL, q_str, query, top_k)
return [dict(row) for row in rows]
```
The `FULL OUTER JOIN` between dense and BM25 results ensures that documents appearing in only one retrieval set are still included in the fusion — a document that ranks 1st in BM25 but doesn't appear in dense top-50 still scores `1/(60+1) = 0.0164`, which often beats documents that rank 40th+ in both. The constant k=60 in the RRF denominator is the standard choice from the original RRF paper (Cormack et al., 2009); it de-emphasizes the very top ranks to smooth score variance. Use `plainto_tsquery` rather than `to_tsquery` for user input — `plainto_tsquery` handles arbitrary text safely, while `to_tsquery` requires AND/OR/NOT operator syntax and will error on natural-language queries.