Create pgvector semantic search backend with query embedding functions for all 3 models.

Purpose: Enable semantic similarity search against the pre-computed embeddings using cosine distance. Query embeddings must use RETRIEVAL_QUERY task type (different from RETRIEVAL_DOCUMENT used during indexing) for optimal retrieval accuracy.

Output: src/search/ module with query embedding and top-K search functions for Google, Jina, and MiniLM models.

<execution_context> @./.claude/get-shit-done/workflows/execute-plan.md @./.claude/get-shit-done/templates/summary.md </execution_context>

@.planning/PROJECT.md @.planning/ROADMAP.md @.planning/STATE.md @.planning/phases/03-search-implementation/03-CONTEXT.md @.planning/phases/03-search-implementation/03-RESEARCH.md

Existing embedding modules for reference

@src/embeddings/google_embed.py @src/embeddings/jina_embed.py @src/embeddings/minilm_embed.py @src/db.py

Task 1: Create query embedding functions for all 3 models src/search/__init__.py, src/search/pgvector_search.py Create src/search/ directory and module.

In src/search/pgvector_search.py, create query embedding functions:

  1. embed_query_google(query_text: str) -> list[float]:

    • Use existing google_embed module pattern (lazy client initialization)
    • CRITICAL: Use task_type='RETRIEVAL_QUERY' (not RETRIEVAL_DOCUMENT)
    • Returns 768-dimensional embedding
  2. embed_query_jina(query_text: str) -> list[float]:

    • Use existing jina_embed module pattern with requests
    • CRITICAL: Use task='retrieval.query' (not retrieval.passage)
    • Returns 1024-dimensional embedding
  3. embed_query_minilm(query_text: str) -> list[float]:

    • Use existing minilm_embed module pattern (lazy model loading)
    • MiniLM doesn't distinguish query vs document, so reuse same encode call
    • Returns 384-dimensional embedding with normalize_embeddings=True

In src/search/init.py, export:

  • embed_query_google, embed_query_jina, embed_query_minilm
  • search_pgvector (from next task)

Follow existing module patterns:

  • Lazy initialization for API clients/models
  • Load .env for API credentials
  • Type hints for all functions
from src.search import embed_query_google, embed_query_jina, embed_query_minilm
# Test each returns correct dimension
assert len(embed_query_google("test query")) == 768
assert len(embed_query_jina("test query")) == 1024
assert len(embed_query_minilm("test query")) == 384
Query embedding functions work for all 3 models with correct dimensions and task types.
Task 2: Create pgvector search function with similarity scores src/search/pgvector_search.py Add to src/search/pgvector_search.py:

search_pgvector(conn, query_embedding: list[float], embedding_column: str, k: int = 5) -> list[dict]:

  • Use pgvector cosine distance operator <=>
  • Calculate similarity as 1 - distance (cosine distance to similarity conversion)
  • Query fields: id, supplier_name, description, debit_account, credit_account, cost_center, net_amount, gross_amount
  • Filter: WHERE {embedding_column} IS NOT NULL
  • Order by: {embedding_column} <=> %s::vector
  • Limit to k results
  • Return list of dicts with all fields + similarity score

Set HNSW search parameters for optimal performance:

  • Execute SET hnsw.ef_search = 40 before search queries

Add convenience function search_all_models(conn, query_text: str, k: int = 5) -> dict:

  • Embeds query with all 3 models
  • Searches with each embedding
  • Returns dict with keys: 'google', 'jina', 'minilm'
  • Each value is list of result dicts

IMPORTANT: Use parameterized queries for embedding (pass as %s::vector), but embedding_column must be interpolated directly (f-string) since column names can't be parameterized. Validate column name against allowed values ['embedding_google', 'embedding_jina', 'embedding_minilm'].

Update src/search/init.py exports.

from src.db import get_connection
from src.search import search_pgvector, embed_query_google

conn = get_connection()
emb = embed_query_google("test query")
results = search_pgvector(conn, emb, 'embedding_google', k=3)

# Verify results structure
assert len(results) == 3
assert 'similarity' in results[0]
assert 'supplier_name' in results[0]
assert 'debit_account' in results[0]
assert 0 <= results[0]['similarity'] <= 1
conn.close()
pgvector search returns top-K results with similarity scores and all required fields.
1. Query embeddings use correct task types: - Google: RETRIEVAL_QUERY - Jina: retrieval.query - MiniLM: normalized encoding
  1. Search returns correct structure:

    • All result fields present
    • Similarity scores in [0, 1] range
    • Results ordered by similarity descending
  2. Module imports work:

    from src.search import (
        embed_query_google, embed_query_jina, embed_query_minilm,
        search_pgvector, search_all_models
    )
    

<success_criteria>

After completion, create `.planning/phases/03-search-implementation/03-01-SUMMARY.md`