Semantic Search Comparison
What This Is
A spike comparing semantic search approaches for invoice line item matching: LLM-based context matching (current Orcha approach) vs pgvector semantic search with multiple embedding models. Uses real GDPDU booking history to benchmark quality, latency, and cost.
Core Value
Determine whether pgvector semantic search can match or exceed the quality of LLM-based matching for assigning GL accounts and cost centers to invoice line items.
Requirements
Validated
(None yet — ship to validate)
Active
Out of Scope
- Production deployment — this is a spike
- Supplier matching improvements — focus is on line item semantic matching
- Multi-tenant support — single dataset comparison
- Real-time embedding generation — pre-compute embeddings during import
Context
Current Orcha implementation:
- Suppliers matched via pg_trgm (fuzzy string similarity >= 0.7)
- Up to 50 historical bookings for matched supplier passed as CSV to LLM
- Gemini 2.5 Flash performs semantic matching to suggest accounts/cost centers
- Reference code:
orcha/src/com/getorcha/workers/ingestion/post_process.clj:41-69
Dataset:
- Source:
orcha/dump/regnology/historical.csv
- ~6,078 rows of GDPDU-style bookings
- Columns: Supplier Name, Line Item Description, Net Amount, Debit Account, Credit Account, Cost Center
Embedding models to compare:
- Google text-multilingual-embedding-002 — production-ready, multilingual
- Jina AI embeddings — retrieval-optimized
- all-MiniLM-L6-v2 — local, free, 384 dimensions
Evaluation approach:
- Synthetic variations: modify existing descriptions slightly
- Ground truth: original booking's GL account and cost center
- Metrics: exact match, top-K, LLM-as-judge, human eval, latency, cost
Constraints
- Tech stack: Python for embeddings/API, Postgres 18 in Docker
- Postgres port: Use non-standard port (e.g., 5433) — standard port 5432 is in use by another project
- Credentials: Use existing Orcha Google API credentials for Gemini and embeddings
- Budget: Jina AI may need API key — guide user if needed
Key Decisions
| Decision |
Rationale |
Outcome |
| Postgres 18 in Docker |
Isolated environment, easy pgvector setup |
— Pending |
| Synthetic test queries |
Test robustness to description variations |
— Pending |
| Three embedding models |
Compare local vs API, different architectures |
— Pending |
Last updated: 2026-02-20 after initialization