Semantic Search Comparison

What This Is

A spike comparing semantic search approaches for invoice line item matching: LLM-based context matching (current Orcha approach) vs pgvector semantic search with multiple embedding models. Uses real GDPDU booking history to benchmark quality, latency, and cost.

Core Value

Determine whether pgvector semantic search can match or exceed the quality of LLM-based matching for assigning GL accounts and cost centers to invoice line items.

Requirements

Validated

(None yet — ship to validate)

Active

Out of Scope

Context

Current Orcha implementation:

Dataset:

Embedding models to compare:

  1. Google text-multilingual-embedding-002 — production-ready, multilingual
  2. Jina AI embeddings — retrieval-optimized
  3. all-MiniLM-L6-v2 — local, free, 384 dimensions

Evaluation approach:

Constraints

Key Decisions

Decision Rationale Outcome
Postgres 18 in Docker Isolated environment, easy pgvector setup — Pending
Synthetic test queries Test robustness to description variations — Pending
Three embedding models Compare local vs API, different architectures — Pending

Last updated: 2026-02-20 after initialization