The current matching algorithm is oriented toward finding a single best match per document type group. When a document arrives and multiple existing documents are genuine matches (e.g., a contract added after 5 invoices from the same supplier already exist), the system should create matches with all of them, not just pick the best one.
Replace the current decide-matches logic with a two-tier approach per type group:
rule-based.llm.Both tiers produce matches simultaneously — they are not mutually exclusive within a type group.
Reframe from "pick the best match" to "for each candidate, determine whether it genuinely belongs to the same business transaction as the source document." Candidates are sent together so the LLM can reason about overlaps and duplicates.
Response schema unchanged: {matches: [{candidate, confidence, reasoning}]}. Absence from the list = no match.
| File | Change |
|---|---|
src/com/getorcha/workers/matching/core.clj |
Rewrite decide-matches |
src/com/getorcha/workers/matching/llm_decision.clj |
Update build-match-prompt wording |
| Tests for both namespaces | Update expectations |