Issue: #291 Date: 2026-03-02
The matching system finds related documents but doesn't verify whether their contents agree. An invoice that overcharges by 50% compared to the PO still matches with high confidence. Line items, quantities, and prices are never cross-checked.
After documents are matched into a cluster, a single LLM call compares their contents and produces a structured reconciliation report surfacing price discrepancies, quantity mismatches, unmatched line items, and total inconsistencies.
document_clusterCREATE TABLE document_cluster (
id uuid PRIMARY KEY DEFAULT gen_random_uuid(),
reconciliation jsonb,
reconciled_at timestamptz,
created_at timestamptz NOT NULL DEFAULT now(),
updated_at timestamptz NOT NULL DEFAULT now()
);
document.cluster_id becomes a FK to document_cluster(id) ON DELETE SET NULL.
Existing cluster UUIDs are migrated: one document_cluster row per distinct cluster_id currently in use.
{:status "reconciled" ;; or "discrepancies"
:summary "..." ;; cluster-level human-readable summary
:issues [{:severity "warning" ;; or "error"
:category "price-discrepancy"
:summary "Invoice bills €120/unit but PO specifies €100/unit for Widget A"
:document-ids ["<uuid-a>" "<uuid-b>"]
:details [{:field "unit-price"
:expected "100.00"
:actual "120.00"}]}]}
No changes to document_match.
assign-cluster! updatedCreates/merges document_cluster rows instead of bare UUIDs:
INSERT INTO document_cluster, set both docs' cluster_idformat-document-summary made publicCurrently private in llm_decision.clj. Made public for reuse by reconciliation.
com.getorcha.workers.matching.reconciliationreconcile-cluster!:
format-document-summary, including document IDdocument_cluster.reconciliation + reconciled_atIn process-document! (worker.clj), between match-document! and set-matching-status! "succeeded":
(let [cluster-before (get-cluster-id db doc-id)
_ (matching/match-document! db search-config llm-config doc)
cluster-after (get-cluster-id db doc-id)
affected (cond-> #{}
cluster-before (conj cluster-before)
cluster-after (conj cluster-after))]
(doseq [cluster-id affected]
(reconciliation/reconcile-cluster! db llm-config cluster-id))
(db.matching/set-matching-status! db doc-id {:status "succeeded"}))
Handles reingestion naturally: old and new clusters are both reconciled.
On reconciliation failure (after retries): log warning, notify admins, still set matching status to "succeeded". Reconciliation is additive — missing results don't block matching.
:matching config)with-llm-retry pattern (3 attempts, exponential backoff)System message: role, rules (tolerance, matching strategy, output format).
User message: all documents in the cluster, each with UUID and formatted summary. Instructions to identify discrepancies, quantity mismatches, unmatched items, and total inconsistencies.
Response: JSON matching ReconciliationResponse schema.
(def ReconciliationIssue
[:map
[:severity [:enum "warning" "error"]]
[:category :string]
[:summary :string]
[:document-ids [:vector :string]]
[:details {:optional true}
[:vector
[:map
[:field :string]
[:expected [:maybe :string]]
[:actual [:maybe :string]]]]]])
(def ReconciliationResponse
[:map
[:status [:enum "reconciled" "discrepancies"]]
[:summary :string]
[:issues [:vector ReconciliationIssue]]])
Badge next to "Matches" heading:
status = "reconciled"status = "discrepancies"Below match cards, inside the matches section. Each issue renders:
Reconciliation data loaded from document_cluster.reconciliation via the document's cluster_id. Part of the matches section — no separate endpoint. The existing matching-complete SSE event refreshes the whole section, so reconciliation results appear when matching finishes.
Single Sonnet call per cluster. Typical cluster of 2-3 documents ≈ 2-4K input tokens ≈ $0.01-0.02 per reconciliation.
matching-complete