Pairing-Specific LLM Matching Prompts Implementation Plan

For Claude: REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.

Goal: Replace the generic LLM matching prompt with direction-dependent, pair-specific prompts that express the correct matching semantics (cardinality, line-item focus) for each source→candidate combination.

Architecture: Add a pair-prompts lookup map keyed by [source-type candidate-type] vectors. Thread candidate-type through decide-matches → llm-match-decision → build-match-prompt. The candidate grouping in core.clj already exists (line 203) so changes there are minimal — just passing the candidate type down.

Tech Stack: Clojure, Malli schemas, clojure.test

Design doc: docs/plans/2026-03-02-pairing-specific-matching-prompts-design.md

Task 1: Add pair-prompts map and update `build-match-prompt`

Files:

Modify: src/com/getorcha/workers/matching/llm_decision.clj:148-187
Test: test/com/getorcha/workers/matching/llm_decision_test.clj

Step 1: Write the failing tests

Add tests for the new build-match-prompt signature. The function now takes [source-doc candidate-type candidates]. Test that different pair types produce different system prompts and task instructions.

In test/com/getorcha/workers/matching/llm_decision_test.clj, add a new deftest after the existing build-match-prompt-test:

(deftest build-match-prompt-pair-specific-test
  (let [invoice-doc  #:document{:type "invoice"
                                :structured-data {:invoice-number "INV-001"
                                                  :issuer {:name "ACME"}
                                                  :total 1000}}
        contract-doc #:document{:type "contract"
                                :structured-data {:contract-number "C-001"
                                                  :counterparty {:name "ACME"}}}
        po-doc       #:document{:type "purchase-order"
                                :structured-data {:po-number "PO-001"
                                                  :supplier {:name "ACME"}}}
        grn-doc      #:document{:type "goods-received-note"
                                :structured-data {:grn-number "GRN-001"
                                                  :supplier {:name "ACME"}}}
        candidates-fn (fn [doc] [{:doc doc :score 0.55 :evidence []}])]

    (testing "invoice → contract: exclusive prompt, mentions single best match"
      (let [{:keys [system user]} (llm-decision/build-match-prompt
                                   invoice-doc "contract" (candidates-fn contract-doc))]
        (is (str/includes? system "contract"))
        (is (str/includes? (str/lower-case user) "single best"))))

    (testing "contract → invoice: many prompt, mentions multiple matches"
      (let [{:keys [system user]} (llm-decision/build-match-prompt
                                   contract-doc "invoice" (candidates-fn invoice-doc))]
        (is (str/includes? (str/lower-case user) "all candidate"))))

    (testing "invoice → purchase-order: line-item prompt"
      (let [{:keys [system user]} (llm-decision/build-match-prompt
                                   invoice-doc "purchase-order" (candidates-fn po-doc))]
        (is (str/includes? (str/lower-case system) "line item"))))

    (testing "purchase-order → invoice: line-item prompt"
      (let [{:keys [system user]} (llm-decision/build-match-prompt
                                   po-doc "invoice" (candidates-fn invoice-doc))]
        (is (str/includes? (str/lower-case system) "line item"))))

    (testing "purchase-order → contract: exclusive prompt"
      (let [{:keys [system user]} (llm-decision/build-match-prompt
                                   po-doc "contract" (candidates-fn contract-doc))]
        (is (str/includes? (str/lower-case user) "single best"))))

    (testing "contract → purchase-order: many prompt"
      (let [{:keys [system user]} (llm-decision/build-match-prompt
                                   contract-doc "purchase-order" (candidates-fn po-doc))]
        (is (str/includes? (str/lower-case user) "all candidate"))))

    (testing "goods-received-note → purchase-order: line-item prompt"
      (let [{:keys [system user]} (llm-decision/build-match-prompt
                                   grn-doc "purchase-order" (candidates-fn po-doc))]
        (is (str/includes? (str/lower-case system) "line item"))))

    (testing "purchase-order → goods-received-note: line-item prompt"
      (let [{:keys [system user]} (llm-decision/build-match-prompt
                                   po-doc "goods-received-note" (candidates-fn grn-doc))]
        (is (str/includes? (str/lower-case system) "line item"))))))

Also update the existing build-match-prompt-test to pass the new candidate-type argument. All three existing test cases use invoice→purchase-order, so pass "purchase-order" as the second arg:

;; Line 158: change from
(llm-decision/build-match-prompt source candidates)
;; to
(llm-decision/build-match-prompt source "purchase-order" candidates)

;; Line 180: same change
(llm-decision/build-match-prompt source "purchase-order" candidates)

;; Line 195: same change
(llm-decision/build-match-prompt source "purchase-order" candidates)

The test at line 186-197 ("prompt asks for per-candidate evaluation") checks for "each candidate" in the user prompt. The new pair-specific task text for invoice→PO won't contain that exact phrase. Delete this test — it tested generic wording that no longer applies. The new build-match-prompt-pair-specific-test covers the pair-specific wording.

Step 2: Run tests to verify they fail

Run: clj -X:test:silent :nses '[com.getorcha.workers.matching.llm-decision-test]' Expected: FAIL — build-match-prompt doesn't accept candidate-type yet.

Step 3: Add the pair-prompts map and update build-match-prompt

In src/com/getorcha/workers/matching/llm_decision.clj, replace the Prompt Construction section (lines 148-187) with:

;; Prompt Construction
;; -----------------------------------------------------------------------------

(defn ^:private format-candidates
  "Format candidates list for the LLM prompt."
  [candidates]
  (->> candidates
       (map-indexed
        (fn [i {:keys [doc score evidence]}]
          (str "### Candidate " (inc i) "\n"
               (format-document-summary doc)
               "\nPreliminary score: " (format "%.2f" (double score))
               "\nEvidence: " (pr-str (mapv :signal evidence)))))
       (str/join "\n\n---\n\n")))


(def ^:private pair-prompts
  "Direction-dependent prompt templates keyed by [source-type candidate-type].
   Each entry has :system (role/context) and :task (cardinality guidance + output format)."
  {;; Exclusive pairs: source belongs to exactly one candidate
   ["invoice" "contract"]
   {:system "You are a financial document matching assistant.
Your task is to match an invoice to its governing contract.
An invoice is typically covered by exactly one contract.
Focus on: counterparty identity, whether the invoice line items fall within the contract's scope and deliverables, whether amounts are consistent with the contract value, whether the service period falls within the contract dates, and any explicit contract references on the invoice."
    :task "Select the single best matching contract from the candidates below, if any.
If multiple candidates could match, pick the one with the strongest evidence.
If none match confidently, return empty matches."}

   ["purchase-order" "contract"]
   {:system "You are a financial document matching assistant.
Your task is to match a purchase order to its governing contract.
A purchase order is typically issued under exactly one contract.
Focus on: counterparty identity, whether the PO line items fall within the contract's scope and deliverables, whether amounts are consistent with the contract value, and any explicit contract references on the PO."
    :task "Select the single best matching contract from the candidates below, if any.
If multiple candidates could match, pick the one with the strongest evidence.
If none match confidently, return empty matches."}

   ;; Many pairs: source can be associated with multiple candidates
   ["contract" "invoice"]
   {:system "You are a financial document matching assistant.
Your task is to find all invoices that belong to a given contract.
A contract can have many invoices billed against it over its lifetime.
Focus on: counterparty identity, whether invoice amounts and line items are consistent with the contract's scope, and whether invoice dates fall within the contract period."
    :task "Match all candidates that are genuinely covered by the source contract.
Multiple matches are expected and encouraged when justified."}

   ["contract" "purchase-order"]
   {:system "You are a financial document matching assistant.
Your task is to find all purchase orders issued under a given contract.
A contract can have many POs issued against it.
Focus on: counterparty identity, whether PO line items fall within the contract's scope, and any explicit contract references on the POs."
    :task "Match all candidates that are genuinely issued under the source contract.
Multiple matches are expected and encouraged when justified."}

   ;; Line-item pairs: many-to-many matching driven by line item comparison
   ["invoice" "purchase-order"]
   {:system "You are a financial document matching assistant.
Your task is to match an invoice to purchase orders by comparing line items.
An invoice can reference items from multiple POs, and a PO can be partially invoiced across multiple invoices.
Focus on: line item descriptions, quantities, unit prices, and amounts. A partial match (some line items match) is still a valid match. Also consider PO references on the invoice and counterparty identity."
    :task "Match all candidates that share relevant line items or references with the source invoice.
Multiple matches are expected and encouraged when justified. Partial matches count."}

   ["purchase-order" "invoice"]
   {:system "You are a financial document matching assistant.
Your task is to match a purchase order to invoices by comparing line items.
A PO can be partially invoiced across multiple invoices, and an invoice can span items from multiple POs.
Focus on: line item descriptions, quantities, unit prices, and amounts. A partial match (some line items match) is still a valid match. Also consider PO references on the invoices and counterparty identity."
    :task "Match all candidates that share relevant line items or references with the source purchase order.
Multiple matches are expected and encouraged when justified. Partial matches count."}

   ["goods-received-note" "purchase-order"]
   {:system "You are a financial document matching assistant.
Your task is to match a goods received note to purchase orders by comparing line items.
A GRN can confirm delivery of items from multiple POs.
Focus on: line item descriptions, quantities received vs quantities ordered, PO references, and delivery dates. A partial match (some items from one PO) is still a valid match."
    :task "Match all candidates that share relevant line items with the source goods received note.
Multiple matches are expected and encouraged when justified. Partial matches count."}

   ["purchase-order" "goods-received-note"]
   {:system "You are a financial document matching assistant.
Your task is to match a purchase order to goods received notes by comparing line items.
A PO can have multiple deliveries confirmed by separate GRNs.
Focus on: line item descriptions, quantities ordered vs quantities received, PO references on the GRNs, and delivery dates. A partial match (some items delivered) is still a valid match."
    :task "Match all candidates that share relevant line items with the source purchase order.
Multiple matches are expected and encouraged when justified. Partial matches count."}})


(defn build-match-prompt
  "Build LLM prompt for match decision.

   Returns a map with :system and :user keys. The system message provides
   pair-specific role context, the user message contains the source document,
   candidates, and pair-specific task instructions.

   Arguments:
     source-doc      - Source document with `:document/type` and `:document/structured-data`
     candidate-type  - String type of the candidate documents (e.g. \"purchase-order\")
     candidates      - Vector of `{:doc :score :evidence}`"
  [source-doc candidate-type candidates]
  (let [source-type (:document/type source-doc)
        {:keys [system task]} (get pair-prompts
                                   [source-type candidate-type]
                                   {:system "You are a document matching assistant for financial documents.
Determine which candidate document(s) match the source document.
Consider: supplier identity, amounts, dates, reference numbers, and cross-references.
Be conservative - only confirm matches you are confident about."
                                    :task "For each candidate, determine whether it genuinely belongs to the same business transaction as the source document."})]
    {:system system
     :user   (str "## Source Document\n"
                  (format-document-summary source-doc)
                  "\n\n## Candidates\n\n"
                  (format-candidates candidates)
                  "\n\n## Task\n"
                  task
                  "\n\nReturn JSON with only the candidates that match:\n"
                  "{\"matches\": [{\"candidate\": <1-indexed>, \"confidence\": \"high\"|\"medium\"|\"low\", \"reasoning\": \"...\"}]}\n\n"
                  "If none match confidently, return {\"matches\": []}")}))

Step 4: Run tests to verify they pass

Run: clj -X:test:silent :nses '[com.getorcha.workers.matching.llm-decision-test]' Expected: PASS

Step 5: Lint

Run: clj-kondo --lint src/com/getorcha/workers/matching/llm_decision.clj test/com/getorcha/workers/matching/llm_decision_test.clj Expected: No errors

Step 6: Commit

git add src/com/getorcha/workers/matching/llm_decision.clj test/com/getorcha/workers/matching/llm_decision_test.clj
git commit -m "feat: add pair-specific LLM matching prompts to build-match-prompt"

Task 2: Thread `candidate-type` through `llm-match-decision`

Files:

Modify: src/com/getorcha/workers/matching/llm_decision.clj:255-275
Test: test/com/getorcha/workers/matching/llm_decision_test.clj

Step 1: Update existing tests for the new signature

llm-match-decision changes from [llm-config source-doc candidates] to [llm-config source-doc candidate-type candidates].

In llm-decision-test.clj, update the llm-match-decision-retries-transient-errors-test:

;; Line 244-248: change from
(llm-decision/llm-match-decision
  {:provider :anthropic :api-key "k" :model "m"}
  #:document{:type "invoice" :structured-data {}}
  [{:doc #:document{:type "purchase-order" :structured-data {}} :score 0.6 :evidence []}])
;; to
(llm-decision/llm-match-decision
  {:provider :anthropic :api-key "k" :model "m"}
  #:document{:type "invoice" :structured-data {}}
  "purchase-order"
  [{:doc #:document{:type "purchase-order" :structured-data {}} :score 0.6 :evidence []}])

Apply the same change to all 3 llm-match-decision calls in that deftest (lines 244, 261, 278 approximately).

Step 2: Run tests to verify they fail

Run: clj -X:test:silent :nses '[com.getorcha.workers.matching.llm-decision-test]' Expected: FAIL — wrong arity

Step 3: Update llm-match-decision signature

In llm_decision.clj, change lines 255-275:

(defn llm-match-decision
  "Ask LLM to decide which candidates match the source document.
   Retries up to 3 times on transient errors with exponential backoff.
   Throws on non-transient errors or when all retries are exhausted.

   Arguments:
     llm-config      - LLM provider config map (passed to `llm/generate`)
     source-doc      - Source document DB row with `:document/type`, `:document/structured-data`
     candidate-type  - String type of the candidate documents
     candidates      - Vector of `{:doc :score :evidence}`

   Returns `{:matches [{:candidate :confidence :reasoning}]
             :input-tokens N :output-tokens N :model string}`"
  [llm-config source-doc candidate-type candidates]
  (let [{:keys [system user]}                        (build-match-prompt source-doc candidate-type candidates)
        prompt                                       (str system "\n\n" user)
        {:keys [text input-tokens output-tokens model]}
        (with-llm-retry #(llm/generate llm-config prompt) 3)]
    (assoc (parse-llm-response text)
           :input-tokens  input-tokens
           :output-tokens output-tokens
           :model         model)))

Step 4: Run tests to verify they pass

Run: clj -X:test:silent :nses '[com.getorcha.workers.matching.llm-decision-test]' Expected: PASS

Step 5: Commit

git add src/com/getorcha/workers/matching/llm_decision.clj test/com/getorcha/workers/matching/llm_decision_test.clj
git commit -m "feat: thread candidate-type through llm-match-decision"

Task 3: Thread `candidate-type` through `decide-matches` in core.clj

Files:

Modify: src/com/getorcha/workers/matching/core.clj:72-95
Test: test/com/getorcha/workers/matching/core_test.clj

Step 1: Update existing tests for the new signature

decide-matches changes from [llm-config source-doc candidates] to [llm-config source-doc candidate-type candidates].

In core_test.clj, update all calls to decide-matches. The existing tests use invoice→purchase-order, so add "purchase-order" as the third arg. Here is the full list of call sites:

;; Line 49: (matching/decide-matches nil nil candidates)
;; becomes:  (matching/decide-matches nil nil "purchase-order" candidates)

;; Line 55: (matching/decide-matches nil nil [])
;; becomes:  (matching/decide-matches nil nil "purchase-order" [])

;; Line 61: (matching/decide-matches nil nil candidates)
;; becomes:  (matching/decide-matches nil nil "purchase-order" candidates)

;; Line 70: (matching/decide-matches nil nil candidates)
;; becomes:  (matching/decide-matches nil nil "purchase-order" candidates)

;; Line 82: (matching/decide-matches llm-config nil candidates)
;; becomes:  (matching/decide-matches llm-config nil "purchase-order" candidates)

;; Line 97: (matching/decide-matches llm-config source-doc candidates)
;; becomes:  (matching/decide-matches llm-config source-doc "purchase-order" candidates)

;; Line 103: (matching/decide-matches nil nil candidates)
;; becomes:  (matching/decide-matches nil nil "purchase-order" candidates)

;; Line 126: (matching/decide-matches llm-config source-doc candidates)
;; becomes:  (matching/decide-matches llm-config source-doc "purchase-order" candidates)

;; Line 138: (matching/decide-matches llm-config source-doc candidates)
;; becomes:  (matching/decide-matches llm-config source-doc "purchase-order" candidates)

;; Line 154: (matching/decide-matches llm-config {} candidates)
;; becomes:  (matching/decide-matches llm-config {} "purchase-order" candidates)

Also update the with-redefs calls to llm-match-decision in lines 95-96 and 120-121. The mock functions need to accept the new candidate-type arg:

;; Line 95-96: change from
(fn [_cfg _src _cands] llm-result)
;; to
(fn [_cfg _src _candidate-type _cands] llm-result)

;; Line 120-121: change from
(fn [_cfg _src cands] ...)
;; to
(fn [_cfg _src _candidate-type cands] ...)

;; Line 136-137: change from
(fn [_cfg _src _cands] llm-result)
;; to
(fn [_cfg _src _candidate-type _cands] llm-result)

;; Line 150-151: change from
(fn [_ _ cands] ...)
;; to
(fn [_ _ _candidate-type cands] ...)

Step 2: Run tests to verify they fail

Run: clj -X:test:silent :nses '[com.getorcha.workers.matching.core-test]' Expected: FAIL — wrong arity

Step 3: Update decide-matches in core.clj

Change the signature at line 72 and the LLM call at line 93:

(defn decide-matches
  "Decide which candidates to match based on scores.

   Two-tier approach:
   - All candidates >= high threshold → auto-match (rule-based)
   - Candidates in uncertain zone (low..high) → LLM decides (if config provided)

   Both tiers produce matches simultaneously. Uncertain-zone candidates
   are capped at `max-uncertain-for-llm` (sorted by score descending).

   Returns seq of `{:doc :score :evidence :match-method}`."
  [llm-config source-doc candidate-type candidates]
  (when (seq candidates)
    (let [high-threshold (:high evidence/match-thresholds)
          {high true uncertain false} (group-by #(>= (:score %) high-threshold) candidates)
          rule-matches   (mapv #(assoc % :match-method "rule-based") high)
          llm-matches    (when (and llm-config (seq uncertain))
                           (let [batch (->> uncertain
                                            (sort-by :score >)
                                            (take max-uncertain-for-llm))]
                             (resolve-llm-matches
                              (llm-decision/llm-match-decision llm-config source-doc candidate-type batch)
                              batch)))]
      (into rule-matches llm-matches))))

Then update the call site in match-document! at line 212. Currently:

(let [matches (decide-matches llm-config doc type-candidates)]

Change to:

(let [matches (decide-matches llm-config doc doc-type type-candidates)]

(doc-type is already bound from the map destructuring [doc-type type-candidates] at line 205.)

Step 4: Run tests to verify they pass

Run: clj -X:test:silent :nses '[com.getorcha.workers.matching.core-test]' Expected: PASS

Step 5: Lint both files

Run: clj-kondo --lint src/com/getorcha/workers/matching/core.clj test/com/getorcha/workers/matching/core_test.clj Expected: No errors

Step 6: Commit

git add src/com/getorcha/workers/matching/core.clj test/com/getorcha/workers/matching/core_test.clj
git commit -m "feat: thread candidate-type through decide-matches to LLM"

Task 4: Run full test suite

Step 1: Run all matching tests

Run: clj -X:test:silent :nses '[com.getorcha.workers.matching.llm-decision-test com.getorcha.workers.matching.core-test]' Expected: All pass

Step 2: Lint all changed files

Run: clj-kondo --lint src/com/getorcha/workers/matching/llm_decision.clj src/com/getorcha/workers/matching/core.clj test/com/getorcha/workers/matching/llm_decision_test.clj test/com/getorcha/workers/matching/core_test.clj Expected: No errors

Pairing-Specific LLM Matching Prompts Implementation Plan

Task 1: Add pair-prompts map and update build-match-prompt

Task 2: Thread candidate-type through llm-match-decision

Task 3: Thread candidate-type through decide-matches in core.clj

Task 4: Run full test suite

Task 1: Add pair-prompts map and update `build-match-prompt`

Task 2: Thread `candidate-type` through `llm-match-decision`

Task 3: Thread `candidate-type` through `decide-matches` in core.clj