Note (2026-04-24): After this document was written, legal_entity was renamed to tenant and the old tenant was renamed to organization. Read references to these terms with the pre-rename meaning.

Debug Match Implementation Plan

For Claude: REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.

Goal: Create a bb debug:fetch-match-cluster task and a /debug-match skill that together enable debugging matching issues by fetching full cluster data from prod and delegating investigation to a subagent using systematic-debugging.

Architecture: Extract shared utilities from the existing debug_fetch_document.clj into debug_common.clj. Build a new debug_fetch_match_cluster.clj that queries prod for a document's entire cluster (docs, edges, reconciliation, ingestions) and inserts everything locally. The /debug-match skill handles argument parsing, local DB checks, fetching, context gathering, and subagent delegation.

Tech Stack: Babashka (bb tasks), nREPL (prod queries), HoneySQL, PostgreSQL, AWS SSM/S3

Task 1: Extract shared utilities into `scripts/debug_common.clj`

Files:

Create: scripts/debug_common.clj
Modify: scripts/debug_fetch_document.clj

Step 1: Create scripts/debug_common.clj

Extract the following from debug_fetch_document.clj into a new debug-common namespace:

(ns debug-common
  "Shared utilities for debug scripts that fetch data from production."
  (:require [babashka.process :as p]
            [bencode.core :as bencode]
            [cheshire.core :as json]
            [clojure.edn :as edn]
            [clojure.java.io :as io]
            [clojure.string :as str]
            [honey.sql :as sql])
  (:import [java.net Socket]
           [java.io PushbackInputStream]))

(set! *warn-on-reflection* true)

Move these verbatim (no logic changes):

Config constants:

local-db map
prod-profile, prod-region, prod-s3-bucket, prod-instance-name, prod-nrepl-port, local-nrepl-port
local-s3-bucket, local-aws-endpoint, local-aws-region
tmp-dir
dev-seed-legal-entity-id, dev-identity-id

SSM/nREPL functions:

get-instance-id (make public, remove defn-)
start-port-forward! (make public)
wait-for-port (make public)
bytes->str (make public)
nrepl-eval (make public)

S3 functions:

prod-s3-download! (make public)
local-s3-upload! (make public)

Local DB helpers:

unqualify-keys (make public)
cast-special-fields (make public)
document-jsonb-keys, document-enum-keys (make public)
ingestion-jsonb-keys, ingestion-enum-keys (make public)
export-audit-jsonb-keys, export-audit-enum-keys (make public)
document-exists? (make public)
delete-local-document! (make public)
insert-document! (make public)
insert-ingestions! (make public)
insert-export-audits! (make public)

UI helpers:

prompt-yes-no (make public)
format-status (make public)

Port-forwarding lifecycle helper — extract the SSM session pattern used in fetch-document! into a reusable function:

(defn with-port-forward
  "Start SSM port forwarding, wait for connection, execute body-fn, then clean up.
   body-fn receives no arguments. Calls System/exit on failure."
  [body-fn]
  (print "Getting EC2 instance ID... ")
  (flush)
  (let [instance-id (get-instance-id)]
    (when-not instance-id
      (println "FAILED")
      (println "Error: No running instance found")
      (System/exit 1))
    (println instance-id)
    (println "Starting SSM port forward...")
    (let [port-forward-process (start-port-forward! instance-id)]
      (try
        (print "Waiting for nREPL connection... ")
        (flush)
        (if (wait-for-port local-nrepl-port 30000)
          (println "connected")
          (do
            (println "TIMEOUT")
            (System/exit 1)))
        (body-fn)
        (finally
          (p/destroy port-forward-process))))))

S3 transfer helpers — extract the per-document download/upload pattern:

(defn download-document-files!
  "Download document file and all ingestion artifacts from prod S3."
  [{:keys [document ingestions]}]
  ;; Move the existing logic from debug_fetch_document.clj lines 390-411
  )

(defn upload-to-local-s3!
  "Upload all downloaded files to local S3. Returns number of failures."
  [{:keys [document ingestions]}]
  ;; Move the existing logic from debug_fetch_document.clj lines 414-441
  )

Step 2: Refactor debug_fetch_document.clj to use shared namespace

Replace all moved code with requires from debug-common:

(ns debug-fetch-document
  "Fetch a document from production for local debugging.
   ..."
  (:require [babashka.pods :as pods]
            [debug-common :as common]))

(set! *warn-on-reflection* true)

(pods/load-pod 'org.babashka/postgresql "0.1.2")
(require '[pod.babashka.postgresql :as pg])

Update all function calls to use common/ prefix (e.g., common/nrepl-eval, common/insert-document!, etc.).

The query-prod-document function and the main orchestration in fetch-document! stay in this file since they're specific to single-document fetching.

Refactor fetch-document! to use common/with-port-forward:

(defn- fetch-document!
  [document-id force?]
  (println)
  (println "=== Debug: Fetch Document from Production ===")
  (println)
  ;; existing local-exists check using common/document-exists?, common/delete-local-document!, common/prompt-yes-no
  (common/with-port-forward
   (fn []
     ;; existing query + download + insert + upload logic
     ;; using common/ prefixed functions
     )))

Step 3: Verify bb debug:fetch-document still works

Run: bb debug:fetch-document --help Expected: Usage message prints correctly (confirms namespace loading works).

Step 4: Commit

git add scripts/debug_common.clj scripts/debug_fetch_document.clj
git commit -m "refactor: extract shared debug utilities into debug_common.clj"

Task 2: Create `bb debug:fetch-match-cluster` script

Files:

Create: scripts/debug_fetch_match_cluster.clj
Modify: bb.edn (add task entry)

Step 1: Create scripts/debug_fetch_match_cluster.clj

(ns debug-fetch-match-cluster
  "Fetch a document's full match cluster from production for local debugging.

   Downloads the document, all other documents in its cluster, match edges,
   cluster reconciliation data, and ingestion history. Inserts everything
   into local DB and uploads files to local S3.

   Usage:
     bb debug:fetch-match-cluster <document-id>"
  (:require [babashka.pods :as pods]
            [cheshire.core :as json]
            [clojure.string :as str]
            [debug-common :as common]
            [honey.sql :as sql])
  (:import [java.util UUID]))

(set! *warn-on-reflection* true)

(pods/load-pod 'org.babashka/postgresql "0.1.2")
(require '[pod.babashka.postgresql :as pg])

Step 2: Write the prod nREPL query function

This is the core query that fetches the full cluster from prod:

(defn- query-prod-cluster
  "Query a document's full match cluster from prod DB via nREPL.
   Returns map with :document, :cluster, :cluster-docs, :match-edges, :ingestions.
   Returns nil if document not found."
  [document-id]
  (common/nrepl-eval
   (format
    "(do
       (require '[com.getorcha.repl :as repl])
       (require '[com.getorcha.db.sql :as db.sql])
       (let [document-id (parse-uuid \"%s\")
             db-pool     (repl/db-pool)
             doc         (db.sql/execute-one!
                           db-pool
                           {:select [:*]
                            :from   [:document]
                            :where  [:= :id document-id]})]
         (when doc
           (let [cluster-id (:document/cluster-id doc)]
             (if cluster-id
               (let [cluster-docs (db.sql/execute!
                                    db-pool
                                    {:select [:*]
                                     :from   [:document]
                                     :where  [:= :cluster-id cluster-id]})
                     cluster      (db.sql/execute-one!
                                    db-pool
                                    {:select [:*]
                                     :from   [:document-cluster]
                                     :where  [:= :id cluster-id]})
                     match-edges  (db.sql/execute!
                                    db-pool
                                    {:select   [:document-match.*]
                                     :from     [:document-match]
                                     :join     [[:document :da] [:= :document-match.document-a-id :da.id]
                                                [:document :db] [:= :document-match.document-b-id :db.id]]
                                     :where    [:and
                                                [:= :da.cluster-id cluster-id]
                                                [:= :db.cluster-id cluster-id]]})
                     doc-ids      (mapv :document/id cluster-docs)
                     ingestions   (when (seq doc-ids)
                                   (group-by :ingestion/document-id
                                             (db.sql/execute!
                                               db-pool
                                               {:select   [:*]
                                                :from     [:ingestion]
                                                :where    [:in :document-id doc-ids]
                                                :order-by [[:created-at :asc]]})))]
                 {:document     doc
                  :cluster      cluster
                  :cluster-docs cluster-docs
                  :match-edges  match-edges
                  :ingestions   ingestions})
               ;; No cluster — just the document and its ingestions
               {:document     doc
                :cluster      nil
                :cluster-docs [doc]
                :match-edges  []
                :ingestions   {(:document/id doc)
                               (db.sql/execute!
                                 db-pool
                                 {:select   [:*]
                                  :from     [:ingestion]
                                  :where    [:= :document-id document-id]
                                  :order-by [[:created-at :asc]]})}})))))"
    document-id)))

Step 3: Write local insert functions for cluster-specific data

;; JSONB columns for document_match
(def ^:private match-jsonb-keys #{:evidence})

;; JSONB columns for document_cluster
(def ^:private cluster-jsonb-keys #{:reconciliation})


(defn- insert-cluster!
  "Insert a document_cluster row into local DB."
  [cluster]
  (let [data (-> cluster
                 common/unqualify-keys
                 (->> (common/cast-special-fields cluster-jsonb-keys {})))]
    (pg/execute! common/local-db
                 (sql/format {:insert-into :document-cluster
                              :values      [data]}))))


(defn- insert-match-edges!
  "Insert all document_match rows into local DB."
  [edges]
  (when (seq edges)
    (let [edge-data (->> edges
                         (map common/unqualify-keys)
                         (map #(common/cast-special-fields match-jsonb-keys {} %)))]
      (pg/execute! common/local-db
                   (sql/format {:insert-into :document-match
                                :values      edge-data})))))


(defn- set-cluster-ids!
  "Set cluster_id on documents in local DB."
  [document-ids cluster-id]
  (pg/execute! common/local-db
               (sql/format {:update :document
                            :set    {:cluster-id cluster-id}
                            :where  [:in :id document-ids]})))

Step 4: Write the main orchestration function

(defn- handle-existing-documents!
  "Check for existing documents and prompt for replacement.
   Returns set of document IDs that should be skipped (already exist and user declined)."
  [cluster-docs force?]
  (let [existing (filter #(common/document-exists? (:document/id %)) cluster-docs)]
    (when (seq existing)
      (println (format "Found %d document(s) already in local DB:" (count existing)))
      (doseq [doc existing]
        (println (format "  - %s (%s)" (:document/id doc) (:document/file-original-name doc))))
      (if force?
        (do
          (println "Deleting existing documents (--force)...")
          (doseq [doc existing]
            (common/delete-local-document! (:document/id doc))))
        (if (common/prompt-yes-no "Replace all existing documents?")
          (do
            (println "Deleting existing documents...")
            (doseq [doc existing]
              (common/delete-local-document! (:document/id doc))))
          (do
            (println "Aborted.")
            (System/exit 0)))))))


(defn- fetch-match-cluster!
  "Main function to fetch a document's match cluster from production."
  [document-id force?]
  (println)
  (println "=== Debug: Fetch Match Cluster from Production ===")
  (println)

  (common/with-port-forward
   (fn []
     (print "Querying production database for cluster... ")
     (flush)
     (let [data (query-prod-cluster document-id)]
       (when-not data
         (println "NOT FOUND")
         (println "Error: Document not found in production database")
         (System/exit 1))
       (println "done")

       (let [{:keys [document cluster cluster-docs match-edges ingestions]} data]
         ;; Print summary
         (println)
         (println (format "Document: %s (%s)"
                          (:document/id document)
                          (:document/file-original-name document)))
         (if cluster
           (do
             (println (format "Cluster: %s (%d documents, %d match edges)"
                              (:document-cluster/id cluster)
                              (count cluster-docs)
                              (count match-edges)))
             (when (:document-cluster/reconciliation cluster)
               (println "  Reconciliation: present"))
             (println)
             (println "Documents in cluster:")
             (doseq [doc cluster-docs]
               (println (format "  - %s  %s  %s"
                                (:document/id doc)
                                (or (:document/type doc) "unknown")
                                (or (:document/file-original-name doc) "")))))
           (println "No cluster (unmatched document)"))
         (println)

         ;; Handle existing documents
         (handle-existing-documents! cluster-docs force?)

         ;; Download files from prod S3
         (println)
         (println "Downloading from production S3...")
         (doseq [doc cluster-docs]
           (let [doc-ingestions (get ingestions (:document/id doc) [])]
             (common/download-document-files! {:document   doc
                                               :ingestions doc-ingestions})))

         ;; Insert into local DB
         (println)
         (println "Inserting into local database...")

         ;; Insert cluster first (documents reference it)
         (when cluster
           (insert-cluster! cluster)
           (println "  Cluster inserted"))

         ;; Insert documents
         (doseq [doc cluster-docs]
           (common/insert-document! doc common/dev-seed-legal-entity-id)
           (println (format "  Document %s inserted" (:document/id doc))))

         ;; Set cluster_id on documents
         (when cluster
           (set-cluster-ids! (mapv :document/id cluster-docs)
                             (:document-cluster/id cluster))
           (println "  Cluster IDs set"))

         ;; Insert ingestions
         (doseq [[doc-id doc-ingestions] ingestions
                 :when (seq doc-ingestions)]
           (common/insert-ingestions! doc-ingestions)
           (println (format "  %d ingestion(s) for %s" (count doc-ingestions) doc-id)))

         ;; Insert match edges
         (when (seq match-edges)
           (insert-match-edges! match-edges)
           (println (format "  %d match edge(s) inserted" (count match-edges))))

         ;; Upload to local S3
         (println)
         (println "Uploading to local S3...")
         (let [total-failures (atom 0)]
           (doseq [doc cluster-docs]
             (let [doc-ingestions (get ingestions (:document/id doc) [])
                   failures      (common/upload-to-local-s3! {:document   doc
                                                              :ingestions doc-ingestions})]
               (swap! total-failures + failures)))
           (when (pos? @total-failures)
             (println)
             (println (format "ERROR: %d file(s) failed to upload to local S3" @total-failures))
             (println "Check that LocalStack is running: bb dev:status")
             (System/exit 1)))

         ;; Done
         (println)
         (println "=== Done ===")
         (println)
         (println (format "Cluster with %d document(s) and %d match edge(s) fetched successfully."
                          (count cluster-docs) (count match-edges))))))))


(defn -main
  [& args]
  (let [force? (some #{"--force" "-f"} args)
        args   (remove #{"--force" "-f"} args)]
    (when (or (empty? args) (some #{"--help" "-h"} args))
      (println "Usage: bb debug:fetch-match-cluster [OPTIONS] <document-id>")
      (println)
      (println "Fetches a document's full match cluster from production:")
      (println "all documents in the cluster, match edges, reconciliation data,")
      (println "and ingestion history. Inserts into local DB and uploads files to local S3.")
      (println)
      (println "Options:")
      (println "  -f, --force  Replace existing documents without prompting")
      (System/exit (if (empty? args) 1 0)))

    (let [document-id (first args)]
      (when-not (try (parse-uuid document-id) (catch Exception _ nil))
        (println "Error: Invalid document ID (must be a valid UUID)")
        (System/exit 1))
      (fetch-match-cluster! document-id force?))))

Step 5: Add bb task to bb.edn

Add after the existing debug:fetch-document task:

  debug:fetch-match-cluster
  {:doc      "Fetch a document's match cluster from production: bb debug:fetch-match-cluster <document-id>"
   :requires ([debug-fetch-match-cluster])
   :task     (apply debug-fetch-match-cluster/-main *command-line-args*)}

Step 6: Verify script loads

Run: bb debug:fetch-match-cluster --help Expected: Usage message prints correctly.

Step 7: Commit

git add scripts/debug_fetch_match_cluster.clj bb.edn
git commit -m "feat: add debug:fetch-match-cluster bb task"

Task 3: Create the `/debug-match` skill

Files:

Create: .claude/skills/debug-match/SKILL.md

Step 1: Write the skill file

---
name: debug-match
description: Debug matching errors. Fetches cluster data from prod, then investigates using systematic-debugging.
---

# Debug Match Skill

Investigate matching issues: wrong matches, missing matches, failed matching pipeline, bad reconciliation.

## Arguments

/debug-match [problem description] /debug-match [problem description]


- One doc ID: inspect the document's existing match cluster
- Two doc IDs: investigate why two documents didn't match, or why one matched wrong

UUIDs are detected by format (8-4-4-4-12 hex pattern). Everything else is the problem description.

## Step 1: Check Local Database First

For each document ID, check if it exists locally and has cluster data:

```bash
psql -h localhost -U postgres -d orcha -c "SELECT id, type, file_original_name, cluster_id, matching_status, matching_error FROM document WHERE id = '<doc-id>'" -x

If the document exists AND has a cluster_id, check if the cluster is fully present:

psql -h localhost -U postgres -d orcha -c "SELECT count(*) FROM document WHERE cluster_id = (SELECT cluster_id FROM document WHERE id = '<doc-id>')"

If the document exists locally with its cluster data, skip to Step 3.

Step 2: Fetch from Production (if needed)

For each document ID whose cluster isn't locally available, fetch from production:

bb debug:fetch-match-cluster <doc-id>

If you get an authentication error (e.g., "No running instance found" or credentials expired), tell the user to run:

aws sso login --profile orcha-prod

Then retry. For two-doc scenarios, run the command once per doc ID.

Step 3: Gather Context from Local DB

Query the local database to gather the full picture. Run these queries and collect the results:

Documents in the cluster:

psql -h localhost -U postgres -d orcha -c "SELECT id, type, file_original_name, matching_status, matching_error, matching_attempts, normalized_counterparty, normalized_references, cluster_id, created_at FROM document WHERE cluster_id = (SELECT cluster_id FROM document WHERE id = '<doc-id>') ORDER BY created_at" -x

Match edges:

psql -h localhost -U postgres -d orcha -c "SELECT dm.document_a_id, dm.document_b_id, dm.blended_score, dm.llm_confidence, dm.match_method, dm.evidence, dm.created_at FROM document_match dm JOIN document da ON dm.document_a_id = da.id JOIN document db ON dm.document_b_id = db.id WHERE da.cluster_id = (SELECT cluster_id FROM document WHERE id = '<doc-id>') AND db.cluster_id = da.cluster_id ORDER BY dm.blended_score DESC" -x

Cluster reconciliation:

psql -h localhost -U postgres -d orcha -c "SELECT dc.id, dc.reconciliation, dc.reconciled_at FROM document_cluster dc JOIN document d ON d.cluster_id = dc.id WHERE d.id = '<doc-id>'" -x

For two-doc (false negative) scenarios, also gather data for the second document's cluster (if any), and check the normalized fields on both documents to understand why they didn't match:

psql -h localhost -U postgres -d orcha -c "SELECT id, normalized_counterparty, normalized_references, searchable_text, type FROM document WHERE id IN ('<doc-id-1>', '<doc-id-2>')" -x

Step 4: Delegate to Subagent

Spawn an orcha-workers subagent with the gathered data. The subagent prompt must include:

All gathered cluster data (documents, edges with scores/evidence, reconciliation)
The user's problem description
The document ID(s) provided
Whether this is a one-doc (inspect cluster) or two-doc (false negative) scenario
Instruction to invoke and follow the systematic-debugging skill for root cause investigation
Instruction to read matching source code as needed (key files: src/com/getorcha/workers/matching/core.clj, candidates.clj, evidence.clj, llm_decision.clj, reconciliation.clj, src/com/getorcha/db/document_matching.clj)

The skill does NOT prescribe what the subagent investigates — it provides context and enforces methodology. The subagent adapts to the specific problem.


**Step 2: Commit**

git add .claude/skills/debug-match/SKILL.md git commit -m "feat: add debug-match skill"


---

### Task 4: Manual end-to-end test

This cannot be automated (requires prod access). Verify the full flow works:

**Step 1: Test `bb debug:fetch-match-cluster` with a known clustered document**

Pick a document ID from production that is in a cluster. Run:

```bash
bb debug:fetch-match-cluster <doc-id>

Verify:

Cluster summary prints correctly (document count, edge count, reconciliation presence)
All documents are inserted into local DB
Cluster row exists with reconciliation data
Match edges are present
S3 files are downloaded and uploaded

Step 2: Verify local data integrity

psql -h localhost -U postgres -d orcha -c "SELECT id, type, cluster_id FROM document WHERE cluster_id IS NOT NULL ORDER BY cluster_id, created_at"
psql -h localhost -U postgres -d orcha -c "SELECT * FROM document_match LIMIT 5" -x
psql -h localhost -U postgres -d orcha -c "SELECT id, reconciliation IS NOT NULL as has_reconciliation FROM document_cluster"

Step 3: Test the skill

Run /debug-match <doc-id> describe the problem here and verify:

Local DB check happens first
Context is gathered correctly
Subagent is spawned with proper data and invokes systematic-debugging

Step 4: Test two-doc scenario

Run /debug-match <doc-id-1> <doc-id-2> these should have matched but didn't and verify:

Both clusters are fetched (or noted as unmatched)
Normalized fields for both docs are gathered
Subagent receives both-doc context

Step 5: Test bb debug:fetch-document still works

Run: bb debug:fetch-document --help Verify the refactored script still functions correctly.

Debug Match Implementation Plan

Task 1: Extract shared utilities into scripts/debug_common.clj

Task 2: Create bb debug:fetch-match-cluster script

Task 3: Create the /debug-match skill

Step 2: Fetch from Production (if needed)

Step 3: Gather Context from Local DB

Step 4: Delegate to Subagent

Task 1: Extract shared utilities into `scripts/debug_common.clj`

Task 2: Create `bb debug:fetch-match-cluster` script

Task 3: Create the `/debug-match` skill