Note (2026-04-24): After this document was written,
legal_entitywas renamed totenantand the oldtenantwas renamed toorganization. Read references to these terms with the pre-rename meaning.
For Claude: REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
Goal: Create a bb debug:fetch-match-cluster task and a /debug-match skill that together enable debugging matching issues by fetching full cluster data from prod and delegating investigation to a subagent using systematic-debugging.
Architecture: Extract shared utilities from the existing debug_fetch_document.clj into debug_common.clj. Build a new debug_fetch_match_cluster.clj that queries prod for a document's entire cluster (docs, edges, reconciliation, ingestions) and inserts everything locally. The /debug-match skill handles argument parsing, local DB checks, fetching, context gathering, and subagent delegation.
Tech Stack: Babashka (bb tasks), nREPL (prod queries), HoneySQL, PostgreSQL, AWS SSM/S3
scripts/debug_common.cljFiles:
scripts/debug_common.cljscripts/debug_fetch_document.cljStep 1: Create scripts/debug_common.clj
Extract the following from debug_fetch_document.clj into a new debug-common namespace:
(ns debug-common
"Shared utilities for debug scripts that fetch data from production."
(:require [babashka.process :as p]
[bencode.core :as bencode]
[cheshire.core :as json]
[clojure.edn :as edn]
[clojure.java.io :as io]
[clojure.string :as str]
[honey.sql :as sql])
(:import [java.net Socket]
[java.io PushbackInputStream]))
(set! *warn-on-reflection* true)
Move these verbatim (no logic changes):
Config constants:
local-db mapprod-profile, prod-region, prod-s3-bucket, prod-instance-name, prod-nrepl-port, local-nrepl-portlocal-s3-bucket, local-aws-endpoint, local-aws-regiontmp-dirdev-seed-legal-entity-id, dev-identity-idSSM/nREPL functions:
get-instance-id (make public, remove defn-)start-port-forward! (make public)wait-for-port (make public)bytes->str (make public)nrepl-eval (make public)S3 functions:
prod-s3-download! (make public)local-s3-upload! (make public)Local DB helpers:
unqualify-keys (make public)cast-special-fields (make public)document-jsonb-keys, document-enum-keys (make public)ingestion-jsonb-keys, ingestion-enum-keys (make public)export-audit-jsonb-keys, export-audit-enum-keys (make public)document-exists? (make public)delete-local-document! (make public)insert-document! (make public)insert-ingestions! (make public)insert-export-audits! (make public)UI helpers:
prompt-yes-no (make public)format-status (make public)Port-forwarding lifecycle helper — extract the SSM session pattern used in fetch-document! into a reusable function:
(defn with-port-forward
"Start SSM port forwarding, wait for connection, execute body-fn, then clean up.
body-fn receives no arguments. Calls System/exit on failure."
[body-fn]
(print "Getting EC2 instance ID... ")
(flush)
(let [instance-id (get-instance-id)]
(when-not instance-id
(println "FAILED")
(println "Error: No running instance found")
(System/exit 1))
(println instance-id)
(println "Starting SSM port forward...")
(let [port-forward-process (start-port-forward! instance-id)]
(try
(print "Waiting for nREPL connection... ")
(flush)
(if (wait-for-port local-nrepl-port 30000)
(println "connected")
(do
(println "TIMEOUT")
(System/exit 1)))
(body-fn)
(finally
(p/destroy port-forward-process))))))
S3 transfer helpers — extract the per-document download/upload pattern:
(defn download-document-files!
"Download document file and all ingestion artifacts from prod S3."
[{:keys [document ingestions]}]
;; Move the existing logic from debug_fetch_document.clj lines 390-411
)
(defn upload-to-local-s3!
"Upload all downloaded files to local S3. Returns number of failures."
[{:keys [document ingestions]}]
;; Move the existing logic from debug_fetch_document.clj lines 414-441
)
Step 2: Refactor debug_fetch_document.clj to use shared namespace
Replace all moved code with requires from debug-common:
(ns debug-fetch-document
"Fetch a document from production for local debugging.
..."
(:require [babashka.pods :as pods]
[debug-common :as common]))
(set! *warn-on-reflection* true)
(pods/load-pod 'org.babashka/postgresql "0.1.2")
(require '[pod.babashka.postgresql :as pg])
Update all function calls to use common/ prefix (e.g., common/nrepl-eval, common/insert-document!, etc.).
The query-prod-document function and the main orchestration in fetch-document! stay in this file since they're specific to single-document fetching.
Refactor fetch-document! to use common/with-port-forward:
(defn- fetch-document!
[document-id force?]
(println)
(println "=== Debug: Fetch Document from Production ===")
(println)
;; existing local-exists check using common/document-exists?, common/delete-local-document!, common/prompt-yes-no
(common/with-port-forward
(fn []
;; existing query + download + insert + upload logic
;; using common/ prefixed functions
)))
Step 3: Verify bb debug:fetch-document still works
Run: bb debug:fetch-document --help
Expected: Usage message prints correctly (confirms namespace loading works).
Step 4: Commit
git add scripts/debug_common.clj scripts/debug_fetch_document.clj
git commit -m "refactor: extract shared debug utilities into debug_common.clj"
bb debug:fetch-match-cluster scriptFiles:
scripts/debug_fetch_match_cluster.cljbb.edn (add task entry)Step 1: Create scripts/debug_fetch_match_cluster.clj
(ns debug-fetch-match-cluster
"Fetch a document's full match cluster from production for local debugging.
Downloads the document, all other documents in its cluster, match edges,
cluster reconciliation data, and ingestion history. Inserts everything
into local DB and uploads files to local S3.
Usage:
bb debug:fetch-match-cluster <document-id>"
(:require [babashka.pods :as pods]
[cheshire.core :as json]
[clojure.string :as str]
[debug-common :as common]
[honey.sql :as sql])
(:import [java.util UUID]))
(set! *warn-on-reflection* true)
(pods/load-pod 'org.babashka/postgresql "0.1.2")
(require '[pod.babashka.postgresql :as pg])
Step 2: Write the prod nREPL query function
This is the core query that fetches the full cluster from prod:
(defn- query-prod-cluster
"Query a document's full match cluster from prod DB via nREPL.
Returns map with :document, :cluster, :cluster-docs, :match-edges, :ingestions.
Returns nil if document not found."
[document-id]
(common/nrepl-eval
(format
"(do
(require '[com.getorcha.repl :as repl])
(require '[com.getorcha.db.sql :as db.sql])
(let [document-id (parse-uuid \"%s\")
db-pool (repl/db-pool)
doc (db.sql/execute-one!
db-pool
{:select [:*]
:from [:document]
:where [:= :id document-id]})]
(when doc
(let [cluster-id (:document/cluster-id doc)]
(if cluster-id
(let [cluster-docs (db.sql/execute!
db-pool
{:select [:*]
:from [:document]
:where [:= :cluster-id cluster-id]})
cluster (db.sql/execute-one!
db-pool
{:select [:*]
:from [:document-cluster]
:where [:= :id cluster-id]})
match-edges (db.sql/execute!
db-pool
{:select [:document-match.*]
:from [:document-match]
:join [[:document :da] [:= :document-match.document-a-id :da.id]
[:document :db] [:= :document-match.document-b-id :db.id]]
:where [:and
[:= :da.cluster-id cluster-id]
[:= :db.cluster-id cluster-id]]})
doc-ids (mapv :document/id cluster-docs)
ingestions (when (seq doc-ids)
(group-by :ingestion/document-id
(db.sql/execute!
db-pool
{:select [:*]
:from [:ingestion]
:where [:in :document-id doc-ids]
:order-by [[:created-at :asc]]})))]
{:document doc
:cluster cluster
:cluster-docs cluster-docs
:match-edges match-edges
:ingestions ingestions})
;; No cluster — just the document and its ingestions
{:document doc
:cluster nil
:cluster-docs [doc]
:match-edges []
:ingestions {(:document/id doc)
(db.sql/execute!
db-pool
{:select [:*]
:from [:ingestion]
:where [:= :document-id document-id]
:order-by [[:created-at :asc]]})}})))))"
document-id)))
Step 3: Write local insert functions for cluster-specific data
;; JSONB columns for document_match
(def ^:private match-jsonb-keys #{:evidence})
;; JSONB columns for document_cluster
(def ^:private cluster-jsonb-keys #{:reconciliation})
(defn- insert-cluster!
"Insert a document_cluster row into local DB."
[cluster]
(let [data (-> cluster
common/unqualify-keys
(->> (common/cast-special-fields cluster-jsonb-keys {})))]
(pg/execute! common/local-db
(sql/format {:insert-into :document-cluster
:values [data]}))))
(defn- insert-match-edges!
"Insert all document_match rows into local DB."
[edges]
(when (seq edges)
(let [edge-data (->> edges
(map common/unqualify-keys)
(map #(common/cast-special-fields match-jsonb-keys {} %)))]
(pg/execute! common/local-db
(sql/format {:insert-into :document-match
:values edge-data})))))
(defn- set-cluster-ids!
"Set cluster_id on documents in local DB."
[document-ids cluster-id]
(pg/execute! common/local-db
(sql/format {:update :document
:set {:cluster-id cluster-id}
:where [:in :id document-ids]})))
Step 4: Write the main orchestration function
(defn- handle-existing-documents!
"Check for existing documents and prompt for replacement.
Returns set of document IDs that should be skipped (already exist and user declined)."
[cluster-docs force?]
(let [existing (filter #(common/document-exists? (:document/id %)) cluster-docs)]
(when (seq existing)
(println (format "Found %d document(s) already in local DB:" (count existing)))
(doseq [doc existing]
(println (format " - %s (%s)" (:document/id doc) (:document/file-original-name doc))))
(if force?
(do
(println "Deleting existing documents (--force)...")
(doseq [doc existing]
(common/delete-local-document! (:document/id doc))))
(if (common/prompt-yes-no "Replace all existing documents?")
(do
(println "Deleting existing documents...")
(doseq [doc existing]
(common/delete-local-document! (:document/id doc))))
(do
(println "Aborted.")
(System/exit 0)))))))
(defn- fetch-match-cluster!
"Main function to fetch a document's match cluster from production."
[document-id force?]
(println)
(println "=== Debug: Fetch Match Cluster from Production ===")
(println)
(common/with-port-forward
(fn []
(print "Querying production database for cluster... ")
(flush)
(let [data (query-prod-cluster document-id)]
(when-not data
(println "NOT FOUND")
(println "Error: Document not found in production database")
(System/exit 1))
(println "done")
(let [{:keys [document cluster cluster-docs match-edges ingestions]} data]
;; Print summary
(println)
(println (format "Document: %s (%s)"
(:document/id document)
(:document/file-original-name document)))
(if cluster
(do
(println (format "Cluster: %s (%d documents, %d match edges)"
(:document-cluster/id cluster)
(count cluster-docs)
(count match-edges)))
(when (:document-cluster/reconciliation cluster)
(println " Reconciliation: present"))
(println)
(println "Documents in cluster:")
(doseq [doc cluster-docs]
(println (format " - %s %s %s"
(:document/id doc)
(or (:document/type doc) "unknown")
(or (:document/file-original-name doc) "")))))
(println "No cluster (unmatched document)"))
(println)
;; Handle existing documents
(handle-existing-documents! cluster-docs force?)
;; Download files from prod S3
(println)
(println "Downloading from production S3...")
(doseq [doc cluster-docs]
(let [doc-ingestions (get ingestions (:document/id doc) [])]
(common/download-document-files! {:document doc
:ingestions doc-ingestions})))
;; Insert into local DB
(println)
(println "Inserting into local database...")
;; Insert cluster first (documents reference it)
(when cluster
(insert-cluster! cluster)
(println " Cluster inserted"))
;; Insert documents
(doseq [doc cluster-docs]
(common/insert-document! doc common/dev-seed-legal-entity-id)
(println (format " Document %s inserted" (:document/id doc))))
;; Set cluster_id on documents
(when cluster
(set-cluster-ids! (mapv :document/id cluster-docs)
(:document-cluster/id cluster))
(println " Cluster IDs set"))
;; Insert ingestions
(doseq [[doc-id doc-ingestions] ingestions
:when (seq doc-ingestions)]
(common/insert-ingestions! doc-ingestions)
(println (format " %d ingestion(s) for %s" (count doc-ingestions) doc-id)))
;; Insert match edges
(when (seq match-edges)
(insert-match-edges! match-edges)
(println (format " %d match edge(s) inserted" (count match-edges))))
;; Upload to local S3
(println)
(println "Uploading to local S3...")
(let [total-failures (atom 0)]
(doseq [doc cluster-docs]
(let [doc-ingestions (get ingestions (:document/id doc) [])
failures (common/upload-to-local-s3! {:document doc
:ingestions doc-ingestions})]
(swap! total-failures + failures)))
(when (pos? @total-failures)
(println)
(println (format "ERROR: %d file(s) failed to upload to local S3" @total-failures))
(println "Check that LocalStack is running: bb dev:status")
(System/exit 1)))
;; Done
(println)
(println "=== Done ===")
(println)
(println (format "Cluster with %d document(s) and %d match edge(s) fetched successfully."
(count cluster-docs) (count match-edges))))))))
(defn -main
[& args]
(let [force? (some #{"--force" "-f"} args)
args (remove #{"--force" "-f"} args)]
(when (or (empty? args) (some #{"--help" "-h"} args))
(println "Usage: bb debug:fetch-match-cluster [OPTIONS] <document-id>")
(println)
(println "Fetches a document's full match cluster from production:")
(println "all documents in the cluster, match edges, reconciliation data,")
(println "and ingestion history. Inserts into local DB and uploads files to local S3.")
(println)
(println "Options:")
(println " -f, --force Replace existing documents without prompting")
(System/exit (if (empty? args) 1 0)))
(let [document-id (first args)]
(when-not (try (parse-uuid document-id) (catch Exception _ nil))
(println "Error: Invalid document ID (must be a valid UUID)")
(System/exit 1))
(fetch-match-cluster! document-id force?))))
Step 5: Add bb task to bb.edn
Add after the existing debug:fetch-document task:
debug:fetch-match-cluster
{:doc "Fetch a document's match cluster from production: bb debug:fetch-match-cluster <document-id>"
:requires ([debug-fetch-match-cluster])
:task (apply debug-fetch-match-cluster/-main *command-line-args*)}
Step 6: Verify script loads
Run: bb debug:fetch-match-cluster --help
Expected: Usage message prints correctly.
Step 7: Commit
git add scripts/debug_fetch_match_cluster.clj bb.edn
git commit -m "feat: add debug:fetch-match-cluster bb task"
/debug-match skillFiles:
.claude/skills/debug-match/SKILL.mdStep 1: Write the skill file
---
name: debug-match
description: Debug matching errors. Fetches cluster data from prod, then investigates using systematic-debugging.
---
# Debug Match Skill
Investigate matching issues: wrong matches, missing matches, failed matching pipeline, bad reconciliation.
## Arguments
/debug-match
- One doc ID: inspect the document's existing match cluster
- Two doc IDs: investigate why two documents didn't match, or why one matched wrong
UUIDs are detected by format (8-4-4-4-12 hex pattern). Everything else is the problem description.
## Step 1: Check Local Database First
For each document ID, check if it exists locally and has cluster data:
```bash
psql -h localhost -U postgres -d orcha -c "SELECT id, type, file_original_name, cluster_id, matching_status, matching_error FROM document WHERE id = '<doc-id>'" -x
If the document exists AND has a cluster_id, check if the cluster is fully present:
psql -h localhost -U postgres -d orcha -c "SELECT count(*) FROM document WHERE cluster_id = (SELECT cluster_id FROM document WHERE id = '<doc-id>')"
If the document exists locally with its cluster data, skip to Step 3.
For each document ID whose cluster isn't locally available, fetch from production:
bb debug:fetch-match-cluster <doc-id>
If you get an authentication error (e.g., "No running instance found" or credentials expired), tell the user to run:
aws sso login --profile orcha-prod
Then retry. For two-doc scenarios, run the command once per doc ID.
Query the local database to gather the full picture. Run these queries and collect the results:
Documents in the cluster:
psql -h localhost -U postgres -d orcha -c "SELECT id, type, file_original_name, matching_status, matching_error, matching_attempts, normalized_counterparty, normalized_references, cluster_id, created_at FROM document WHERE cluster_id = (SELECT cluster_id FROM document WHERE id = '<doc-id>') ORDER BY created_at" -x
Match edges:
psql -h localhost -U postgres -d orcha -c "SELECT dm.document_a_id, dm.document_b_id, dm.blended_score, dm.llm_confidence, dm.match_method, dm.evidence, dm.created_at FROM document_match dm JOIN document da ON dm.document_a_id = da.id JOIN document db ON dm.document_b_id = db.id WHERE da.cluster_id = (SELECT cluster_id FROM document WHERE id = '<doc-id>') AND db.cluster_id = da.cluster_id ORDER BY dm.blended_score DESC" -x
Cluster reconciliation:
psql -h localhost -U postgres -d orcha -c "SELECT dc.id, dc.reconciliation, dc.reconciled_at FROM document_cluster dc JOIN document d ON d.cluster_id = dc.id WHERE d.id = '<doc-id>'" -x
For two-doc (false negative) scenarios, also gather data for the second document's cluster (if any), and check the normalized fields on both documents to understand why they didn't match:
psql -h localhost -U postgres -d orcha -c "SELECT id, normalized_counterparty, normalized_references, searchable_text, type FROM document WHERE id IN ('<doc-id-1>', '<doc-id-2>')" -x
Spawn an orcha-workers subagent with the gathered data. The subagent prompt must include:
systematic-debugging skill for root cause investigationsrc/com/getorcha/workers/matching/core.clj, candidates.clj, evidence.clj, llm_decision.clj, reconciliation.clj, src/com/getorcha/db/document_matching.clj)The skill does NOT prescribe what the subagent investigates — it provides context and enforces methodology. The subagent adapts to the specific problem.
**Step 2: Commit**
git add .claude/skills/debug-match/SKILL.md git commit -m "feat: add debug-match skill"
---
### Task 4: Manual end-to-end test
This cannot be automated (requires prod access). Verify the full flow works:
**Step 1: Test `bb debug:fetch-match-cluster` with a known clustered document**
Pick a document ID from production that is in a cluster. Run:
```bash
bb debug:fetch-match-cluster <doc-id>
Verify:
Step 2: Verify local data integrity
psql -h localhost -U postgres -d orcha -c "SELECT id, type, cluster_id FROM document WHERE cluster_id IS NOT NULL ORDER BY cluster_id, created_at"
psql -h localhost -U postgres -d orcha -c "SELECT * FROM document_match LIMIT 5" -x
psql -h localhost -U postgres -d orcha -c "SELECT id, reconciliation IS NOT NULL as has_reconciliation FROM document_cluster"
Step 3: Test the skill
Run /debug-match <doc-id> describe the problem here and verify:
Step 4: Test two-doc scenario
Run /debug-match <doc-id-1> <doc-id-2> these should have matched but didn't and verify:
Step 5: Test bb debug:fetch-document still works
Run: bb debug:fetch-document --help
Verify the refactored script still functions correctly.