Note (2026-04-24): After this document was written, legal_entity was renamed to tenant and the old tenant was renamed to organization. Read references to these terms with the pre-rename meaning.

Debug Match Skill — Design

Problem

When matching produces incorrect results (false positives, false negatives, failed pipelines, bad reconciliation), debugging requires:

  1. Fetching the document's entire match cluster from production (all documents, edges, reconciliation)
  2. Understanding why matching behaved the way it did

There is no tooling to fetch a full cluster from prod. The existing bb debug:fetch-document only fetches a single document in isolation. And there is no skill to guide the investigation.

Solution

Two pieces:

  1. bb debug:fetch-match-cluster <doc-id> — a bb task that fetches a document's full match cluster from production into local
  2. /debug-match skill — orchestrates fetching, gathers context from local DB, then delegates investigation to an orcha-workers subagent using the systematic-debugging skill

Skill Interface

/debug-match <doc-id> [problem description]
/debug-match <doc-id-1> <doc-id-2> [problem description]

UUIDs are detected by format; everything else is the problem description.

bb debug:fetch-match-cluster

What it fetches from prod (via nREPL)

Given a document ID:

  1. The document row — get its cluster_id
  2. If cluster_id exists:
  3. If no cluster_id: just the document itself (unmatched)
  4. Ingestions for every document in the cluster

Returns a map:

{:document     {...}                        ;; the requested doc
 :cluster      {...}                        ;; document_cluster row, nil if unmatched
 :cluster-docs [{...} ...]                  ;; all docs in cluster
 :match-edges  [{...} ...]                  ;; all document_match rows
 :ingestions   {<doc-id> [{...} ...] ...}}  ;; ingestions keyed by document id

Local insert

For each document in the cluster:

Additionally:

Conflict handling

If any document already exists locally, prompt to replace (same UX as debug:fetch-document).

Shared utilities — scripts/debug_common.clj

Extract from debug_fetch_document.clj into a shared namespace:

debug_fetch_document.clj gets refactored to use debug-common. debug_fetch_match_cluster.clj uses the same shared namespace plus adds cluster/match/reconciliation insert logic.

Skill Flow

/debug-match <args>
    |
    v
[Parse args: extract doc IDs + problem description]
    |
    v
[For each doc ID: check local DB for document + cluster data]
    |
    v
[If missing: run `bb debug:fetch-match-cluster <doc-id>`]
    |  (handle auth errors: prompt `aws sso login`)
    v
[Query local DB for full context:
  - All documents in cluster(s)
  - All match edges (scores, method, evidence, confidence)
  - Reconciliation data
  - Matching status/error fields
  - Normalized fields (counterparty, references)]
    |
    v
[Spawn orcha-workers subagent with:
  - Gathered cluster data
  - User's problem description
  - Doc IDs and scenario type (one-doc vs two-doc)
  - Instruction to use systematic-debugging skill]

The skill does not prescribe what the subagent investigates. It ensures the right data is local, gathers it, and delegates. The subagent reads the matching source code and follows systematic-debugging to find the root cause.

Decisions

Decision Choice Rationale
Two-doc case handling Call bb debug:fetch-match-cluster once per doc ID Keeps the bb task single-purpose; skill handles orchestration
Investigation approach Delegate to orcha-workers subagent Keeps main context clean; matching issues are varied
Subagent methodology Subagent invokes systematic-debugging skill Single source of truth; skill can evolve independently
Script reuse Extract shared utilities to debug_common.clj Both fetch scripts share SSM, S3, and DB insert logic
What to fetch from prod Everything (edges, reconciliation, ingestions) Matching/reconciliation is LLM-driven, non-deterministic; must have exact prod state