Pipeline Correction Tracking - Product Specification

Author: Product Team Status: Draft Last Updated: 2026-02-04

1. Problem Statement

When an invoice is processed through Orcha's ingestion pipeline, the initial LLM extraction is followed by deterministic validation checks that may flag errors or uncertainties. Post-processors like the UncertainValidationsResolver then re-examine the original PDF with an LLM to correct these issues — fixing math errors, removing misextracted line items, correcting tax IDs.

Currently, these corrections are applied silently to the structured data. The LLM's reasoning and confidence are stored in validation-results, but there is no record of what specific fields were modified, what the original values were, or which processor made the change.

This creates a blind spot:

Debugging is guesswork: When a field has an unexpected value, there's no way to tell if it came from extraction or was modified by a post-processor
No prompt improvement signal: We can't measure how often corrections happen or which fields are corrected most frequently — data that would help improve extraction prompts and reduce the need for corrections
No data lineage: The evolution of structured data through the pipeline is not traceable

2. Goal

Add a lightweight correction log that records every modification made by corrective post-processors, enabling developers to:

Trace data lineage — for any field, know whether it came from extraction or was corrected, and by whom
Identify correction patterns — query which fields are corrected most often, by which processors, to guide extraction prompt improvements
Monitor pipeline health — measure what percentage of ingestions require corrections over time

3. Success Metrics

Metric	Target
Corrective modifications logged	100% — every field change by in-scope processors is recorded
Pre-correction state reconstructable	Any corrected field's original value can be retrieved via a single query
Correction frequency queryable	Simple SQL can answer "top corrected fields in last 30 days"
Pipeline latency impact	No measurable increase in ingestion processing time

4. Core Concepts

4.1 Corrective vs. Enrichment Post-Processors

Not all post-processors modify existing data. Some add entirely new fields (enrichments). Only processors that change values originally set by extraction are in scope.

Processor	Type	What it does
UncertainValidationsResolver	Corrective	Modifies field values, updates line-item fields, removes misextracted line items (see full field inventory below)
TaxComplianceAnalyzer	Corrective	Corrects tax ID value (`issuer.tax-id`) and type classification (`issuer.tax-id-type`) when analysis disagrees with extraction
AccountsMatcher	Enrichment	Adds `debit-account` / `credit-account` to line items
CostCenterMatcher	Enrichment	Adds `cost-center` to line items
AccrualsMatcher	Enrichment	Adds `accrual` to line items
SupplierMatcher	Enrichment	Adds `matched-account-number` / `match-confidence` / `match-reasoning` to issuer

Verified: No other pipeline stage mutates extracted fields. with-validations only adds :validation-results, with-fraud-detection only adds :fraud-flags, and no transformations occur between extraction and post-processing or between post-processing and persistence.

4.2 Complete Field Inventory

Every field that can be corrected by a post-processor, verified against the codebase:

UncertainValidationsResolver — field corrections (via allowlist):

Field path	Triggered by check
`subtotal`	`financial-math`
`tax-amount`	`financial-math`
`tax-rate`	`financial-math`
`total`	`financial-math`
`discount`	`financial-math`
`shipping`	`financial-math`
`amount-due`	`financial-math`
`line-items-include-tax`	`financial-math`
`issuer.country`	`issuer-country`
`issuer.iban`	`iban-format`
`issuer.tax-id`	`tax-id-format`
`issuer.tax-id-type`	`tax-id-format`
`issuer.vat-id`	`tax-id-format` (legacy field)
`recipient.country`	`recipient-country`

UncertainValidationsResolver — line-item corrections:

Action	Triggered by check	Details
`line-item-update`	`financial-math`	Any line-item field (amount, quantity, unit-price, tax-rate, description, etc.) updated by index
`line-item-removal`	`financial-math`	Line items removed by index (e.g., section subtotals misextracted as line items)

TaxComplianceAnalyzer — tax ID corrections:

Field path	Condition
`issuer.tax-id`	Tax compliance analysis returns `status: "corrected"`
`issuer.tax-id-type`	Tax compliance analysis returns `status: "corrected"`

4.3 Correction Action Types

Each logged correction is typed by the kind of change it represents:

Action Type	Description	Example
`field-update`	A top-level or nested field value was changed	Subtotal corrected from 100.0 to 105.0
`line-item-update`	A field on a specific line item was changed	Line item 0's amount changed from 150.0 to 125.0
`line-item-removal`	A line item was removed entirely	Section subtotal misextracted as line item, removed

4.3 Before/After Capture

Every correction logs the old value (what extraction produced) and the new value (what the post-processor set), along with the LLM's reasoning and confidence score. This enables full reconstruction of the pre-correction state without needing pipeline snapshots.

4.4 Separate Storage

Correction logs are stored in a dedicated table, separate from the structured data itself. This keeps business data clean while enabling cross-ingestion analytics (e.g., "which fields are corrected most often across all tenants?").

5. Audience

This feature is developer-only. No UI changes are introduced.

Developers: Query the correction log via SQL for debugging and analytics
No admin or end-user access: Corrections are not surfaced in the invoice review UI
Future possibility: A dashboard or UI indicators could be built on top of this data later

6. Use Cases

6.1 Debugging an Unexpected Field Value

Scenario: Invoice #12345 shows subtotal = 105.0, but the PDF clearly states 100.0.
Developer queries: SELECT * FROM ingestion_correction_log
                   WHERE ingestion_id = '<id>' AND field_path = 'subtotal'
Result: Shows that UncertainValidationsResolver changed subtotal from 100.0 to 105.0
        with reasoning "Found shipping fee included in subtotal on PDF"
Action: Developer investigates whether the correction was appropriate or a model error

6.2 Identifying Extraction Improvement Opportunities

Scenario: Developer wants to know if extraction prompts should be improved.
Developer queries: SELECT field_path, action, count(*) as frequency
                   FROM ingestion_correction_log
                   WHERE tenant_id = '<id>' AND created_at > now() - interval '30 days'
                   GROUP BY field_path, action
                   ORDER BY frequency DESC
Result: line-item removals account for 40% of all corrections
Action: Improve extraction prompt to avoid extracting section subtotals as line items

6.3 Pipeline Health Monitoring

Scenario: Developer wants to measure correction rate over time.
Developer queries: SELECT
                     count(DISTINCT icl.ingestion_id)::float / count(DISTINCT i.id) as correction_rate
                   FROM ingestion i
                   LEFT JOIN ingestion_correction_log icl ON icl.ingestion_id = i.id
                   WHERE i.status = 'completed'
                     AND i.completed_at > now() - interval '30 days'
Result: 12% of ingestions required at least one correction
Action: Track this metric monthly to measure extraction quality improvements

6.4 Tracing a Tax ID Correction

Scenario: A supplier's tax ID type was reclassified from "vat" to "ein".
Developer queries: SELECT * FROM ingestion_correction_log
                   WHERE ingestion_id = '<id>' AND field_path LIKE 'issuer.tax-id%'
Result: Two corrections logged:
        - field_path: 'issuer.tax-id-type', old: "vat", new: "ein"
        - field_path: 'issuer.tax-id', old: "DE12345678", new: "12-3456789"
        Processor: tax-compliance-analyzer
Action: Developer verifies the reclassification was correct

7. Data Model

7.1 Correction Log Table

Table: ingestion_correction_log

Column	Type	Description
id	UUID	Primary key
ingestion_id	UUID (FK)	Which ingestion run produced this correction
tenant_id	UUID (FK)	Tenant for RLS and partitioning
processor_id	TEXT	Which processor made the correction (e.g., `uncertain-validations-resolver`, `tax-compliance-analyzer`)
action	TEXT	One of: `field-update`, `line-item-update`, `line-item-removal`
field_path	TEXT	Dotted path to the corrected field (e.g., `subtotal`, `issuer.country`, `line-items.0.amount`)
check_name	TEXT (nullable)	Which validation check triggered this correction (e.g., `financial-math`). Null for non-validation corrections (e.g., TaxComplianceAnalyzer).
old_value	JSONB	The value before correction
new_value	JSONB (nullable)	The value after correction. Null for `line-item-removal`.
reasoning	TEXT	LLM's explanation for why the correction was made
confidence	NUMERIC	LLM's confidence score (0.0 - 1.0)
created_at	TIMESTAMPTZ	When the correction was logged

7.2 Indexes

Index	Purpose
`(ingestion_id)`	Look up all corrections for a specific ingestion
`(tenant_id, processor_id)`	Analytics: which processors produce the most corrections per tenant
`(tenant_id, field_path)`	Analytics: which fields are corrected most often per tenant

7.3 Example Rows

Field correction (subtotal):

Column	Value
processor_id	`uncertain-validations-resolver`
action	`field-update`
field_path	`subtotal`
check_name	`financial-math`
old_value	`100.0`
new_value	`105.0`
reasoning	`"Found shipping fee of 5.0 included in subtotal on PDF page 1"`
confidence	`0.85`

Line item removal:

Column	Value
processor_id	`uncertain-validations-resolver`
action	`line-item-removal`
field_path	`line-items.2`
check_name	`financial-math`
old_value	`{"description": "Subtotal Section A", "amount": 500.0, ...}`
new_value	`null`
reasoning	`"This is a section subtotal row, not a billable line item"`
confidence	`0.90`

Tax ID type reclassification:

Column	Value
processor_id	`tax-compliance-analyzer`
action	`field-update`
field_path	`issuer.tax-id-type`
check_name	`null`
old_value	`"vat"`
new_value	`"ein"`
reasoning	`"US-based supplier, EIN format matches, not a VAT ID"`
confidence	`0.92`

8. Edge Cases & Business Rules

Edge Case	Behavior
Ingestion retried after failure (attempt_count > 1)	Delete existing correction logs for the ingestion before re-processing. The log always reflects the final successful run.
Post-processor runs but makes no corrections	No rows inserted. Absence of rows = no corrections applied.
Multiple corrections in a single processor run	Each correction is a separate row. The `ingestion_id` ties them together.
Correction sets a field to null	`new_value` is stored as JSON `null`. Distinct from `line-item-removal` where `new_value` column itself is SQL NULL.
Line item removal changes subsequent item indexes	Log the index at the time of removal (before any reindexing). Multiple removals are logged with their original indexes.
Processor crashes mid-correction	If the ingestion fails and retries, correction logs are deleted at the start of the next attempt (see retry behavior above).
Enrichment processor overwrites a field that already had a value	Out of scope for v1. Only corrective processors are tracked. If enrichment processors start overwriting extracted values, they should be added to scope.

9. Data Retention

No auto-cleanup: Correction logs are retained indefinitely. The table is narrow (one row per corrected field) and storage cost is minimal.
Future option: A TTL-based cleanup can be added later if the table grows large. The separate table design makes this straightforward.
No archival: No cold storage or partitioning needed initially.

10. Out of Scope (v1)

UI indicators: No visual markers on corrected fields in the invoice review UI
Enrichment tracking: Post-processors that add new fields (account matching, cost centers, accruals) are not logged
Real-time alerts: No notifications when corrections happen
Dashboard: No built-in analytics dashboard — developers use SQL directly
Correction approval workflow: No human-in-the-loop for reviewing corrections before they're applied
Full pipeline snapshots: No storing of complete structured-data at each pipeline stage
Correction undo: No mechanism to revert a correction to its original value

11. Open Questions

Field path format: Should field paths use dot notation (issuer.country) or vector notation ([:issuer :country])? Dot notation is more SQL-friendly; vector notation is more Clojure-idiomatic. Recommendation: dot notation for queryability.
Confidence threshold logging: Should we also log cases where the LLM analyzed a field but decided NOT to correct it (confidence too low or value confirmed correct)? This data could be useful for understanding false-positive rates. Recommendation: defer to v2.
Cross-processor conflicts: If both UncertainValidationsResolver and TaxComplianceAnalyzer try to correct the same field (e.g., tax-id), who wins? Currently they run in parallel and results are applied sequentially. The last writer wins, but both corrections would be logged. Recommendation: document this behavior and accept it for v1.