Pending Schema Cleanups
Track schema changes that require backward-compatibility periods before final cleanup.
Active Cleanups
drop-ingestion-structured-data
| Field |
Value |
| Original Migration |
<timestamp>-add-document-history |
| Cleanup Migration |
Pending — gated on production stability of the new migration with no open regressions |
| What Will Be Removed |
ingestion.structured_data (JSONB column) — replaced by document_history.patch for per-ingestion state and document.structured_data for current materialized state. Stops being written when the new migration drops trg_update_document_from_ingestion; ingestion worker writes transactionally to document_history + document instead. |
drop-ingestion-valid-structured-data
| Field |
Value |
| Original Migration |
<timestamp>-add-document-history |
| Cleanup Migration |
Pending — gated on production stability of the new migration with no open regressions |
| What Will Be Removed |
ingestion.valid_structured_data (BOOLEAN column) — replaced by Malli schema validation in the ingestion worker, recorded implicitly by the presence of a document_history row with change_type='ingestion' (failed validation = no row). |
Pending Schema-Level Deprecations
Tables/columns/JSON keys that have stopped being written but are not yet
queued behind a specific cleanup migration. Add to "Active Cleanups" with a
Field/Value table once a cleanup migration is planned.
ap_ingestion_post_process_stat (entire table)
- Replaced by:
document_processor_run (unified per-processor run history,
any trigger kind).
- Stopped being written: , when the diagnostics recompute pipeline
shipped and post-process handlers started writing to
document_processor_run.
- Gate to drop: backfill verified, no open queries against the old table.
document.matching_status, matching_error, matching_attempts, matching_failed_at
- Replaced by:
document_processor_run rows where processor_id = 'matching'.
Latest status via DISTINCT ON (processor_id) ... ORDER BY document_version DESC;
attempts via COUNT(*).
- Stopped being written: 2026-04-16. All readers use the
latest-matching-run LATERAL subquery against document_processor_run
(see app.http.documents.shared/lateral-joins). Worker-level skip/fail
now writes synthetic document_processor_run rows.
- Gate to drop: production stability window with no regressions on
the new
latest-matching-run LATERAL path.
matching_status ENUM type
- Replaced by:
processor_run_status ENUM.
- Gate to drop: after the columns above are dropped.
document.reconciliation_status
- Replaced by:
document.diagnostics.reconciliation.status (materialized)
and document_processor_run rows where processor_id = 'reconciliation'.
- Stopped being written: .
- Gate to drop: reconciliation UI reads from
document.diagnostics.
structured_data.{validation-results, fraud-flags, tax-issues} + structured_data.line-items[*].vat-validation
- Replaced by:
document.diagnostics + document_processor_run.
- Stopped being written: .
- Gate to drop: migration verified, all readers moved to
document.diagnostics.
Completed Cleanups
add-fk-document-cluster-id
| Field |
Value |
| Original Migration |
20260303065512-add-document-cluster-table |
| Cleanup Migration |
20260310190256-add-fk-document-cluster-id |
| Date Completed |
2026-03-10 |
| What Was Added |
FK constraint: document.cluster_id REFERENCES document_cluster(id) ON DELETE SET NULL |
| Field |
Value |
| Original Migration |
20260115124658-document-source-metadata |
| Cleanup Migration |
20260124100134-remove-ingestion-source-metadata |
| Date Completed |
2026-01-24 |
| What Was Removed |
ingestion.source_metadata column, trg_copy_source_metadata trigger, copy_source_metadata_to_ingestion() function |
| Code Removed |
:source-metadata from ingestion INSERT in queue-for-ingestion! |