Note (2026-04-24): After this document was written,
legal_entitywas renamed totenantand the oldtenantwas renamed toorganization. Read references to these terms with the pre-rename meaning.
Date: 2026-04-17 Status: Approved for implementation planning
Per-processor SSE events (diagnostic-run-started:<id> and diagnostic-run-completed:<id>) drive diagnostic section updates in the document detail view. For each IProcessor run, a DB trigger on document_processor_run fires a pg_notify; the SSE listener forwards a fragment swap for the matching section.
This model fails for fast processors. Validations completes in ~42 ms; its started and completed events arrive at the client within the same HTMX swap window. outerHTML swap removes the element that owned the SSE listener before the new element's listener re-registers. The second event is dropped. The section stays on "Recomputing…" indefinitely.
Tax Compliance has an adjacent problem: the SSE re-render path wraps only the tax-issues slice into the renderer, dropping fields that come from structured-data (issuer/recipient country, service category). The section re-renders without its header chips.
Both symptoms are consequences of a model that pushes per-processor granular updates to the UI when the UI only has five coarse-grained diagnostic sections. The work the UI does on each granular event is the same: fetch current DB state and re-render the section.
Replace per-processor events with one atomic diagnostics-recomputed SSE event per recompute cycle, emitted by the application layer. The event's payload is one SSE message containing hx-swap-oob fragments for every diagnostic section, each targeting its stable id. The browser processes all OOB fragments together; no listener churn, no race.
The "recomputing underway" UI state is conveyed by OOB swaps in the edit HTTP response (sections render as :stale since the user's edit bumped document.version past the completed runs). No SSE event is needed for that transition.
Single SSE event: diagnostics-recomputed.
Payload shape (NOTIFY JSON):
{"event/type": "diagnostics-recomputed",
"document/id": "…uuid…",
"legal-entity/id": "…uuid…",
"tenant/id": "…uuid…"}
No section-specific data in the payload. The SSE handler reads authoritative state from the DB and renders every section.
Three application-layer producers fire diagnostics-recomputed via pg_notify on the document_events channel:
workers/diagnostics_recompute/orchestrator.clj). Wraps the whole recompute-all! body in try … (finally …). The finally always fires the event — success, processor failure, or orchestrator exception.workers/ap/processors/matching/worker.clj). Same try/finally around its processor invocation. Matching's completion used to flow through the per-run-row trigger; with that trigger going away, the worker takes over.db.run/reap-stuck-running!). After flipping orphaned rows to :failed, groups by document_id and fires diagnostics-recomputed for each affected document so stuck-looking UI converges promptly rather than waiting the 30-minute reap interval without user feedback.pg_notify → existing document_events LISTEN connection in app/ingestion.clj. Adding :diagnostics-recomputed to the DocumentEvent malli schema lets it flow through the current listener/publisher with no new plumbing. Cross-JVM-safe for when the recompute and SSE endpoints live in different processes.
SSE handler in app/http/documents/view/shared.clj:detail-events gets a new case:
:diagnostics-recomputed
{:event "diagnostics-recomputed"
:data (hiccup/html (render-all-diagnostic-sections db-pool document-id le-id-set))}
render-all-diagnostic-sections does one DB read per source (document, latest-runs-per-processor, matches/reconciliation/cluster-peers, supplier-verification), then concatenates section fragments. Each fragment has hx-swap-oob="outerHTML" and its stable id (#diagnostic-section-validations, #diagnostic-section-fraud-detector, #diagnostic-section-tax-compliance-analyzer, #diagnostic-section-reconciliation, #section-matches).
The edit HTTP response (app/http/documents/edits.clj) calls render-all-diagnostic-sections and appends the output to the existing fragment. Because the edit has just bumped document.version past every completed run's document_version, the section classifier returns :stale for each one. The browser OOB-swaps all sections to stale without any SSE round-trip.
render-all-diagnostic-sections in app/http/documents/view/shared.clj — single source of truth for "render every diagnostic section at current DB state, with OOB attributes."fire-diagnostics-recomputed! in app/ingestion.clj (or a new app/events.clj if the helper surface grows) — wraps pg_notify with a well-shaped payload.trigger_processor_run_event and notify_processor_run_event().app/ingestion.clj — DocumentEvent schema adds the :diagnostics-recomputed variant; DiagnosticRunStartedEvent and DiagnosticRunCompletedEvent are removed.app/http/documents/view/shared.clj — detail-events adds the :diagnostics-recomputed case; the :diagnostic-run-started / :diagnostic-run-completed cases and the render-diagnostic-section helper are removed; the dead :matching case is removed.app/http/documents/edits.clj — edit response includes the render-all OOB fragment.app/ui/components.clj — validation-results-section, fraud-detection-section, tax-compliance-section, reconciliation-section, contract-validation-section drop their :hx-ext, :sse-swap, :hx-swap attributes. The stable :id stays as an OOB target.app/http/documents/view/shared.clj:matches-section — drops sse-swap="matching-complete".workers/diagnostics_recompute/orchestrator.clj — try/finally fires the event.workers/ap/processors/matching/worker.clj — try/finally fires the event.db/document_processor_run.clj — reap-stuck-running! returns affected document IDs (or accepts a callback) so the caller can fire events.DiagnosticRunStartedEvent, DiagnosticRunCompletedEvent schemas.render-diagnostic-section helper.:diagnostic-run-started, :diagnostic-run-completed, :matching cases in detail-events.app/ingestion.clj and app/http/documents/view/shared.clj.trigger_processor_run_event DB trigger and notify_processor_run_event() function.fail-run! records status='failed'; the section classifier returns :failed; the badge renders "Analysis failed." No new code.try/finally fires diagnostics-recomputed on the way out. The subsequent render reads whatever ended up in the DB (some sections :current, some :failed, some :stale if they never ran). The UI always converges on a terminal state, not stuck on :stale.diagnostics-recomputed per affected document after flipping stale running rows, so orphans don't leave the UI stuck for 30 min.Unchanged except for what the matching worker emits:
trg_notify_document_classified) → SSE :ingestion event with status=:in-progress → no re-render (handler skips in-progress).notify_ingestion_event trigger → SSE :ingestion event with status=:completed → existing status-changed SSE → whole #document-area re-renders. No change.diagnostics-recomputed on completion. The matches section (and any other affected sections) re-render.render-all-diagnostic-sections — seed documents with combinations of run states (:current, :stale, :failed, :recomputing, :never-run) and verify the output contains the right section fragments with OOB attributes.app.ingestion-test gains a case coercing a :diagnostics-recomputed NOTIFY payload through DocumentEvent.recompute-all! success path: fire-diagnostics-recomputed! called exactly once (via spy or with-redefs).recompute-all! failure path: stub one phase-2 processor to throw; fire-diagnostics-recomputed! still called exactly once.reap-stuck-running!: seed a stuck row, run the sweep, assert row flipped and fire-diagnostics-recomputed! called once per affected document.Full NOTIFY → listener → SSE round-trip. with-db-rollback fixture discards pg_notify, same as today. We cover the listener coercion and renderer paths with direct input tests instead.
One PR, one migration. Client and server must roll out together — the DB trigger is dropped in the same migration that removes the per-processor event cases from the listener and SSE handler. The diagnostic section components are updated in lockstep to stop relying on sse-swap.
No backward-compat shim. A running server with the old UI loaded won't receive per-processor events after the migration; it will receive the new atomic event, which its old sections don't listen for. Users with a stale tab see "Recomputing…" indefinitely — same failure mode that prompted this work — and need to refresh. Acceptable given the scope of the deploy (local/staging/prod all upgrade at once).
complete-notice! in workers/ap/ingestion.clj). Independent issue; tracked separately.