AP approvals currently execute DATEV export from the app service after the final
approval. The export itself is implemented in com.getorcha.integrations.ap.maesn
and records DATEV-specific progress in ap_datev_export_audit.
We want the app to own user-facing workflow state and output intent, while a worker owns external output execution. This first implementation is intentionally narrow: AP final approval should enqueue a document output job, and an SQS worker should execute the existing DATEV export path. The design uses generic names so AR outputs can use the same mechanism later, but AR policy resolution and multi-target output auditing are out of scope.
document_output_job table as durable workflow/audit state.ap_datev_export_audit.Add document_output_job as the generic record of output intent and worker
execution state.
Columns:
id UUID PRIMARY KEY DEFAULT gen_random_uuid()tenant_id UUID NOT NULL REFERENCES tenant(id) ON DELETE CASCADEdocument_id UUID NOT NULL REFERENCES document(id) ON DELETE CASCADEdocument_domain document_output_domain NOT NULLtrigger document_output_trigger NOT NULLstatus document_output_status NOT NULL DEFAULT 'pending'document_version INTEGERrequested_by UUID REFERENCES identity(id) ON DELETE SET NULLstarted_at TIMESTAMPTZcompleted_at TIMESTAMPTZlast_error TEXTcreated_at TIMESTAMPTZ NOT NULL DEFAULT now()updated_at TIMESTAMPTZ NOT NULL DEFAULT now()Enums:
document_output_domain: initially ap; later ar.document_output_trigger: initially approval_completed and manual; later
scheduled or AR-specific triggers as needed.document_output_status: pending, running, dispatched, failed.The job table must not include a target. A job means "dispatch this document's
outputs." Which systems receive output is a worker/code decision in this phase
and later a tenant policy decision. Auditing individual output targets is useful,
but out of scope for this implementation.
Useful indexes:
(document_id, created_at DESC) for document detail/debugging.(tenant_id, created_at DESC) for operational views.(status, created_at) for future repair tooling.CREATE UNIQUE INDEX idx_document_output_job_document_active ON document_output_job (document_id) WHERE status IN ('pending', 'running').
This rejects a second manual re-export while one is already in flight and also
protects against final-approval dispatch racing another enqueue path. Later,
when AR or scheduled outputs need concurrent dispatch semantics, this can be
relaxed or replaced with a more specific uniqueness key.Add a fifth queue to resources/com/getorcha/config.edn under
:com.getorcha/aws :queues:
:document-outputv1-orcha-global-document-outputv1-orcha-global-document-output-dlqQueue attributes should match the current local queue helper posture unless production infrastructure requires the same setting elsewhere:
VisibilityTimeout = 60 seconds on the queue.heartbeat-extension-seconds before starting work.MessageRetentionPeriod = 604800 seconds, 7 days.maxReceiveCount = 3.Worker Integrant config should mirror the existing SQS consumers:
max-queue-messages = 10wait-time-seconds = 20heartbeat-extension-seconds = 300heartbeat-rate-seconds = 60stale-running-seconds = 600Message body for V1 is the job id as a plain UUID string. This matches the ingestion queue's simple id message style and avoids introducing a message schema before the worker has multiple message kinds. Future message versions can switch to JSON if needed.
The stale-running threshold is intentionally longer than the visibility extension window and several heartbeat periods. It lets a replacement worker recover from a dead worker, while avoiding duplicate export initiation during normal heartbeat jitter.
The approval handler continues to lock approval rows and update the final row in
the existing transaction. When the approval transition makes the document fully
approved, the same transaction inserts a document_output_job with:
document_domain = 'ap'trigger = 'approval_completed'status = 'pending'document_id, tenant_id, document.version, and approver identityAfter the transaction commits, the handler sends an SQS message to the document
output queue with the job-id.
If the transaction fails, no approval or job is committed. If the transaction
commits but SQS send fails, update the job to failed with last_error and
surface that dispatch failure through the document detail UI. Stale pending
jobs should therefore mean the app committed the job and attempted dispatch, but
the process stopped before it could mark send failure; operators can find these
rows directly, and a future sweeper can enqueue stale pending jobs if this proves
common.
The failure update after an SQS send error must be guarded:
UPDATE document_output_job SET status = 'failed', last_error = ?, updated_at = now(), completed_at = now() WHERE id = ? AND status = 'pending'.
If the update affects zero rows, the app must not overwrite the current state;
it should re-read the job and render the current dispatch state. This prevents an
ambiguous send error from marking a job failed after a worker has already
claimed it.
The document detail re-export action should use the same dispatcher path. The
handler validates user access and basic document eligibility, inserts a
document_output_job with:
document_domain = 'ap'trigger = 'manual'status = 'pending'document_id, tenant_id, document.version, and requester identityAfter creating the job, the handler sends the SQS message with job-id. If SQS
send fails, it marks the job failed with last_error and returns a UI state
that reflects the dispatch failure. It should not call
maesn/create-booking-proposal! directly.
Manual re-export must reject or surface "already in flight" when the partial
unique index prevents creating a second pending job for the same document. It
must not enqueue another SQS message for a document with an active pending or
running dispatch job.
Add an SQS worker namespace following existing worker patterns:
job-id.workers/ap/ingestion.clj.document_output_job.updated_at while the job
remains running. The stale-running claim uses this timestamp as the worker
liveness marker.UPDATE document_output_job SET status = 'running', started_at = COALESCE(started_at, now()), updated_at = now() WHERE id = ? AND (status = 'pending' OR (status = 'running' AND updated_at < now() - interval '600 seconds')) RETURNING *.dispatched: delete the message.running: leave the message to be retried after visibility timeout,
without deleting it. It must not silently convert a potentially stuck job
into a permanently stuck job.document_domain = aptrigger = approval_completed or manualmaesn/create-booking-proposal!dispatched once DATEV export has been handed to DATEV/Maesn far
enough that ap_datev_export_audit owns the remaining async outcome tracking.failed with last_error when export initiation fails.DATEV completion remains asynchronous and continues to be represented by
ap_datev_export_audit. The output job records that the output dispatcher
accepted and dispatched the configured output work.
The worker should not invent a new concurrency convention in this plan. It should reuse the existing SQS worker pattern: virtual-thread-per-task executor plus SQS visibility extension during long-running work.
Add a nullable dispatch_job_id UUID REFERENCES document_output_job(id) column
to ap_datev_export_audit.
When the output worker calls the DATEV connector, the connector should record the
job id on the audit row it creates. This gives operators a direct link from the
generic output dispatch attempt to the DATEV-specific async task state. Existing
manual exports or historical rows can have NULL.
Dispatch-level failures must be visible even when no DATEV audit row exists.
Examples include SQS send failure after job creation, missing DATEV integration,
or worker-side eligibility failure before maesn/create-booking-proposal!
creates an audit row.
Add a Postgres NOTIFY trigger for document_output_job changes on status
insert/update. The payload should include:
event/type = output-dispatchdocument-output-job/iddocument-output-job/statusdocument/idtenant/idorganization/idold-statusThe document detail SSE handler should react to :output-dispatch by
re-rendering the DATEV export section using both:
ap_datev_export_auditdocument_output_jobThe DATEV export section should surface dispatch state when it is relevant:
pending or running: show a dispatching/export-requested state and keep the
SSE subscription active.failed with no newer DATEV audit: show an inline failure using
document_output_job.last_error and allow retry when the regular approval and
DATEV eligibility rules allow it.dispatched: prefer the DATEV audit state when present; otherwise show a
transient dispatched/exporting state until the audit event arrives.Manual re-export should also use this path. If the app creates a job but fails to
send SQS, it should mark the job failed, render the DATEV export section with
that dispatch failure, and not pretend that DATEV export has started.
The worker must tolerate duplicate SQS delivery.
Rules:
dispatched: delete the message.running: do not start another external export.running: claim atomically and retry, using the same claim function as
pending jobs.failed: delete the message. Re-dispatch of failed jobs is future manual or
repair tooling, not SQS redelivery in this implementation.Before initiating DATEV export, the worker should preserve existing DATEV eligibility behavior and avoid creating a new export when the document already has a non-retryable successful export state.
If a stale running job already has a linked DATEV audit with a task id or a
terminal status, the worker should not create another DATEV export. It should
complete the dispatch job according to the linked audit state, or leave it for
manual repair if the audit state is ambiguous.
Remove AP batch DATEV export entirely:
/toggle, /toggle-all, /deselect-all, and /export-datev routes
when they only support batch export.bulk-export-datev.Manual single-document re-export remains available in the document detail UI, but its execution moves to the document output worker. The required behavior changes for this plan are final-approval export via worker, manual re-export via worker, and removal of batch export.
Before adding the migration, review current export-related schema and code. Initial expectations:
ap_datev_export_audit. It tracks DATEV-specific request payloads,
task IDs, statuses, errors, payload hashes, and powers the existing UI/SSE
status updates. Add dispatch_job_id rather than replacing this table.tenant_datev_integration. It stores DATEV connection state,
credentials, config, and metadata.Only remove schema that is proven unused and unrelated to DATEV task auditing or DATEV connection state.
Add or update focused tests:
document_output_job and sends an SQS message after
the transaction commits.document_output_job, sends SQS,
and does not call DATEV directly from the app handler.pending or running output job.failed only with a guarded
WHERE status = 'pending' update and never overwrites running or
dispatched.:document-output with queue name, DLQ, visibility timeout,
retention, and max receive count defined in this spec.UPDATE ... WHERE status = 'pending' OR stale running older than 600 seconds ... RETURNING *.dispatched.failed with last_error.dispatch_job_id.