Changelog
v1.3007
Changes since: v1.2992 (commit 1837b0153)
Released: 2026-05-20
Total commits: 6 (orcha)
Files changed: 13 (+1,079 / -179 lines)
Email Notifications: Deliverability + HTML body + document link
User-facing notification emails now include a proper HTML body alongside
plain text, and link directly to the affected document. Headers and
configuration are tuned for deliverability (auth-aligned From,
List-Unsubscribe / List-Unsubscribe-Post, plain-text alternative,
and explicit Reply-To). Test coverage expanded accordingly.
Fixes
- FVR/UVR corrections now persist to
structured_data: The v2
IProcessor migration (commit 41000fd23) silently dropped FVR's
and UVR's structured-data writes — corrections were recorded in the
diagnostic slice (correction-applied?: true) but never reached the
document row. Downstream consumers (notably the DATEV booking-
proposal builder) then read uncorrected values and produced invalid
payloads. Restored persistence by accumulating JSON-Patch ops in the
resolvers and declaring -writes; UVR now runs in its own phase
before FVR to honour the engine's disjoint-writes invariant on
overlapping paths.
- Maesn payload: strip payment-confirmation lines from fully-paid
invoices: German hotel/retail receipts often list the payment
alongside billed items in a single Debit/Kredit table (e.g.,
"Visa Card -148,46"). Extraction keeps the negative line so the UI
mirrors the invoice, but Maesn's booking-proposal endpoint requires
totalGrossAmount = sum(lineItems). structured-data->api-payload
now detects the fully-paid-with-offsetting-payment-line pattern and
strips the negative line(s) from the payload sent to Maesn.
structured-data is unchanged; only the outbound payload transforms.
- Maesn payload: skip prepayments for any fully-paid invoice: When
amount-due is 0, unconditionally skip folding prepayment items into
lineItems. The previous predicate required a negative line to be
present, which missed re-ingested cases where extraction had dropped
the payment line and FVR had re-added the payment as a prepayment.
Partial-payment behavior unchanged.
- DATEV connector: use
datev-rewe-export auth endpoint: Maesn
renamed the trial integration; updated the configured auth endpoint
to keep DATEV sign-in working.
v1.2992
Changes since: v1.2974 (commit 65c58660e)
Released: 2026-05-19
Total commits: 17
Files changed: 88 (+3,114 / -1,733 lines)
VAT Rate Statement Check (§14 Abs. 4 Nr. 8)
New validation enforcing that German invoices state the applicable tax rate
(or, for tax-exempt supplies, a reference to the exemption) as required by
§14 Abs. 4 Nr. 8 UStG.
- Schema: Optional
:tax-rate-stated? field added to InvoiceData
- Extraction: The extraction prompt now asks whether the invoice explicitly
states the tax rate or an exemption reference
- Validation: New
check-tax-rate-stated validation rule, resolved as a
fourth sub-path in the UVR (Umsatzsteuer-Voranmeldung readiness) check,
reading legal-basis and document-type to decide applicability
- Findings: The tax-rate-stated warning now surfaces across the ordering
lists alongside other validation findings
Fixes
- In-table prepayment/deposit rows are treated as line items: Prepayment
and deposit rows appearing inside the line-item table are now extracted as
line items rather than misclassified as prepayments, fixing totals and
reconciliation on affected invoices. Applies to newly ingested (or
recomputed) documents.
v1.2974
Changes since: v1.2971 (commit 0df4ba3aa)
Released: 2026-05-18
Total commits: 3
Files changed: 10 (+157 / -30 lines)
Fixes
- Validation line-item messages are more legible:
format-line-item-issue
now caps the line-item description at 50 characters (was 30) and spells out
page references as (page 1) / (pages 1–2) instead of the cryptic
[p1] / [p1-2]. Applies to newly ingested (or recomputed) documents.
- Diagnostic boxes no longer disappear for clean documents: the validation,
fraud-signals, and reconciliation sections (invoice / contract / GRN / PO
views and the OOB refresh path) now render whenever the analyzer has run —
showing the all-clear state — and are hidden only for documents ingested
before the analyzer existed (
:never-run).
v1.2971
Changes since: v1.2459 (commit 3d8fccf8b)
Released: 2026-05-18
Total commits: 511
Files changed: 510 (+130,413 / -3,984 lines)
The majority of the repo-wide diff is the new wiki-browser/ subproject and
the www/ v3 marketing-site rewrite. The sections below cover the orcha
application changes only.
Production Resilience Hardening (Phase 1)
Post-outage (2026-05-16) hardening of the workers pipeline and infrastructure.
- Global heavy-work concurrency gate: All 5 heavy per-message worker tasks (ingestion, matching, diagnostics-recompute, document-output, acquisition) gated through a shared semaphore; configurable via
ORCHA_HEAVY_CONCURRENCY (default 3), registered as a component
call-with-permit inline gate helper: Reusable permit-acquisition wrapper
- SQS visibility ordering fix: Extend message visibility before acquiring the heavy-work permit; delete the SQS message only after gated work succeeds (fixes DLQ regression where slow permit waits dropped messages)
- Resilient poll loop: Shared
run-poll-loop with exponential backoff + full jitter; backoff exponent clamped so a sustained failure can't overflow-kill the loop; all 5 poll loops routed through it
- JVM memory tuning:
MaxRAMPercentage 75→60, fail-fast + heap dump on OOM; orcha capped at 3200m
- Infrastructure: 2G swapfile via cfn-init, t4g unlimited CPU credits, CW agent emits swap + ASG-aggregated mem/swap/disk, mem/swap/disk CloudWatch alarms, 15-min ALB-unhealthy-sustained alarm, guarded SSM auto-replace (alarm-state gated, skips Terminating-lifecycle instances)
SendGrid Outbound Email
- Provider-dispatched
send!: New email-send abstraction with sendgrid and log providers
- Outbound SES removed: SES outbound client,
send-email!, identity, and IAM removed
- Routed channels: User email channel, admin email, and channel-verification email all routed through
email/send!
- Config-driven: Provider/api-key in notifications config; env-overridable for local dev (test stays
:log)
- DNS:
mail.getorcha.com SendGrid DNS records applied; SSM placeholder for the SendGrid key
Fixes
- Admin org lookup: Use
[:in :id ...] for organization lookup (was emitting an invalid uuid_array cast)
- Maesn: Redact api-key from response logs, surface
get-user-info failures
- DATEV: Point production at the non-sandbox Maesn API key
v1.2459
Changes since: v1.1995 (commit df240647)
Released: 2026-05-02
Total commits: 463
Files changed: 460 (+76,318 / -10,557 lines)
Inline Field Editing
- Editable invoice fields: Header, party, payment, line items, accruals, delivery, and bank/payment fields all editable inline
- Editable-value helper: New hiccup helper with hover/edit/saving CSS states and provenance threading
- Scalar edit endpoint: HTTP API with optimistic locking via
document.version
- Line item operations: Add, delete, and reorder rows (SortableJS drag-and-drop)
- Master data picker: Floating popover for line-item account fields
- Reset all edits: Single button to revert human edits to latest derivation
- Per-field reset: Individual undo for human-edited fields
- Toast notifications: HX-Trigger surfacing of success/error feedback
- Locale-aware numeric input: Decimal/thousand separator handling in edit fields
Document History & Versioning
document_history table: Append-only audit log with derivation and edit change types
- Document version: Optimistic locking column on
document table
- JSON-patch applier: Path-based patch operations with
[id=X] array selectors
- Transactional ingestion completion: Document write + history insert in a single transaction with
FOR UPDATE lock
- Document provenance walker: Computes per-field provenance from history
- Line-item annotation:
:id and :order required on every LineItem and BreakdownItem
Unified Processors Engine (IProcessor v2)
- Single processors engine: All processors share one core run loop with conditional filtering
- Edit-mode triggering: Engine reuses runs whose declared reads don't intersect edited paths
- Phased execution: Validations, fraud, matching, reconciliation, supplier, accounts, etc. ordered by phase
- Migrated processors: accounts, accruals, cost-center, supplier-matcher/verifier, tax-compliance, financial-validation-resolver, uncertain-validations-resolver, fraud-detector, validations, matching, reconciliation
- Reads-based gating: Each processor declares the structured-data paths it consumes
- Engine op filter: Preserves user-edited fields by skipping derivation that would overwrite them
- Cold-start exemption: Engine runs all processors on first ingestion regardless of reads overlap
- Two-phase run rows:
document_processor_run rows inserted at start and updated at completion
- Reap stuck runs: Orchestrator reaps long-running runs and fires diagnostics-recomputed
Document Diagnostics Atomic Updates
document.diagnostics column: Diagnostic outputs moved out of structured_data into dedicated jsonb column
document_processor_run table: Per-processor-per-version run tracking with status, version, input hash, and result
- Diagnostics-recompute SQS queue: Edit handler enqueues debounced recompute messages
- SSE auto-refresh: HTMX
sse-swap on diagnostic sections; diagnostics-recomputed events trigger OOB renders
- Section state classifier: Per-section badge helpers and CSS for green/yellow/red states
- Atomic write helper:
update-diagnostic! supports JSON-path sub-path writes
- Backfilled history: Existing matching, reconciliation, validation status seeded from prior columns
- Schema cleanup: Removed diagnostic keys from invoice, contract, GRN, and PO structured-data schemas
AP Approvals Workflow
- Tenant approvers: Configure ordered list of approvers per tenant with admin UI
- Approval handlers: Approve, reject (with cascade), revoke (most-recent only)
- Snapshot on completion: Approver list captured at ingestion completion for audit immutability
- Inline edits gated: Edits disabled when document is awaiting approval
- Auto-fire DATEV: Final approval triggers DATEV export through document output worker
- Schema migration:
tenant_ap_approval_config, approvers, snapshot, and audit tables
- 409 conflict handling: Concurrent approval attempts return structured error
- Display-name rendering: Approver names resolved from identity for UI display
AP Processing Modes
- Mode schema: New
processing_mode column on tenant config (straight-through, supervised, etc.)
- Centralized policy:
ap.processing namespace gates edits, output requests, and approval state
- Human review snapshot: Edits gated on whether document has entered human review
- Output authorization: Output requests authorized by mode and approval state
- Admin configuration: Tenant detail page exposes processing-mode picker
- Straight-through: Output requested automatically after matching completes
Document Output Dispatcher
- Generic output job system: New
document_output_job schema and worker
- DATEV via output jobs: Manual and auto-fired DATEV exports dispatched through output worker
- Status surfacing: UI badges show "Awaiting approval", "Dispatch failed", and dispatch progress
- Duplicate handling: 23505 uniqueness fall-back path tested; tolerates active jobs during final approval
- Audit linking: Export audits reference output job IDs
Admin Tenant Detail Page
- Anchor-nav shell: Tenant detail page with section anchors and quick-stats header
- Identity section: Inline PUT-per-field with column whitelist and 400/404 error tightening
- Sections added: Ingestion sources, DATEV integration, OAuth integrations, master data, prompt customizations, file store, notifications, booking history, API keys (scoped create + inline revoke), QA dataset summary, stats & cost (with chart)
- Organization rows: HTMX expand/collapse with tenants-by-organization aggregation
- Scoped create: Create-tenant modal launched from organization row actions
- Tenant chart: Cost/usage chart constrained to fixed height, scripts rendered without HTML escaping
Schema Rename: legal_entity → tenant
- Bulk rename:
legal_entity → tenant across schema, code, namespaces, routes, and tests
- Old "tenant" sense → organization: Top-level tenant record renamed to organization
- Migration: Up/down migration with constraint and FK renames; Migratus statement separators
- Vocabulary docs: Architecture and plan docs updated for new naming
- Schema abbreviation cleanup:
le-* keys renamed to tenant-*
- Route rename:
/legal-entities → /tenants
AI Observability
- AI events namespace: Macros for emitting structured logging events
- Event types:
ai.llm.call, ai.ocr.call, mcp.tool-call, agent run/iteration/tool-call
- Pipeline phase wrapping: 13 pipeline steps wrapped with
with-ai-phase
- Per-iteration scope:
:ai-iteration MDC scoped per-iteration with iteration events
- MDC context: Request-id, identity, scope, method, tenant pushed to MDC in MCP middleware
- Flattened MDC: Standard MDC keys flattened in JsonLayout
- Stop-reason capture:
:stop-reason included in LLM generate return maps
- Invariants test: Asserts pipeline event sequence for AI events
bb db:clone-prod: Clone production RDS to local dump file via throwaway RDS instance
- Snapshot reuse:
--fresh-snapshot and --freshness-hours flags
- Snapshot tagging: Throwaway source snapshots tagged and cleaned up; leaked-snapshot listing
bb db:list-clones / bb db:list-clone-snapshots: Surface leaked instances and snapshots
- Schema diff script: Compare local DB against prod dump
- DB credentials via JDBC properties: Avoids leaking creds in URL
- Helper tasks: db:drop-clone and ancillary clone tooling
- Runbook: New runbook for prod-clone refactor testing
MiniStack Replaces LocalStack
- MiniStack: Replaces LocalStack across dev infrastructure
- Manual seeding:
bb dev:seed instead of bb dev:init-aws
- Sandbox boot: Wait for MiniStack seed in repl-entrypoint
- Service rename:
localstack → ministack across service name, system property, and seed dir
- SQS reliability: maxReceiveCount and visibility-extension fixes; ignore S3 test events instead of cycling to DLQ
- Polling halt: Workers interrupt polling threads on halt in dev
Validation & Matching Fixes
- Recipient identity: Postal-code anchoring; legal-weight gate for vat-id/tax-id; review status for semantic name/address equivalence
- Demo account suppression: Recipient-identity check returns all-match for Orcha demo legal entity
- Supplier verification: Gemini responseSchema enforced; re-run on tax-id edits with staleness in UI; race-lost dedup stats
- Gemini temperature: Set to 0 for deterministic outputs
- UVR: New
not-applicable LLM verdict; persist debug envelope on required-fields resolver
- Tax-compliance: Use upsert for tax-id correction ops
- Supplier matcher: Declares
:validation-results/supplier-match write
- Recipient identity warning card: New UI card; unified validation card styling
AP UI & Navigation
- Supplier filter: Dropdown on AP overview list
- Next/prev navigation: Respects active sort column, status priority, and legal-entity filter
- Validation tab: Highlighting, section layout, and scroll behavior fixes
- List highlighting: Document management list highlights row on back-navigation
- Master data card: Layout fix; FVR banner full-width
- Read-only invoice view: Avoids recomputing UI on read-only views
DATEV Export Fixes
- Validation banner alignment: Cover-page findings match in-app validation banner
- Skonto1 fields: Placed per Maesn API contract
- Integration credentials: Use connected DATEV integration credentials
- export-eligible? routing: Routed through connected-datev-integration
- Diagnostics-tolerant export: Allow export when diagnostics report errors
- Re-export path: Document path parameter declared
- Unexported labels: Avoid exporting label for unexported documents
Teams Integration Fixes
- JWT verification: Hardened token validation against Microsoft JWKS
- Tenant-specific consent URL: Per-tenant consent flow with correct tenant ID propagation
- Tenant ID shadowing: Fixed callback shadowing bug that lost the tenant context
- Providers config: Updated provider configuration entries
- Manifest updates: Dev-environment Teams manifest refreshed for sideload distribution
Deploy Resilience
- Fail-fast on missing SQS queues:
:com.getorcha.aws/state init now throws an explicit error naming any queue that doesn't exist (typical cause: CDK has not been deployed yet) instead of an opaque AWS SDK stack trace. Surfaces the cause clearly in CloudWatch logs instead of waiting on health-check timeouts.
- Explicit
clean-orphans command: Removed automatic orphan cleanup from the regular migrate path. Cleanup now requires an explicit bb migrate clean-orphans invocation (CLI) or (com.getorcha.repl.migrations.clean-orphans/run!) (prod REPL). Prevents the rollback-corrupts-tracking failure mode where a failed deploy + rollback would silently delete tracking rows for migrations whose schema was already applied.
- CDK: document-output queue + complete SQS IAM grants: Foundation stack now creates
v1-orcha-global-document-output queue and DLQ. Compute stack IAM policy extended to grant SQS access for both diagnostics-recompute and document-output queues — closing pre-existing gaps that caused QueueDoesNotExistException on app boot.
Documentation & Marketing
- Integration research: Pleo, Moss, Scopevisio, SAP, Weclapp added
- Proposals: EmPact, EcoG, Zattoo
- Community articles: Sub-agents (DE/EN), CFO coding (DE revised + EN new), intro-claude refresh
- Privacy/security: Data privacy & security section on home and product pages
- Compliance: O1 TOM, O3 sub-processor list, O14 Transfer Impact Assessments codebase-verified reviews
- Repository guidelines: Contributing docs, agent docs (gh, clj-nrepl-eval)
- Plans/specs: Unified processors, document diagnostics, edit history, AP approvals, AP processing modes, document output dispatcher, account master data picker, prod-clone refactor, legal_entity → tenant rename, fix-segmentation, pipeline correctness testing
Other Notable Changes
- Sandbox tooling: jq and playwright-cli with chromium deps installed in sandbox
- HTTP logging: Unsupported HTML response bodies logged
- Removed: Batch DATEV export, dead bulk-selection CSS, deprecated matching_status writes, processor-run-event trigger, per-processor diagnostic event schemas
- Auto-update Claude Code CLI: On container start in sandbox
- End-to-end edit recompute test: Covers full edit → diagnostics-recompute path
- Self-heal integrant.repl: Preparer self-heals on every (reset)
v1.1995
Changes since: v1.1767 (commit cae6408c)
Released: 2026-04-12
Total commits: 227
Files changed: 139 (+13,559 / -15,339 lines)
Multi-Document PDF Splitting
- Segmentation gate: Ingestion pipeline splits multi-document PDFs into child documents with page ranges
- Classification detection: LLM classifies whether a PDF contains multiple logical documents and returns page segments
- Content-hash dedup: SHA-256 hash on segmented children prevents re-processing on reingest
- Idempotency: Classification result cached in DB; segmentation skipped if already run
Recipient Identity Validation
- New validation check: Verifies invoice recipient matches the legal entity via extracted name/address
- Vision fallback: UVR extended to resolve recipient identity warnings using document image
- Validation UI: Recipient identity warnings shown in formal requirements box with tooltips
- Demo account suppression: Warnings hidden for demo legal entity
Dense Layout Handling
- Density detection: Computes density ratio for OCR layout elements to identify tightly-packed tables
- Density-aware grouping: Row grouping algorithm adapted for dense layouts to prevent incorrect merges
- Vision fallback: Dense pages automatically routed to vision transcription pipeline
Reconciliation Improvements
- Scoped to current document: Issues filtered to only those referencing the viewed document
- Contract exclusion: Contracts no longer assigned to match clusters
- Display redesign: Summary-only view replacing details table, traffic-light badges on list pages
- Linked placeholders:
[doc:<uuid>] placeholders resolved to clickable document links
- Cluster-aware resolver: Sibling documents in cluster now resolve correctly, not just direct matches
- Relaxed surcharge rules: Invoice-only charges use LLM judgment instead of fixed 5% threshold
QA Ground Truth Dataset
- Admin editor: Full snapshot capture system for QA ground truth with dataset versioning
- Enriched snapshots: Includes prompt customizations and tenant config alongside document data
Post-Processing Refactor
- Phased pipeline: Fraud detection extracted from monolithic module into focused post-processing phases
- Contract risk: Separated into dedicated
contract_risk module
- Fee handling fixes: Fee line items included in amount-due validation and total formula
Document View Layout
- Compact sidebar: Sidebar collapses on document detail page to maximize viewing area
- PDF panel constraint: PDF preview capped at 720px max-width for better readability
- Smooth transitions: Sidebar animates on collapse/expand, content swaps preserve state
Master Data Management
- Delete buttons: Settings UI includes delete actions for uploaded master data records
- CSV encoding fix: Character encoding issues resolved for uploaded CSV files
Matching Improvements
- Reference overlap: Candidate retrieval widened to include documents with overlapping references
- Non-matchable types: Pipeline skips non-matchable document types instead of failing
Other Changes
- Upload limit: Manual upload limit increased from 5 to 10 documents
- UI polish: Validation box colored borders, check label colors, section icon alignment
- Notification channels: Grid layout and button styling fixes
- Service address field: New field for utility/rent invoices to support cost center matching
- IBAN/BIC extraction: Strengthened prompt rule to reduce LLM non-determinism
- Tax compliance: Filter out missing-tax-id issues when tax-id correction was applied
- Teams integration: Updated for SingleTenant bot and dev environment
- Slack notifications: Fixed document link path in anomaly notifications
- Classification prompt: Added attachment/appendix guidance for multi-doc detection
- Gross validation: Added line-item-sum vs total check to close validation gap
v1.1767
Changes since: v1.1343 (commit 0630092a)
Released: 2026-03-23
Total commits: 423
Files changed: 423 (+80,853 / -15,667 lines)
AP/AR Domain Isolation
- Namespace rename:
com.getorcha.erp → com.getorcha.app, workers under workers.ap
- Table rename migration: AP-specific tables prefixed with
ap_ (doc_source → ap_doc_source, ingestion → ap_ingestion, etc.)
- Route restructure: AP routes mounted at
/ap instead of /documents/accounts-payable
AI Module
- New
com.getorcha.ai package: Extracted ai.llm, ai.prompts, ai.tools from workers namespace
- Agent loop: LangChain4j-based agent loop with tool bridging for agentic classification
- Large document handling: Chunked extraction with page-based splitting, summary page detection, and merge of chunked results
- Large document classification: Agent loop with summary detection for multi-page invoices
Google Drive File Store
- FileStore implementation: Google Drive backend with token refresh and Shared Drive support
- Settings UI: Google Drive connection page with Picker integration
- OAuth flow: Drive-specific OAuth routes and authorization
- MCP integration:
resolve-file-store with dev override for FP&A tools
Transcription Pipeline
- PDFBox-first transcription: Per-page text extraction with OCR/vision fallback only for pages below character threshold
- Image text supplementation: Spatial merge of PDFBox text elements with OCR results for mixed content pages
- PDFBox layout extraction: Positioned text elements with image detection
- Native PDF parsing disabled: Document AI fallback uses rasterized pages only
Multi-Document Upload
- Split upload button: Type-filtered upload with popover status tracking
- Three-phase status: Uploading → classifying → processing popover indicators
- SSE integration: Real-time document row updates on AP and Document Management pages
- S3-first upload: Upload to S3 before DB insert to prevent orphaned document records
Invoices: Installments & Fees
- Installment schema: New
Installment type and line item category enum (service, fee)
- Extraction prompt: Updated to extract installments, fees, and corrected amount-due
- FVR updates: Handle amount-due errors with fees and installments
- Maesn export: Include installments as positive line items in booking proposals
Matching & Reconciliation
- Reconciliation: Cross-document issue detection with LLM-based reconciliation
- Match badges: Visual match status indicators on AP and Document Management list pages
- Hardened prompts: Scoped reconciliation to cross-document issues, minimum-2 document constraint
OAuth Consolidation
- Unified OIDC provider: Shared protocol for Google and Microsoft OAuth
- Shared token management: Extracted
email.oauth.tokens, email.oauth.state, oauth.core/token-request!
- App auth middleware: Validated Microsoft token expiration, unified JWKS caching
- Excel sandbox: SCI-based sandboxed Clojure evaluation for Excel operations
- Excel functions:
profile, find, read-range, read-ranges, merged-regions, named-ranges
- FastExcel migration: Replaced docjure/POI with FastExcel for production reads
- Data map refactor: Renamed context → data-map, added MCP resources support
- Tool registry: Migrated to
ai.tools with scope-gated MCP adapter
Responsive UI
- Sidebar collapse: App and admin sidebars collapse below 1024px with overlay
- Mobile layout: Single-column KPIs, stacked integration cards
- Line item redesign: Space-efficient data panel layout with content-visibility
Ingestion Fixes
- Schema validation logging: Log only error paths instead of full malli schema (eliminates multi-KB log lines)
- Missing prompt fields: Added
contact-person to PO and GRN supplier extraction prompts
- Cost center post-processor: Use
assoc nil instead of dissoc for missing employee field
- Post-process stats: Idempotent insert on retry
- Extraction fixes: Single-string addresses, double-counted discount handling, tax ID party association
- Snapshot system: Dump/restore document table state with S3 sync
- Demo pages: Ported controlling, payroll, tax-filing, monthly-closing, supplier-communication
- Tolaria: systemd service, install/setup/run scripts, demo environment config
Website & Marketing
- Homepage redesign: Apple-style design with product pages (DE & EN)
- AI engine section: Chip design with starburst animation
- Community articles: 15+ new articles (OCR vs AI, Claude Enterprise, Excel, sub-agents, etc.)
- Product features: March 2026 slide deck
v1.1343
Changes since: v1.1326 (commit edc952bd)
Released: 2026-03-05
Total commits: 16
Files changed: 20 (+2,082 / -282 lines)
MCP & OAuth
- MCP protocol upgrade: Updated to version 2025-11-25
- Tool initialization fix: Call
init-tools! during Link handler startup to register MCP tools in production
- OAuth flow redesign: Ocean wave animation and styled success page for MCP OAuth flow
ERP & UI
- Error pages: Styled 404/403/500 error pages for ERP
- Cross-tenant documents: Auto-switch tenant for super admins accessing cross-tenant document links
- Login animation: Replaced compass animation with ocean waves
Document Processing
- GRN extraction: Guide single quantity field mapping to
quantity-received
- Evidence scoring: Normalize VAT IDs and fall back to
quantity-ordered
Infrastructure
- Sandbox container: Switched to GraalVM native-image base, added rlwrap
- Debug tooling: Handle PGobject, enum, and array columns in
debug:fetch-match-cluster
v1.1326
Changes since: v1.1250 (commit fcb985b9)
Released: 2026-03-05
Total commits: 75
Files changed: 161 (+53,267 / -1,172 lines)
New MCP tools for financial planning & analysis data mapping and master data access.
- orcha-fpna-data-map: Loads Data Discovery Protocol and legal entity context for FP&A mapping sessions
- orcha-fpna-save-dm: Incremental save of data map entries discovered during FP&A sessions
- orcha-fpna-list-files: List and summarize Excel files from legal entity data sources
- orcha-master-data-legal-entities: Legal entity master data lookup
- orcha-data-master-data: GL accounts, cost centers, and business partner master data
- FileStore protocol: Pluggable file access (local filesystem, S3) for FP&A tools
- OAuth scope migration: Replaced
mcp:* scopes with domain-scoped docs:read, fpna:read
- Tool registry refactor: Replaced atom-based MCP tool registry with defmulti dispatch
- debug-match skill: Systematic matching error investigation via prod cluster data
- debug:fetch-match-cluster: Babashka task to fetch match cluster data from production
- Shared debug utilities: Extracted common debug helpers into
debug_common.clj
Website & UI
- Presentation slides: Added slides from OrchaSlides repo
- Wave animation: Increased contrast and speed on website
- Tips card glow: White text and rotating border glow on featured AI tips card
- Pageview analytics: IP address tracking
- App nav link: Added to website navigation
Infrastructure
- sandbox:claude:
--infra flag to start REPL in background
v1.1250
Changes since: v1.1227 (commit c7dfff9e)
Released: 2026-03-04
Total commits: 22
Files changed: 74 (+10,882 / -223 lines)
Invoice-GRN Matching
Direct matching between invoices and goods-received notes (GRNs).
- GRN extraction improvements: Better handling of delivery notes, deduplicated normalized references, accept delivery-note-numbers as alternative to grn-number
- Reference field vectors: Convert document reference fields from singular strings to vectors
- Evidence signals: GR-reference-exact signal and date-within-period extended for GRN receipt-date within invoice service period
- LLM prompts: Add prompt entries for invoice-GRN matching decisions
- Matchable pair registration: Register invoice-GRN as matchable pair, update UI counterpart types
- Community page: Blog posts, videos, newsletter signup, and resource pages (DE + EN)
- WebGL orb animation: Hero section orb animation for website and login page
- Pageview tracking: DIY tracking via Google Sheets with Apps Script integration
Login Page Rework
- Split-screen layout: Redesigned login page with AI tips and ocean wave animation
v1.1227
Changes since: v1.1187 (commit 906d65c2)
Released: 2026-03-03
Total commits: 39
Files changed: 54 (+4,444 / -223)
Line-Item Reconciliation
LLM-based line-item comparison between matched documents (e.g., invoice vs contract).
- Reconciliation engine: New namespace with LLM-based line-item comparison and issue detection
- Pipeline integration: Wire reconciliation into
process-document! after matching
- Reconciliation UI: Badge and issues list in matches section showing discrepancies
- LE-specific prompts: Add customer-specific prompt customizations for matching and reconciliation
Matching SSE Updates
Real-time matching status updates on the document detail page.
- Postgres trigger: Notify on matching status changes via
pg_notify
- SSE handler: Handle matching events in detail page SSE stream
- Live UI updates: SSE connector in matches section for automatic refresh on completion
Microsoft Teams Integration
Notification channel integration with Microsoft Teams via Azure Bot.
- Teams notification channel: Send document processing notifications to Teams channels
- Single-tenant Azure Bot: Adapt for single-tenant bot registration
- Multi-tenant routing: Support multiple tenants with proper channel targeting
- Infrastructure: Add Teams SSM parameters to FoundationStack
Skip "Other" Documents
Handle documents classified as unsupported types gracefully.
- Skip extraction: Documents classified as "other" bypass extraction pipeline
- S3 cleanup: Delete S3 files for unsupported documents
- UI handling: Show friendly message for unsupported types, exclude from document list
Document Clusters
- Cluster table: Add
document_cluster junction table with FK constraint
- Cluster assignment: Update
assign-cluster! to use new document_cluster rows
Fixes
- Matching schema: Add
matching-status to Document schema, fix SSE status checks
- False-alarm notifications: Suppress notifications for
insufficient-documents status
- Migration orphan cleanup: Run orphan cleanup after migrate so
schema_migrations table exists
- Test FK constraint: Use
create-cluster! in document_matching_test
v1.1187
Changes since: v1.1154 (commit 07c76cef)
Released: 2026-03-02
Total commits: 32
Files changed: 107 (+4,715 / -3,042)
Many-to-Many Matching
Rewrote the matching pipeline to support many-to-many document matching with per-candidate LLM evaluation.
- Many-to-many decide-matches: Rewrite matching to evaluate each candidate independently instead of picking a single best match
- Per-candidate LLM prompts: Updated LLM prompt structure for individual candidate evaluation
- Match UI pagination: Show all matches with HTMX-powered pagination
- Always-visible match section: Two-state UI showing matches or "no matches" instead of hiding the section
- Silent failure prevention: Fix matching worker to surface errors instead of swallowing them
- Confidence display fix: Show LLM confidence level instead of blended score in match cards
Pairing-Specific LLM Prompts
- Pair-specific prompts: Add specialized LLM matching prompts per document pair type (invoice-contract, invoice-delivery, etc.)
- Candidate-type threading: Thread candidate-type through decide-matches and llm-match-decision to select the right prompt
Migration Squash
- Squash 39 migrations: Consolidate migrations up to v1.963 into a single init file
- Orphan cleanup: Auto-clean orphaned schema_migrations rows on migrate, with generic key extraction
Other
- Deploy task: Add
bb deploy task to trigger CodePipeline
- Contract rename: Rename
financially-active to recurring for contracts
- TOCTOU race fix: Handle race condition in supplier verification during parallel ingestion
- UI fixes: Document management navigation, matches placement, and minor improvements
v1.1154
Changes since: v1.963 (commit 2d2edd18)
Released: 2026-02-27
Total commits: 190
Files changed: 303 (+109,920 / -3,568)
Document Matching Pipeline
Full document matching system: evidence-based scoring, LLM verification, cluster management, and UI.
- Core matching engine: Evidence signal collection, scoring pipeline, and LLM-based match decisions
- Hybrid candidate retrieval: BM25 + semantic embedding search with RRF fusion, replacing unranked retrieval
- Blend scoring: Integrate cosine similarity retrieval scores into the evidence scoring pipeline
- Evidence signals: PO/contract reference matching, VAT/IBAN matching, quantity/amount comparison, date period validation, supplier name fuzzy matching, description overlap
- LLM matching: Claude-based match confirmation with retry logic and transient/non-transient error classification
- Cluster management: Automatic cluster assignment when matches are created, cluster merging
- Status tracking:
matching_status column with pending/in-progress/succeeded/failed states, attempt counting
- Matching UI: Match cards on document detail page with blended score, evidence display, and section navigation
- SQS worker: Document matching queue with
document-ready event publishing after ingestion
- Infrastructure: SQS queue and CloudWatch alarms for matching pipeline
- Normalized columns:
normalized_counterparty and normalized_references for efficient candidate filtering
- Searchable text: Enriched with line items, deliverables, and pricing for BM25 retrieval
- Backfill migration: Populate searchable_text and embeddings for existing documents
Contract Enhancements
- Compliance & obligations post-processors: GDPR/NIS2/DORA compliance checking and key obligations extraction
- Contract detail redesign: Hero card, compliance section, obligations display, new section ordering
- Principal/counterparty rename: Rename party-a/party-b across schema, validation, matching, fraud detection, and display
- Legal entity context: Inject legal entity data into extraction prompt for principal identification
- Financially active contracts: Expanded contract types and financial activity tracking
Documents Module Refactor
- Module split: Break monolithic document-management into submodules (accounts-payable, management, upload, shared, view)
- Type-specific detail views: Dedicated view modules per document type with unified SSE and dispatch
- Route restructure: New namespace-based route assembly with type dispatch
Supplier Verification
- Automated verification: Auto-verify green suppliers, remove manual approve/reject flow
- Unknown supplier handling: Automated verification for unknown suppliers with TTL and improved name matching
Other Features
- Email viewer tab: View original email content in document detail
- Contract classification: Add contract subtype to classification step
- VAT/Tax ID validation: Offline checksum validation for 47+ countries
- Semantic search: Hybrid BM25 + embedding retrieval integration
- Claude Code sandbox: Isolated Docker development environment with Clojure CLI, psql, nREPL proxy, Testcontainers, clj-kondo
Fixes
- DATEV REWE: Fix OAuth state persistence, journal entry parameters, missing fiscal years, large batch inserts
- Matching reliability: Error handling for silent failures, DB transaction atomicity, message deletion fixes
- Migration fixes: Statement separators, pgvector/array casting, enum type casts
v1.963
Changes since: v1.888 (commit c9b2578c)
Released: 2026-02-19
Total commits: 74
Files changed: 128 (+15,038 / -3,650)
DATEV REWE Integration
Full integration with DATEV REWE accounting via Maesn API for importing booking history.
- DATEV REWE connection flow: Public HTMX pages for OAuth-based account linking
- Journal entry import: Sync REWE journal entries with
sync-rewe-journal-entries!
- Booking history matching: Match invoices against supplier-specific DATEV booking history
- Journal entry mapping: Convert DATEV REWE entries to booking history items
- Link management: Create, validate, and mark DATEV REWE links as used
- Settings UI: DATEV REWE section with copyable link text
Commits: 0c8078df, 6296c617, 5ce7ad1b, 7e389206, a1cf023b, 78177236, 43d1c4bc, 2ed9bc47, 68482df9, 057db012, 95fd01cd, fa622d21, fa9e6ffb, deb97c20, 86bdc5df, 238d7d90, d1b498ea, 79bd429a, a53f6348
Link MCP Server
OAuth 2.1-authenticated MCP server for external integrations.
- OAuth 2.1 authentication: Full OAuth flow with PKCE support
- Dynamic client registration: JWT-based client IDs with automatic validation
- Google/Microsoft IdP integration: OAuth callback handlers for identity providers
- Database-backed flow storage: Secure state management for OAuth flows
- Document Query API v1: Query documents via MCP protocol
- Infrastructure: SSL certificate, target group, listener rules, DNS records, CloudWatch alarms
Commits: b993adc7, 0fba649b, 57b96dec, 426fa061, 009a298b, 94033408, 59ca32ef, 6ed1091f, fd4a821f, 57ce8344, 58cc43f3, 8ab23b0e, 9ff7f784, 45d72c58, 861fda1a, 648e40e4, 0ca38d9b, e2776540, 111588c4, af5bf9b3, 161c189f
UI Improvements
- VAT/BU reasoning display: Show tax and BU code reasoning in data panel for all statuses
- Settings page improvements: Sortable modals, all columns visible, better error handling
- Email validation: Inline app-styled errors replace native browser validation
- Multi-line SSE fix: Handle multi-line SSE data per spec to prevent truncated updates
Commits: 559ef3e9, 387c1bdd, 97f9c85e, aa1e866b, e86665e2
Bug Fixes
- Mailto link extraction: Fixed HTML mailto links leaking into forwarding chain extraction
- Booking history upload: Fixed handler for async Ring middleware
- Test schema fixes: Fixed mismatches after legal entity hierarchy migration
Commits: 0334ad8c, 7fd0d664, fbd72ce4, a70a7097
Developer Experience
- clojure-eval skill: nREPL evaluation for testing code in running REPL
- reingest-doc skill: Re-run document ingestion after code changes
- Test output silencing: Cleaner test output while preserving REPL/CI logging
- Ralph environment: Autonomous development environment configuration
Commits: d4455d62, a99a2f3d, b2d9add1, b1982af2
v1.888
Changes since: v1.847 (commit 2782e1d4)
Released: 2026-02-16
Total commits: 40
Files changed: 44 (+4,474 / -438)
Notification System
Multi-channel notification system for email and Slack alerts.
- Notification channels: Configure Slack webhooks and team member email addresses per tenant
- Admin notifications: Admin notifications mirrored to Slack with legal entity context
- Email notifications: SES-based email delivery with inline HTMX updates
- Slack OAuth: OAuth flow for channel selection with redesigned UX
- Event-based alerts: Notifications for critical failures, email sync gaps, and subscription expiry
- Skip internal tenants: Admin notifications skip Orcha tenant
Commits: c7702017, 19b10465, 10348a58, 160ba8d3, 75f0b391, 939a1bb0, 9279fb94, 10fca302, 748414b8, 0ee38ce1, f182538a, 37abc8c5
Duplicate Detection
Prevent duplicate document ingestion through statement-based exclusion.
- Content hash exclusion: Skip documents matching existing statements for tenant
- Statement exclusion list: Configurable exclusion rules per legal entity
Commits: de6c79ed
Infrastructure
- Consolidated AWS clients: Single Integrant component for all AWS clients
- Input/output token display: Admin cost tables show input and output tokens separately
Commits: efc5397b, 48cabba7, 1b9a7717
UI Improvements
- Document list sorting: Fixed to preserve legal-entity filter and add stable tiebreaker
- Prompt customizations matrix: Restored on tenants admin page
Commits: 9607219a, 0f55428b, becd521f
Bug Fixes
- SES multi-recipient handling: Fixed OAuth qualified key access for multi-recipient emails
- VAT rate comparison: Fixed mismatch comparison using numeric equality
- HoneySQL FILTER syntax: Fixed admin users query aggregate function
- IP address test: Fixed to accept any valid IP format
- Bulk actions: Added deselect all button to bulk actions bar
Commits: cbeec94e, 1efe186e, 0f3e7ced, 5d7ddc99, faf673ba, b16d0508
Developer Experience
- Test guidelines: Prefer
clj -X:test over REPL testing
- AWS SSO hint: Added login hint to debug-doc skill
- Routing helper: Use
erp.http.routes/path-for instead of direct reitit calls
- Qualified key guideline: Added guideline for inline predicates
Commits: abce94c2, 89fe85ef, 8e8344b8, 38ea25c2, 8be8344b
v1.847
Changes since: v1.822 (commit dfd0b96b)
Released: 2026-02-13
Total commits: 24
Files changed: 45 (+4,737 / -605)
REST API for Master Data
Added REST API endpoints for managing master data programmatically.
- GL Accounts API: GET/POST endpoints for GL account management
- Cost Centers API: GET/POST endpoints for cost center management
- Business Partners API: GET/POST endpoints for business partner management
- API key authentication: JWT-based auth with tenant-scoped API keys
- CSV downloads: Download master data as CSV files
- Swagger documentation: OpenAPI spec at
/api/v1/swagger.json
Commits: bbf895b0
- VAT extraction rules: Improved guidance for VAT ID and tax rate extraction
- Service period handling: Better extraction of service/delivery periods
- Line item rules: Enhanced line item extraction accuracy
- Supplier selection rule: Added rule for supplier identification in extraction prompt
Commits: 8ca29f3a, 07222140
Account Matcher Improvements
- Structured decision procedure: Refactored prompt with clearer matching logic
- Removed internal terminology: Cleaner reasoning output for end users
Commits: cd3f7904, 45cbb254
BU Code Fixes
- Austria support: Added Austrian BU codes to mapping
- Correct DATEV codes: Fixed BU code mapping for various tax scenarios
- Token optimization: Removed DATEV doc references from docstrings
Commits: 57ba10fb, 670086c0
Financial Validation
- Percentage-based tolerances: Switched from fixed amounts to percentage-based tolerances for financial math validation
Commits: 88f248e3
UI Improvements
- Frozen headers: Page header and table header now freeze on scroll in documents list
- Merged settings tabs: Settings tabs consolidated into single scrollable page
- Warning readability: Improved validation and fraud warning/error message display
- Modal fixes: Fixed GL Accounts modal showing LazySeq, removed modal size constraints
- CSV upload fix: Fixed dataset CSV upload functionality
Commits: 60f606cc, 13e13760, f69fd694, 0bae7b54, 9f9af21f
Infrastructure
- X-Ray health exclusion: Excluded /health endpoint from X-Ray tracing to reduce noise
Commits: f4aacf44
v1.822
Changes since: v1.805 (commit 26761cfa)
Released: 2026-02-12
Total commits: 16
Files changed: 60 (+4,611 / -2,661)
Multi-Tenant Architecture
Tenant & Legal Entity Hierarchy
Introduced multi-tenant support with tenant/legal-entity hierarchy.
- Tenant management: Admin panel for creating and managing tenants
- Legal entity support: Multiple legal entities per tenant
- ERP multi-legal-entity UI: Legal entity selector in ERP interface
- Auth middleware refactor: Removed legal entities from auth middleware
Commits: 25ca6557, b064567c, e5ab21f0, f7b7af25
DATEV Integration Improvements
- isPaymentOrder field: Added to DATEV payload - true when payment due, false otherwise (
92f8682b)
- Supplier address: Send supplier address to DATEV via addresses[].city field (
047babff)
- Always send supplier info: Send both name and account number to DATEV (
eaf10ca3)
- BU code validation: Validate BU codes before sending to DATEV API (
54b299de)
UI Improvements
- Settings page redesign: Split integrations, redesigned master data UI (
445312dd)
- Non-taxable display: Show 'Non-taxable' for line items with nil tax rate in VAT box (
c283802e)
- Path refactoring: Replace hardcoded path strings with path-for calls (
203c932e)
Infrastructure
- X-Ray production fix: Use IgnoreErrorContextMissingStrategy for X-Ray in production (
65423b8a)
- X-Ray daemon sidecar: Added to production docker-compose (
bcc2cd80)
Developer Experience
- CLAUDE.md updates: Added working relationship preferences (
2fcd1210)
- Debug tool fix: Fixed debug:fetch-document for legal_entity_id schema (
db8c7fad)
Bug Fixes
- Admin tenants lint fix: Fixed shadowed function names in admin tenants handler (
52105fca)
v1.805
Changes since: v1.789 (commit af18810b)
Released: 2026-02-11
Total commits: 15
Files changed: 29 (+1,424 / -514)
Infrastructure
AWS X-Ray Distributed Tracing
Added X-Ray tracing support for request tracking across services.
- X-Ray integration: New
com.getorcha.xray namespace for tracing
- Request correlation: Traces HTTP requests through the system
Commits: 403512cc
Testcontainers Upgrade
Upgraded Testcontainers from 1.20.4 to 2.0.3 for Docker 29 compatibility.
- Docker 29 support: Fixed "client version 1.32 is too old" error
- New artifact names: Migrated to
testcontainers-* naming convention
- New package structure: Using
org.testcontainers.{module} packages
Commits: a21a0886
UI Improvements
Invoice List Enhancements
Improved document list sorting, filtering, and navigation.
- Cost center filter: Filter invoices by cost center
- Account filter: Filter invoices by GL account
- Improved sorting: Better sorting behavior and navigation
- CSV downloads: Download accounts and cost centers as CSV files
Commits: 6b93bd24, 13032130, 8d4caa85
Settings Page Redesign
Reworked data panel and settings page visual design.
- Visual refresh: Cleaner layout and styling for settings page
- Data panel improvements: Reorganized data management interface
Commits: 6155566a
Bug Fixes
- Period allocation: Fixed accrual period allocation to use invoice date instead of current month (
f6ff1df3)
- VAT badge: Fixed to show actual tax rate instead of expected rate (
6331f6a0)
- Non-taxable items: Added support for out-of-scope items in tax-rate-breakdowns (
2c2a86e9)
- Test fixes: Fixed LLM tests to mock hato/request instead of hato/post (
f6a8a7c6)
v1.789
Changes since: v1.762 (commit 72beaa94)
Released: 2026-02-09
Total commits: 26
Files changed: 23 (+867 / -278)
Financial Math Validation Fix
Fixed total validation to handle both net and gross invoice formats.
- Dual formula check:
check-total now tries both net-based and gross-based formulas before flagging errors
Commits: c6abdab5
Excel File Support
Added ability to view Excel files in the document viewer.
- PDF preview generation: Excel files are converted to PDF for viewing
- Download original: Banner with download button for the original Excel file
Commits: 6b3e874b, 0593becb
Discount Handling Improvements
Enhanced extraction and validation of invoice discounts (Skonto).
- FVR corrections: Discount and discount-type now correctable by Financial Validation Resolver
- Extraction prompt: Improved guidance for discount extraction
- DATEV Skonto fields: Fixed field placement per Maesn API requirements
Commits: 1f6b13f1, 89404bbb, 30f516a3
UI Improvements
- Original filename display: Shows uploaded filename above PDF viewer with copy-to-clipboard button
- Tax validation row merge: Tax ID Format and Tax Rate combined into single Tax row
- Overview list redesign: Softer badge styling and total amount column
- Navigation filter fix: Document navigation now respects status filter
Commits: a08b4fa9, d3639d0a, cce03045, d25b47dd
Admin Panel
- Date range filtering: Flexible date range selection on cost dashboard
Commits: a20ef211
Website
- Mobile fixes: Fixed horizontal scroll and redirect flash on mobile
Commits: 844998a4
v1.762
Changes since: v1.715 (commit 541193f9)
Released: 2026-02-08
Total commits: 46
Files changed: 33 (+2,454 / -1,154)
Document Classification
Added document classification step to distinguish invoices from credit notes, delivery notes, etc.
- Classification step: New pipeline stage classifies documents before extraction
- Document type guidance: Accounts and cost center matching prompts now receive document type context
- Schema extension: Added
document-type field to support classification
Commits: 5531cb8a, 3af2ff37
Multi-Rate Tax Validation
Improved validation for invoices with multiple VAT rates.
- Breakdown-based verification: Tax validation now uses rate-by-rate breakdown comparison
- Aggregated display: Payment Summary UI aggregates tax breakdowns by rate
- Net subtotals: Fixed validation to treat breakdown subtotals as always net
Commits: 2eb45329, 5e9dbc7c, 0e749afb, 8f7acbe9
Financial Validation Improvements
Enhanced FVR (Financial Validation Resolver) capabilities.
- Quantity multiplier: FVR can now correct quantity for subscription/annual rate proration
- Unit price corrections: FVR can correct unit-price in subtotal line-item-corrections
- Line item amount corrections: FVR subtotal check can now correct individual line item amounts
- Stale reasoning filter: FVR explanations filtered out when underlying checks pass
- Removed NET bias: Improved FVR prompt to remove bias toward NET amounts
- Surcharges support: Added handling for per-line or invoice-level surcharges/fees
Commits: 0882925a, 7044c8d9, 6c3d7824, 2fc45a15, 6ee270f7, 6295d6b0, 6de7f0f0, da2be3ae
Flexible Cost Center Matching
Configurable cost center matching strategies per tenant.
- Strategy configuration: Tenants can choose between matching approaches
- Improved accuracy: Better handling of employee-based and description-based matching
Commits: facbab9a
- Gross vs net guidance: Added extraction guidance for gross vs net line item pricing
- Reverse charge tax-rate: Improved extraction instruction for reverse charge scenarios
- Packaging field: Added packaging field to invoice schema
- UVR field resolution: UVR can now resolve missing required fields from transcription text
Commits: 2bbc0405, 316af74d, ab5b1ed7, df4ea643
Validation Fixes
- Computed subtotal: Only uses line items with amounts (fixes edge cases)
- IBAN format exclusion: IBAN format issues excluded from TCA tax-issues output
- Reverse charge line items: Fixed reverse charge validation on per-line-item basis
Commits: 93f271bd, 5005782d, 11e29a9e
Infrastructure & Operations
- Vision batch sizes: Increased from 5 to 10 pages for better throughput
- DATEV ledger-name fix: Fixed population and retry on
invalid_ledger_name error
- Ring upgrade: Updated to 1.15.3 for commons-fileupload2 compatibility
- NVD security fixes: Updated nvd-clojure to 5.3.0, fixed HIGH severity vulnerabilities
- Disabled nREPL in tests: Prevents port conflicts during test runs
Commits: dfd2a874, fc0f56e5, a30dbcdc, 8fc0510a, 8fcc825c, 36ee0f35
Admin Panel Improvements
- Cost page enhancements: Added runs and cost-per-run columns
- Gemini rate card: Added missing gemini-2-5-flash rate for cost calculation
Commits: ea54e544, e75c1e9d
UI Improvements
- Payment Summary: Simplified UI with math check badge
- Debug-doc skill: Updated to check local first and prefer PDF over transcription
Commits: 81cb737f, dc08fb76
v1.715
Changes since: v1.691 (commit dc625add)
Released: 2026-02-06
Total commits: 23
Files changed: 15 (+833 / -124)
Microsoft OAuth Session Fix
Fixed auth middleware to validate Microsoft tokens in session cookies.
- Dual token validation: Auth middleware now tries Cognito first, then Microsoft JWKS
- Session persistence: Microsoft logins now work correctly after redirect
Commits: dd1658e3
Email Forwarding Chain Detection
Enhanced email acquisition to extract the full forwarding chain for fraud detection and cost center matching.
- Forwarding chain extraction: Parse full chain of forwarded emails
- Original sender detection: Identify the original sender in forwarded emails
- Cost center fallback: Use forwarding chain for cost center matching when direct match fails
Commits: d78334b4, 38dc7fa0, 3facc4e9, 61a36fb8
Tiered Invoice Validation
Invoice formal requirements now validated based on total amount thresholds.
- Tiered validation: Different required fields based on invoice amount (≤250€, ≤1000€, >1000€)
- Simplified small invoices: Reduced requirements for low-value invoices per German tax law
Commits: 9a408acf
Tax Validation Relaxation
Improved handling of tax-related validation issues.
- Warnings instead of errors: Tax issues now show as warnings, not blocking errors
- Reverse charge handling: Better support for reverse charge invoice validation
- Needs Review filter fix: Filter now correctly shows warnings only, not errors
Commits: 932ae09a, 2060e522
LLM Token Tracking
Complete token usage tracking for all LLM call sites.
- Full coverage: All LLM calls now track token usage
- Cost visibility: Better insight into API costs per operation
Commits: 7094bd0a
v1.691
Changes since: v1.674 (commit 1b1eb60f)
Released: 2026-02-06
Total commits: 16
Files changed: 19 (+1,748 / -1,197)
Direct Microsoft OAuth
Replaced Cognito-based Microsoft authentication with direct Microsoft OAuth flow. Cognito's strict issuer validation rejects Microsoft's multi-tenant tokens where the issuer varies by tenant ID.
- Direct OAuth flow: Microsoft login now bypasses Cognito entirely
- Azure App separation: Two Auth apps (dev/prod) separate from Email apps
- HMAC state signing: CSRF protection using signed state parameter
- Microsoft JWKS validation: Fetches and caches Microsoft public keys for JWT verification
- Flexible issuer validation: Accepts any valid Microsoft tenant issuer pattern
- New SSM parameters:
microsoft-auth-client-id, microsoft-auth-client-secret, microsoft-auth-state-secret
Google login continues to use Cognito unchanged.
Commits: 57240376
Admin Prompt Customizations Matrix
Added a matrix view showing prompt customization status across all tenants.
- Matrix view: Shows which prompts are customized for which tenants
- Tenants admin page: New customizations matrix section
- HoneySQL refactor: Migrated admin queries to HoneySQL
Commits: 9330e3dc, 82477f5a, b4a6213f, 59fc8645
QA Dataset PDF Downloads
Super admins can now download original PDFs from the QA dataset.
- PDF download button: Added to QA dataset admin page
- Correct file paths: Fixed JDBC column key handling for file paths
Commits: 1e5e2fff, e60df9ea, f190e392, a2ee220d, 1747ba00
Infrastructure & Operations
- Cognito auth logging: Added detailed logging for authentication failures
- Release skill paths: Fixed to use relative paths for portability
- AWS Profiles documentation: Added profile names to CLAUDE.md
Commits: a7524c4c, d5b6fd09, d88b90d5
Website Updates
- Product dropdown: Added Product dropdown menu to navigation
- Section reorganization: Reorganized website sections
Commits: 26b72e73
v1.674
Changes since: v1.620 (commit 434eaf9d)
Released: 2026-02-05
Total commits: 22
Files changed: 16 (+1,891 / -544)
Price-Per Bulk Pricing Support
Added support for invoices that show unit prices for quantities other than 1 (e.g., price per 100 units).
- New
price-per field: Line items can now specify a pricing unit divisor (100, 1000, etc.)
- Extraction prompt: Instructs LLM to extract price-per from visual indicators only (column headers like "/100", "EP/100 Stk")
- Validation formula: Updated to
amount = quantity × unit-price / price-per
- FVR correction: Can now correct price-per, unit-price, and amount extraction errors
- UI display: Shows "(per N)" indicator next to unit prices when price-per > 1
Commits: da757a74, 59f7ae43
FVR Post-Correction Improvements
Fixed issues where corrected errors were still displayed to users.
- Re-run validation: After FVR corrections, validation is re-run to reflect the corrected data
- Filter explanations: FVR explanations now only show for checks that are still failing
- Status update: Changed "uncertain" to "error" after FVR resolves what it can
- Subtotal fix: Correcting a line item now properly resolves cascading subtotal errors
Commits: da757a74
FVR Smart Page Selection
Intelligent page selection for Financial Validation Resolver to reduce vision API costs.
- Strategy-based selection: Localized (flagged pages only), full-scan, or financial-only
- Batched processing: Large documents processed in batches of 10 pages max
- Tolerance stacking: Detects cumulative rounding errors and returns warning instead of triggering FVR
- Improved error messages: Line item errors now show description and page location
Commits: 5d5a0f42
Per-Line-Item BU Codes
BU (Buchungsschlüssel) codes now assigned per line item instead of invoice-level.
- Line-item assignment: Each line item gets its own BU code based on VAT treatment
- Sonnet 4.5 for tax: Upgraded tax compliance analyzer to Claude Sonnet 4.5
- Context prompts: Added accounts payable context to cost center and accounts matching
Commits: 3e78a73a, 61838e95
VAT Validation Improvements
- Pattern improvements: Better VAT ID validation to prevent false positives
- Batch processing: Cost center and accounts matching now processed in batches
- Extraction rules: Renumbered and simplified tax ID extraction instructions
Commits: ea9af3f6, 0f82f7b4, 85864178
Payment Summary Improvements
- Comparison grid: Prepayments moved into Original vs Verified comparison view
- Visual improvements: Enhanced payment summary layout
Commits: 433c664c, 0b922484
Fraud Detection Improvements
- Context-based analysis: Improved fraud detection prompt with better context handling
Commits: 942207bd
v1.620
Changes since: v1.537 (commit f474136c)
Released: 2026-02-03
Total commits: 72
Files changed: 54 (+5,459 / -1,166)
Fixed LLM hallucination errors in IBAN extraction.
- Verbatim extraction: Changed prompt to extract IBANs with spaces intact (avoids transformation errors)
- Code normalization: Strip spaces in post-processing instead of asking LLM to transform
- IBAN validation: Added length-by-country and MOD-97 checksum validation
- Uncertain resolution: Invalid IBANs marked for URV resolver to correct via PDF review
Commits: 25f5216f
Fraud Detection System
Comprehensive fraud detection with rule-based and identifier-based checks.
- Invoice splitting detection: Flags potential split invoices from same supplier
- Supplier bank change: Detects IBAN changes for known suppliers
- Supplier tax ID change: Flags tax ID changes for existing suppliers
- Sender domain change: Monitors email sender domain consistency
- UI integration: Fraud flags shown in document overview list with warning/error badges
Commits: 7076c44b, 9c3184b2, eeaf9fd3, 47b5362d
Financial Math Validation Overhaul
Bottom-up validation approach with LLM uncertainty resolution.
- Bottom-up validation: Validates line items → subtotal → tax → total in sequence
- PDF verification: LLM resolver now receives PDF for financial-math uncertainties
- Gross/net detection: Improved handling of gross vs net invoice calculations
- VAT validation fix: Fixed false warnings on gross invoices with correct math
Commits: c9ace564, 7f05c317, 00fd0d64, cd65d816
Bulk DATEV Export
Export multiple documents to DATEV in a single operation.
- Checkbox selection: Select documents for bulk export from overview list
- PostgreSQL session store: Handles bulk export state across requests
- Export polling: Automatic status polling after manual exports
- Disabled states: Export checkbox disabled for documents with validation errors or export in progress
Commits: 84a886d5, 234ef2e7, ea0616af, 78087cb0, b1ca8a3b
LLM Prompt Customization
Tenant-specific prompt customization for extraction and post-processing.
- StringSubstitutor templates: Replaced manual string interpolation
- Tenant prompts table: Store custom prompt overrides per tenant
- Dynamic resolver prompts: UncertainValidationsResolver builds prompts based on which checks failed
Commits: 31301ceb, 05768461, a8f7bcb1
LLM Data Corrections
Allow LLM uncertainty resolver to correct extracted data.
- Field corrections: Resolver can now fix subtotal, tax, total, discount, shipping, amount-due
- Line item updates: Can update individual line item fields by index
- Line item removal: Can remove incorrectly extracted items (e.g., section subtotals)
- Country inference: Resolver fills in missing issuer/recipient country codes
Commits: 5bd0a43d
DATEV Integration Improvements
- Auto-fill issuer fields: Missing issuer phone/fax/email populated from master data
- Amount-due fallback: Uses total when amount-due is nil
- Effective amount fallback: Handles fully-paid invoices correctly
- Payload error handling: Graceful handling of payload build errors
- DATEV polling refactor: Moved polling into create-booking-proposal!
Commits: f6806bdb, 29ddc499, 176db948, dee6e72f, 16a25a69
Validation Improvements
- Tax ID consolidation: All tax ID validation moved to TaxComplianceAnalyzer
- Invoice-date fallback: Prevents failed processing when date missing
- Invoice-splitting fix: No longer counts current document as 'other'
- Page selection fix: Financial-math verification now selects correct pages from multi-page PDFs
- Account-number type: Changed from int to string to preserve leading zeros
Commits: 5450c418, 97177e5c, 702a9ced, aaedea2d, 71c9f5d8
UI Improvements
- Fraud flags in overview: Show fraud-flags and tax-issues warnings/errors
- Date range filter styling: Dark theme styling for filter component
- Export status badge: Reflects actual export state
- Validation UI redesign: Check-row layout with status badges
Commits: 8fec4789, 50b6d182, e88751fc, 9c3184b2
Infrastructure & Operations
- Commit SHA tracking: Ingestion records now include git commit SHA at processing time
- SSM cleanup: Removed deprecated /v1-orcha/refresh-token-key-arn parameter
- MDC logging: Added request/job tracing context
- CloudWatch logs: Added to debug-doc skill
Commits: ee5f6107, 43ffd7ac, 31fa3314, 42466853, e6cdb36e
Bug Fixes
- VAT breakdown: Fixed payment summary for NET line items
- SES Sources table: Fixed admin panel rendering
- JSON parse regex: Preserve escaped quotes
- Validation status: Fixed missing tax ID and issuer country inference
Commits: dcffee5b, 3ab115bb, 881c068d, ecb4b08d
Code Quality
- Lint fixes: Removed all unused binding warnings
- Test failure patterns: Documented grep patterns in CLAUDE.md
- Debug logging cleanup: Removed temporary validation logging
Commits: 6a75743c, 5b0816bc, bd4468ef, 43eee533, 35734972
v1.537
Changes since: v1.532 (commit 15681143)
Released: 2026-01-31
Total commits: 4
Files changed: 2 (+11 / -9)
Bug Fixes
- HoneySQL STRING_AGG: Fixed ORDER BY syntax for aggregate functions
- SES admin copy button: Added clipboard copy for email addresses
- db.sql helpers: Use
with-transaction and ->cast for consistent kebab-case result keys
Commits: b9adc79d, e766c3a1, 82aee1e0, 8af77fdc
v1.532
Changes since: v1.518 (commit 451c8695)
Released: 2026-01-31
Total commits: 13
Files changed: 14 (+1,758 / -192)
SES Email Multi-TO Handling & Domain Validation
Security and correctness improvements for SES email ingestion.
- Domain validation: Tokens only extracted from configured mail-domain (prevents token extraction from spoofed domains)
- Multi-TO handling: Emails to multiple tenants now processed for all valid recipients
- Tenant deduplication: Multiple addresses routing to same tenant processed once
- Lowercase-only tokens: Fix case-sensitivity issue with email address parsing
Commits: fef1947b
OCR Layout Reconstruction Fix
Fixed document text extraction for complex layouts.
- Flatten raw-response vectors: OCR layout reconstruction now properly handles nested response structures
Commits: 2a339516, 67dbba11
Admin Panel Improvements
Enhanced admin UI for user and tenant management.
- Users admin page: Identity whitelist management for controlling access
- Tenant management: Create and edit tenant functionality
- SES sources: Toggle active/inactive, delete actions, fixed data refresh issues
- SES domain verification: Added mail.prod.getorcha.com domain
Commits: f20da51e, b5d29938, fda42181, ed1df246, c12049bb
Code Quality
- HoneySQL conversion: Migrated raw SQL queries to HoneySQL
- Lint fixes: clj-kondo cleanup
Commits: 9a84b3c0, 598ee8f6
Infrastructure Fixes
- Artifact paths: Fixed paths for orcha/ working directory structure
- SSO configuration: Fixed Identity Store ID format
Commits: 9cda551c, 88accb50, 71663a1c
v1.518
Changes since: v1.485 (commit c927a299)
Released: 2026-01-30
Total commits: 18
Files changed: 241 (+53,623)
Monorepo Restructure
Project moved to orcha/ subdirectory for monorepo organization.
- Directory move: Main project now lives in
orcha/ subdirectory
- Infrastructure move:
infra/ moved to orcha/infra/
- CI/AWS paths: Updated all paths for new structure
Commits: 8355b100, fdd4139d, 5bfe88e1
Admin SES Doc Source Management
Admin UI for managing SES email sources.
- SES source list: View all SES doc sources across tenants
- Create SES source: Admin button to create new sources
- Qualified tenant keys: Consistent key usage in admin queries
- Code style fixes: Admin namespace cleanup
Commits: ba971cb1, d7591fad, 85b1a0e1, a8a17b85
SES Plus-Addressing Improvements
Enhanced SES email routing with token-only addressing.
- Plus-addressing: Route emails via
invoices+TOKEN@mail.env.getorcha.com
- 10-char letter-only tokens: Simplified tokens for better compatibility
- Token-only routing: Removed reliance on sender email matching
Commits: 95380523, f9f51689
UI Improvements
- Validation chips: Stack vertically with green border for passing checks
- AP list pagination: Fixed pagination behavior
- CET timezone: Dates display in Central European Time
- Date range filter: Filter documents by date range
Commits: 3cf6fabf, 74738fd6
DATEV Sandbox Fixes
- Payload adaptation: Fixed account number truncation in sandbox adapter
- Test alignment: Updated tests to match implementation behavior
Commits: eb4128c6, 8b35a76c
Release Process Improvements
- Lint warnings block: Release skill now fails on warnings, not just errors
- Version counting: Fixed to count all commits on master
- Commit guidelines: Improved CLAUDE.md instructions for committing
Commits: 473c33bf, 90f9281d, 15bcaad1, 3fd4a71b
v1.485
Changes since: v1.312 (commit 94a8e325)
Released: 2026-01-30
Total commits: 172
Files changed: 304 (+40,113 / -1,981)
Admin Statistics Dashboard
Comprehensive admin panel for monitoring system health and performance.
- Dashboard pages: Overview, Costs, Quality, Performance, Acquisition, Tenants, Activity
- Real-time metrics: Token usage, ingestion stats, processing times, error rates
- Tenant filtering: Filter all metrics by specific tenant
- Export functionality: CSV exports for costs and quality data
- Charts: Daily trends, breakdowns by model/processor, histograms
Commits: dcb99529, d490a518
QA Dataset Collection
Super admin feature to collect invoices for ground truth testing.
- Add to QA dataset button: Document detail page action for super admins
- QA dataset admin page: View and manage collected invoices
- Structured data snapshots: Captures extraction results at collection time
- Tenant-scoped: Tracks which tenant each QA item belongs to
Commits: e206df59, bee076f3, 46350cc6
International Tax ID Support
Typed tax identifier system supporting multiple international formats.
- Tax ID types: VAT (EU), USt-IdNr (DE), UID (AT), CHE-TVA (CH), NIP (PL), etc.
- Schema changes: Added
:tax-id-type and :tax-id fields
- Validation: Format-specific validation per tax ID type
- DATEV compatibility: Proper handling of international identifiers
Commits: ae607559, f51b13ac, 26fb73f2
DATEV Integration Improvements
Enhanced booking proposal generation and configuration.
- Contact account number: Added
contactAccountNumber to Maesn payloads
- Line item amounts: Fixed amount calculations in booking proposals
- Ledger name: Auto-fetch on connect with retry logic
- §13b sub-cases: Enhanced BU code prompt for reverse charge
- Tenant integration table: Refactored storage to
tenant_integration
- Customer configuration docs: Added setup documentation
Commits: d79d7901, cb7767ee, 9619a8ba, 213b0348, ea09fb67, cca59331
Supplier Matching
Automatic supplier identification during document ingestion.
- SupplierMatcher post-processor: Matches invoices to known suppliers
- CSV import: Upload supplier master data
- Confidence scoring: High/medium/low match confidence
Commits: b77b9dab, 14c59116
Business Partner Import
Import business partners from DATEV EXTF files.
- EXTF parser: Reads DATEV standard export format
- Bulk import: Import customers/suppliers from accounting system
- Field mapping: Maps DATEV fields to Orcha schema
Commits: 1de15581
Document Status UI Improvements
Unified document status display with smart sorting.
- Status column: Combined status showing validation + DATEV state
- Smart sorting: Context-aware sorting based on DATEV connection
- SSE updates: Real-time status updates via Server-Sent Events
- Status filter: Replaced document type filter with status filter
- Sortable columns: Click column headers to sort
- List state preservation: Maintains filters/sort when navigating
Commits: 8139b25d, 43f67cb2, 71e1697a, 6c89b86a, 479157cd, 2b833ac8, 9ef59135, c66a99c7
DATEV PDF Cover Page
Validation summary cover page prepended to exported PDFs.
- Cover page generation: Summarizes validation results
- Visual status indicators: Pass/fail badges for each check
- Invoice metadata: Key fields displayed on cover
- Automatic inclusion: Added to all DATEV exports
Commits: dc71b57a
DATEV Export UI Unification
Single auto-refreshing export section in document detail.
- Unified UI: Combined export button and status in one section
- Auto-refresh: Polls for status updates during export
- Progress indication: Shows export state transitions
Commits: 1ab3f22a
CRM Integration
Merged Orcha-crm module for lead management and billing.
- Lead tracking: Import and manage sales leads
- Billing module: Invoice generation and tracking
- Data sync: Automated data update commits
Commits: f5b220a5, 2f905932, ff430df5, f0eafbc3
Code Quality
- db-conn → db-pool rename: Consistent naming across codebase
- Lint fixes: clj-kondo warnings resolved
- Handler docstrings: Removed hardcoded paths
- Defensive code removal: Cleaned up type-checking code
Commits: 157e8c2b, f084b95d, 340fbe95, 9e28f52b, 7891681a, b0811f60
v1.312
Changes since: v1.300 (commit 26152aac)
Released: 2026-01-27
Total commits: 11
Files changed: 16 (+293 / -128)
MAESN API Key Configuration Externalization
Refactored DATEV/Maesn integration to move API key from runtime SSM fetch to compile-time config injection.
- Environment-aware config: API key and auth endpoint now injected via Integrant config
- Sandbox vs production endpoints: Config profile determines which endpoint to use
- Removed
get-api-key SSM fetch: API key passed directly in config map
- Updated function signatures:
build-auth-url, get-user-info, get-async-task, poll-until-complete! now accept datev-config map
- New SSM parameter: Added
integrations/maesn/api-key-sandbox for sandbox environment
- Malli schema: Added
[:integrations ...] schema to OrchestratorContext
Commits: 8792a1b8
Financial Math Validation Improvements
Enhanced line item validation for gross/net amount consistency.
- Gross/net validation fix: Corrected validation logic for line item amounts
- Financial math uncertainties: Resolver now handles ambiguous validation cases
- Shipping field extraction: Clarified to avoid double-counting in totals
Commits: f035e8af, 2d1299f2, 7ec62745
DATEV Export Improvements
- Nested error format: Fixed log message extraction for nested DATEV error responses
Commits: 5c5081bf
Code Quality
- Removed debug logging: Cleaned up financial validation logging
- Release skill update: Warnings now block releases (previously only errors did)
Commits: 79756728, 0f63cc25, 0b6dd3e3, edc697f3
v1.300
Changes since: v1.281 (commit ca09bdb4)
Released: 2026-01-26
Total commits: 18
Files changed: 15 (+2,220 / -247)
DATEV Integration Improvements
Enhanced DATEV export reliability and user feedback.
- Schema validation: New
com.getorcha.schema.integration.datev namespace with Malli schemas for Maesn API payloads
- Input sanitization: Automatic string normalization (control chars, directional marks, invisible Unicode)
- Content hash per tenant:
content_hash now unique per tenant instead of globally, preventing cross-tenant conflicts
- PARTIAL_SUCCESS status: Handle partial export results with JSONB error storage and modal UI
- Inline UI feedback: Improved export status display directly in document view
- Sandbox compatibility: Temporary sanitization for Maesn sandbox environment constraints
- Message normalization fix: Proper handling of DATEV error message formats
Commits: a0706f58, 80d66d22, 663996aa, 62932c00, 6022c71c
Settings Page Redesign
Reorganized settings page with improved layout.
- Tabbed layout: Master Data and Integrations tabs
- Email disconnect: Improved email disconnection flow
- Removed redundant header: Cleaner page layout
- Larger tab font: 18px font size for better readability
Commits: e1669f81, b69df9e7, db261b95, e00a3946
UI/UX Improvements
- Login page centering: Fixed container alignment when sidebar is hidden
- Super admin banner: Prominent page border when viewing other tenants
- Cost center employee: Only show when invoice has contact person
- Multi-file upload fix: Server-side OOB prevents duplicate rows
Commits: d6965b72, af9127b7, c3b17b33, e20158bb
v1.281
Changes since: v1.277 (commit 6833460b)
Released: 2026-01-24
Total commits: 3
Files changed: 19 (+1,597 / -21)
Admin Service Authentication
ALB-based Cognito authentication for the admin service.
- ALB authenticate-cognito action: 1-hour session timeout, Google-only OAuth
- Auth middleware (
admin/http/middleware/auth.clj): Validates ALB-signed JWT, checks @getorcha.com domain
- Callback URL configuration: ALB OAuth flow via
/oauth2/idpresponse
- Security group update: HTTPS egress for ALB → Cognito communication
Deployment
- Port 7777 exposed: Dockerfile and docker-compose updated for admin service
- Health check validation:
validate.sh now checks both ERP (8888) and Admin (7777)
Code Organization
- Moved ERP auth middleware:
http/middleware/auth.clj → erp/http/middleware/auth.clj
Commits: 668703d9, 093d2ded, e5af1034
v1.277
Changes since: v1.274 (commit f5b464c8)
Released: 2026-01-24
Total commits: 2
Files changed: 2 (+2 / -1)
DATEV Integration
- Sandbox mode enabled: Use Maesn sandbox endpoint for DATEV auth (
190ccfbb)
Documentation
- Secrets runbook: Added Maesn API key to update-secrets.md (
11b8aa49)
v1.274
Changes since: v1.197 (commit 5afe3df0 - Fix multi-tenant SSE security vulnerability)
Released: 2026-01-24
Total commits: 76
Files changed: 103 (+18,251 / -2,342)
DATEV Integration
Full integration with DATEV accounting software via Maesn API.
- Maesn API client (
maesn.clj): OAuth flow, file uploads, booking proposal creation
- Export to DATEV button on document detail page for one-click export
- DATEV settings UI for connecting/disconnecting tenant accounts
- Prepayment recording support for advance payments
- Export audit trail: New
datev_export_audit table tracks all exports with status history
- Structured logging with timing metrics for all Maesn API calls
- CloudWatch alarm for DATEV export failures (metric filter on error logs)
- SSM parameter for Maesn API key (
/v1-orcha/{env}/maesn-api-key)
- Booking proposal iteration: Improved structured-data to booking-proposal mapping
- Auto-export on ingestion: Automatically exports to DATEV when ingestion completes
- Payload hashing: CBOR + SHA-256 change detection to avoid duplicate exports
- Export status derived from audit: Single source of truth, removed redundant document columns
- Improved payload handling:
- Only include EU VAT IDs (normalized and filtered)
- Filter out zero-amount line items
- Fix rounding errors by adjusting last line item (max 0.10 diff)
- Validate PO reference format
- UI improvements: Immediate feedback on export, spinner during auto-export
- Removed unsupported APIs: Bank accounts and payments endpoints (not supported by Maesn)
Database migrations:
20260121101354-add-datev-integration: Tenant DATEV connection storage
20260122121022-add-datev-export-audit: Export history and audit logging
Commits: cb62d680, e9ffa025, 8f2b8544, d61422ea, 391afc3b, 830b3079, 81d20f48, 65350314, 23d1c04e
SES Email Acquisition
New email ingestion channel via AWS SES for mail.{env}.getorcha.com.
- SES receiving infrastructure: Route53 MX records, SES receipt rules, S3 storage
- MIME parser using Jakarta Mail for .eml file processing
- Tenant routing by sender email lookup
- Deduplication with retry support after errors
- S3 event processing: SQS messages trigger acquisition pipeline
- LocalStack simulation for local development testing
- Queue latency alarms: Alert when oldest message exceeds threshold
- Ingest queue: > 5 minutes
- Email-acquire queue: > 60 seconds
Database migrations:
20260120075416-add-ses-email-source: doc_source_ses and doc_source_ses_processed tables
Infrastructure changes:
foundation_stack.py: SES receipt rule set, S3 bucket for emails, email-acquire SQS queue
ops_stack.py: SQS ApproximateAgeOfOldestMessage alarms
Documentation:
- Architecture doc for SES email acquisition flow
- Customer setup guide (M365, Gmail, Exchange forwarding)
- Operational runbook for troubleshooting
Commits: a914e4de, 0a413a56, 9c720d63, 7e16e0db, a0b31d48, 633631ac
UI/UX Improvements
Comprehensive user interface updates for better usability.
Invoice Display
- Date/time format: Now shows timezone with 2-digit year (DD.MM.YY HH:mm z)
- Payment terms: Standardized date format (DD.MM.YYYY)
- Invoice number: Removed monospace styling for consistency
- PDF viewer sidebar: Closed by default
- Logo clickable: Navigates to /documents page
- Orcha tenant banner: Hidden for cleaner UI
Multi-File Upload
- Upload up to 5 PDF invoices simultaneously
- Single-file uploads return single map (backward compatible)
- Multi-file uploads return vector
Validation Section Redesign
- Combined layout: Validation + Tax Compliance in single section
- Two-column grid: Formal Requirements (left), Tax Compliance (right)
- Status badges: Replace dots with Correct/Warning/Review badges
- Tooltips: Hover explanations for all validation items
- Warning banners: Colored borders indicate severity
Parties Section
- Moved Issuer/Recipient into Invoice Details section
- Removed standalone Parties tab
- Invoice metadata wrapped in styled box with header
Line Items
- Cost center badge per line item (moved from invoice level)
- Debit/credit account labels with proper display
- Pricing factor support for scaled unit pricing
- PO/GR references displayed in header
Commits: dd2b3d7b, 63dd7bcc, 25eb42e1, 7d0d73d5, 2fc10836, 9714342c, f652dc80, 4084b0f3
Cost Center Enhancements
Per-line-item cost center assignment with intelligent matching.
- Line-item level cost centers: Moved from invoice level to individual line items
- Description field: Optional 5th column in CSV upload for cost center descriptions
- Dynamic confidence guidance: LLM receives context-aware scoring guidelines
- Detects employee-based cost center patterns
- Checks if descriptions are available
- Adjusts confidence when contact person missing on invoice
- Settings UI: Updated table with description column
Commits: 14a611c9, e21cdf6e, 32b188ba
Accounting Features
Enhanced double-entry bookkeeping support.
Debit/Credit Accounts
- AccountsMatcher now matches both Soll (debit) and Haben (credit) accounts
- Uses same GL accounts CSV for both account types
- Schema change:
:account renamed to :debit-account, added :credit-account
- UI displays accounts with debit/credit labels
Purchase Order & Goods Receipt
- PO reference extraction from invoices
- GR reference (Goods Receipt/Delivery Note) extraction
- New extraction rule (#18) for PO/GR references
- Displayed in invoice header
Late Invoice Period Handling
- AccrualsMatcher considers current date
- If invoice-date month < current-date month, uses current month as minimum booking period
- Prevents booking to closed periods for single-period items
Commits: c669eef4, 14a611c9
Validation & Compliance
Tax compliance validation improvements.
- Compliance statements field for structured capture of:
- Reverse-charge statements
- VAT exemption declarations
- Intra-community supply notices
- Small business exemptions
- ComplianceStatement schema: type, text, legal-basis fields
- TaxIssue enum: Added
:missing-compliance-statement type
- TaxComplianceAnalyzer: Validates and flags missing statements
- Generic terminology: Avoids LLM bias toward specific jurisdictions
Commits: 9ed926dc, 63f7f442, 7e17156a
Infrastructure & Operations
SSM Parameter Rename
- Renamed
/v1-orcha/refresh-token-key-arn to /v1-orcha/db-secrets-key-arn
- KMS key now used for multiple purposes (refresh tokens, DATEV credentials)
- Old parameter kept temporarily for backwards compatibility (#82)
- Updated KMS alias and description
Admin Service
- New minimal admin service on port 7777
- Simple centered dashboard page
- No authentication required
- Light theme CSS
- Shared HTTP utilities extracted to
com.getorcha.http:
formats.clj: Muuntaja with HTML/Hiccup encoding
middleware.clj: inject-config, exception-middleware
middleware/auth.clj: Cognito JWT authentication
Email Protocol Reorganization
- Separated OAuth concerns (ERP/HTTP) from sync concerns (Workers)
- Added
erp/email/{oauth,gmail,outlook}.clj for OAuth flow
- Added
email/oauth/tokens.clj for shared token operations
- Moved
triage.clj to workers/acquisition/email/triage.clj
- Added
acquisition/multi.clj for multimethod dispatch
Other
- ERP resources moved to
resources/erp/public/
- Cheshire encoder for
java.time.Instant to JSONB
- Random ports for HTTP servers in test fixtures
Commits: 5017705f, 871075bc, 2fc10836, 0ba31913
Documentation
Feature Specifications
- Product Roadmap: Technical details, 18-month timeline, €3.55-6.83M/year value
- DocuSign API: Electronic signature integration with German compliance (eIDAS, BGB, GoBD)
- AI Financial Reporting Agent: Automated board-ready reports with CFO-level analysis
- Supplier & Vendor Management: 360° profiles, contract analysis, discount tracking
- Scenario Analysis & Monte Carlo: Financial simulation capabilities
- Accounts Receivable: Full AR feature specification
- Slack Integration: Communication and notification spec
- Purchase Requests & Approvals: Procurement workflow spec
Operational Documentation
- SES email acquisition architecture
- Customer email forwarding setup guide
- SES troubleshooting runbook
- Maesn/DATEV integration documentation
Commits: 1fe2d08b, ef473e54, 4ef3c129, eefa565a, 6f3220dd, 3339c1de, 26f1da19, 6ca41205, 9714342c
Bug Fixes
- Qualified key destructuring: Fixed subscription renewal using unqualified keys causing nil values (
e621bce7)
- Single-file API uploads: Return single map instead of vector for backward compatibility (
ac71f31a)
- NullPointerException: Fixed BuCodeAssigner when tax-rate is nil (
7f4fe882)
- Ring imports: Fixed ring.util.response imports for content-type and header (
d1707175)
- Email banner: Restored accidentally removed email-connection-banner call (
d1707175)
- Date format detection: Enhanced LLM extraction with locale-aware parsing (
dd2b3d7b)
- Test data: Added missing pricing-factor field to extraction tests (
6e2d2d49)
- Ingestion error handler: Fixed complete-ingestion! call signature (
fc230b19)
Code Quality & Maintenance
- Removed BuCodeAssigner (replaced by improved matching) (
2126d746)
- Removed German wording from LLM prompts (
5828ee12)
- Fixed deprecated
integrant.core/prep usage across dev, system, and test files (6ef32c41, f6c53732, baafb193, a6d54193, c7b359c5)
- Lint fixes: removed unused imports and bindings (
d462ffbb, 92cca134, 8704205b, 7976e36b, 7ebc8a58)
- Added
.lsp/.cache/ to .gitignore (9ed926dc)
- Excluded
volumes/postgres and target from consult-find/ripgrep (aea8ea7d)
- Schema validation tests for new fields (
9ed926dc)
- Removed unused
ingestion.source_metadata column and trigger (88afb57a)
Breaking Changes
1. LineItem Schema: :account → :debit-account + :credit-account
PR: #48 (getorcha/counter-account-po-gr)
Commit: c669eef4
- Renamed
:account to :debit-account in LineItem schema
- Added new
:credit-account field for double-entry bookkeeping
- Commit
6ef32c41 explicitly removed legacy :account support from UI
Data migration required: Rename :account → :debit-account in all line items
2. Cost Center: Invoice-level → Line-item-level
PR: #65 (getorcha/CC-and-period-allocation-extension)
Commit: 14a611c9
- Moved
:cost-center from invoice level to individual line items
- CostCenterMatcher now matches per line item
- Invoice-level cost center section and tab removed from UI
Data migration required: Copy invoice-level :cost-center to each line item, then remove invoice-level field
3. S3 Config: :s3-bucket → :s3-buckets Map
Commit: a914e4de (direct to master)
;; Before
:s3-bucket "v1-orcha-global-storage-..."
;; After
:s3-buckets {:storage "v1-orcha-global-storage-..."
:ses-emails "v1-orcha-ses-emails-..."}
Status: Fully aligned - no deployment gaps found
- CDK infrastructure creates both buckets correctly
- All application code updated in same commit
- IAM policies grant proper access to both buckets
Migration Notes
- SSM parameter rename (tracked in #82)
- Migrate from
/v1-orcha/refresh-token-key-arn to /v1-orcha/db-secrets-key-arn
- Old parameter kept temporarily for backwards compatibility
Contributors