Document Management Part 1 — Implementation Plan

For Claude: REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.

Goal: Enable ingestion of contracts, purchase orders, and goods received notes alongside invoices — classify, extract type-specific structured data, and store results.

Architecture: Shared common schema (schema/common.clj) with type-specific schemas in folders (schema/invoice/, schema/purchase_order/, schema/contract/, schema/grn/). A dispatch layer (schema/structured_data.clj) uses Malli :multi on :document-type to validate the correct schema. Pipeline stages use existing multimethod dispatch — new types get extraction prompts, stub validation, and no-op fraud/post-processing.

Tech Stack: Clojure, Malli (schemas), next.jdbc (DB), PostgreSQL (ENUM types), Anthropic Claude (LLM extraction)


Task 1: Database Migration — Add Document Types

Files:

Step 1: Write up migration

-- Add new document types to enum
ALTER TYPE document_type ADD VALUE IF NOT EXISTS 'contract';
ALTER TYPE document_type ADD VALUE IF NOT EXISTS 'purchase-order';
ALTER TYPE document_type ADD VALUE IF NOT EXISTS 'goods-received-note';
ALTER TYPE document_type ADD VALUE IF NOT EXISTS 'other';

-- Update trigger to use actual document type from structured_data instead of hard-coding 'invoice'
CREATE OR REPLACE FUNCTION update_document_from_ingestion()
RETURNS TRIGGER AS $$
BEGIN
    IF NEW.status = 'completed' AND OLD.status = 'in-progress' THEN
        UPDATE document
        SET
            type = (NEW.structured_data->>'document-type')::document_type,
            structured_data = NEW.structured_data,
            needs_human_review = (
                NOT COALESCE(NEW.valid_structured_data, true)
                OR EXISTS (
                    SELECT 1 FROM jsonb_each(NEW.structured_data->'validation-results') AS v(k, val)
                    WHERE val->>'status' = 'error'
                )
            ),
            updated_at = now()
        WHERE id = NEW.document_id;
    END IF;
    RETURN NEW;
END;
$$ LANGUAGE plpgsql;

Step 2: Write down migration

-- Revert trigger to hard-coded 'invoice'
CREATE OR REPLACE FUNCTION update_document_from_ingestion()
RETURNS TRIGGER AS $$
BEGIN
    IF NEW.status = 'completed' AND OLD.status = 'in-progress' THEN
        UPDATE document
        SET
            type = 'invoice',
            structured_data = NEW.structured_data,
            needs_human_review = (
                NOT COALESCE(NEW.valid_structured_data, true)
                OR EXISTS (
                    SELECT 1 FROM jsonb_each(NEW.structured_data->'validation-results') AS v(k, val)
                    WHERE val->>'status' = 'error'
                )
            ),
            updated_at = now()
        WHERE id = NEW.document_id;
    END IF;
    RETURN NEW;
END;
$$ LANGUAGE plpgsql;

-- Note: Cannot remove values from ENUM in PostgreSQL without dropping and recreating the type.
-- This down migration only reverts the trigger. Manual cleanup of enum values would be needed
-- if a full rollback is required.

Step 3: Run migration locally

Run: cd orcha && clojure -M:dev -m com.getorcha.dev/migrate Expected: Migration applies successfully. Verify with psql -U postgres -d orcha -c "SELECT enum_range(NULL::document_type);"

Step 4: Commit

git add orcha/resources/migrations/20260217120000-add-document-types.up.sql orcha/resources/migrations/20260217120000-add-document-types.down.sql
git commit -m "feat: add contract, purchase-order, goods-received-note to document_type enum

Update trigger to derive document type from structured_data instead of
hard-coding 'invoice'."

Task 2: Create schema/common.clj — Extract Shared Components

Files:

Step 1: Create schema/common.clj

Extract these definitions from structured_data.clj into schema/common.clj:

Namespace: com.getorcha.schema.common

Also add shared enums:

Step 2: Add a ContractParty schema to common.clj

Contracts need a party schema with signatory and role fields:

(def ContractParty
  "Party in a contract with signatory details."
  [:map
   [:name [:string {:min 1}]]
   [:address [:maybe :string]]
   [:country [:maybe :string]]
   [:tax-id-type [:maybe TaxIdType]]
   [:tax-id [:maybe :string]]
   [:signatory [:maybe :string]]
   [:role [:maybe :string]]])

Step 3: Verify compile

Run: cd orcha && clojure -M:dev -e "(require 'com.getorcha.schema.common)" Expected: No errors

Step 4: Commit

git add orcha/src/com/getorcha/schema/common.clj
git commit -m "feat: add schema/common.clj with shared document components

Extract TaxIdType, Issuer, Recipient, ServicePeriod, ComplianceStatement,
Surcharge, ContractParty, Confidence, and DocumentType enums."

Task 3: Move Invoice Schema to schema/invoice/structured_data.clj

Files:

Step 1: Create schema/invoice/structured_data.clj

Namespace: com.getorcha.schema.invoice.structured-data

Move all invoice-specific definitions from structured_data.clj:

Require schema.common for shared components (Issuer, Recipient, etc.)

Step 2: Update schema/structured_data.clj to re-export

The existing structured_data.clj becomes a dispatch layer that:

  1. Requires schema.invoice.structured-data
  2. Defines StructuredData as the invoice schema for now (will become :multi dispatch later)
  3. Re-exports AccountMatch, AccrualMatch, FraudFlag, ValidationCheck, etc. for backwards compat
(ns com.getorcha.schema.structured-data
  "Dispatch schema for all document types.
   Routes to type-specific schemas based on :document-type."
  (:require [com.getorcha.schema.invoice.structured-data :as invoice]))

;; Re-exports for backwards compatibility
(def AccountMatch invoice/AccountMatch)
(def AccrualMatch invoice/AccrualMatch)
(def VatValidation invoice/VatValidation)
(def CostCenterMatch invoice/CostCenterMatch)
(def BuCode invoice/BuCode)
(def ServiceCategory invoice/ServiceCategory)
(def TaxIssue invoice/TaxIssue)
(def FraudFlagType invoice/FraudFlagType)
(def FraudSeverity invoice/FraudSeverity)
(def FraudFlag invoice/FraudFlag)

;; For now, StructuredData = InvoiceData. Will become :multi dispatch after all type schemas exist.
(def StructuredData invoice/InvoiceData)

Step 3: Run existing tests to verify nothing broke

Run: cd orcha && clojure -M:test -m kaocha.runner -- --focus com.getorcha.workers.ingestion.extraction-test Expected: All tests pass (schema imports still resolve, InvoiceData validates same as old StructuredData)

Also run: cd orcha && clojure -M:test -m kaocha.runner -- --focus com.getorcha.workers.ingestion.classification-test Expected: All tests pass

Step 4: Commit

git add orcha/src/com/getorcha/schema/invoice/structured_data.clj orcha/src/com/getorcha/schema/structured_data.clj
git commit -m "refactor: move invoice schema to schema/invoice/structured_data.clj

Original structured_data.clj becomes a dispatch layer that re-exports
shared types for backwards compatibility."

Task 4: Create Purchase Order Schema

Files:

Step 1: Write the PO schema

Namespace: com.getorcha.schema.purchase-order.structured-data

(ns com.getorcha.schema.purchase-order.structured-data
  "Schema for LLM-extracted purchase order structured data."
  (:require [com.getorcha.schema.common :as common]))

(def ^:private POLineItem
  [:map
   [:description :string]
   [:article-code [:maybe :string]]
   [:quantity [:maybe number?]]
   [:unit [:maybe :string]]
   [:unit-price [:maybe number?]]
   [:amount [:maybe number?]]
   [:tax-rate [:maybe number?]]
   [:expected-delivery-date [:maybe :string]]  ;; ISO 8601
   [:page-location [:tuple :int :int]]])

(def PurchaseOrderData
  "Schema for LLM-extracted purchase order structured data."
  [:map
   ;; Classification (from common)
   [:document-type [:= "purchase-order"]]
   [:document-description [:maybe :string]]
   [:confidence common/Confidence]
   [:missing-fields [:maybe [:vector :string]]]

   ;; Core
   [:po-number [:string {:min 1}]]
   [:po-date [:maybe :string]]           ;; ISO 8601
   [:currency [:maybe :string]]
   [:total-value [:maybe number?]]
   [:status [:maybe :string]]            ;; e.g., "confirmed", "pending", "cancelled"

   ;; Parties
   [:buyer [:maybe common/Recipient]]
   [:supplier common/Issuer]

   ;; Line items
   [:line-items [:maybe [:vector POLineItem]]]

   ;; Logistics
   [:expected-delivery-date [:maybe :string]]
   [:delivery-address [:maybe :string]]
   [:delivery-country [:maybe :string]]
   [:incoterm-code [:maybe :string]]
   [:incoterm-place [:maybe :string]]
   [:shipping-terms [:maybe :string]]

   ;; Commercial
   [:payment-terms [:maybe :string]]
   [:discount-terms [:maybe :string]]
   [:validity-date [:maybe :string]]     ;; ISO 8601

   ;; References
   [:contract-reference [:maybe :string]]
   [:requisition-number [:maybe :string]]

   ;; Approval
   [:authorized-by [:maybe :string]]
   [:approval-date [:maybe :string]]])   ;; ISO 8601

Step 2: Verify compile

Run: cd orcha && clojure -M:dev -e "(require 'com.getorcha.schema.purchase-order.structured-data)" Expected: No errors

Step 3: Commit

git add orcha/src/com/getorcha/schema/purchase_order/structured_data.clj
git commit -m "feat: add purchase order structured data schema"

Task 5: Create Contract Schema

Files:

Step 1: Write the contract schema

Namespace: com.getorcha.schema.contract.structured-data

(ns com.getorcha.schema.contract.structured-data
  "Schema for LLM-extracted contract structured data."
  (:require [com.getorcha.schema.common :as common]))

(def ^:private ContractType
  [:enum "service" "supply" "lease" "nda" "framework" "other"])

(def ^:private RenewalType
  [:enum "auto" "manual" "none"])

(def ^:private PaymentScheduleEntry
  [:map
   [:description :string]
   [:date [:maybe :string]]
   [:amount [:maybe number?]]])

(def ^:private VariableComponent
  [:map
   [:description :string]
   [:formula [:maybe :string]]])

(def ^:private Penalty
  [:map
   [:description :string]
   [:amount-or-percentage [:maybe :string]]])

(def ContractData
  "Schema for LLM-extracted contract structured data."
  [:map
   ;; Classification
   [:document-type [:= "contract"]]
   [:document-description [:maybe :string]]
   [:confidence common/Confidence]
   [:missing-fields [:maybe [:vector :string]]]

   ;; Core
   [:contract-number [:maybe :string]]
   [:title [:maybe :string]]
   [:contract-type [:maybe ContractType]]
   [:effective-date [:maybe :string]]
   [:expiration-date [:maybe :string]]
   [:currency [:maybe :string]]
   [:total-value [:maybe number?]]

   ;; Parties
   [:party-a [:maybe common/ContractParty]]
   [:party-b [:maybe common/ContractParty]]

   ;; Terms
   [:payment-schedule [:maybe [:vector PaymentScheduleEntry]]]
   [:payment-terms [:maybe :string]]
   [:renewal-type [:maybe RenewalType]]
   [:renewal-notice-period [:maybe :string]]
   [:termination-notice-period [:maybe :string]]
   [:termination-conditions [:maybe :string]]

   ;; Scope
   [:description [:maybe :string]]
   [:deliverables [:maybe [:vector :string]]]
   [:slas [:maybe [:vector :string]]]

   ;; Financial
   [:base-fee [:maybe number?]]
   [:variable-components [:maybe [:vector VariableComponent]]]
   [:price-escalation [:maybe :string]]
   [:penalties [:maybe [:vector Penalty]]]

   ;; References
   [:po-references [:maybe [:vector :string]]]
   [:predecessor-contract [:maybe :string]]

   ;; Legal
   [:governing-law [:maybe :string]]
   [:jurisdiction [:maybe :string]]
   [:liability-cap [:maybe :string]]
   [:insurance-requirements [:maybe :string]]
   [:confidentiality [:maybe :boolean]]])

Step 2: Verify compile

Run: cd orcha && clojure -M:dev -e "(require 'com.getorcha.schema.contract.structured-data)" Expected: No errors

Step 3: Commit

git add orcha/src/com/getorcha/schema/contract/structured_data.clj
git commit -m "feat: add contract structured data schema"

Task 6: Create GRN Schema

Files:

Step 1: Write the GRN schema

Namespace: com.getorcha.schema.grn.structured-data

(ns com.getorcha.schema.grn.structured-data
  "Schema for LLM-extracted goods received note structured data."
  (:require [com.getorcha.schema.common :as common]))

(def ^:private ItemCondition
  [:enum "accepted" "damaged" "short"])

(def ^:private QualityAssessment
  [:enum "pass" "partial" "fail"])

(def ^:private GRNLineItem
  [:map
   [:description :string]
   [:quantity-ordered [:maybe number?]]
   [:quantity-received [:maybe number?]]
   [:quantity-rejected [:maybe number?]]
   [:unit [:maybe :string]]
   [:condition [:maybe ItemCondition]]
   [:rejection-reason [:maybe :string]]
   [:page-location [:tuple :int :int]]])

(def GRNData
  "Schema for LLM-extracted goods received note structured data."
  [:map
   ;; Classification
   [:document-type [:= "goods-received-note"]]
   [:document-description [:maybe :string]]
   [:confidence common/Confidence]
   [:missing-fields [:maybe [:vector :string]]]

   ;; Core
   [:grn-number [:maybe :string]]
   [:receipt-date [:maybe :string]]
   [:receiving-location [:maybe :string]]

   ;; References
   [:po-reference [:maybe :string]]
   [:delivery-note-number [:maybe :string]]
   [:shipping-reference [:maybe :string]]

   ;; Parties
   [:supplier [:maybe common/Issuer]]
   [:receiver-name [:maybe :string]]
   [:receiver-department [:maybe :string]]

   ;; Line items
   [:line-items [:maybe [:vector GRNLineItem]]]

   ;; Logistics
   [:carrier [:maybe :string]]
   [:delivery-date [:maybe :string]]
   [:delivery-method [:maybe :string]]

   ;; Inspection
   [:inspector-name [:maybe :string]]
   [:inspection-date [:maybe :string]]
   [:inspection-notes [:maybe :string]]
   [:quality-assessment [:maybe QualityAssessment]]

   ;; Sign-off
   [:received-by [:maybe :string]]
   [:approved-by [:maybe :string]]])

Step 2: Verify compile

Run: cd orcha && clojure -M:dev -e "(require 'com.getorcha.schema.grn.structured-data)" Expected: No errors

Step 3: Commit

git add orcha/src/com/getorcha/schema/grn/structured_data.clj
git commit -m "feat: add goods received note structured data schema"

Task 7: Update structured_data.clj — Multi-Schema Dispatch

Files:

Step 1: Update to use Malli :multi dispatch

Replace the temporary re-export with a proper multi-dispatch schema:

(ns com.getorcha.schema.structured-data
  "Dispatch schema for all document types.
   Routes to type-specific schemas based on :document-type."
  (:require [com.getorcha.schema.invoice.structured-data :as invoice]
            [com.getorcha.schema.purchase-order.structured-data :as po]
            [com.getorcha.schema.contract.structured-data :as contract]
            [com.getorcha.schema.grn.structured-data :as grn]
            [malli.core :as m]))

;; Re-exports for backwards compatibility (used by post_process.clj, validation.clj, etc.)
(def AccountMatch invoice/AccountMatch)
(def AccrualMatch invoice/AccrualMatch)
(def VatValidation invoice/VatValidation)
(def CostCenterMatch invoice/CostCenterMatch)
(def BuCode invoice/BuCode)
(def ServiceCategory invoice/ServiceCategory)
(def TaxIssue invoice/TaxIssue)
(def FraudFlagType invoice/FraudFlagType)
(def FraudSeverity invoice/FraudSeverity)
(def FraudFlag invoice/FraudFlag)

(def StructuredData
  "Multi-schema that dispatches on :document-type.
   Validates invoice, purchase-order, contract, or goods-received-note data
   against their type-specific schemas."
  (m/schema
   [:multi {:dispatch :document-type}
    ["invoice"             invoice/InvoiceData]
    ["purchase-order"      po/PurchaseOrderData]
    ["contract"            contract/ContractData]
    ["goods-received-note" grn/GRNData]]))

Step 2: Run existing tests

Run: cd orcha && clojure -M:test -m kaocha.runner -- --focus com.getorcha.workers.ingestion.extraction-test Expected: All tests pass (invoice data still validates through the :multi dispatch)

Step 3: Commit

git add orcha/src/com/getorcha/schema/structured_data.clj
git commit -m "feat: structured_data.clj now dispatches to type-specific schemas

Uses Malli :multi on :document-type to validate invoice, purchase-order,
contract, and goods-received-note data against their respective schemas."

Task 8: Update schema/document.clj — Add Document Types

Files:

Step 1: Update Type enum

Change line 7-8 from:

(def Type
  [:enum :invoice])

To:

(def Type
  [:enum :invoice :contract :purchase-order :goods-received-note :other])

Step 2: Commit

git add orcha/src/com/getorcha/schema/document.clj
git commit -m "feat: add new document types to schema/document.clj Type enum"

Task 9: Update Classification — Stop Rejecting Non-Invoices

Files:

Step 1: Write test for non-invoice classification returning result

Add to classification_test.clj — test that classify! returns classification for non-invoices instead of throwing:

;; This test will initially fail because classify! currently throws for non-invoices
(deftest test-classify-returns-non-invoice-types
  (testing "classify! returns classification for contract (does not reject)"
    ;; This test mocks the LLM call to return a contract classification
    ;; and verifies classify! returns the result instead of throwing
    ))

Note: The classify! function requires DB + LLM, so the key change is in the unit-testable parse-classification-result function. Existing tests already cover parsing for all types (lines 59-93 of classification_test.clj). The main change is in classify! itself.

Step 2: Update classify! to return all types

In classification.clj, replace lines 169-179:

    (if (= "invoice" (:document-type classification))
      ;; Invoice - return classification and stats
      {:classification classification
       :stats          stats}
      ;; Non-invoice - reject
      (throw (ex-info "Document rejected: not an invoice"
                      {:kind             ::rejection
                       :rejection-reason :not-invoice
                       :document-type    (:document-type classification)
                       :description      (:invoice-description classification)
                       :confidence       (:confidence classification)})))

With:

    {:classification classification
     :stats          stats}

Step 3: Add :document-description key to parse-classification-result

In parse-classification-result (line 129), add a :document-description key alongside :invoice-description:

    {:document-type        document-type
     :invoice-subtype      (when (= "invoice" document-type)
                             (if (invoice-subtypes subtype)
                               subtype
                               "standard-invoice"))
     :invoice-description  (:description result)
     :document-description (:description result)  ;; new, type-agnostic alias
     :confidence           (or (:confidence result) "medium")}

Step 4: Update classification docstring

Update the ns docstring and classify! docstring to reflect that non-invoices are no longer rejected.

Step 5: Run classification tests

Run: cd orcha && clojure -M:test -m kaocha.runner -- --focus com.getorcha.workers.ingestion.classification-test Expected: All tests pass

Step 6: Commit

git add orcha/src/com/getorcha/workers/ingestion/classification.clj orcha/test/com/getorcha/workers/ingestion/classification_test.clj
git commit -m "feat: classification returns all document types, stops rejecting non-invoices

Add :document-description as type-agnostic alias for :invoice-description."

Task 10: Add Extraction for Purchase Orders

Files:

Step 1: Add PO extraction prompt

Add a new defmethod workers/-prompt :extraction-purchase-order with a comprehensive extraction prompt for purchase orders. Follow the same structure as the invoice prompt (:extraction) but with PO-specific fields matching the PurchaseOrderData schema.

The prompt should:

Step 2: Add PO extraction multimethod

(defmethod structured-data "purchase-order"
  [{:keys [db-pool llm-config] :as _context}
   {{:keys [text]} :transcription-result :keys [document] :as _ingestion}]
  (let [started-at      (java.time.Instant/now)
        extraction-cfg  (:extraction llm-config)
        legal-entity-id (:document/legal-entity-id document)
        prompt          (workers/legal-entity-prompt db-pool legal-entity-id :extraction-purchase-order {:text text})
        llm-generation  (llm/generate extraction-cfg prompt)
        ended-at        (java.time.Instant/now)]
    {:data  (-> (:text llm-generation)
                llm/parse-json-response
                normalize-issuer-iban)
     :stats (-> llm-generation
                (dissoc :text)
                (assoc :started-at started-at
                       :ended-at ended-at))}))

Step 3: Verify compile

Run: cd orcha && clojure -M:dev -e "(require 'com.getorcha.workers.ingestion.extraction)" Expected: No errors

Step 4: Commit

git add orcha/src/com/getorcha/workers/ingestion/extraction.clj
git commit -m "feat: add purchase order extraction prompt and multimethod"

Task 11: Add Extraction for Contracts

Files:

Step 1: Add contract extraction prompt

Add defmethod workers/-prompt :extraction-contract with a comprehensive prompt for contracts. Fields should match ContractData schema.

Step 2: Add contract extraction multimethod

(defmethod structured-data "contract"
  [{:keys [db-pool llm-config] :as _context}
   {{:keys [text]} :transcription-result :keys [document] :as _ingestion}]
  (let [started-at      (java.time.Instant/now)
        extraction-cfg  (:extraction llm-config)
        legal-entity-id (:document/legal-entity-id document)
        prompt          (workers/legal-entity-prompt db-pool legal-entity-id :extraction-contract {:text text})
        llm-generation  (llm/generate extraction-cfg prompt)
        ended-at        (java.time.Instant/now)]
    {:data  (-> (:text llm-generation)
                llm/parse-json-response)
     :stats (-> llm-generation
                (dissoc :text)
                (assoc :started-at started-at
                       :ended-at ended-at))}))

Step 3: Commit

git add orcha/src/com/getorcha/workers/ingestion/extraction.clj
git commit -m "feat: add contract extraction prompt and multimethod"

Task 12: Add Extraction for GRN

Files:

Step 1: Add GRN extraction prompt

Add defmethod workers/-prompt :extraction-grn with a comprehensive prompt for goods received notes. Fields should match GRNData schema.

Step 2: Add GRN extraction multimethod

(defmethod structured-data "goods-received-note"
  [{:keys [db-pool llm-config] :as _context}
   {{:keys [text]} :transcription-result :keys [document] :as _ingestion}]
  (let [started-at      (java.time.Instant/now)
        extraction-cfg  (:extraction llm-config)
        legal-entity-id (:document/legal-entity-id document)
        prompt          (workers/legal-entity-prompt db-pool legal-entity-id :extraction-grn {:text text})
        llm-generation  (llm/generate extraction-cfg prompt)
        ended-at        (java.time.Instant/now)]
    {:data  (-> (:text llm-generation)
                llm/parse-json-response)
     :stats (-> llm-generation
                (dissoc :text)
                (assoc :started-at started-at
                       :ended-at ended-at))}))

Step 3: Commit

git add orcha/src/com/getorcha/workers/ingestion/extraction.clj
git commit -m "feat: add goods received note extraction prompt and multimethod"

Task 13: Add Validation Stubs for New Types

Files:

Step 1: Add basic validation for purchase orders

(defmethod validate "purchase-order"
  [structured-data]
  (assoc structured-data
         :validation-results
         {:required-fields (if (and (not-empty (:po-number structured-data))
                                    (not-empty (get-in structured-data [:supplier :name])))
                             {:status "pass"}
                             {:status "error"
                              :message "Missing required fields: po-number and/or supplier name"})}))

Step 2: Add basic validation for contracts

(defmethod validate "contract"
  [structured-data]
  (assoc structured-data
         :validation-results
         {:required-fields (if (or (not-empty (:contract-number structured-data))
                                   (not-empty (:title structured-data)))
                             {:status "pass"}
                             {:status "error"
                              :message "Missing required fields: contract-number or title"})}))

Step 3: Add basic validation for GRN

(defmethod validate "goods-received-note"
  [structured-data]
  (assoc structured-data
         :validation-results
         {:required-fields (if (not-empty (:grn-number structured-data))
                             {:status "pass"}
                             {:status "error"
                              :message "Missing required field: grn-number"})}))

Step 4: Run validation tests

Run: cd orcha && clojure -M:test -m kaocha.runner -- --focus com.getorcha.workers.ingestion.validation-test Expected: All existing tests pass

Step 5: Commit

git add orcha/src/com/getorcha/workers/ingestion/validation.clj
git commit -m "feat: add basic validation for purchase-order, contract, goods-received-note

Initial required-field checks only. Financial math validation is
invoice-specific and not applicable to new types."

Task 14: Add Fraud Detection No-Ops for New Types

Files:

Step 1: Add no-op implementations

(defmethod detect "purchase-order"
  [_context ingestion]
  ingestion)

(defmethod detect "contract"
  [_context ingestion]
  ingestion)

(defmethod detect "goods-received-note"
  [_context ingestion]
  ingestion)

Step 2: Run fraud detection tests

Run: cd orcha && clojure -M:test -m kaocha.runner -- --focus com.getorcha.workers.ingestion.fraud-detection-test Expected: All existing tests pass

Step 3: Commit

git add orcha/src/com/getorcha/workers/ingestion/fraud_detection.clj
git commit -m "feat: add no-op fraud detection for new document types"

Task 15: Add Post-Processing No-Ops for New Types

Files:

Step 1: Add no-op implementations

(defmethod run "purchase-order"
  [_context ingestion]
  ingestion)

(defmethod run "contract"
  [_context ingestion]
  ingestion)

(defmethod run "goods-received-note"
  [_context ingestion]
  ingestion)

Step 2: Commit

git add orcha/src/com/getorcha/workers/ingestion/post_process.clj
git commit -m "feat: add no-op post-processing for new document types"

Task 16: Update Ingestion Pipeline — Remove Rejection Catch Block

Files:

Step 1: Remove the ::classification/rejection catch clause

In job-handler (around lines 574-593), remove the entire cond branch that handles ::classification/rejection:

;; REMOVE THIS ENTIRE BLOCK:
(= ::classification/rejection (:kind data))
(let [{:keys [document-type description]} data]
  (log/info "Document rejected" data)
  (notifications/notify! ...)
  (delete-document! ...)
  (delete-message!))

Step 2: Update classify! docstring

Update the docstring at line 292-294 to remove mention of rejection:

(defn ^:private classify!
  "Classifies document type. Returns ingestion with :classification added."
  [context ingestion]
  ...)

Step 3: Run all tests

Run: cd orcha && clojure -M:test -m kaocha.runner Expected: All tests pass

Step 4: Commit

git add orcha/src/com/getorcha/workers/ingestion.clj
git commit -m "feat: remove non-invoice document rejection from ingestion pipeline

All document types now flow through the full pipeline: classify, extract,
validate, fraud-detect, post-process, complete."

Task 17: Final Integration Verification

Step 1: Run full test suite

Run: cd orcha && clojure -M:test -m kaocha.runner Expected: All tests pass

Step 2: Verify schema dispatch works

Run a REPL check:

(require '[malli.core :as m])
(require '[com.getorcha.schema.structured-data :as sd])

;; Invoice data should validate
(m/validate sd/StructuredData {:document-type "invoice" :invoice-number "INV-001" ...})

;; PO data should validate
(m/validate sd/StructuredData {:document-type "purchase-order" :po-number "PO-001" ...})

;; Contract data should validate
(m/validate sd/StructuredData {:document-type "contract" :contract-number "CTR-001" ...})

;; GRN data should validate
(m/validate sd/StructuredData {:document-type "goods-received-note" :grn-number "GRN-001" ...})

Step 3: Verify DB migration

psql -U postgres -d orcha -c "SELECT enum_range(NULL::document_type);"

Expected: {invoice,contract,purchase-order,goods-received-note,other}

Step 4: Final commit (if any remaining changes)

git add -A
git commit -m "chore: final cleanup for document management part 1"