For Claude: REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
Goal: Enable ingestion of contracts, purchase orders, and goods received notes alongside invoices — classify, extract type-specific structured data, and store results.
Architecture: Shared common schema (schema/common.clj) with type-specific schemas in folders (schema/invoice/, schema/purchase_order/, schema/contract/, schema/grn/). A dispatch layer (schema/structured_data.clj) uses Malli :multi on :document-type to validate the correct schema. Pipeline stages use existing multimethod dispatch — new types get extraction prompts, stub validation, and no-op fraud/post-processing.
Tech Stack: Clojure, Malli (schemas), next.jdbc (DB), PostgreSQL (ENUM types), Anthropic Claude (LLM extraction)
Files:
orcha/resources/migrations/20260217120000-add-document-types.up.sqlorcha/resources/migrations/20260217120000-add-document-types.down.sqlStep 1: Write up migration
-- Add new document types to enum
ALTER TYPE document_type ADD VALUE IF NOT EXISTS 'contract';
ALTER TYPE document_type ADD VALUE IF NOT EXISTS 'purchase-order';
ALTER TYPE document_type ADD VALUE IF NOT EXISTS 'goods-received-note';
ALTER TYPE document_type ADD VALUE IF NOT EXISTS 'other';
-- Update trigger to use actual document type from structured_data instead of hard-coding 'invoice'
CREATE OR REPLACE FUNCTION update_document_from_ingestion()
RETURNS TRIGGER AS $$
BEGIN
IF NEW.status = 'completed' AND OLD.status = 'in-progress' THEN
UPDATE document
SET
type = (NEW.structured_data->>'document-type')::document_type,
structured_data = NEW.structured_data,
needs_human_review = (
NOT COALESCE(NEW.valid_structured_data, true)
OR EXISTS (
SELECT 1 FROM jsonb_each(NEW.structured_data->'validation-results') AS v(k, val)
WHERE val->>'status' = 'error'
)
),
updated_at = now()
WHERE id = NEW.document_id;
END IF;
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
Step 2: Write down migration
-- Revert trigger to hard-coded 'invoice'
CREATE OR REPLACE FUNCTION update_document_from_ingestion()
RETURNS TRIGGER AS $$
BEGIN
IF NEW.status = 'completed' AND OLD.status = 'in-progress' THEN
UPDATE document
SET
type = 'invoice',
structured_data = NEW.structured_data,
needs_human_review = (
NOT COALESCE(NEW.valid_structured_data, true)
OR EXISTS (
SELECT 1 FROM jsonb_each(NEW.structured_data->'validation-results') AS v(k, val)
WHERE val->>'status' = 'error'
)
),
updated_at = now()
WHERE id = NEW.document_id;
END IF;
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
-- Note: Cannot remove values from ENUM in PostgreSQL without dropping and recreating the type.
-- This down migration only reverts the trigger. Manual cleanup of enum values would be needed
-- if a full rollback is required.
Step 3: Run migration locally
Run: cd orcha && clojure -M:dev -m com.getorcha.dev/migrate
Expected: Migration applies successfully. Verify with psql -U postgres -d orcha -c "SELECT enum_range(NULL::document_type);"
Step 4: Commit
git add orcha/resources/migrations/20260217120000-add-document-types.up.sql orcha/resources/migrations/20260217120000-add-document-types.down.sql
git commit -m "feat: add contract, purchase-order, goods-received-note to document_type enum
Update trigger to derive document type from structured_data instead of
hard-coding 'invoice'."
schema/common.clj — Extract Shared ComponentsFiles:
orcha/src/com/getorcha/schema/common.cljorcha/src/com/getorcha/schema/structured_data.clj (source of components to extract)Step 1: Create schema/common.clj
Extract these definitions from structured_data.clj into schema/common.clj:
TaxIdType (lines 6-14)Issuer (lines 17-39) — make public (remove ^:private)Recipient (lines 42-53) — make publicServicePeriod (lines 56-61) — make publicComplianceStatement (lines 110-116) — make publicSurcharge (lines 139-150) — make publicNamespace: com.getorcha.schema.common
Also add shared enums:
Confidence — [:enum "high" "medium" "low"]DocumentType — [:enum "invoice" "contract" "purchase-order" "goods-received-note" "other"]Step 2: Add a ContractParty schema to common.clj
Contracts need a party schema with signatory and role fields:
(def ContractParty
"Party in a contract with signatory details."
[:map
[:name [:string {:min 1}]]
[:address [:maybe :string]]
[:country [:maybe :string]]
[:tax-id-type [:maybe TaxIdType]]
[:tax-id [:maybe :string]]
[:signatory [:maybe :string]]
[:role [:maybe :string]]])
Step 3: Verify compile
Run: cd orcha && clojure -M:dev -e "(require 'com.getorcha.schema.common)"
Expected: No errors
Step 4: Commit
git add orcha/src/com/getorcha/schema/common.clj
git commit -m "feat: add schema/common.clj with shared document components
Extract TaxIdType, Issuer, Recipient, ServicePeriod, ComplianceStatement,
Surcharge, ContractParty, Confidence, and DocumentType enums."
schema/invoice/structured_data.cljFiles:
orcha/src/com/getorcha/schema/invoice/structured_data.cljorcha/src/com/getorcha/schema/structured_data.cljStep 1: Create schema/invoice/structured_data.clj
Namespace: com.getorcha.schema.invoice.structured-data
Move all invoice-specific definitions from structured_data.clj:
Prepayment (make public)AccountMatchAccrualPeriod, AccrualMatchVatValidationCostCenterMatchBuCodeLineItem (make public, now requires common/Surcharge)TaxRateBreakdown (make public)ValidationCheck, ValidationResultsServiceCategoryTaxIssueFraudFlagType, FraudSeverity, FraudFlagStructuredData schema itself → rename to InvoiceDataRequire schema.common for shared components (Issuer, Recipient, etc.)
Step 2: Update schema/structured_data.clj to re-export
The existing structured_data.clj becomes a dispatch layer that:
schema.invoice.structured-dataStructuredData as the invoice schema for now (will become :multi dispatch later)AccountMatch, AccrualMatch, FraudFlag, ValidationCheck, etc. for backwards compat(ns com.getorcha.schema.structured-data
"Dispatch schema for all document types.
Routes to type-specific schemas based on :document-type."
(:require [com.getorcha.schema.invoice.structured-data :as invoice]))
;; Re-exports for backwards compatibility
(def AccountMatch invoice/AccountMatch)
(def AccrualMatch invoice/AccrualMatch)
(def VatValidation invoice/VatValidation)
(def CostCenterMatch invoice/CostCenterMatch)
(def BuCode invoice/BuCode)
(def ServiceCategory invoice/ServiceCategory)
(def TaxIssue invoice/TaxIssue)
(def FraudFlagType invoice/FraudFlagType)
(def FraudSeverity invoice/FraudSeverity)
(def FraudFlag invoice/FraudFlag)
;; For now, StructuredData = InvoiceData. Will become :multi dispatch after all type schemas exist.
(def StructuredData invoice/InvoiceData)
Step 3: Run existing tests to verify nothing broke
Run: cd orcha && clojure -M:test -m kaocha.runner -- --focus com.getorcha.workers.ingestion.extraction-test
Expected: All tests pass (schema imports still resolve, InvoiceData validates same as old StructuredData)
Also run: cd orcha && clojure -M:test -m kaocha.runner -- --focus com.getorcha.workers.ingestion.classification-test
Expected: All tests pass
Step 4: Commit
git add orcha/src/com/getorcha/schema/invoice/structured_data.clj orcha/src/com/getorcha/schema/structured_data.clj
git commit -m "refactor: move invoice schema to schema/invoice/structured_data.clj
Original structured_data.clj becomes a dispatch layer that re-exports
shared types for backwards compatibility."
Files:
orcha/src/com/getorcha/schema/purchase_order/structured_data.cljStep 1: Write the PO schema
Namespace: com.getorcha.schema.purchase-order.structured-data
(ns com.getorcha.schema.purchase-order.structured-data
"Schema for LLM-extracted purchase order structured data."
(:require [com.getorcha.schema.common :as common]))
(def ^:private POLineItem
[:map
[:description :string]
[:article-code [:maybe :string]]
[:quantity [:maybe number?]]
[:unit [:maybe :string]]
[:unit-price [:maybe number?]]
[:amount [:maybe number?]]
[:tax-rate [:maybe number?]]
[:expected-delivery-date [:maybe :string]] ;; ISO 8601
[:page-location [:tuple :int :int]]])
(def PurchaseOrderData
"Schema for LLM-extracted purchase order structured data."
[:map
;; Classification (from common)
[:document-type [:= "purchase-order"]]
[:document-description [:maybe :string]]
[:confidence common/Confidence]
[:missing-fields [:maybe [:vector :string]]]
;; Core
[:po-number [:string {:min 1}]]
[:po-date [:maybe :string]] ;; ISO 8601
[:currency [:maybe :string]]
[:total-value [:maybe number?]]
[:status [:maybe :string]] ;; e.g., "confirmed", "pending", "cancelled"
;; Parties
[:buyer [:maybe common/Recipient]]
[:supplier common/Issuer]
;; Line items
[:line-items [:maybe [:vector POLineItem]]]
;; Logistics
[:expected-delivery-date [:maybe :string]]
[:delivery-address [:maybe :string]]
[:delivery-country [:maybe :string]]
[:incoterm-code [:maybe :string]]
[:incoterm-place [:maybe :string]]
[:shipping-terms [:maybe :string]]
;; Commercial
[:payment-terms [:maybe :string]]
[:discount-terms [:maybe :string]]
[:validity-date [:maybe :string]] ;; ISO 8601
;; References
[:contract-reference [:maybe :string]]
[:requisition-number [:maybe :string]]
;; Approval
[:authorized-by [:maybe :string]]
[:approval-date [:maybe :string]]]) ;; ISO 8601
Step 2: Verify compile
Run: cd orcha && clojure -M:dev -e "(require 'com.getorcha.schema.purchase-order.structured-data)"
Expected: No errors
Step 3: Commit
git add orcha/src/com/getorcha/schema/purchase_order/structured_data.clj
git commit -m "feat: add purchase order structured data schema"
Files:
orcha/src/com/getorcha/schema/contract/structured_data.cljStep 1: Write the contract schema
Namespace: com.getorcha.schema.contract.structured-data
(ns com.getorcha.schema.contract.structured-data
"Schema for LLM-extracted contract structured data."
(:require [com.getorcha.schema.common :as common]))
(def ^:private ContractType
[:enum "service" "supply" "lease" "nda" "framework" "other"])
(def ^:private RenewalType
[:enum "auto" "manual" "none"])
(def ^:private PaymentScheduleEntry
[:map
[:description :string]
[:date [:maybe :string]]
[:amount [:maybe number?]]])
(def ^:private VariableComponent
[:map
[:description :string]
[:formula [:maybe :string]]])
(def ^:private Penalty
[:map
[:description :string]
[:amount-or-percentage [:maybe :string]]])
(def ContractData
"Schema for LLM-extracted contract structured data."
[:map
;; Classification
[:document-type [:= "contract"]]
[:document-description [:maybe :string]]
[:confidence common/Confidence]
[:missing-fields [:maybe [:vector :string]]]
;; Core
[:contract-number [:maybe :string]]
[:title [:maybe :string]]
[:contract-type [:maybe ContractType]]
[:effective-date [:maybe :string]]
[:expiration-date [:maybe :string]]
[:currency [:maybe :string]]
[:total-value [:maybe number?]]
;; Parties
[:party-a [:maybe common/ContractParty]]
[:party-b [:maybe common/ContractParty]]
;; Terms
[:payment-schedule [:maybe [:vector PaymentScheduleEntry]]]
[:payment-terms [:maybe :string]]
[:renewal-type [:maybe RenewalType]]
[:renewal-notice-period [:maybe :string]]
[:termination-notice-period [:maybe :string]]
[:termination-conditions [:maybe :string]]
;; Scope
[:description [:maybe :string]]
[:deliverables [:maybe [:vector :string]]]
[:slas [:maybe [:vector :string]]]
;; Financial
[:base-fee [:maybe number?]]
[:variable-components [:maybe [:vector VariableComponent]]]
[:price-escalation [:maybe :string]]
[:penalties [:maybe [:vector Penalty]]]
;; References
[:po-references [:maybe [:vector :string]]]
[:predecessor-contract [:maybe :string]]
;; Legal
[:governing-law [:maybe :string]]
[:jurisdiction [:maybe :string]]
[:liability-cap [:maybe :string]]
[:insurance-requirements [:maybe :string]]
[:confidentiality [:maybe :boolean]]])
Step 2: Verify compile
Run: cd orcha && clojure -M:dev -e "(require 'com.getorcha.schema.contract.structured-data)"
Expected: No errors
Step 3: Commit
git add orcha/src/com/getorcha/schema/contract/structured_data.clj
git commit -m "feat: add contract structured data schema"
Files:
orcha/src/com/getorcha/schema/grn/structured_data.cljStep 1: Write the GRN schema
Namespace: com.getorcha.schema.grn.structured-data
(ns com.getorcha.schema.grn.structured-data
"Schema for LLM-extracted goods received note structured data."
(:require [com.getorcha.schema.common :as common]))
(def ^:private ItemCondition
[:enum "accepted" "damaged" "short"])
(def ^:private QualityAssessment
[:enum "pass" "partial" "fail"])
(def ^:private GRNLineItem
[:map
[:description :string]
[:quantity-ordered [:maybe number?]]
[:quantity-received [:maybe number?]]
[:quantity-rejected [:maybe number?]]
[:unit [:maybe :string]]
[:condition [:maybe ItemCondition]]
[:rejection-reason [:maybe :string]]
[:page-location [:tuple :int :int]]])
(def GRNData
"Schema for LLM-extracted goods received note structured data."
[:map
;; Classification
[:document-type [:= "goods-received-note"]]
[:document-description [:maybe :string]]
[:confidence common/Confidence]
[:missing-fields [:maybe [:vector :string]]]
;; Core
[:grn-number [:maybe :string]]
[:receipt-date [:maybe :string]]
[:receiving-location [:maybe :string]]
;; References
[:po-reference [:maybe :string]]
[:delivery-note-number [:maybe :string]]
[:shipping-reference [:maybe :string]]
;; Parties
[:supplier [:maybe common/Issuer]]
[:receiver-name [:maybe :string]]
[:receiver-department [:maybe :string]]
;; Line items
[:line-items [:maybe [:vector GRNLineItem]]]
;; Logistics
[:carrier [:maybe :string]]
[:delivery-date [:maybe :string]]
[:delivery-method [:maybe :string]]
;; Inspection
[:inspector-name [:maybe :string]]
[:inspection-date [:maybe :string]]
[:inspection-notes [:maybe :string]]
[:quality-assessment [:maybe QualityAssessment]]
;; Sign-off
[:received-by [:maybe :string]]
[:approved-by [:maybe :string]]])
Step 2: Verify compile
Run: cd orcha && clojure -M:dev -e "(require 'com.getorcha.schema.grn.structured-data)"
Expected: No errors
Step 3: Commit
git add orcha/src/com/getorcha/schema/grn/structured_data.clj
git commit -m "feat: add goods received note structured data schema"
structured_data.clj — Multi-Schema DispatchFiles:
orcha/src/com/getorcha/schema/structured_data.cljStep 1: Update to use Malli :multi dispatch
Replace the temporary re-export with a proper multi-dispatch schema:
(ns com.getorcha.schema.structured-data
"Dispatch schema for all document types.
Routes to type-specific schemas based on :document-type."
(:require [com.getorcha.schema.invoice.structured-data :as invoice]
[com.getorcha.schema.purchase-order.structured-data :as po]
[com.getorcha.schema.contract.structured-data :as contract]
[com.getorcha.schema.grn.structured-data :as grn]
[malli.core :as m]))
;; Re-exports for backwards compatibility (used by post_process.clj, validation.clj, etc.)
(def AccountMatch invoice/AccountMatch)
(def AccrualMatch invoice/AccrualMatch)
(def VatValidation invoice/VatValidation)
(def CostCenterMatch invoice/CostCenterMatch)
(def BuCode invoice/BuCode)
(def ServiceCategory invoice/ServiceCategory)
(def TaxIssue invoice/TaxIssue)
(def FraudFlagType invoice/FraudFlagType)
(def FraudSeverity invoice/FraudSeverity)
(def FraudFlag invoice/FraudFlag)
(def StructuredData
"Multi-schema that dispatches on :document-type.
Validates invoice, purchase-order, contract, or goods-received-note data
against their type-specific schemas."
(m/schema
[:multi {:dispatch :document-type}
["invoice" invoice/InvoiceData]
["purchase-order" po/PurchaseOrderData]
["contract" contract/ContractData]
["goods-received-note" grn/GRNData]]))
Step 2: Run existing tests
Run: cd orcha && clojure -M:test -m kaocha.runner -- --focus com.getorcha.workers.ingestion.extraction-test
Expected: All tests pass (invoice data still validates through the :multi dispatch)
Step 3: Commit
git add orcha/src/com/getorcha/schema/structured_data.clj
git commit -m "feat: structured_data.clj now dispatches to type-specific schemas
Uses Malli :multi on :document-type to validate invoice, purchase-order,
contract, and goods-received-note data against their respective schemas."
schema/document.clj — Add Document TypesFiles:
orcha/src/com/getorcha/schema/document.cljStep 1: Update Type enum
Change line 7-8 from:
(def Type
[:enum :invoice])
To:
(def Type
[:enum :invoice :contract :purchase-order :goods-received-note :other])
Step 2: Commit
git add orcha/src/com/getorcha/schema/document.clj
git commit -m "feat: add new document types to schema/document.clj Type enum"
Files:
orcha/src/com/getorcha/workers/ingestion/classification.cljorcha/test/com/getorcha/workers/ingestion/classification_test.cljStep 1: Write test for non-invoice classification returning result
Add to classification_test.clj — test that classify! returns classification for non-invoices instead of throwing:
;; This test will initially fail because classify! currently throws for non-invoices
(deftest test-classify-returns-non-invoice-types
(testing "classify! returns classification for contract (does not reject)"
;; This test mocks the LLM call to return a contract classification
;; and verifies classify! returns the result instead of throwing
))
Note: The classify! function requires DB + LLM, so the key change is in the unit-testable parse-classification-result function. Existing tests already cover parsing for all types (lines 59-93 of classification_test.clj). The main change is in classify! itself.
Step 2: Update classify! to return all types
In classification.clj, replace lines 169-179:
(if (= "invoice" (:document-type classification))
;; Invoice - return classification and stats
{:classification classification
:stats stats}
;; Non-invoice - reject
(throw (ex-info "Document rejected: not an invoice"
{:kind ::rejection
:rejection-reason :not-invoice
:document-type (:document-type classification)
:description (:invoice-description classification)
:confidence (:confidence classification)})))
With:
{:classification classification
:stats stats}
Step 3: Add :document-description key to parse-classification-result
In parse-classification-result (line 129), add a :document-description key alongside :invoice-description:
{:document-type document-type
:invoice-subtype (when (= "invoice" document-type)
(if (invoice-subtypes subtype)
subtype
"standard-invoice"))
:invoice-description (:description result)
:document-description (:description result) ;; new, type-agnostic alias
:confidence (or (:confidence result) "medium")}
Step 4: Update classification docstring
Update the ns docstring and classify! docstring to reflect that non-invoices are no longer rejected.
Step 5: Run classification tests
Run: cd orcha && clojure -M:test -m kaocha.runner -- --focus com.getorcha.workers.ingestion.classification-test
Expected: All tests pass
Step 6: Commit
git add orcha/src/com/getorcha/workers/ingestion/classification.clj orcha/test/com/getorcha/workers/ingestion/classification_test.clj
git commit -m "feat: classification returns all document types, stops rejecting non-invoices
Add :document-description as type-agnostic alias for :invoice-description."
Files:
orcha/src/com/getorcha/workers/ingestion/extraction.cljStep 1: Add PO extraction prompt
Add a new defmethod workers/-prompt :extraction-purchase-order with a comprehensive extraction prompt for purchase orders. Follow the same structure as the invoice prompt (:extraction) but with PO-specific fields matching the PurchaseOrderData schema.
The prompt should:
${text} substitution variableStep 2: Add PO extraction multimethod
(defmethod structured-data "purchase-order"
[{:keys [db-pool llm-config] :as _context}
{{:keys [text]} :transcription-result :keys [document] :as _ingestion}]
(let [started-at (java.time.Instant/now)
extraction-cfg (:extraction llm-config)
legal-entity-id (:document/legal-entity-id document)
prompt (workers/legal-entity-prompt db-pool legal-entity-id :extraction-purchase-order {:text text})
llm-generation (llm/generate extraction-cfg prompt)
ended-at (java.time.Instant/now)]
{:data (-> (:text llm-generation)
llm/parse-json-response
normalize-issuer-iban)
:stats (-> llm-generation
(dissoc :text)
(assoc :started-at started-at
:ended-at ended-at))}))
Step 3: Verify compile
Run: cd orcha && clojure -M:dev -e "(require 'com.getorcha.workers.ingestion.extraction)"
Expected: No errors
Step 4: Commit
git add orcha/src/com/getorcha/workers/ingestion/extraction.clj
git commit -m "feat: add purchase order extraction prompt and multimethod"
Files:
orcha/src/com/getorcha/workers/ingestion/extraction.cljStep 1: Add contract extraction prompt
Add defmethod workers/-prompt :extraction-contract with a comprehensive prompt for contracts. Fields should match ContractData schema.
Step 2: Add contract extraction multimethod
(defmethod structured-data "contract"
[{:keys [db-pool llm-config] :as _context}
{{:keys [text]} :transcription-result :keys [document] :as _ingestion}]
(let [started-at (java.time.Instant/now)
extraction-cfg (:extraction llm-config)
legal-entity-id (:document/legal-entity-id document)
prompt (workers/legal-entity-prompt db-pool legal-entity-id :extraction-contract {:text text})
llm-generation (llm/generate extraction-cfg prompt)
ended-at (java.time.Instant/now)]
{:data (-> (:text llm-generation)
llm/parse-json-response)
:stats (-> llm-generation
(dissoc :text)
(assoc :started-at started-at
:ended-at ended-at))}))
Step 3: Commit
git add orcha/src/com/getorcha/workers/ingestion/extraction.clj
git commit -m "feat: add contract extraction prompt and multimethod"
Files:
orcha/src/com/getorcha/workers/ingestion/extraction.cljStep 1: Add GRN extraction prompt
Add defmethod workers/-prompt :extraction-grn with a comprehensive prompt for goods received notes. Fields should match GRNData schema.
Step 2: Add GRN extraction multimethod
(defmethod structured-data "goods-received-note"
[{:keys [db-pool llm-config] :as _context}
{{:keys [text]} :transcription-result :keys [document] :as _ingestion}]
(let [started-at (java.time.Instant/now)
extraction-cfg (:extraction llm-config)
legal-entity-id (:document/legal-entity-id document)
prompt (workers/legal-entity-prompt db-pool legal-entity-id :extraction-grn {:text text})
llm-generation (llm/generate extraction-cfg prompt)
ended-at (java.time.Instant/now)]
{:data (-> (:text llm-generation)
llm/parse-json-response)
:stats (-> llm-generation
(dissoc :text)
(assoc :started-at started-at
:ended-at ended-at))}))
Step 3: Commit
git add orcha/src/com/getorcha/workers/ingestion/extraction.clj
git commit -m "feat: add goods received note extraction prompt and multimethod"
Files:
orcha/src/com/getorcha/workers/ingestion/validation.cljStep 1: Add basic validation for purchase orders
(defmethod validate "purchase-order"
[structured-data]
(assoc structured-data
:validation-results
{:required-fields (if (and (not-empty (:po-number structured-data))
(not-empty (get-in structured-data [:supplier :name])))
{:status "pass"}
{:status "error"
:message "Missing required fields: po-number and/or supplier name"})}))
Step 2: Add basic validation for contracts
(defmethod validate "contract"
[structured-data]
(assoc structured-data
:validation-results
{:required-fields (if (or (not-empty (:contract-number structured-data))
(not-empty (:title structured-data)))
{:status "pass"}
{:status "error"
:message "Missing required fields: contract-number or title"})}))
Step 3: Add basic validation for GRN
(defmethod validate "goods-received-note"
[structured-data]
(assoc structured-data
:validation-results
{:required-fields (if (not-empty (:grn-number structured-data))
{:status "pass"}
{:status "error"
:message "Missing required field: grn-number"})}))
Step 4: Run validation tests
Run: cd orcha && clojure -M:test -m kaocha.runner -- --focus com.getorcha.workers.ingestion.validation-test
Expected: All existing tests pass
Step 5: Commit
git add orcha/src/com/getorcha/workers/ingestion/validation.clj
git commit -m "feat: add basic validation for purchase-order, contract, goods-received-note
Initial required-field checks only. Financial math validation is
invoice-specific and not applicable to new types."
Files:
orcha/src/com/getorcha/workers/ingestion/fraud_detection.cljStep 1: Add no-op implementations
(defmethod detect "purchase-order"
[_context ingestion]
ingestion)
(defmethod detect "contract"
[_context ingestion]
ingestion)
(defmethod detect "goods-received-note"
[_context ingestion]
ingestion)
Step 2: Run fraud detection tests
Run: cd orcha && clojure -M:test -m kaocha.runner -- --focus com.getorcha.workers.ingestion.fraud-detection-test
Expected: All existing tests pass
Step 3: Commit
git add orcha/src/com/getorcha/workers/ingestion/fraud_detection.clj
git commit -m "feat: add no-op fraud detection for new document types"
Files:
orcha/src/com/getorcha/workers/ingestion/post_process.cljStep 1: Add no-op implementations
(defmethod run "purchase-order"
[_context ingestion]
ingestion)
(defmethod run "contract"
[_context ingestion]
ingestion)
(defmethod run "goods-received-note"
[_context ingestion]
ingestion)
Step 2: Commit
git add orcha/src/com/getorcha/workers/ingestion/post_process.clj
git commit -m "feat: add no-op post-processing for new document types"
Files:
orcha/src/com/getorcha/workers/ingestion.cljStep 1: Remove the ::classification/rejection catch clause
In job-handler (around lines 574-593), remove the entire cond branch that handles ::classification/rejection:
;; REMOVE THIS ENTIRE BLOCK:
(= ::classification/rejection (:kind data))
(let [{:keys [document-type description]} data]
(log/info "Document rejected" data)
(notifications/notify! ...)
(delete-document! ...)
(delete-message!))
Step 2: Update classify! docstring
Update the docstring at line 292-294 to remove mention of rejection:
(defn ^:private classify!
"Classifies document type. Returns ingestion with :classification added."
[context ingestion]
...)
Step 3: Run all tests
Run: cd orcha && clojure -M:test -m kaocha.runner
Expected: All tests pass
Step 4: Commit
git add orcha/src/com/getorcha/workers/ingestion.clj
git commit -m "feat: remove non-invoice document rejection from ingestion pipeline
All document types now flow through the full pipeline: classify, extract,
validate, fraud-detect, post-process, complete."
Step 1: Run full test suite
Run: cd orcha && clojure -M:test -m kaocha.runner
Expected: All tests pass
Step 2: Verify schema dispatch works
Run a REPL check:
(require '[malli.core :as m])
(require '[com.getorcha.schema.structured-data :as sd])
;; Invoice data should validate
(m/validate sd/StructuredData {:document-type "invoice" :invoice-number "INV-001" ...})
;; PO data should validate
(m/validate sd/StructuredData {:document-type "purchase-order" :po-number "PO-001" ...})
;; Contract data should validate
(m/validate sd/StructuredData {:document-type "contract" :contract-number "CTR-001" ...})
;; GRN data should validate
(m/validate sd/StructuredData {:document-type "goods-received-note" :grn-number "GRN-001" ...})
Step 3: Verify DB migration
psql -U postgres -d orcha -c "SELECT enum_range(NULL::document_type);"
Expected: {invoice,contract,purchase-order,goods-received-note,other}
Step 4: Final commit (if any remaining changes)
git add -A
git commit -m "chore: final cleanup for document management part 1"