Status: Draft for review
Date: 2026-05-20
Related: 2026-04-15-unified-processors-design.md, 2026-04-13-document-diagnostics-design.md
Rework how Orcha performs legal analysis of contracts so that it produces a materially better, jurisdiction-aware assessment that scales beyond Germany.
Goals:
key-obligations (stays an extracted fact for now).workers/ap/ingestion/extraction.clj, :extraction-contract): a single LLM
call extracts fields and runs a hardcoded EU compliance block
(GDPR / NIS2 / DORA) plus key-obligations.schema/contract/structured_data.clj) defines compliance-checks
and risk-flags; locale is only inferred for date parsing. There is no
jurisdiction abstraction.IProcessor
protocol (§3 of the unified-processors spec) where post-processors declare
reads/writes/diagnostic/modes, run -compute (often an LLM call via
ai.prompts/tenant-prompt), and write to a diagnostic slice. Notably
tax-compliance already varies its prompt by tenant country — so a
jurisdiction-varying LLM post-processor is an established pattern, not new.The German legal-skills repo (Klotzkette/claude-fuer-deutsches-recht) ships
Claude Code skills (SKILL.md files). We are not running those skills at
runtime. We use prompts — specifically composable prompt modules in Orcha's
own repo — and treat the skills as source material we mine to author those
modules (their license is MIT/Apache, so this is permitted with attribution).
Rationale:
tenant_prompt_customization path,
schema-validated structured output, and unit/eval testing.If Orcha later builds an interactive contract assistant (chat/Q&A over a contract), skills may be the right tool there. That is out of scope here.
Two stages, mirroring AP extraction → post-processors:
Contract ingestion
│
├─ Stage 1. Extraction (:extraction-contract, slimmed)
│ → UNIVERSAL FACTS + a first-class CLAUSE INVENTORY
│ (no good/bad/missing judgment; EU compliance block REMOVED)
│
└─ Stage 2. Legal Analysis (NEW :contract-legal-analysis IProcessor)
→ resolve jurisdiction (code)
→ compose BASE + jurisdiction layers + contract-type module (+ tenant additions)
→ one LLM call → schema-validated findings
→ write :legal diagnostic slice
Extraction stays jurisdiction-agnostic and makes no assessment. Its job is "what's in this contract," now including a first-class clause inventory using one canonical clause taxonomy (shared vocabulary for both stages):
:clauses
[{:type :liability-cap ; canonical taxonomy: :termination :auto-renewal
:text "<verbatim clause text>" ; :confidentiality :indemnification
:summary "<one-line plain summary>" ; :data-protection :price-escalation
:location "<section / page anchor>"} ; :sla :governing-law :payment-terms …
…]
…alongside the scalar facts it already extracts (parties incl. which side is the company/principal, dates, value/currency, renewal type, notice periods, payment terms, etc.). The previously-inline EU compliance analysis is removed from this prompt.
From the company's (principal/tenant) perspective, the analysis produces:
favorable / unfavorable / neutral, with rationale.analysis_prompt =
BASE (how to assess, output contract, RDG-safe framing)
+ JURISDICTION layers (additive — see 6.2)
+ CONTRACT-TYPE module
+ tenant additions (existing tenant_prompt_customization)
Module texts are versioned resources in the repo, editable without code:
resources/com/getorcha/legal/
base.md
layers/{generic,eu,de}.md
types/{nda,saas,service-supply,lease-rental,loan,insurance,framework,other}.md
prompt-version is recorded in the output :meta for regression tracking.
generic+ eu (GDPR/AVV Art. 28, DORA Art. 30, NIS2)+ de (BGB §§305–310 AGB control, §309 Nr. 9 auto-renewal validity)So a German SaaS contract receives generic + eu + de + saas; a US SaaS contract
receives generic + saas. Adding a country later = add a layer file; nothing else
changes.
resolve-jurisdiction(structured-data, tenant) → {:resolved :layers :basis :confidence}
with precedence:
governing-law / jurisdiction clause → country/regionprincipal + counterparty countries (DE present → DE; both EU → EU)tenant.company-country)genericDeterministic and unit-tested in isolation.
:legal diagnostic sliceAll three finding kinds share one shape so UI and tests handle them uniformly:
:legal
{:jurisdiction {:resolved :de
:layers [:generic :eu :de]
:basis "governing-law clause: 'Recht der Bundesrepublik Deutschland'"
:confidence :high}
:clause-assessments
[{:clause-type :auto-renewal
:verdict :unfavorable ; :favorable | :unfavorable | :neutral
:severity :warning ; :critical | :warning | :info
:rationale "12-month auto-renewal, 3-month notice — long lock-in for the company."
:citations [{:label "§ 309 Nr. 9 BGB" :note "term/auto-renewal limits in AGB"}]
:suggestion "Negotiate shorter renewal term or notice window."
:clause-ref "<id/anchor of the extracted clause>"}]
:missing-clauses
[{:clause-type :data-protection
:severity :critical
:rationale "SaaS processing personal data without an Art. 28 GDPR DPA/AVV."
:citations [{:label "Art. 28 DSGVO"}]
:suggestion "Add a data-processing agreement (AVV)."}]
:compliance ; replaces the hardcoded GDPR/NIS2/DORA block
[{:regime :gdpr :status :warning :findings [...]}
{:regime :dora :status :not-applicable}]
:summary {:overall :needs-attention ; :ok | :needs-attention | :high-risk
:counts {:unfavorable 2 :missing 1 :critical 1}
:headline "2 unfavorable clauses, 1 missing DPA."}
:meta {:prompt-version "legal-v1" :model "..." :analyzed-at "..."}}
LegalAnalysis schema enforces the shape; the LLM fills the
judgment. Invalid output → handled as an analysis error (tested).:citations reference and (for assessments) a
:clause-ref back to the Stage-1 clause.compliance-checks and the unused risk-flags.auto-renewal ("verlängert sich um 12 Monate, Kündigungsfrist
3 Monate"), liability-cap, no data-protection clause.generic + eu + de + saas): auto-renewal → unfavorable + §309
Nr. 9 BGB validity risk; liability-cap present → favorable; missing AVV/DPA
(Art. 28 GDPR) → critical.New processor :contract-legal-analysis, registered next to
validations/fraud/tax-compliance:
| Method | Behavior |
|---|---|
-id |
:contract-legal-analysis |
-reads |
contract clauses, parties (+countries), governing-law, jurisdiction, contract-type, dates, value, renewal/notice (dispatches on state → contracts only) |
-writes |
[] (diagnostic-only) |
-diagnostic |
{:slice :legal} |
-modes |
#{:ingestion :edit} |
-always? |
false |
-compute |
resolve-jurisdiction → compose-prompt → LLM (tenant-prompt) → parse → validate vs LegalAnalysis → {:result … :stats …} |
-apply-ops |
[] |
compose-legal-prompt(layers, contract-type) is registered as
defmethod ai.prompts/-prompt :contract-legal-analysis, so it picks up
per-tenant additions automatically through tenant-prompt.A German "Prüfung" that applies statutes to a customer's specific contract can edge toward a regulated Rechtsdienstleistung (RDG). The BASE module therefore instructs the model to produce informational risk signals with references for the user's own review, in hedged language — not legal conclusions or advice. A standing disclaimer is stored with the output. The exact wording should get a counsel sign-off before launch.
The prompt modules are authored by mining (not running) these skills:
| Module | Mined from |
|---|---|
layers/de |
vertragsrecht/{vertragspruefung, agb-pruefung, vertragsverlaengerungs-monitor, saas-msa-pruefung, nda-pruefung} |
layers/eu |
datenschutzrecht/avv-pruefung (Art. 28), regulatorisches-recht/dora-ikt-vertragspruefung (Art. 30), NIS2 |
layers/generic |
universal contract-review heuristics (parties, value, renewal/notice, liability, risk) |
types/* |
nda-pruefung, saas-msa-pruefung, lieferantenvertrag-pruefung, plus fachanwalt-* norm appendices (it/versicherung/bank/transport) per type |
LegalAnalysis; malformed
output → analysis error, with a test for the rejection path.clause-type + verdict/kind + severity + citation present), tolerant to
wording — e.g. "DE SaaS without DPA → missing data-protection, critical."
Track corpus pass-rate as the quality metric. Reuse the dev/snapshots harness.prompt-version; re-run
the golden evals and diff findings before shipping.:legal slice.:clauses inventory in extraction; keep existing fields.:contract-legal-analysis processor + modules + LegalAnalysis schema,
writing the :legal slice (shadow — not yet displayed).:extraction-contract.compliance-checks consumers to read from :legal; retire the
dead risk-flags.:legal — separate later effort.:basis/:confidence; low confidence can downgrade severities or surface a
"confirm jurisdiction" hint.key-obligations becomes an assessed finding later.:legal in the UI (list badge from :summary.overall, detail panel).