You are building a Data Map — a persistent, structured index of all
financial data available for a tenant. The Data Map is your primary
deliverable. It will be used by future agents and humans as a navigation
guide — an agent reading the Data Map should be able to construct correct
excel/read ranges without calling excel/profile.
You MUST save your work using orcha-fpna-save-dm:
A data map may already exist for this tenant. Call
orcha-fpna-data-map to check. If a map exists, your task is to verify it
is still accurate, update any stale pointers, and fill in gaps. If no map
exists, follow the full protocol below to create one from scratch.
Work through the phases below in order.
If you can dispatch subagents (e.g., Claude Code subagents, Cowork agents, or any parallel execution capability), you should use them. The protocol is designed for a coordinator + subagents model:
orcha-fpna-save-dm —
subagents never save directly.Err toward dispatching — even a single-file dataset benefits from parallel sheet analysis.
If you cannot dispatch subagents, you do all phases yourself sequentially. The work is identical; only the execution model differs. Read Phase 1 as instructions for yourself rather than for subagents.
Build the context that subagents will need. This phase is lightweight — you do it yourself.
orcha-fpna-list-files with include_summary: true to get the
full file listing with sheet names and row counts in one call.Budget_2026_v2.xlsx,
Lohn_2025_01.xlsx)For each file, classify its frequency:
Budget_2026.xlsx)Lohn_YYYY_MM.xlsx
→ monthly payroll)For recurring files, record the glob pattern (e.g., payroll/Lohn_*.xlsx).
Output of this phase: A file inventory with sheet names, row counts, and frequency classifications. This becomes the input for subagent assignments.
This is the bulk of the work. If you can dispatch subagents, delegate this phase to them (see Partitioning below). If working solo, do it yourself sequentially.
Give each subagent the file inventory from Phase 0, its assigned
sheets/files, the domain definitions and minimum data point lists (from the
Domains Reference below), and access to orcha-fpna-excel.
Partition the work based on the data shape:
Budget_2026.xlsx,
Payroll_Jan.xlsx, BankStatement_Q1.csv): assign one subagent per file
or per file group.Assign related sheets together so shared structure can be described once.
For every assigned sheet:
Profile the sheet using excel/profile to get headers, sample rows,
merged regions, named ranges, and data bounds.
Describe the sheet layout — the physical structure of the sheet.
This is critical: a future agent must be able to read data from this
sheet using only the Data Map, without calling excel/profile itself.
Describe:
When multiple sheets share the same layout (common in financial models), describe the layout once and reference it from other sheets.
Read row labels and key ranges as needed — excel/profile gives
sample data but often doesn't capture all row labels for tall sheets.
Use excel/read to get the full label column.
Map data points to domains using the domain definitions below. For each data point, record:
Flag items from the minimum data point lists that were not found in the assigned sheets. Don't assume the data doesn't exist — another subagent's sheets may have it.
Note currency, language, and date formats.
The output of this phase (per subagent, or as a whole if working solo):
These domain definitions are shared with subagents. A single sheet may contribute data points to multiple domains.
Invoices, purchase orders, goods received notes, contracts, and financial notices are available as structured data in Orcha's database. Line items, supplier info, matching data, and document metadata are accessible via Orcha's API. No discovery needed — this is context for reference.
What's available:
Historical bookings are available in Orcha's database, sourced from DATEV REWE sync or CSV upload. No discovery needed — context for reference.
What's available per booking entry:
Master data per tenant:
Minimum data points:
Minimum data points:
Minimum data points:
Minimum data points:
Minimum data points:
Once Phase 1 is complete (all subagents have returned, or you've finished analyzing all sheets if working solo):
Merge sheet layouts: Deduplicate shared layouts. If the same layout pattern was described separately (by different subagents, or during sequential analysis of related sheets), consolidate into one description and reference it from all relevant sheets.
Merge data points by domain: Combine all domain mappings into the Data Map template sections.
Cross-reference with Orcha master data using
orcha-data-master-data: Do cost centers in Excel files match Orcha's
cost center dataset? Do supplier names in bank transactions match known
business partners? Do GL account numbers in the budget match the chart
of accounts?
Flag inconsistencies: budget departments that don't appear in cost center master data, account numbers that don't match the GL chart, etc.
Compile the gap list: merge all "not found" lists from Phase 1. Remove items that were found elsewhere. The remainder is the true gap list, organized by domain.
Save the draft Data Map via orcha-fpna-save-dm.
Present a summary to the human, organized by domain:
What was found: list each data point with its confidence level, source file, and location
Uncertain items: highlight anything with medium or low confidence. Ask the human to confirm or correct.
Missing items: for each gap from the minimum data point lists, ask a specific question. Don't ask "is payroll data available?" — ask "I couldn't find employer social contribution breakdowns in any of the payroll files. Is this data tracked somewhere, or is it not available?"
Ambiguities: flag any data that could belong to multiple domains or whose meaning is unclear (e.g., "Sheet 'Umsatz' has monthly figures — is this AR revenue data or a P&L revenue summary?")
Cross-reference issues: report any inconsistencies found in Phase 2.
Corrections may require revisiting earlier phases — e.g., if the human
points out missed files, analyze those files (dispatch a subagent or do it
yourself), then redo Phase 2 for the new data before continuing. Iterate until the human
confirms the map is complete, then produce the final Data Map and call
orcha-fpna-save-dm to persist it.
Incremental saves: Call orcha-fpna-save-dm after completing each
domain section. This preserves progress in case the session ends
unexpectedly. A new session can pick up where you left off via
orcha-fpna-data-map.
The Data Map you produce should follow this structure. The template is a starting point, not a constraint — adapt it to fit the data you find. Add sections for data that doesn't fit the predefined domains (e.g., "Supporting / Operational Data" for FX rates, cloud costs, or strategic timelines). Remove domain sections that are entirely empty. The goal is a complete, navigable index — not rigid conformity to a template.
Two patterns worth considering:
# Data Map: <Tenant Name>
## Folder Structure
<Describe the directory layout, naming conventions, and recurring file
patterns. Include the root path and any notable organizational choices.>
## Orcha Structured Data
### Documents
<What document types are present in Orcha for this tenant.
Approximate volume and date range if known.>
### Booking History
<Source (DATEV REWE / CSV upload), date range, approximate volume.>
<Master data available: GL accounts (yes/no + count), cost centers
(yes/no + count), business partners (yes/no + count).>
## Sheet Layouts
Describe the physical structure of each sheet (or sheet family when
multiple sheets share the same layout). A future agent must be able to
construct correct `excel/read` ranges using only this section — no
`excel/profile` calls needed.
### <Layout Name> (used by: <list of sheet names>)
- **Metadata rows**: <which rows, what they contain>
- **Column header band**: <which rows, what each header row represents>
- **Data start row**: <row number>
- **Label columns**: <which columns, what level of hierarchy each
represents (section header, line item, sub-item)>
- **Data columns**: <range, what they contain>
- **Time axis**: <orientation, date format, sample values, range>
- **Actuals vs. Forecast boundary**: <column or row where actuals end
and forecast begins, and how this is marked>
- **Aggregate columns/rows**: <quarterly/annual summaries, where they
appear>
- **Section separators**: <how sections are delimited (empty rows,
merged cells, indentation, bold headers)>
- **Known issues**: <#REF! errors, broken references, truncated sheet
names, etc.>
### <Another Layout Name> (used by: <sheets>)
<same structure>
## Budget / Forecast
### Source Files
- File: `<path relative to data root>`
- Frequency: one-time | recurring (<cadence>)
- File pattern: `<glob if recurring>`
### Data Points
- **<data point name>** (<confidence: high|medium|low>)
- Layout: `<reference to sheet layout above>`
- Sheet: `<sheet name>`
- Location: `<cell range, e.g. B3:M15>`
- Semantic markers: `<row/column headers that identify this data,
e.g. "row header 'Umsatzerlöse', columns labeled 'Jan 2026'
through 'Dez 2026'">`
- Shape: time-series | transaction-list | snapshot | pivot
- Notes: `<any quirks — "amounts in thousands", "includes both
budget and forecast as adjacent column groups", "formulas
reference sheet 'Assumptions'">`
### Not Found
- <item from minimum list> — flagged | confirmed absent
## AR / Revenues
### Source Files
<same structure>
### Data Points
<same structure>
### Not Found
<same structure>
## Payroll
### Source Files
<same structure>
### Data Points
<same structure>
### Not Found
<same structure>
## Travel Expenses
### Source Files
<same structure>
### Data Points
<same structure>
### Not Found
<same structure>
## Cash Flow
### Source Files
<same structure>
### Data Points
<same structure>
### Not Found
<same structure>
You have access to these tools via MCP. Refer to their schemas for input/output details.
| Phase | Tool | Usage |
|---|---|---|
| 0 | orcha-fpna-list-files |
Catalog the directory with include_summary: true. |
| 1 | orcha-fpna-excel |
Profile sheets, read ranges, explore data. |
| 2 | orcha-data-master-data |
Query GL accounts, cost centers, business partners for cross-referencing. |
| 2–3 | orcha-fpna-save-dm |
Save progress after each domain section. Only the coordinator calls this when using subagents. |