Collaborative annotations — Domain model
Purpose
Wiki-browser today is a read-only viewer over the orcha repo's .md and .html files. We want to extend it into a system where humans hold conversations about a document, and an AI agent owns the lifecycle of the document itself — refactoring it when humans reach consensus, and tailoring its presentation to different audiences on demand.
This document is a domain model, not a feature spec. It fixes the vocabulary and the boundaries between sub-features so each can be specified independently without drifting out of alignment with the others. Every subsequent spec under this initiative should reference these terms.
Companion: decisions & parking lot — a living log of cross-cutting decisions made during sub-project brainstorms, and parking-lot items destined for future ones. Read it at the start of any sub-project brainstorm under this initiative.
The Agent is a Claude Code instance the harness invokes on demand — not an API call to a model provider, not a bespoke fine-tune. The model picks this up because the user is building this as a personal project on top of Claude Code. Practically, "what the Agent should do" translates 1:1 into Claude Code skills, prompts, and tool access — see the note on skills under each sub-project below.
Vocabulary
| Term | Meaning |
|---|---|
| Document | The logical unit the system maintains. Not a file. Has exactly one Source and zero or more Perspectives. Topics belong here. |
| Source | The canonical, authoritative content of a Document — markdown or HTML. The "truth" the Agent edits when a Topic is Incorporated. |
| Perspective | An audience-tailored rendering of a Source — e.g. engineer, CFO, ops. Generated and maintained by the Agent. Read-mostly; does not feed back into the Source. |
| Topic | A discussion thread attached to a Document, written between humans. Lives at the Document level; surfaces in every Perspective. The closest analogue is a Google Doc comment thread. The Agent maintains a Topic's Anchor but does not (in the initial version) reply inside it. |
| Anchor | The Agent-maintained mapping from a Topic to the relevant region(s) of the Source and each Perspective. Semantic, not a character range — it survives content edits, including the Agent's own rewrites. A Topic may also be anchor-less if no region maps cleanly; those surface in a document-level UI instead of inline. |
| Resolution | A Topic's terminal state: either Incorporated (Agent applies the discussion's outcome to the Source, refreshes affected Perspectives, re-anchors any surviving open Topics) or Discarded. |
| Participant | A human who acts inside a Topic — proposes ideas, replies, decides Resolution. (The Agent is not a Topic participant in the initial version; see Out of scope.) |
| Agent | The AI process that owns the Document — concretely, a Claude Code instance the harness invokes on demand. Maintains Anchors, generates and refreshes Perspectives, and rewrites the Source on Incorporation. Does not chat inside Topics. |
| Harness | The wiki-browser surface (UI + server) that hosts Documents, Topics, the Agent, and the humans collaborating with them. |
Object model
Three relationships in this picture do the heavy lifting:
- Source → Perspectives is one-way. Perspectives are derived from the Source — when the Source changes (e.g. on Incorporation), the Agent refreshes every Perspective. The reverse never happens: nothing in a Perspective propagates back into the Source.
- Topic ↔ Anchor is many-to-many across renderings. A single Topic anchors into the Source and into each Perspective independently — the Agent computes per-rendering anchors. A Topic with no clean mapping may be left anchor-less.
- Agent → Document is ownership, not collaboration. Humans never write to the Source directly; they propose changes through Topics, and the Agent decides how to apply them on Incorporation.
Invariants
Rules that hold across every sub-feature. If a sub-spec breaks one of these, either the spec is wrong or this model needs to be updated explicitly.
- Topics belong to the Document. A Topic opened from the CFO's Perspective is the same Topic an engineer sees on the Source. There is no per-Perspective Topic.
- Source is the only authoritative content. Perspectives are derived from the Source and never feed back into it. Topic threads are conversation state, not document content — they live alongside the Document, not inside the Source.
- Only the Agent writes the Source. Humans propose; the Agent applies. This is what lets the system maintain anchors and Perspectives consistently across rewrites.
- Anchors are recomputed, not migrated. When the Source changes, the Agent re-anchors every still-open Topic from semantic intent — it does not patch character offsets. A Topic whose intent no longer maps to anything in the new Source surfaces as orphaned and waits for a human decision.
- Anchors are optional. The Agent may decide a Topic doesn't map cleanly to any region — for instance, a Topic about the document as a whole. Anchor-less Topics surface in a document-level UI (think Google Docs' "general comments") rather than as inline annotations. The Agent's anchoring routine returns "anchor-less" as a valid outcome.
- Resolution mutates the Source, then propagates. Incorporation: rewrite Source → refresh affected Perspectives → re-anchor surviving open Topics. Discard: no Source change; Topic moves to a closed state and stops occupying anchor space.
- Perspectives never carry conversation state. Anything stateful (a Topic, a draft, a Resolution decision) lives at the Document level.
Sub-projects
Each row gets its own spec → plan → implementation cycle. Order is foundation-first; later sub-projects assume earlier ones.
| # | Sub-project | What it nails down |
|---|---|---|
| 1 | Document model & persistence | What a Document is in storage; the relationship between Source, Perspectives, and the existing repo files; versioning. Resolves the in-repo-files vs. harness-managed-store question. Blocks everything. |
| 2 | Topic core: data model + anchoring | Topic schema, message thread, state machine, and the semantic-anchor mechanism (including the anchor-less outcome). Minimal UI. The anchor algorithm is the technically hardest piece and deserves isolated thought. Agent skill: re-anchor Topics for a given Source. |
| 3 | Agent runtime & harness invocation | How the wiki-browser server spawns or talks to the Claude Code instance, what context it ships, how results come back, and where the agent's skills/prompts live in the repo. Foundational for sub-projects #4 and #5. |
| 4 | Topic resolution & incorporation | The Resolve flow, the Agent's Source-rewrite job, Perspective refresh, re-anchoring surviving Topics. Failure modes here (bad rewrites, lost Topics) need their own design. Agent skill: apply a Topic's outcome to the Source. |
| 5 | Perspectives | Generation policy, refresh triggers, switching UI, how a Perspective references the Source. Independent enough to specify alone; depends on #1 and #3. Agent skill: generate/refresh a Perspective for a given audience. |
| 6 | Real-time collaboration mechanics | Presence, simultaneous editing of Topic threads, notifications, conflict handling. Cross-cutting; designed once and reused. |
| 7 | Identity & permissions | Wiki-browser has no auth today. Who can create / resolve / override the Agent. Surprisingly load-bearing — required before any of this is multi-user. |
| 8 | Wiki-browser UI integration | How Topics and Resolution UI render inside the existing chrome and iframe-content split. The current architecture constrains options here. Perspective UI is split off to its own sub-project (the successor to #5). |
| 9 | Batched incorporation | Allow one Agent rewrite + one commit to incorporate N Topics together (today: one Topic per rewrite). Touches #4's schema (composite FKs, CHECKs, stale checks), wb-agent surface, wb-incorporate skill, and the batch-selection UI in the chrome. Agent skill: incorporate N Topics' outcomes into the Source in one pass. |
Out of scope (initial version)
These are reasonable extensions but are explicitly not part of the first build. Calling them out so future specs don't accidentally treat them as requirements.
- Agent participation in Topic discussions. No @-mentioning the Agent, no Agent replying inside a Topic thread, no "ask the Agent" affordance. Topics are human-to-human. The Agent acts on a Topic only at Resolution time, on a human's request to Incorporate.
- Cross-Document Topics. A Topic is scoped to one Document.
- Direct human edits to the Source. All Source changes go through a Topic + Incorporation. If a human wants to fix a typo, they open a Topic for it.
Cross-cutting open questions
These span multiple sub-projects. They aren't resolved here — but every sub-spec should know they exist and either resolve them in scope or explicitly defer.
In-repo files (git as VCS) vs. a harness-managed store. Affects offline edits, the wiki-browser's "browse the repo" identity, and the Agent's write authority. Owned by sub-project #1.
One Agent per Document, or one Agent serving many Documents? Affects context, cost, and how Claude Code sessions are managed. Owned by #3.
Cached artifacts vs. generated on every request. Latency and cost. Owned by #5.