Topic resolution & incorporation — Design
Problem
The collaborative-annotations initiative needs a way for a Topic to reach its terminal state. Sub-projects #1 and #2 fixed the data shape — proposals are an append-only log, Topics carry commit_sha / discarded_at outcomes, anchors are data-orcha-anchor markers stamped into Source — and #3 shipped the subprocess runtime that can invoke the Agent. Three holes remain:
- The
wb-incorporateskill exists only as a scaffold with a<REWRITE CONTRACT OWNED BY #4>placeholder. No actual prompt body tells the Agent what to do. - There is no HTTP surface to start a proposal job, fetch its diff, approve it, or discard a Topic. The only Topic-mutating endpoints today are message append and Topic creation.
- The diff review gate from #1's "commit-immediately with UI review" decision has no implementation — Tier 1 (unified) and Tier 2 (rendered side-by-side) are unbuilt.
Without these, a Topic is a one-way thread that can never close.
Goals & non-goals
Goals
- Fill in the
wb-incorporaterewrite contract: how the Agent produces a proposed Source given a Topic + thread + every other open Topic on the same Source. - Ship the seven HTTP endpoints needed for the Resolve flow, all behind collaborator auth from #7.
- Ship Tier 1 (unified) and Tier 2 (side-by-side rendered) diff views as decided in #1.
- Add a post-job invariant check that catches silent anchor drops without imposing arbitrary caps on prompt context.
- Have the Agent attach a short natural-language explanation to each proposal so humans can verify it understood the discussion before reading the diff.
- Simplify
wb-agent's surface to match its single consumer (the skill): no redundant or defensive flags. - Reshape the prompt the Agent receives so it reads as if a human asked the Agent to help, not as if a system is feeding it parameters. The bootstrap minimum is three values (Job ID, Config path, wb-agent path); everything else the Agent discovers on its own.
Non-goals
- Perspective refresh on Incorporation. Owned by #5. #4 lands the Source commit and stops there.
- Resolve UI chrome. Button styling, modal layouts, the affordance for adding rework feedback, the diff toggle widget — all #8. #4 ships functional endpoints + minimal client wiring so #8 has something to skin.
- Push notifications. v1 polls
agent_jobsand the proposals list. SSE is #8. - Per-Document personas / Perspective switcher in the Resolve view. #5 + #8.
- New
topic_messages.kindvalues. #4 introduces no new kinds. Rework feedback rides on existing'human'messages (see the Decision on "no rework concept in the data model" below); discard reasons likewise (see the Decision on "discard reason rides onkind='human'" below). No annotation columns on proposals either.
Approach
Three changes, layered on the existing #1–#3 substrate:
- Slim and shape the wire contract for one consumer.
wb-agent's only caller is the skill. Each subcommand keys on whatever it naturally operates on:get-topicandinsert-proposalon--job-id;list-open-topicson--source-path+--exclude-topic. Every subcommand still takes--config=<path>. The prompt the skill receives drops everything derivable: onlyJob ID,Config path, andwb-agent pathremain, framed as a human asking the Agent for help (not as a harness shipping parameters). Proposal rows keep the producing job ID for audit and precise UI joins. - Fill in the rewrite contract. Replace the
<REWRITE CONTRACT OWNED BY #4>marker inwb-incorporate/SKILL.mdwith a body that (a) applies the Topic's discussion to the Source, (b) drops the incorporated Topic's marker, (c) places at least onedata-orcha-anchormarker for every other open non-global Topic on the Source — naturally where each maps, otherwise in an "Other ideas (potentially to discard)" parking-lot section at the bottom of the Source, (d) writes a 1–3 paragraph explanation of what it understood the Topic to be asking for and how its rewrite addresses it, persisted as the body of theagent-proposalmessage that already accompanies every proposal row. - Add the Resolve flow. Seven HTTP endpoints exposing the operations the Resolve UI needs: enqueue a proposal job (idempotent per Topic), list proposals for a Topic, render the unified diff, serve the proposed Source through the existing render pipeline (Tier 2), approve a proposal (calls existing
collab.Incorporate), and discard a Topic. All behind collaborator auth from #7.
Subsequent proposal requests are identical to the first. The only difference is that more messages have accumulated in the Topic's thread between attempts. The propose endpoint takes no body. Rework feedback is delivered through the existing POST /api/topics/{id}/messages endpoint from #2, not as an inline field on propose.
Why: the schema already supports threaded conversation and an append-only proposal log; carving out a separate "rework" path would duplicate both for no semantic gain. Future richer affordances (e.g. inline annotations on the diff) compose by writing more elaborate 'human' message bodies — no schema change required.
get-topic --config=<path> --job-id=<id> and insert-proposal --config=<path> --job-id=<id> are job-scoped. list-open-topics --config=<path> --source-path=<abs> --exclude-topic=<id> is source-scoped. The --id / --topic-id / --base-sha flags shipped in #3's scaffold are removed, along with any defensive validation that existed only to support them.
Why: the only caller is the skill, and the skill should ask the wire for what it naturally needs. list-open-topics is a question about a Source; get-topic and insert-proposal are scoped to the job. A single uniform flag would force semantic awkwardness ("ask about a source by passing a job") for marginal uniformity gain.
#2's "each open non-global Topic has at most one marker in its Source" relaxes to "at least one." A Topic can wrap multiple regions when its semantic intent naturally spans them. The post-job invariant checks for ≥1 occurrence of data-orcha-anchor="<id>" in the proposed Source, not exactly one. The parking-lot section stays as the safety valve when a Topic doesn't map to any region.
Why: some discussions actually span multiple regions (e.g., "rename foo everywhere"), and the renderer already handles overlapping/multi-occurrence anchors via data-topic-ids. Forcing one was conservative; the relaxation costs nothing in implementation and removes an artificial constraint on the Agent's expression.
Every proposal carries a 1–3 paragraph natural-language explanation of what the Agent understood the Topic to be asking for and how its rewrite addresses it. Stored as the body of the agent-proposal topic_messages row that already accompanies each proposal (per #2's kind enum). No schema change. The Resolve UI surfaces it above the diff.
Why: humans approving a rewrite need to know what the Agent thought they were asking for, not just what it produced. The diff shows the bytes; the explanation shows the intent. Without it, every disagreement requires reverse-engineering the prompt.
#4 adds migration 004: incorporation_proposals.agent_job_id TEXT REFERENCES agent_jobs(id), nullable for legacy/manual proposals and populated by wb-agent insert-proposal --job-id=… for Agent-authored rows. The column is not part of the Topic outcome FK; it is provenance and UI/debugging data. A failed post-exit invariant can leave behind a proposal row whose linked job is failed; the proposals list keeps it in history but does not expose it as approvable.
Why: the old design inferred proposal/job association from timestamps, which is brittle and loses the actual audit relationship. The producing job is already the natural scope for insert-proposal; persisting it makes "which run produced this revision?" a direct join instead of a guess.
If a collaborator opens a new Topic while a proposal job is in flight, or if the Agent silently drops an older Topic, the dangerous condition is the same: the proposal bytes do not contain a data-orcha-anchor="<topic-id>" marker for a currently open non-global Topic on that Source. So freshness checks compare the current open-Topic set to the proposed bytes directly. At read time and approval time, the server verifies that every current open non-global Topic on the Source, excluding the Topic being incorporated, appears at least once in proposal.proposed_source. Missing markers make the proposal stale with stale_reasons including "missing_topic_markers" and a missing_topic_ids list.
Why: timestamps are an indirect proxy for the real invariant and fail under same-second creation, delayed list-open-topics calls, retries, and Agent omissions. Marker-set staleness tests the approval safety condition itself: no open Topic should be stranded by committing the proposal. The SHA guard remains separate as stale_reasons including "source_sha"; if both checks fail, the response returns both reasons.
Design
User-visible state machine
The Resolve UI surfaces seven derived states on an open Topic, all read from DB rows plus current Source/proposal bytes. No new state column.
| State | Condition | UI affordances |
|---|---|---|
no-proposal |
No incorporation_proposals row for the Topic, no running/queued agent_jobs row. |
"Propose rewrite" · Discard |
generating |
Latest agent_jobs row for this Topic has status IN ('queued','running'). |
Spinner; client polls GET /api/agent/jobs/{id}. Other actions disabled. |
proposal-fresh |
Latest proposal (highest revision_number) exists, is approvable (agent_job_id IS NULL or the linked job is succeeded), base_source_sha matches the current Source SHA on disk, and every current open non-global Topic on this Source except the incorporated Topic has at least one marker in proposed_source. |
Diff · Approve · Propose rewrite (regenerate) · Discard |
proposal-stale |
Latest proposal exists, is approvable by job status, and at least one freshness invariant fails: base_source_sha ≠ current Source SHA on disk, or one or more current open non-global Topics on this Source are missing from proposed_source. |
Diff (with "stale" banner; reasons shown: source changed and/or missing topic markers) · Propose rewrite · Discard. Approve disabled. |
job-failed |
Latest agent_jobs row for this Topic is failed/timed_out; no fresh proposal landed. |
Error toast with error_tail · Propose rewrite (retry) · Discard. Previous fresh/stale proposal (if any) remains visible. |
incorporated |
topics.commit_sha IS NOT NULL. |
"Incorporated as <commit-sha>"; no actions. |
discarded |
topics.discarded_at IS NOT NULL. |
"Discarded by <user> on <date>"; no actions. |
"Latest" everywhere in the table means the row with the highest revision_number for proposals, and the row with the highest started_at (then created_at as tiebreaker) for agent_jobs filtered to this Topic. A linked job gates only whether an Agent proposal is approvable at all; it is not used for freshness. Agent proposals whose linked job is failed/timed_out remain in history but are not approvable.
The proposal-fresh / proposal-stale distinction is detected at read time, not stored. The server computes fresh = (agent_job_id IS NULL OR linked job status == 'succeeded') and (SourceSHA(repo_root, source_path) == proposal.base_source_sha) and (all current open non-global Topic IDs on this Source, excluding the incorporated Topic, occur as data-orcha-anchor markers in proposed_source) when serving the proposals list. The same checks happen server-side at approval time before collab.Incorporate writes the file; the UI's display is purely advisory. The list response includes stale_reasons (["source_sha"], ["missing_topic_markers"], both, or empty) plus missing_topic_ids so the UI's banner can name the cause. Proposal rows from failed jobs stay in the DB for audit history but never surface as actionable, and Approve is gated server-side.
HTTP surface
Seven endpoints, all behind auth.RequireCollaborator (the helper from #7). The API endpoints return application/json with the same error envelope as #2 ({ "error": "<code>" }); the preview route returns rendered HTML.
| Method · path | Body | Returns | Notes |
|---|---|---|---|
POST /api/topics/{id}/proposals |
empty | { job_id } · 202 (new) / 200 (existing) |
Enqueues a wb-incorporate job. Idempotent per Topic: if the latest agent_jobs row for this Topic has status IN ('queued','running'), returns that job_id with 200 instead of creating a new row. Any other state (no prior job, succeeded, failed, timed_out) enqueues a fresh job — the propose-rewrite button on proposal-fresh and the retry button on job-failed rely on this. Jobs for different Topics on the same Source are allowed to queue; the per-Source agent queue (from #3) serializes them. 422 if the Topic is already incorporated/discarded. |
GET /api/topics/{id}/proposals |
— | [{ id, revision_number, base_source_sha, agent_job_id, job_status, fresh, stale_reasons, missing_topic_ids, created_at }, …] |
Ordered by revision_number descending. fresh, stale_reasons, and missing_topic_ids computed at read time. Body omitted from list view. |
GET /api/proposals/{id}/diff |
— | { unified, base_sha, proposed_sha, fresh } |
Tier 1 viewer source. unified is computed server-side with internal/diff (Go); no shell-out to git diff. 404 if proposal not found; 410 if Topic terminal. |
GET /content/preview/proposals/{id} |
— | text/html |
Tier 2 right-iframe source. Renders the proposed-Source bytes through the existing internal/render pipeline at the same source_path namespace. Cache-Control: no-store. Left iframe uses the existing /content/{source_path}. |
POST /api/proposals/{id}/incorporate |
{ subject?, body? } |
{ commit_sha, topic_id } |
Calls existing collab.Incorporate. If subject is absent or empty, the server defaults it to "Incorporate Topic: <summary>", where <summary> is the Topic's first kind='human' message body with leading Markdown syntax (# , - , * , > ) trimmed, whitespace collapsed, then truncated to 60 Unicode runes (not bytes — Go's utf8.DecodeRuneInString; preserves multibyte codepoints) and ellipsised with "…" if truncated. 409 with "stale_proposal" and { stale_reasons, missing_topic_ids } if the Source SHA drifted or if current open Topic markers are missing from the proposal bytes. 422 if proposal is for a terminal Topic or if its linked Agent job did not succeed. |
POST /api/topics/{id}/discard |
{ reason? } |
{ discarded_at } |
Pure DB transition (see "Discard flow" below). 422 if Topic is already terminal. |
GET /api/agent/jobs/{id} |
— | { id, kind, status, started_at, completed_at, exit_code, error_tail } |
From #3; used here for the generating spinner. No body change. |
Approve, propose, and discard attribute to the request principal via users.id (the normalised email from #7). The bootstrap-operator path is already gone.
wb-incorporate rewrite contract
The skill replaces the <REWRITE CONTRACT OWNED BY #4> placeholder. The body is written in the voice of someone asking the Agent for help, not as system instructions. Below is the spec text; the actual SKILL.md mirrors it.
SKILL.md (body)# What you're helping with You're helping a small group of writers iterate on a shared document. They hold conversations about specific parts of the document — call those conversations Topics — and when they reach agreement, they ask you to translate the outcome into a concrete rewrite. The rewritten document is then reviewed and (usually) committed as the new version. Vocabulary you'll see: • Source — the canonical Markdown/HTML file being edited. • Topic — a discussion thread attached to a region (or the whole) of the Source. Topics are open until they are incorporated or discarded. • Anchor — an inline marker stamped into the Source as an HTML element with a data-orcha-anchor="<topic-id>" attribute. Tells the UI which region each open Topic is about. You maintain these. • Proposal — the rewritten Source you produce, plus a short explanation of what you understood the Topic to be asking for. # Inputs you have The message that invoked this skill carries three values: Job ID: <uuid> Config path: <absolute path to wiki-browser.yaml> wb-agent path: <absolute path to the wb-agent binary> Always invoke wb-agent via the absolute path you were given. Always pass --config=<Config path>. # Steps 1. Load the working set Run: <wb-agent path> get-topic --config=… --job-id=… It returns the Topic you're working on with its anchor and the full message thread, the absolute path to the Source file, and the base_source_sha of the Source at this moment. Prior proposals you've made for this Topic appear inside the message thread as kind "agent-proposal", with their full proposed_source bodies inlined — read them as your own earlier thinking. Then list the other open discussions on the same Source so you can keep them anchored: <wb-agent path> list-open-topics --config=… --source-path=<abs> --exclude-topic=<current-id> Pass the absolute source_path get-topic returned, and exclude the Topic you're working on. The response is every other open non-global Topic on that Source, each with its anchor and message thread. Read the current Source via the Read tool at the absolute source_path. 2. Produce a rewrite Apply the Topic's discussion to the Source. The discussion is authoritative — don't invent changes the humans didn't agree to. If the discussion is ambiguous or unresolved, prefer the most recent human messages and explicit decision markers ("yes", "approved", "let's go with X"); if still ambiguous, prefer the smallest change that reflects the conversation. 3. Re-anchor every other open non-global Topic For every Topic returned by list-open-topics, the rewritten Source MUST contain at least one occurrence of: data-orcha-anchor="<that-topic-id>" How to place markers: • If the Topic's intent still maps to a region of the new Source, wrap the relevant content. Inline anchors use <span data-orcha-anchor="<id>">…</span>; block-level anchors use <div data-orcha-anchor="<id>"></div> immediately preceding the block, separated by a blank line. • If a Topic's intent naturally spans multiple regions, use multiple markers — one per region is fine. • If a Topic's intent no longer maps cleanly to anything in the new Source, append it to a section titled exactly: ## Other ideas (potentially to discard) at the very bottom of the Source. Each parked Topic gets at least one sub-bullet referencing the discussion, wrapped in a marker. If a parked discussion has multiple distinct sub-ideas to preserve, use multiple sub-bullets — each with its own marker is fine. • Do NOT include any marker for the Topic you're incorporating. Its discussion outcome lives in the prose now; the marker would be a dead UUID in the committed Source. 4. Write a short explanation In 1–3 paragraphs, summarise what you understood the Topic to be asking for and how your rewrite addresses it. This goes in the UI alongside the diff so the humans reviewing know what your thinking was. Plain prose; no Markdown headings. 5. Persist Pipe the rewritten Source bytes on stdin to wb-agent, with the explanation inline as a flag: <wb-agent path> insert-proposal --config=… --job-id=… \ --explanation="$EXPLANATION" < rewritten-source Build $EXPLANATION as a single shell variable so multi-line text and special characters pass safely as one argv value. No temp files. Exit 0 on success. Exit non-zero on any unrecoverable error; the stderr you produce becomes the error message humans see in the UI.
Four invariants the prompt encodes:
- At least one marker per other open non-global Topic. Missing markers mean a Topic was silently dropped (the post-job invariant catches this).
- No marker for the incorporated Topic. Its outcome is in the prose now; the marker would be dead UUID in committed Source.
- The parking-lot section is the safety valve. The Agent never has to drop a Topic to fit; it can always park unmappable ideas under
## Other ideas (potentially to discard). - An explanation accompanies every proposal. Empty explanations fail the job; the human-facing review depends on this.
The harness enforces all four post-exit (see "Job-kind invariants" below).
Diff viewer — Tier 1 (unified)
GET /api/proposals/{id}/diff computes the unified diff server-side using an in-process Go diff library (github.com/sergi/go-diff/diffmatchpatch and github.com/hexops/gotextdiff are both viable; the implementation plan picks one). No git diff --no-index shell-out. The response carries the diff text plus base_sha, proposed_sha, and fresh. The client renders it as a <pre> block with line-level +/− classes. The chrome (toolbar, tier toggle button) belongs to #8.
Diff viewer — Tier 2 (side-by-side rendered)
Two iframes side by side. The left iframe loads /content/{source_path} — the current Source rendered through the existing internal/render pipeline. The right iframe loads a new route:
internal/server/handler_preview.goGET /content/preview/proposals/{id}
→ look up proposal by id (auth check)
→ load proposed_source bytes from the row
→ render through internal/render using source_path for relative-link
resolution (so images, links, mermaid all resolve identically to the
current Source)
→ Content-Type: text/html; Cache-Control: no-store
Same render pipeline as live Source means rich content (images, tables, SVG, mermaid, JS-driven knobs) works for free. Base-URL handling matches today's /content/ route because the path under the namespace is the same.
Anchor resolution on the preview. The preview route runs ResolveAnchors (the same post-render pass used by /content/{source_path}) in a render-only post-commit mode. It reads current open non-global Topics on the Source, excludes the incorporated Topic (its marker is gone from the bytes by contract), and keeps only Topics whose data-orcha-anchor="<id>" marker appears in proposed_source. For those Topics, the route passes a temporary {"kind":"marker"} anchor to ResolveAnchors even if the stored DB anchor is still pre-marker. The DB is not mutated by preview rendering. The result looks like the post-commit page would: re-anchored Topics highlight in place, the parking-lot section highlights its parked entries, and the incorporated Topic has no highlight because its outcome is now in the prose.
The preview iframe deliberately does not emit the <meta name="wb-source-sha"> tag the live /content/ route uses for Topic creation, so client-side Topic-composer code on the preview cannot mistake the preview for a live Source and create a Topic against bytes that aren't yet committed.
Default tier on first open: Tier 2 (rendered side-by-side). Tier 1 is a toggle for precision. Users approve content, not patches; the rendered view shows the actual user-facing change.
Mobile fallback for Tier 2. Below ~700px wide, side-by-side iframes are unusable. The recommended pattern is tabs (Current | Proposed) — same two URLs, one visible at a time. The API surface doesn't change; the breakpoint logic, tab chrome, and toggle widget belong to #8.
Discard flow
Discard is a pure SQLite transition. No Agent job, no Source write. The endpoint:
- Validates the Topic exists and is non-terminal.
- If
reasonis present and non-empty, appends akind='human'message with that body,author_user_id= the discarding user — through the existing message-append path so thetopic_messages.sequenceallocation stays consistent. Ifreasonis absent or empty, no message is appended; the bare outcome columns suffice as the audit trail. - Sets
topics.discarded_atandtopics.discarded_by. - Returns
{ discarded_at }.
Steps 2 and 3 run through the existing single-writer goroutine in one SQLite transaction.
kind='human', no new message kind
The discard reason is metadata about the closing action, not a continuation of the discussion, so a dedicated kind='topic-discarded' would be cleaner in the abstract. We reuse 'human' anyway: the discarding user is captured by topic_messages.author_user_id (and matches topics.discarded_by), the discard timestamp is captured by topics.discarded_at, and the thread display naturally ends with the human's reason as the last message — which is what a reader wants to see.
Why: #2's kind enum can grow when a concrete need surfaces, but this need is cosmetic, not structural. A new kind would force every reader of the thread (current and future) to special-case rendering for one row that already reads correctly as a human message. The non-goal against new topic_messages.kind values applies here on purpose, not by accident. If a future affordance needs to distinguish discard-reasons from regular messages (e.g., a "closing remark" badge), we can derive it from (topic_messages.sequence == max sequence) AND (topics.discarded_at IS NOT NULL) without changing the schema.
If the Topic had a marker-kind anchor, the data-orcha-anchor element is still in the committed Source after Discard. That is acceptable: the renderer treats markers for non-open Topics as inert (no highlight, no sidebar entry), so stale markers are invisible to readers. The next Incorporation on the same Source rewrites it cleanly because list-open-topics filters by open Topics only.
Removing the marker on Discard would either (a) make the harness write to Source — violating "only the Agent writes the Source" — or (b) spawn a no-decision Agent job, burning model budget for no user-visible benefit. Both fail the cost/benefit check.
Job-kind invariants for incorporate
#3 ships a single post-exit check ("a new incorporation_proposals row exists for the expected topic_id with created_at >= job.started_at"). #4 replaces the timestamp inference for Agent proposals with the explicit link from migration 004: a proposal row must exist with agent_job_id = job.id, then pass the three anchor/explanation invariants below. All checks run after wb-agent insert-proposal has committed the proposal row and its accompanying agent-proposal message to the DB but before the harness reports status='succeeded' on the job. Failure on any invariant flips the job to status='failed' with a descriptive error_tail; the proposal row stays (the proposal log is append-only) but the UI does not surface a failed proposal as fresh.
- Every other open non-global Topic on the Source has at least one marker. For each Topic returned by
list-open-topicsat job start, the proposed Source must contain at least one occurrence of the literal substringdata-orcha-anchor="<topic-id>". Missing occurrences fail the job witherror_tail = "anchor invariant: topic <id> not stamped in proposal". Multiple occurrences are allowed (a Topic may span multiple regions). - The incorporated Topic's marker is absent. The proposed Source must not contain the literal substring
data-orcha-anchor="<incorporated-topic-id>". Failure:error_tail = "anchor invariant: incorporated topic's marker leaked into proposal". - The accompanying
agent-proposalmessage has a non-empty body. The Agent must attach an explanation. Empty body fails the job witherror_tail = "explanation invariant: agent-proposal body is empty".
The checks run in internal/agent/service.go's post-exit step alongside the existing "did a row appear?" verification. Anchor checks read the proposal's proposed_source column and do plain substring matching on data-orcha-anchor="…" — counting literal substring occurrences, not parsing HTML. The marker syntax is fixed (Topic IDs are UUIDs, attribute uses double quotes), so substring matching is unambiguous. Explanation check is a simple LENGTH(TRIM(body)) > 0 against the agent-proposal message row.
No explicit cap on open Topics per Source. Earlier drafts considered capping the number of open Topics whose context the skill loads, to bound prompt size. The first invariant above replaces that — it catches the actual failure mode (silent drop) regardless of prompt size, so we don't need an arbitrary cap rejecting work that might fit. If runs start timing out on real Sources, #5 or a future iteration can revisit per-Topic message truncation.
Explanation length, unified contract. Three numbers appear in this spec and have caused confusion: "1–3 paragraphs" in the SKILL.md body is content guidance only — what the Agent should aim for so humans can use the explanation. The only hard constraints are #2's validateMessage rule (non-empty body, body length ≤ 64 KiB — applied to all topic_messages rows including kind='agent-proposal') and the Linux argv ceiling (~128 KiB on the wire from skill to wb-agent, well above 64 KiB). wb-agent insert-proposal validates the explanation against the same validateMessage the server uses, so the cap is enforced server-side regardless of whether the Agent's prompt followed the guidance. Failure: non-zero exit from wb-agent with the validator's error message; the harness reports status='failed' and the error_tail surfaces in the UI.
Auth wiring
All seven endpoints sit behind auth.RequireCollaborator. The request principal (users.id = normalised email) is used for:
incorporation_proposals.proposed_bystaysNULLfor Agent-produced rows (per #3's migration 003).topics.incorporated_by= principal at approval time.topics.discarded_by= principal at discard time.topic_messages.author_user_id= principal for rework feedback messages.incorporation_attempts.approved_by= principal at approval time (already wired by #1'sIncorporateInput.ApproverID).
Anonymous users get 401; authenticated non-collaborators get 403, per #7's envelope decision. API endpoints never redirect to Google; the preview content route returns HTML only after collaborator auth passes.
All five mutating endpoints (POST /api/topics/{id}/proposals, POST /api/proposals/{id}/incorporate, POST /api/topics/{id}/discard, plus the message-append and topic-create endpoints from #2 that they coexist with) additionally require X-CSRF-Token per #7. The two GET endpoints (/api/topics/{id}/proposals, /api/proposals/{id}/diff) and the preview content route do not.
Schema migration 004
#4 adds one provenance column to the proposal log:
sqlALTER TABLE incorporation_proposals ADD COLUMN agent_job_id TEXT REFERENCES agent_jobs(id); CREATE INDEX incorporation_proposals_agent_job ON incorporation_proposals(agent_job_id);
The column is nullable. Existing rows and any future manual/user-authored proposals can keep NULL; Agent-authored rows inserted by wb-agent insert-proposal --job-id=… must set it. The migration does not rebuild the table because SQLite allows adding a nullable FK column. Tests cover insertion with a valid job ID, rejection of a missing job ID when foreign keys are enabled, and legacy NULL rows.
Anchor JSON updates on Incorporation
The Agent's rewrite contract requires every other open non-global Topic on the Source to receive at least one data-orcha-anchor marker in the proposed Source. That stamps the new Source bytes correctly, but doesn't touch the corresponding topics.anchor JSON in SQLite — and #2's invariant is that any Topic whose anchor.kind == "pre-marker" resolves through byte offsets in the Source pinned by source_sha. Once Incorporation lands and that source_sha no longer matches the committed Source, those offsets refer to nothing useful; the locator must flip to marker (UUID-based) for resolution to keep working.
So Incorporation extends the SQLite transaction inside collab.CompleteIncorporation to also re-anchor the other open Topics. The harness:
- Reads the open-Topic set on the Source at approval time (same query
list-open-topicsuses, minus the incorporated Topic), then filters it to Topic IDs whose marker appears inproposed_source. This is the post-stale-check set, so if any current open Topic is missing a marker, approval has already returned409 stale_proposal. - Passes those Topic IDs to
collab.CompleteIncorporationas a newReanchorTopicIDs []stringfield onCompleteIncorporationInput. - Inside the existing transaction (so the commit-sha write, attempt completion, and anchor updates all succeed together or all roll back), runs:
UPDATE topics SET anchor = '{"kind":"marker"}', updated_at = unixepoch() WHERE id IN (<reanchor list>) AND discarded_at IS NULL AND commit_sha IS NULL AND json_extract(anchor, '$.kind') = 'pre-marker'TheWHEREfilters guard against Topics that transitioned terminal in the time between read and write (concurrent Discard, concurrent Incorporation on a sibling Topic that somehow made it past the per-Source queue), and skip Topics already atmarkerkind (idempotent retry safety). Topics in the global-anchor kind aren't in the input set to begin with (list-open-topicsexcludes them).
The post-job invariant (every other open Topic has data-orcha-anchor="<id>" in the proposed Source) is the partner check: it guarantees that when the anchor JSON flips to marker, the marker that locator refers to actually exists in the committed bytes. Without both halves, the system silently corrupts anchor resolution the first time two Topics share a Source.
The Discard-flow callout's claim "the renderer treats markers for non-open Topics as inert" still holds: markers for discarded or already-incorporated Topics stay in committed Source until the next Incorporation rewrites the file. This update only converts open Topics' anchor JSON.
Slim prompt body (replaces #3's parameter block)
The harness's prompt builder in internal/agent/claude_cli_runner.go writes the prompt as a brief human-style request, not a parameter dump. Concretely:
prompt bodyPlease help me incorporate a Topic discussion into a shared document.
Use the wb-incorporate skill to load the conversation, produce a rewrite,
and persist it as a proposal for review.
Job ID: ab12cd34-…
Config path: /etc/wiki-browser/wiki-browser.yaml
wb-agent path: /opt/wiki-browser/dist/wb-agent
The skill discovers everything else through wb-agent get-topic --job-id=…. Removed from the prompt: Repo root, Source path, Base source SHA, Topic ID. The #3 decision "job parameters travel in the prompt body, not env vars" still holds for the three remaining values; only the derivable ones drop.
The skill's description field stays accurate (#3 already wrote it); the body change is the one that swaps from infrastructure tone to user-task tone.
wb-agent surface (the only consumer is the skill)
Every subcommand takes --config=<path>. The identifier flag matches what each call naturally operates on.
| Subcommand | Identifying flags | Reads | Writes | Returns / accepts |
|---|---|---|---|---|
get-topic |
--job-id |
agent_jobs · topics · topic_messages · incorporation_proposals · filesystem (for base_source_sha) |
— | stdout JSON: { topic, source_path, base_source_sha, anchor, messages: [{kind, body|proposed_source, sequence, author}…] }. source_path is the absolute filesystem path; repo_root is not returned separately. |
list-open-topics |
--source-path · --exclude-topic |
topics · topic_messages | — | stdout JSON array: [{ id, anchor, messages: […] }, …]. Returns every open non-global Topic on the source, minus the one identified by --exclude-topic. --source-path accepts an absolute path; wb-agent computes filepath.Rel(cfg.Root, source_path) to normalise to the repo-relative path stored in the DB, then runs the relative path through collab.ValidateSourcePath. Any path that escapes cfg.Root or fails validation exits non-zero with a descriptive stderr — the harness's responsibility for trust-boundary enforcement does not relax just because the only caller is the skill. |
insert-proposal |
--job-id · --explanation |
agent_jobs · topics · incorporation_proposals (for revision_number) · filesystem (for current base_source_sha) |
incorporation_proposals (one row, with agent_job_id = job ID) · topic_messages (one agent-proposal row with body = explanation) |
stdin: proposed Source bytes (no temp files). --explanation=<inline text> argv. stdout: { proposal_id, revision_number, message_id }. Insert happens in one SQLite transaction opened as BEGIN IMMEDIATE so the cross-process write-lock acquires up-front; inside the transaction, topic_messages.sequence is allocated as COALESCE(MAX(sequence), 0) + 1 for the topic (same pattern as collab.InsertMessage's funnel — promoted to BEGIN IMMEDIATE because wb-agent is a different process than the server and cannot share the in-process write goroutine). SQLite's busy_timeout (set per-connection from #1's DSN) handles contention with concurrent server-side writes. Proposal row + agent-proposal message row are linked via proposal_id in the same transaction, and the proposal row is linked back to the producing agent_jobs row via agent_job_id. |
No temp files. The explanation rides as a single argv value (Linux argv limit is 128 KiB; 1–3 paragraphs is well under that). The skill's SKILL.md documents the Bash pattern (single-quoted variable assignment) for safely passing multi-line text with arbitrary characters.
Reshaping wb-agent and the prompt collapses several places where the scaffolds carry parallel or now-misshapen flags. The implementation plan that follows this spec will sweep:
cmd/wb-agent/get_topic.go— keep--configand--job-id; drop--idand any path-related flags. Stop returningrepo_root; return absolutesource_path. Drop any defensive validation that existed to reconcile redundant inputs.cmd/wb-agent/list_open_topics.go— change from--source-path+job-implicit-exclusion to--source-path+--exclude-topic; the scaffold currently keys on--source-pathalone, so this is a small extension. Addfilepath.Rel(cfg.Root, source_path)+ValidateSourcePathdefence at the absolute-path boundary.cmd/wb-agent/insert_proposal.go— keep--configand--job-id; drop--topic-idand--base-sha; add--explanation. Insert proposal row (agent_job_idset to the job ID) + accompanyingagent-proposalmessage in oneBEGIN IMMEDIATEtransaction, with cross-processMAX(sequence)+1allocation.internal/agent/claude_cli_runner.go— rewrite the prompt builder: human-task voice in the body, three values only (Job ID, Config path, wb-agent path).internal/collab/migrations/004_proposal_job_id.sql— add nullableincorporation_proposals.agent_job_idplus an index.internal/collab/mutators.go— extendCompleteIncorporationInputwithReanchorTopicIDs []stringand add the in-transactionUPDATE topics SET anchor = '{"kind":"marker"}' WHERE id IN (…)guarded by the open-state filters described in "Anchor JSON updates on Incorporation" above. Add a proposal freshness helper that checks the current Source SHA and marker-set invariant, returningstale_reasonsandmissing_topic_ids; the approval handler runs it before invokingCompleteIncorporation.internal/server/handler_preview.go(new) — Tier 2 preview route. Renders proposed-Source bytes throughinternal/renderand runsResolveAnchorsagainst render-only marker anchors for open Topics whose markers exist in the proposed bytes, excluding the incorporated Topic. Omits the<meta name="wb-source-sha">tag so the Topic composer cannot mistake it for a live Source.internal/server/handlers_proposals.go/ wherever the proposals-list and approval handlers land — implementfresh+stale_reasons+missing_topic_idscomputation, idempotent enqueue on the propose endpoint, rune-based subject-default trimming on approve, CSRF gating on the five mutating endpoints.- Test fixtures in
internal/agent/{runner_test,service_test,claude_cli_runner_test,e2e_test}.goandcmd/wb-agent/main_test.gothat assert the old prompt and flag shape — rewrite to match the new surface. .claude/skills/wb-incorporate/SKILL.md— replace the scaffold wholesale with the body in "wb-incorporate rewrite contract" above.
Concurrency & recovery
The per-Source agent queue from #3 (agent.max_concurrent_jobs=1, keyed by source_path) serialises proposal jobs on the same Source. Approval is serialised by two staleness checks before the file write: (1) the existing SHA guard — the first commit on a Source wins; any in-flight proposal for another Topic on that Source becomes stale (base_source_sha mismatch, stale_reasons includes "source_sha"); (2) the marker-set guard — every current open non-global Topic on the Source, excluding the Topic being incorporated, must have at least one marker in the proposal bytes (stale_reasons includes "missing_topic_markers" when not). The UI surfaces both via the fresh, stale_reasons, and missing_topic_ids fields in the proposals list and the 409 stale_proposal response from approval. No new locking primitives — Topic creation, proposal generation, and message append all stay unblocked.
Startup recovery is unchanged: #3's agent_jobs sweep marks any orphaned running/queued rows as failed; #1's collab.Recover reconciles any partial Incorporation attempt against the working tree. #4 introduces proposal/job provenance but no new durable in-flight state.
Open questions
- Per-Source approval race. Two collaborators clicking Approve on different fresh proposals in the same second (each for a different Topic on the same Source) will both pass the read-side
freshcheck; the second one'scollab.Incorporatewill fail the SHA check after the first one commits. UX is acceptable (the loser sees a stale error, the propose endpoint dedups so they re-queue the same Topic without ceremony), but worth observing once two-collaborator usage starts in earnest. - Bash-quoting robustness for very long or unusual explanations. The skill builds the
--explanationargv value as a shell-quoted variable. Standard 1–3 paragraph natural-language explanations pass through cleanly; pathological inputs (literal null bytes, etc.) would fail at the shell level before reachingwb-agent. If observed, we can switch to a stdin framing (length-prefixed) without changing the spec — the SKILL.md is the only thing that knows the wire shape.