Draft

Real-time collaboration mechanics — Design

2026-05-13Danielwiki-browser · sub-project #6

Problem

Sub-projects #1–#4 and #7 settle the data model for Topics, proposals, agent jobs, and authenticated collaborators. The UI today reads that data through REST and refreshes only on iframe navigation or after the calling user's own mutations. With two collaborators acting on the same Documents — one drafting Topics while the other reviews proposals — staleness is now visible in normal use: Max posts a reply, Daniel does not see it; Daniel discards a Topic, Max keeps clicking on a Resolve view that no longer maps to an open Topic; a proposal stales mid-review and the reviewer has no way to know until they refresh.

The domain model assigns this to sub-project #6 — Real-time collaboration mechanics. Its job is to make the second collaborator's view feel within ~1 s of the first's actions, surface presence ("Max is also reading this Document, currently on Topic X"), and codify how the existing data-layer races are reflected in the UI before the losing client's next click. The mechanics are cross-cutting: #5 (Perspectives) and #8 (UI integration) will both ride them.

References: domain model, decisions log.

Goals & non-goals

Goals

Live data flow per Document. Topic creation, message append, discard, incorporation, proposal creation, and agent-job status changes reach every connected collaborator within ~1 s.
Document-level presence with focused Topic. Each collaborator sees who else is on the same Document and which Topic (if any) the other is reading.
Conflict visibility, not new conflict resolution. The data-layer races already settled by #1, #3, #4, and #7 are surfaced faster: the losing client's UI updates before its next click.
One transport for everything. A single channel carries topic events, agent-job events, and presence — no separate mechanism for any of them.
Reuse by #5 and #8. The event types and transport reserved here are the same ones #5 will publish to (perspective refresh) and #8 will consume to render chrome (toasts, presence chip, "regenerating…" indicator).

Non-goals

Cross-document or session-wide notifications. "Max replied on /doc/foo while you're on /doc/bar" is parking-lot for a future sub-project; the v1 stream is scoped to the Document you have open.
Typing indicators and draft sync. Topic threads are append-only and visible the moment they hit SQLite. "Max is typing…" is polish; defer to #8 if missed.
WebSocket. Duplex transport is not needed; clients still POST through REST. SSE is the v1 wire. Swapping later is a localized change inside internal/realtime.
Event replay / Last-Event-ID. Reconnect is "refetch via existing GETs." No server-side ring buffer.
Out-of-app push. Email, OS notifications, mobile push — not in scope.
New conflict-resolution semantics. Race outcomes (one writer wins, second gets 409; stale-proposal banners; incorporation_attempts recovery) are owned upstream. #6 only surfaces them.

Approach

Three transport options were considered.

Transport comparison — recommended row is SSE.
Option	Effort	Fit for #6's load	Survives Nginx Proxy Manager
Keep polling, expand scope	Low	Misses the ~1 s target	Trivially
SSE, per-Document stream	Low–Medium	Matches the target; matches the iframe-per-Document UX	Yes with `proxy_buffering off`
WebSocket per session	Medium	Overkill; nothing needs duplex in v1	Needs Upgrade wiring

SSE is one HTTP request, ships in Go's stdlib via net/http + http.Flusher, and has a documented client (EventSource) with built-in reconnect semantics. For two collaborators viewing 1–2 Documents at a time, the server holds 2–4 long-lived connections per process — well below any meaningful limit on the Pi. The transport is thin enough that swapping to WebSocket later, if push-from-client ever becomes load-bearing, is a localized change inside internal/realtime.

Decision — transport

SSE behind a process-local pub/sub hub, one connection per source_path per signed-in client. Hub publishes are advisory; REST endpoints remain the authoritative source of every record.

Decision — stream scope

Per-Document only for v1. The client opens an EventSource(/api/stream?source_path=X) on Document load and closes it on navigation. Activity on Documents the user is not currently viewing surfaces only when they next visit them. Session-wide / cross-Document streams remain parking-lot for a future sub-project.

Decision — REST is authoritative; events advise

Hub events carry IDs and previews; clients refetch the canonical row through the existing REST endpoints. Reconnect after a drop = "refetch the three relevant endpoints for the focused Document." No event replay, no Last-Event-ID, no per-subscriber ring buffer.

Design

Architecture overview

A new internal/realtime package holds the hub. The hub is in-process, in-memory, keyed by source_path. The HTTP layer adds two endpoints — one SSE stream, one tiny focus-update POST. Existing server-process mutation handlers call Hub.Publish after their SQLite write commits; the agent runner publishes after it observes rows committed by the wb-agent subprocess.

Hub is process-local. Clients subscribe per Document. Mutation handlers and the agent runner publish after their SQLite writes commit.

The `internal/realtime` package

One package, one type. Internal storage is map[string]map[*subscription]struct{} per source guarded by a sync.RWMutex (standard pattern; sync.Map is the wrong shape here — we iterate the set on every publish). Each subscription owns a buffered channel and per-session metadata.

go// internal/realtime/hub.go (sketch)

type Event struct {
    Type    string          // e.g. "topic.created"
    Payload any             // JSON-marshalable
}

type Principal struct {
    UserID      string
    DisplayName string
    SessionID   string
}

type Hub interface {
    // Subscribe allocates a Subscriber but does NOT yet register it for fanout.
    // The SSE handler writes the `subscribed` handshake frame, then calls
    // Activate. This guarantees `subscribed` is the first frame on the wire.
    Subscribe(sourcePath string, p Principal) (*Subscriber, error)
    Activate(s *Subscriber)
    Publish(sourcePath string, evt Event)
    Close()
}

type Subscriber interface {
    ID() string          // the subscriber_id
    Events() <-chan Event
    SetFocus(topicID string)
    Close()
}

Error vocabulary for Subscribe: ErrHubClosed only (returned when Hub.Close has been called — shutdown path). No per-source subscriber cap in v1; the auth allowlist (two users) is the de-facto cap. sourcePath is assumed pre-validated by the SSE handler; the hub does not re-validate.

Subscribers are created by the SSE handler when a client connects, and closed when the handler returns. The 64-event buffer is per subscriber; overflow closes the subscriber's channel and the SSE handler returns, prompting the client to reconnect and refetch. (See Backpressure below for buffer-size rationale.)

Decision — events are hints, not an ordered log

Server-process mutation handlers call Hub.Publish after their SQLite write commits, inside the write-funnel path where possible. Agent-produced rows are different: wb-agent is a separate process, writes directly to SQLite, and cannot call the server's in-memory hub from inside its transaction. For those rows the server runner publishes only after the subprocess exits and the runner observes the committed DB state.

No event sequence number or replay log ships in v1. Clients must treat every event as a hint to refetch canonical REST state and then sort/render by durable fields such as topic-message sequence, proposal revision_number, and job status timestamps. The transport promises fast convergence, not a complete ordered event history.

HTTP surface

Method · Path	Auth	Body / Query	Responses
`GET /api/stream`	`RequireCollaborator`	`?source_path=<path>`	`200 text/event-stream` · `400 bad_source_path` · `401 unauthenticated` · `403 forbidden` · `404 unknown_source` · `503 collab_unavailable`
`POST /api/stream/focus`	`RequireCollaborator` + `RequireCSRF`	`{subscriber_id, topic_id?}` (JSON)	`204 no content` · `400 bad_request` · `401 unauthenticated` · `403 forbidden` · `404 unknown_subscriber` / `unknown_topic` · `429 too_many_focus_calls`

Both endpoints mount in internal/server/server.go alongside the existing collab routes, behind the same session middleware. The SSE handler does not require CSRF — it is a read-only GET, same treatment as GET /api/topics. The focus POST does require CSRF, same as every other mutating endpoint.

SSE handler responsibilities

Validate source_path via collab.ValidateSourcePath; reject 400 on failure. Verify the file exists on disk (mirrors the existing content handler) and reject 404 if not.
Set headers: Content-Type: text/event-stream, Cache-Control: no-store, Connection: keep-alive, X-Accel-Buffering: no. The header is a best-effort hint; deployment still requires proxy_buffering off in Nginx Proxy Manager.
Call Hub.Subscribe(source_path, principal) to allocate a Subscriber (not yet in the fanout set). Write the subscribed handshake frame carrying the server-generated subscriber_id — the only field — and Flusher.Flush(). Only then call Hub.Activate(subscriber), which adds the subscriber to the source's fanout set and triggers the presence recompute. This split guarantees subscribed is the first frame on the wire; concurrent publishes from other sources cannot race in ahead of it. The client renders presence from the presence.updated event that follows, not from subscribed.
Loop: select on the subscriber's event channel, the request context, and a 15 s keepalive ticker. Write each event as one SSE frame — event: and data: lines only, never id: (v1 has no replay log, so honouring a Last-Event-ID reconnect header is impossible):
```
textevent: topic.created
data: {"topic_id":"…","source_path":"…",…}
```
On each keepalive tick, read the session row. If revoked_at is set or expires_at has passed, fall through to step 5 (subscriber close). Otherwise, if now − last_seen_at ≥ auth.SessionTouchInterval (10 min), call store.TouchSession to extend the sliding expiry — without this the existing middleware semantics would let a passive SSE-only observer time out under their feet. Then write a single :keepalive comment line. After every write, call Flusher.Flush().
On context cancellation, channel close, write error, or session-revalidation failure (step 4), call Subscriber.Close() and return. The hub recomputes presence and republishes. This is the single termination path.

Focus endpoint responsibilities

Decode {subscriber_id, topic_id?}. topic_id empty/absent means "no Topic focused, just viewing the Document."
Apply a 1-call/sec limiter (in-memory token bucket keyed by (session_id, subscriber_id) so two tabs in one session get independent quotas). Reject 429 too_many_focus_calls on burst.
Resolve the subscriber by subscriber_id. Verify it belongs to the calling session (defense against a hijacked id from another principal). If absent or session-mismatched, return 404 unknown_subscriber — recoverable by reconnecting. This handle disambiguates two tabs on the same source: each tab focuses its own subscription independently.
If topic_id is non-empty, load the Topic and require topic.source_path to match the subscriber's source. Unknown / cross-source IDs return 404 unknown_topic. Terminal Topics (incorporated or discarded) silently coerce to topic_id = "" — focus on a closed Topic is treated as "no Topic focused, just viewing the Document," because under normal contention the topic.discarded / topic.incorporated event is in flight and the chrome will repaint anyway. Avoids forcing chrome to hand-roll the "swallow 409 silently" path on every focus call.
Call Subscriber.SetFocus(topic_id); the hub fans out the recomputed presence snapshot to every subscriber on the same source as the resolved subscription.
Return 204.

Event surface

Every SSE frame's data payload is a JSON object. Field shape is fixed by this spec; previews are short (≤160 chars) to keep event frames small. Clients re-fetch canonical bodies through REST.

Event surface. Payloads carry IDs + minimal previews; canonical state stays in REST. Exception: `subscribed` is a handshake event, not a state-bearing event — its payload is the session-bound handle the client uses to authenticate `POST /api/stream/focus`.
Type	Payload	Published by	Subscribers do
`subscribed`	`{subscriber_id}`	SSE handler on connect (before joining hub fanout)	Store `subscriber_id` for use on `POST /api/stream/focus`. Presence arrives separately via the `presence.updated` that follows immediately as a side-effect of subscribe.
`topic.created`	`{topic_id, source_path, anchor_kind, first_message_preview, created_by, created_at}`	`handleCreateTopic`	Append to Topic list; refetch `/api/topics?source_path` if rendering depends on more than the preview; refetch `/api/topics/{id}/proposals` for any open Resolve view to surface stale banners caused by the new Topic
`topic.message_appended`	`{topic_id, message_id, sequence, kind, body_preview, author_user_id?, proposal_id?, created_at}` — `proposal_id` is non-null only when `kind = 'agent-proposal'`; `author_user_id` is null for the same case (per #1's "Agent is not a user")	`handleAppendTopicMessage` (human messages) and the agent runner after `wb-agent insert-proposal` commits (agent-proposal messages)	If the focused thread matches, refetch `/api/topics/{id}/messages`; otherwise badge the topic card
`topic.discarded`	`{topic_id, discarded_by, discarded_at}`	Discard handler (ships with #4)	Mark Topic closed; remove from open list; if focused, close the thread view
`topic.incorporated`	`{topic_id, source_path, commit_sha, incorporated_by, incorporated_at}`	Incorporate handler (ships with #4)	Remove the incorporated Topic from the list; reload the current Document iframe so the rendered Source, the `wb-source-sha` meta tag, and anchors reflect `commit_sha`; refetch `/api/topics?source_path` (other Topics' anchors flipped to `marker`); for every other Topic on this Source whose Resolve drawer is open, refetch `/api/topics/{id}/proposals` AND reload both diff iframes — the proposal's `base_source_sha` is now superseded, so both the left "current Source" iframe and the right `/content/preview/proposals/{id}` iframe must re-render against the new state to keep #4's stale banner accurate
`proposal.created`	`{proposal_id, topic_id, source_path, revision_number, agent_job_id}`	Agent runner after post-invariant check succeeds (#4)	Refetch `/api/topics/{id}/proposals`; if a job-running spinner is up, swap to a "review proposal" affordance
`job.updated`	`{job_id, kind, status, topic_id?, persona_name?}`	Agent runner on every state transition	Replaces the legacy `GET /api/agent/jobs/{id}` poll; refetch the job for `error_tail` if `status` is `failed` / `timed_out`
`presence.updated`	`{subscriptions: [{subscriber_id, user_id, display_name, focused_topic_id?}]}` — one entry per live subscription (so two tabs from the same user appear as two entries with the same `user_id`)	Hub on subscribe / unsubscribe / focus change	Re-render the presence chip (dedup display by `user_id` if showing a chip-per-person; or one badge per subscription if showing a chip-per-tab) and any per-Topic "Max is reading this" indicator (derive from any subscription whose `focused_topic_id` matches the Topic)

Decision — no proposal.staled event

Stale-proposal status (#4) is derived on read from the comparison of the proposal's bytes against current open Topics. The events that can make a proposal stale (topic.created, topic.incorporated) already cause the client to refetch /api/topics/{id}/proposals, where the existing #4 stale-check resurfaces the banner. A dedicated proposal.staled event would duplicate state without adding information. Reserve the name in case a future case requires distinct payload, but do not emit it in v1.

Decision — publisher receives its own events

The hub does not filter the publisher out of fanout. The publishing client receives its own topic.created / topic.message_appended / etc. Clients dedupe by record ID; the result is that the publisher's UI converges through the same render path as every other subscriber. Optimistic-update divergence is one bug class we avoid by spending a few extra bytes on self-events.

Decision — publish-after-commit, with a pinned order for Agent incorporate jobs

Every publisher calls Hub.Publish only after its SQLite write commits. For in-process handlers (handleCreateTopic, handleAppendTopicMessage, handleDiscardTopic) this means inside the write-funnel callback after the row commits.

For Agent incorporate jobs, the runner publishes only after collab.CompleteJob succeeds (so consumers cannot observe job.updated{status=succeeded} while the row is still running), and emits events in this exact order so the proposer's tab transitions correctly:

topic.message_appended for the agent-proposal row (carries proposal_id).
proposal.created for the proposal row.
job.updated with the terminal status.

For failed / timed-out incorporate jobs only job.updated is published. For perspective jobs (#5), the analogous order is settled there. Why: job.updated{succeeded} arriving before proposal.created would briefly transition the proposer's UI to "no proposal" before bouncing back to "review proposal." Pinning the order eliminates the flicker without needing a synthetic "generating-finished" state.

Presence model

Hub state per source:

gotype subscription struct {
    SubscriberID    string
    Principal       Principal
    FocusedTopicID  string // "" = viewing the Document, no Topic open
    sender          chan Event
    // ...
}

On every subscribe, unsubscribe, or SetFocus, the hub:

Recomputes the snapshot for the affected source: one entry per live subscription, {subscriber_id, user_id, display_name, focused_topic_id?}. No collapse by user_id — two tabs from the same user produce two entries with the same user_id, each carrying its own focused_topic_id. This matches the focus-POST design (each tab focuses its own subscription independently) and avoids the "tab B subscribed last with empty focus overwrites tab A's focus on Topic X" bug that any collapse policy is forced to invent a tiebreak for.
Publishes presence.updated to every subscriber on that source — including the user whose state changed (so the publisher's UI also re-renders).

The full snapshot is published, not a delta. With two collaborators × small tab counts the payload is trivially small; deltas would be premature optimisation. Clients decide whether the rendered presence chip dedups by user_id (one chip per person, "Max is reading…" with any focus from any of their tabs) or surfaces per-tab entries — the event surface supports both without server-side state.

Auth wiring

Both endpoints mount behind auth.RequireCollaborator. Anonymous requests get 401; signed-in non-collaborators (an unlikely state in v1 since the allowlist is the whole user set) get 403. Initial SSE auth failures, like other protected APIs, return 401/403 with JSON bodies and no HTML redirect for tests and non-browser clients. Browser EventSource does not expose those bodies or status codes, so chrome must classify stream failures with a normal GET /auth/me probe.

Open streams revalidate auth on the 15 s keepalive tick. If the session expires or is revoked while a stream is open, the handler closes the subscriber and returns; the browser reconnect path then probes /auth/me. Anonymous probe result ⇒ location.reload() so the page routes through /auth/login. Still-authenticated probe result ⇒ keep the normal EventSource reconnect running and treat the error as a transient transport failure.

Client wiring (contract — implemented in #8)

This subsection is the canonical contract chrome.js must satisfy. The visual polish — chip placement, toast styling, breakpoint behaviour, choice between per-user and per-tab presence rendering — is owned by #8 and listed under cross-cutting items below. The wiring itself is fixed here.

On iframe load: (a) call loadTopics() as today, (b) fetch GET /api/agent/jobs?source_path=… once to bootstrap in-flight job state (the hub does not replay job state on subscribe), (c) open new EventSource('/api/stream?source_path=' + encodeURIComponent(sourcePath)).
On every subscribed event (initial connect and every EventSource auto-reconnect), replace the per-source subscriber_id in the local map — the previous one is dead. Presence renders from the presence.updated event that follows.
For each event type, dispatch to a handler that updates UI state and triggers REST refetches per the table above. Dedup events by record ID so self-events are idempotent.
On Topic focus (open thread / close thread), POST /api/stream/focus with the stored subscriber_id and topic_id? (null when closing). Debounce locally at 1/sec to match the server limiter. On 404 unknown_subscriber, drop the stale ID and re-issue focus once the next subscribed event arrives.
On iframe navigation, close the EventSource and clear the per-source subscriber_id.
On EventSource.onerror, let the built-in reconnect run and issue a normal GET /auth/me probe. If the probe returns anonymous / 401, force a full page reload; otherwise keep reconnecting without disrupting the page. A 404 unknown_source on the initial connect is terminal — EventSource does not retry on 4xx, and chrome should not either; surface a "this document no longer exists" state and stop.

Concurrent edits and conflict surfacing

The data layer already prevents bad writes. The stream surfaces them faster.

Race	Data layer	What #6 adds
Append race in the same Topic	Per-topic monotonic `sequence` via the write funnel (#1)	Both clients receive both `topic.message_appended` events; both render in sequence order; views converge
Discard race	First commit wins; second POST gets `409 topic_closed`	Losing client receives `topic.discarded` before its click; Discard button greys; if click slipped through, the `409` is the safety net
Approval race on the same Topic	Per-source agent slot (#3); stale-proposal check (#4) returns `409 stale_proposal`	Loser receives `topic.incorporated`; refetches `/api/topics/{id}/proposals`; stale banner appears immediately
New Topic opened mid-proposal	Stale-check resurfaces missing-marker reason (#4)	Proposer's tab receives `topic.created`; refetches proposals; Resolve view shows the stale banner before approval is attempted
Two collaborators creating Topics on the same Source	No conflict — both writes succeed	Both clients see both `topic.created` events; both Topic lists converge

Backpressure and slow consumers

Per-subscriber buffer is 64 events. A single Agent incorporation produces a burst of 3 events (topic.message_appended for agent-proposal, proposal.created, job.updated{succeeded}) plus prior job.updated{running}; combined with presence churn from focus changes and concurrent collaborator activity, 16 is borderline under realistic bursts. 64 events × ~256 bytes per Event struct ≈ 16 KiB per subscriber; even at 64 subscribers (well beyond v1's expected ~4) total channel memory stays under 1 MiB. The reconnect-and-refetch recovery path is the expected behaviour on overflow, not pathological.

If Publish finds the channel full, the hub closes the subscriber and removes it from the source's set. The SSE handler's next select sees the channel closed, returns, and the client reconnects. This drops queued events but the reconnect refetches authoritative state via REST, so no record is lost — only events that were never meaningful (the client was not consuming them) are.

The hub never blocks publishers waiting on slow subscribers. Publish is non-blocking: it tries each subscriber's channel; on full, it triggers async close.

Process lifecycle

Startup. Hub starts empty. internal/server wires it into Deps alongside Collab and AgentService; mutation handlers and the agent runner take a realtime.Hub dependency.
Graceful shutdown. Hub.Close() closes every subscriber channel and waits for handlers to return. http.Server.Shutdown drains in-flight requests; SSE writers return on context cancellation.
Restart. Clients EventSource-reconnect through the proxy; on reconnect the new hub's subscribed event re-establishes presence. Any events emitted during the restart window are lost; reconnect refetches authoritative state.

Operational notes

Nginx Proxy Manager. Set the proxy host's Advanced config to proxy_buffering off; for the wiki-browser host so SSE frames flush through. The handler additionally emits X-Accel-Buffering: no as a belt-and-braces hint.
Pi resource budget. 2 collaborators × 1–2 open tabs each ≤ 4 long-lived connections per process. Each carries a 64-event channel (~16 KiB) plus the SSE writer goroutine. Aggregate < 100 KiB; negligible.
Logging. Hub logs subscribe/unsubscribe at debug; backpressure-closes at warn; publish failures (channel closed mid-send) at debug. No per-event logs in steady state.

Testing strategy

Hub unit tests (internal/realtime): fanout to N subscribers; slow-consumer drop at the 16-event ceiling; no cross-source leak (publish to source A does not reach subscribers on source B); focus-change triggers presence republish with merged snapshot; subscriber close removes from snapshot.
SSE handler integration tests (internal/server): auth gating (401/403 paths), bad source_path (400), missing file (404), keepalive cadence under a fake clock, expired / revoked session closes the stream on the next keepalive, graceful shutdown closes streams without write-races, X-Accel-Buffering header present.
Focus endpoint integration tests: CSRF required; rate limiter triggers 429 on burst; 404 unknown_subscriber when no live stream matches; 404 unknown_topic for unknown or cross-source Topic IDs; 409 topic_closed for terminal Topics; presence republish observed by a sibling subscriber.
End-to-end (internal/server with two httptest clients): A creates a Topic, B's stream receives topic.created within the test deadline. Symmetric tests for message append, discard, incorporate, proposal landing, agent-job status. Presence: A connects, B connects, A focuses Topic, B observes correct snapshots after each transition.
Agent-proposal event ordering (highest-risk path, cross-process + two events + invariant gating): a fake wb-agent commits the proposal row + agent-proposal message row; the runner's post-invariant check succeeds; assert both clients receive events in exactly the order topic.message_appended (with proposal_id set) → proposal.created → job.updated{succeeded}. Mirror test with post-invariant failure: only job.updated{failed} is published, no proposal.created or topic.message_appended for the agent row.

Migration considerations

No SQL migration. Hub is fully in-memory.
Existing polling. Chrome.js's GET /api/agent/jobs/{id} poll is replaced by job.updated consumption. The endpoint stays (one-shot lookups, error_tail fetch on failed/timed_out); only the poll loop goes.
Existing loadTopics() after own actions. Stays. The publisher's own topic.created arrival is idempotent with the explicit refetch; we keep the optimistic refresh for snappier UX and let event-driven refresh cover everyone else.

Open questions

None blocking. Items below are deliberate parking-lot, owned elsewhere.

Cross-document / session-wide stream — deferred. If a future sub-project wants "@-mention me" or "you have unread on /doc/foo", it adds a second endpoint (GET /api/stream/global) without changing the per-Document one.
Event versioning — none in v1. Adding a field to a payload is backwards-compatible for any client that ignores unknown keys. Renaming or removing fields needs a versioning scheme; revisit only when forced.
Per-subscription vs per-user rendering of the presence chip — the event surface emits one entry per subscription; #8 decides whether to render one chip per person (deduping by user_id) or one chip per tab. No server-side change either way.

Cross-cutting items for sister sub-projects

For #5 — Perspectives

Transport is settled. wb-perspective job lifecycle is already covered by job.updated; #5 inherits the wire without adding a new event.
Reserved event type: perspective.refreshed — payload {source_path, perspective_id, source_sha, persona_sha}. #5 decides the trigger (lazy-on-read vs eager-on-Incorporation) and emits this event when a regenerated Perspective is ready to swap in. The client's "regenerating…" indicator (UI in #8) hides on receipt.
Cache invalidation does not need an event. Per #1, invalidation is implicit via SHA mismatch. The "regenerating" UI bridges the read-to-render window; perspective.refreshed closes it.

For #8 — Wiki-browser UI integration

The wire-level contract (when to open the EventSource, what to call on focus, how to handle reconnect / auth-expiry, where to bootstrap job state) is fixed in the Client wiring subsection above and is not restated here. The items below are the genuinely-open UI decisions #8 owns.

Presence chip rendering. The hub emits one entry per subscription. #8 decides whether the chip dedups by user_id (one chip per person) or surfaces per-tab entries, where the chip lives in chrome (toolbar / sidebar header / topic-card metadata), and the mobile breakpoint.
Per-Topic "Max is reading this" indicator. Derive from any subscription in presence.updated whose focused_topic_id matches the Topic. Styling, placement, and whether to show count or just a dot are #8's call.
Toast / badge strategy. Decide which events get a transient toast (topic.created from another user? proposal.created on a Topic you authored?) vs. silent badge update vs. live append. The event surface is fixed; the UX policy is open.
"New messages above" pill vs. live append. The focused thread can either auto-append new topic.message_appended rows or surface a "N new messages — click to load" pill that protects the user's scroll position. #6's payload supports both.
Drop the legacy GET /api/agent/jobs/{id} polling loop in favour of job.updated consumption. The one-shot endpoint stays for fetching error_tail on failed / timed_out.
Stale Source / dead Document state. When the SSE handler returns 404 unknown_source, render a friendly "this document no longer exists" state in chrome and stop reconnecting. Today's handleDoc already returns a 404 page for missing files; align with that.

Problem

Goals & non-goals

Goals

Non-goals

Approach

Design

Architecture overview

The internal/realtime package

HTTP surface

SSE handler responsibilities

Focus endpoint responsibilities

Event surface

Presence model

Auth wiring

Client wiring (contract — implemented in #8)

Concurrent edits and conflict surfacing

Backpressure and slow consumers

Process lifecycle

Operational notes

Testing strategy

Migration considerations

Open questions

Cross-cutting items for sister sub-projects

For #5 — Perspectives

For #8 — Wiki-browser UI integration

References

The `internal/realtime` package