Draft

wiki-browser — Design

v1: render & navigate2026-05-09Daniel

Problem

The orcha monorepo accumulates a lot of authored content: HTML proposals (docs/proposals/*.html), product docs (docs/orcha-*.html), Markdown plans (orcha/docs/plans/*.md), feature specs (feature-specs/*.md), the product roadmap, and so on. Today, reading any of it well requires a local checkout and either opening files in a browser by path or loading them in an editor preview. Sharing a doc with a cofounder mid-conversation means screenshots or pasted snippets.

What's missing is a simple, always-on view: open one URL on the LAN, see every authored document in the repo, navigate between them, render Markdown like GitHub does (including mermaid), and find content by name or text. The host is a Raspberry Pi, so resource frugality is part of the problem.

Goals & non-goals

Goals (v1)

Serve every .html and .md file in the orcha repo, except a configured exclude list, without per-folder configuration.
Render Markdown server-side as GitHub-flavored HTML, with code highlighting and mermaid diagrams (rendered client-side).
Pass through authored HTML files with their own styling intact, wrapped only in minimal navigation chrome.
Provide search by filename/path and by content. Results show within ~200 ms on a Pi-class machine.
Run as a single static binary on a Raspberry Pi with steady-state RAM ≤ 50 MB.
Keep navigation snappy via HTMX content swaps so the sidebar and search box don't reload between pages.

Non-goals (v1)

Annotations, comments, threads, or resolve workflow. Slated for v2.
User identity, login, or auth in the app. Access is gated by WireGuard at the network layer (the Pi runs a WireGuard server with one peer issued to the cofounder; the operator reaches the Pi directly over LAN).
In-browser editing of any file.
Pulling content (the operator runs git pull separately; the server reacts to filesystem changes).
Match-highlighting inside an opened document. Search returns snippets and a link; in-page highlight is v2.
Advanced query syntax. FTS5's default tokenizer with bm25 ranking is the contract.

Approach

Two decisions: the server/frontend architecture, and how rendered content is hosted in the browser.

Server/frontend architecture

Decision matrix — recommended row uses `tr.recommended`.
Option	Resource cost	Snappiness	v2 readiness	Complexity
A. Single binary, server-side render, full reloads	Lowest	OK (~50 ms LAN reload)	Low — would need rework for live annotations	Lowest
B. Go API + HTMX chrome (recommended)	Low (~14 KB of HTMX, 25–40 MB steady-state)	Good (sidebar persists, content swaps via iframe `src`)	High — annotation panel is a sibling DOM node; `postMessage` bridge to in-iframe content	Low
C. Pre-rendered static site	Lowest at request time	Best (FileServer)	Low — v2 dynamic features force a real server anyway	Medium (watcher + build dir + invalidation)

Decision

Option B: Go server with html/template + HTMX driving the chrome (sidebar, search, keyboard). Content rendering is delegated to an iframe; see the next decision.

Content rendering: iframe vs same-document

The chrome is HTMX-driven, but the rendered document body still has to live somewhere. Either inline inside the chrome's DOM (same-document) or as a standalone HTML document loaded into an <iframe>.

Iframe pros

CSS isolation is total — authored HTML can use body, *, @page without bleeding into chrome
JS isolation — authored <script> tags can't touch HTMX, mermaid, or sidebar listeners
No HTML body-extraction step — pass-through HTML is served as-is
Relative URLs and <base> in authored HTML work naturally
Markdown and HTML render through one path — server emits a standalone document either way

Iframe cons

Outer URL must be synced from iframe load events via history.pushState for shareable deep-links
Keyboard shortcuts need listeners in both contexts; iframe forwards unhandled keys to parent via postMessage
Mermaid loads per iframe (cached by the browser after first hit; ~500 KB)
Annotations (v2) need a small postMessage protocol — well-trodden (Hypothesis, code-review tools), bounded scope

Decision

Render all content (Markdown and authored HTML alike) inside an <iframe> served from the same origin. The iframe boundary eliminates the chrome-vs-content CSS contract that same-document rendering would otherwise force into the codebase. v1 ships without a sandbox attribute — content is trusted (the user's own repo); see Open questions.

Note

An earlier sketch considered iframe for HTML and same-document for Markdown. The dual-path tax (two annotation rendering paths to test in v2) outweighed the marginal speed gain on MD-to-MD navigation, so v1 unifies on iframe.

Design

System overview

One Go process, no external services. It reads the repo from disk, holds an in-memory render cache and a SQLite FTS5 index on disk, and serves HTTP. A single root directory is configured; everything under it is auto-discovered, filtered by an exclude list, and indexed by a watcher. Browser-side, the server returns two kinds of HTML: a chrome shell (sidebar, topbar, empty iframe) and standalone content documents loaded into the iframe.

Components and data flow. The browser holds a chrome shell (HTMX) with an iframe child for content. The render cache is in-memory; FTS5 lives in a single SQLite file alongside the binary.

Packages and boundaries

Each package has one responsibility and a small surface; consumers depend on functions, not internals.
Package	Responsibility	Depends on
`config`	Load and validate `wiki-browser.yaml`; produce a typed `Config`.	—
`walker`	Walk the configured root, apply excludes, emit the canonical file list. Watch via `fsnotify` with one `Add()` per directory (recursion is the walker's job, not the kernel's). Debounce events per-path. Owns the source of truth for "what files exist".	`config`
`render`	Pure function `Render(absPath) → (Document, err)` where `Document` is a complete HTML document for iframe consumption plus a `HasMermaid` flag set when the source contained at least one `mermaid` fence. `.md` via `goldmark` + `chroma` wrapped in the prose template; `.html` served verbatim. The server uses `HasMermaid` to gate mermaid script injection — most docs don't need it. In-memory LRU cache keyed by `(absPath, mtime, size)`, byte-bounded.	—
`index`	Owns the SQLite FTS5 database. Exposes `Reindex(path)`, `Remove(path)`, `Search(q, limit)`. Serializes mutations through a single goroutine to avoid write/remove races. Reacts to `walker` events.	`walker`, `render`
`nav`	Builds the sidebar HTML from the walker's file list, grouped by top-level directory. Re-rendered per request — cheap at the FTS5 scaling target (~5,000 entries) — so the live walker view is always reflected.	`walker`
`server`	Wires `net/http` routes. Two template families: the chrome shell (sidebar + topbar + iframe) and the content document (rendered MD or pass-through HTML, served standalone for the iframe).	everything above

Configuration

A single YAML file drives the server. Sensible defaults are baked into the code; the file lists only what differs.

yaml# wiki-browser.yaml
listen: ":8080"
title: "Orcha wiki"
root: "/home/volrath/code/orcha"
extensions: [".md", ".html"]
index_db: "./wiki-browser-index.db"
exclude:
  # user-configurable additions, on top of baked-in defaults:
  - "www/**"
  - "marketing/**"

Baked-in default excludes (always applied; no way to opt out): **/.git/**, **/node_modules/**, **/.worktrees/**, **/.obsidian/**, **/.claude/**, **/tmp-*/**. These exist purely to keep the index sane on any repo; they are not specific to orcha.

Discovery and indexing

Startup. Walk root, apply default + user excludes, collect every file whose extension is in extensions. Diff against the FTS5 database by (path, mtime, size); reindex only changed files; drop entries for files that no longer exist. Size is part of the key because some workflows (notably git checkout) can preserve mtime across content changes.
Live updates. An fsnotify watcher subscribes to the root subtree. On Linux, inotify is non-recursive: the walker calls watcher.Add() once per directory at startup and again when a directory-CREATE event arrives. The same exclude list filters which dirs are subscribed and which events are honored.
Watch budget. A large monorepo can blow through fs.inotify.max_user_watches (default 8 192 on many distros, 524 288 on a Pi running recent Manjaro/Debian). On watcher.Add() failure with ENOSPC, log the limit, the failing path, and the suggested sysctl fix, then continue indexing without live updates for that subtree.
Debounce. Coalesce events per absolute path through a 300 ms window — a git pull over a wide diff fires hundreds of WRITE/RENAME events in milliseconds, often multiple per file. Editor swap files are filtered by suffix (~, .swp, .swx, .tmp) and the numeric inode-only names some editors drop (4913).
Race serialization. All Reindex/Remove calls are funneled through a single goroutine keyed by path so a late WRITE can't reinsert a row that REMOVE just deleted. The reindex worker re-stats the file as the last step before insert; missing-on-disk demotes the operation to Remove.
FTS5 schema. One virtual table with three indexed columns and two regular columns for change detection:

sqlCREATE VIRTUAL TABLE docs USING fts5(
  path,                 -- repo-relative path, tokenized as text
  title,                -- front-matter title or first H1, fallback: filename
  body,                 -- rendered plain text (HTML/MD stripped)
  mtime UNINDEXED,      -- unix seconds; used for change detection
  size  UNINDEXED,      -- bytes; tiebreaker when mtime is preserved across edits
  tokenize = 'unicode61 remove_diacritics 2'
);

Filename matches use the FTS5 query path:<q> OR title:<q> with column-weighted bm25(docs, 8.0, 4.0, 1.0) (path > title > body). Content matches use body:<q> with the FTS5 snippet() helper for ~10-word context windows. The 8/4/1 weights are a starting point — tune during the Pi smoke test against 5–10 known queries from the actual orcha corpus, then update this number. The unicode61 tokenizer with remove_diacritics 2 handles English/Spanish/Portuguese/French well; CJK content would need the trigram tokenizer (SQLite ≥ 3.34) and is deferred — none of the orcha corpus is CJK today.

Routing and templates

Two top-level namespaces — /doc for chrome shells, /content for iframe documents — keep file paths from colliding with reserved chrome routes (search, partials, static, healthz).

Chrome and content responses come from separate template families; the chrome is rendered once per visit, content per navigation.
Method & path	Behavior
`GET /`	Landing page (chrome shell, iframe pointing at the welcome content).
`GET /doc/{path...}`	Chrome shell with the iframe pre-pointed at `/content/{path...}`. This is the deep-linkable URL.
`GET /content/{path...}`	Standalone HTML document for iframe consumption. `.md` rendered with prose CSS + mermaid; `.html` served verbatim. `?raw=1` returns the raw file bytes with `text/plain` for either extension (view-source convenience).
`GET /search?q=...`	Returns the `search-results.html` fragment with two sections — Filename matches, Content matches. Result clicks navigate via the chrome's iframe-swap helper, not full reloads.
`GET /partials/nav`	Sidebar fragment. Used rarely; the sidebar is normally rendered with the chrome shell.
`GET /static/...`	Embedded assets (CSS, chrome.js, content.js, htmx.min.js, mermaid.esm.min.mjs, fonts) via `embed.FS`.
`GET /healthz`	200 OK plain text.

Three reserved content paths are baked into the binary via embed.FS: /content/_welcome, /content/_404, and /content/_search-offline. They take registration precedence over /content/{path...} so a real file can never shadow them. GET / always serves the chrome shell pointing at /content/_welcome, which doubles as the empty-tree landing: when the walker finds zero files matching extensions, the server still starts and the welcome page explains the empty index and references the config — we never fail-fast on an empty tree. 404 responses serve the chrome shell with the iframe pointing at /content/_404; the search-offline fragment is sourced from /content/_search-offline.

Content frame

Content lives in a same-origin iframe; chrome lives in the parent document. There is no shared CSS or JS scope, so no naming discipline is required to keep them apart.

Chrome shell emits <iframe id="content" src="/content/<path>" name="content" sandbox="allow-same-origin allow-scripts allow-popups">, sized to fill the available area. The sandbox blocks top.location hijacks at near-zero cost while keeping same-origin status (so postMessage, contentWindow.location reads, and the v2 annotation client all keep working). If an authored doc later needs form submission or storage, add allow-forms / allow-storage-access-by-user-activation selectively rather than dropping the sandbox.
Markdown content document is a complete HTML5 document: prose CSS in <head>, rendered Markdown in <body>, content.js at the end of body. The mermaid ESM script tag is injected only when Document.HasMermaid is true — most docs save the ~500 KB load and parse cost. The output is a valid standalone document and opens directly in any browser.
Authored HTML is served byte-identical from disk — no body extraction, no style hoisting, no content.js injection. The server only sets Content-Type: text/html; charset=utf-8. Trade-off: keyboard shortcuts (/, Esc) won't reach the chrome when the iframe has focus on an authored HTML doc — there's no script in the iframe to forward them. Documented v1 limitation; not worth injecting into otherwise-untouched bytes.
Same-origin guarantee. Both frames are served from the wiki-browser origin, so the parent can read iframe URL/title via contentWindow and exchange postMessage with targetOrigin: location.origin. We use postMessage rather than direct cross-frame property access because (a) the sandbox attribute makes the boundary an explicit contract that survives future hardening, (b) the v2 annotation client benefits from a typed envelope, and (c) Web Annotation Data Model tooling assumes message passing. Future maintainers: don't "simplify" this back to direct calls.
Navigation inside the iframe (a link from one doc to another, or a search-result swap) replaces the iframe URL via iframe.contentWindow.location.replace(...) so the iframe doesn't grow its own history stack. The parent listens to the iframe's load event and updates the outer URL via history.pushState.

No Content-Security-Policy header in v1 — content is trusted. With assets already vendored under static/, a future default-src 'self' policy would apply without changes and is the right v2 hardening lever to reach for if untrusted authored HTML ever needs to be served.

Search UX

One input in the topbar (parent document); / focuses it, Esc clears.
hx-get="/search" hx-trigger="keyup changed delay:200ms" hx-target="#search-results". Empty query returns an empty fragment.
Results split into Filename matches (top 10) and Content matches (top 20). Each item shows the title, repo-relative path, and (for content matches) a snippet.
Result links carry data-path; the chrome's click delegate calls the iframe-swap helper instead of letting the browser navigate.

Client JS

Split into two scripts. Together ~150 lines.

static/chrome.js (parent). On DOMContentLoaded: init theme toggle, expand the sidebar folder containing the current page, wire keyboard shortcuts (/, Esc), attach the iframe-swap helper. On the iframe's load event: read contentWindow.location.pathname, derive the doc path, call history.pushState({}, '', '/doc/' + path), set document.title from the iframe's title, update aria-current="page" on the matching sidebar link. On popstate: reverse the mapping and call iframe.contentWindow.location.replace('/content/' + path). Listens for postMessage events from the iframe (forwarded keys in v1; range-selection events in v2).
static/content.js (injected into every rendered MD document; absent from pass-through HTML — authored HTML carries its own JS or none). On DOMContentLoaded: mermaid.run({ querySelector: 'pre.mermaid' }). Forwards unhandled keydowns (/, Esc) to the parent via postMessage so chrome shortcuts work even when the iframe has focus. Placeholder hook for the v2 annotation client.

postMessage protocol uses a typed envelope: { kind: 'key', key: '/' }, { kind: 'nav', path: '...' }. targetOrigin is always location.origin.

Resource budget on a Raspberry Pi

Server-side targets, not promises. Verified by smoke test on the actual Pi before declaring v1 done.
Component	Budget	Notes
Binary size	≤ 25 MB	Static, includes embedded assets.
Steady-state RAM	≤ 50 MB	Render cache (LRU, ≤ 32 MB) + FTS5 page cache (2 MB default) + Go runtime (~12 MB) + headroom.
Render cache	≤ 32 MB (LRU, byte-bounded)	Approximate byte size = `len(html) + len(plaintext)`, computed on insert. Entry count is not capped — bytes is the right proxy for memory pressure.
Markdown render time	≤ 5 ms typical	goldmark + chroma; cached after first hit.
FTS5 search latency	≤ 50 ms typical	For corpora up to ~5,000 docs.
Cold start	≤ 1 s	Index diff against existing DB; only changed files reindex.

Client-side cost (browser, not Pi). Mermaid is ~500 KB per cold load and is now gated on Document.HasMermaid — only docs that actually contain a mermaid fence pay the script load and parse cost. Browser-cached after first hit. LAN bandwidth is not the bottleneck on a Pi; this row is about the user's tab, not the host.

Library choices

SQLite driver: modernc.org/sqlite — pure Go, no CGO, easy to cross-compile to ARM from a dev box. FTS5 is bundled in the amalgamation it ships.
Markdown: github.com/yuin/goldmark with extensions for GFM (tables, task lists, strikethrough), autolinks, footnotes, definition lists, and front-matter.
Syntax highlighting: github.com/alecthomas/chroma/v2 via github.com/yuin/goldmark-highlighting/v2.
File watcher: github.com/fsnotify/fsnotify (uses inotify on Linux; subscriptions are per-directory, see Discovery and indexing).
Mermaid: mermaid.esm.min.mjs, vendored under static/, loaded inside each rendered MD document.
HTMX: htmx.min.js vendored under static/, loaded only by the chrome shell.

Build & deploy

The whole reason for picking modernc.org/sqlite and embedding assets via embed.FS is a one-line cross-compile from a dev box.

bash# build for the Pi (64-bit Raspberry Pi OS / Manjaro ARM)
GOOS=linux GOARCH=arm64 CGO_ENABLED=0 \
  go build -trimpath -ldflags="-s -w" \
  -o dist/wiki-browser ./cmd/wiki-browser

# deploy
scp dist/wiki-browser pi:/home/pi/bin/wiki-browser
scp deploy/wiki-browser.service pi:/etc/systemd/system/
ssh pi "sudo systemctl daemon-reload && sudo systemctl restart wiki-browser"

A systemd unit is committed at deploy/wiki-browser.service:

ini[Unit]
Description=wiki-browser
After=network.target

[Service]
ExecStart=/home/pi/bin/wiki-browser -config=/home/pi/.config/wiki-browser.yaml
Restart=on-failure
RestartSec=2s
User=pi

[Install]
WantedBy=multi-user.target

No Docker, no orchestrator — the binary is the unit of deployment. Logs go to journald (journalctl -u wiki-browser).

Error handling

Path traversal. Membership in the walker's canonical file set is checked first (cheap map lookup); only if that passes are filepath.Clean and symlink-eval invariants verified as a defense-in-depth layer. Membership-first short-circuits the per-component lstat cost on the hot path. Any failure → 404.
Missing file. 404 chrome shell with an iframe pointing at /content/_404 (a baked-in helpful page with a breadcrumb back to the parent group).
Render failure. 500 content document that shows the error — this is an internal tool reachable only over the operator's LAN or a WireGuard peer link; surface what broke instead of hiding it.
Trust model. Access is gated at the network layer: the Pi runs a WireGuard server (one peer for the cofounder); the operator's own access is over LAN. The wiki-browser process binds 0.0.0.0 by default (override via listen) and exposes raw error text. Don't put this on a public IP.
Startup — index DB.
- Missing: create the schema and reindex from scratch.
- Schema mismatch: log the detected vs expected schema version and exit non-zero. The operator deletes the file (or runs a future --migrate flag) and restarts.
- Locked: retry with backoff for ~5 s, then start in degraded mode (search returns "offline") and log a warning. Navigation keeps working.
- Corrupt (open succeeds but a sentinel SELECT fails): log and exit non-zero.
Index unavailable mid-flight. Search endpoint returns a fragment that says "search is offline; navigation still works." The rest of the server keeps serving.
Config error at startup. Fail fast with a clear message; do not start serving.

Testing

walker. Table tests for default + user excludes, traversal, symlinks, hidden files, debounce-window coalescing, swap-file filtering, dir-CREATE triggering a recursive subscribe. Fixture trees under testdata/.
render. Golden-file tests: basic Markdown, GFM features, mermaid fence, fenced code with chroma, front-matter parsing. Verify that pass-through HTML round-trips byte-identical (no body extraction).
index. Round-trip tests: index a fixture corpus, search by name, search by content, verify ranking under bm25(8,4,1), verify deletion, verify (mtime, size)-driven reindex. Race test: interleave Reindex and Remove for the same path through the funnel goroutine; assert no zombie rows.
server. httptest coverage of every route. /doc/... returns the chrome shell with the iframe src set to the matching /content/...; /content/... returns a complete HTML5 document with prose CSS for .md and verbatim bytes for .html; ?raw=1 returns text/plain; 404 / 500 paths covered.
Iframe boundary. Headless-browser test (Playwright or Rod): load /doc/<known-path>, assert the iframe loads, assert postMessage from iframe to parent is delivered, assert the parent's history.pushState updates the URL on iframe navigation.
Smoke (Pi). Run against the actual orcha repo on the target Pi; click through all top-level groups, verify mermaid renders, verify search returns sane results for a known query, and capture RSS, binary size, and a pprof snapshot of cache occupancy. Compare against the resource-budget table; declare v1 done only if every row is met.

v2 readiness

The shape we're building maps cleanly onto the v2 features that were brainstormed and deferred:

Identity. A name field set once and stored in a cookie. No structural change to v1 needed; a small middleware reads the cookie when present.
Annotations. Text-range, keyed on the canonical file path the walker already exposes. Adds a db package, annotations handlers, an HTMX panel sibling to the iframe in the chrome shell, and a small client script injected into every content document. The client uses TextQuoteSelector + TextPositionSelector (W3C Web Annotation Data Model — same algorithm as Hypothesis) to serialize ranges and exchanges them with the parent via postMessage. Highlights render in-iframe via the CSS Custom Highlight API (Chromium ≥ 105, Safari ≥ 17.2, Firefox ≥ 140) with a span-wrapping fallback.
Realtime. Server-Sent Events at /events?path=... for new annotations. The connection lives in the chrome (HTMX SSE extension); new events forward into the iframe via postMessage. No new server infrastructure beyond the SSE endpoint.
Anchor drift on doc edits. Best-effort re-anchor with a stored text-quote fallback; uncertain anchors surface as "orphaned" in the panel.

Open questions

None blocking v1. The three previously-open items (iframe sandbox, render-cache bound, empty-tree behavior) are now resolved in the body of the design. Below: v2-adjacent items, no decision needed for v1.

Whether to add in-page match highlighting at search-result click time (uses URL fragment #:~:text= on Chromium, custom highlighter elsewhere — both work inside the iframe).
Whether dark mode is worth shipping in v1 chrome (low effort; punted to keep v1 focused).

References

Spec design system — tokens and components used by this document.
goldmark — Markdown engine.
chroma — syntax highlighter.
modernc.org/sqlite — pure-Go SQLite driver with FTS5.
HTMX — fragment-swapping frontend layer.
fsnotify — filesystem watcher.