Skip to content

feat(web): expand ContentViewer to render PDF, CSV, HTML, XML, JSON, CSS (closes #1329)#1551

Merged
cliffhall merged 2 commits into
v2/mainfrom
feat/1329-expand-content-viewer
Jun 30, 2026
Merged

feat(web): expand ContentViewer to render PDF, CSV, HTML, XML, JSON, CSS (closes #1329)#1551
cliffhall merged 2 commits into
v2/mainfrom
feat/1329-expand-content-viewer

Conversation

@cliffhall

@cliffhall cliffhall commented Jun 29, 2026

Copy link
Copy Markdown
Member

Closes #1329.

Turns ContentViewer into a per-MIME dispatch surface for resource content. It now accepts raw TextResourceContents | BlobResourceContents (in addition to the existing ContentBlock path), so ResourcePreviewPanel.toContentBlock collapses to a passthrough and blob-typed resources render natively instead of falling back to opaque [Binary content …] text.

Type matrix

Type MIME Renderer
PDF application/pdf <iframe> over a Blob URL (#view=FitH), revoked on unmount
CSV text/csv (or .csv) papaparse → Mantine Table (first 100 rows), plain-Code fallback
JSON application/json (or .json) pretty-printed + lazy syntax highlight
XML text/xml, application/xml (or .xml) indented + lazy highlight
HTML text/html (or .html/.htm) sandbox="" iframe over a Blob URL + injected CSP
CSS text/css (or .css) lazy syntax highlight
Markdown text/markdown (or .md) react-markdown, now with a safe-href anchor allowlist
Image / Audio / plain text unchanged

Cross-cutting

  • Lazy highlighter (elements/CodeHighlight): the prism-light runtime, tomorrow theme, and each grammar dynamic-import on first use — one emitted chunk per grammar (prism-light, json, markup, css, tomorrow). Module-level registeredLanguages / failedLoads / loadingPromises caches; renders plain Code until a grammar resolves.
  • HTML sandbox: wrapHtmlWithCsp injects default-src 'none'; … with no script-src (so it stays load-bearing even if sandbox is loosened); served from a Blob URL, revoked on unmount.
  • Markdown anchor safety: SAFE_HREF allowlist (^(https?:|mailto:|#|/(?!/))); non-matching anchors render as inert <span>.
  • MIME inference: ResourcePreviewPanel extends its URI-suffix table to cover the new renderers.

Dependencies

papaparse + react-syntax-highlighter (+ @types/*), both named in the issue. Added tsconfig paths stubs under src/types/ because @types/react-syntax-highlighter's ambient deep-module declarations are shadowed by the real .js under bundler resolution — the same redirect pattern the repo already uses for react/pino.

Notes / deviations

  • papaparse is bundled, not lazy — the issue only mandated lazy-loading for the syntax highlighter. Easy to make lazy if preferred.
  • Stories added for JSON/XML/CSS/CSV/HTML/binary; no PDF story (would need real PDF bytes inline).

Screenshots

Smoke-tested against a stdio MCP server exposing one resource per MIME type. Every renderer below was verified live in the web client. Drag each NN-*.png into its placeholder to embed it.

PDF — application/pdf

07-pdf

CSV — text/csv

01-csv

JSON — application/json

02-json

XML — application/xml

03-xml

HTML (sandboxed iframe) — text/html

04-html

The HTML capture shows the external image broken with CSP console errors — confirming the injected default-src 'none' policy is load-bearing.

CSS — text/css

05-css

Markdown — text/markdown

06-markdown

PNG — image/png

08-png

JPEG — image/jpeg

09-jpeg

WEBP — image/webp

10-webp

Plain text — text/plain

11-plaintext

Testing

npm run validate (all 4 clients), npm run test:coverage (per-file gate — every new file ≥90 on all four dimensions), npm run test:storybook, and the integration suite all pass. New tests cover PDF blob-URL creation + cleanup, CSV table render + fallback, HTML sandbox attributes + CSP injection, JSON/XML/CSS highlighted output, the lazy-load / alias / failed-load / shared-runtime paths, and the URI-suffix MIME-inference paths.

🤖 Generated with Claude Code

…CSS (closes #1329)

Turn ContentViewer into a per-MIME dispatch surface for resource content.
It now accepts raw TextResourceContents | BlobResourceContents (in addition
to the existing ContentBlock path), so ResourcePreviewPanel.toContentBlock
collapses to a passthrough and blob-typed resources render natively instead
of falling back to opaque "[Binary content …]" text.

New renderers (clients/web/src/components/elements/):
- CodeHighlight: lazy react-syntax-highlighter prism-light element. Runtime,
  tomorrow theme, and each grammar dynamic-import on first use (one chunk per
  grammar). Module-level registeredLanguages / failedLoads / loadingPromises
  caches; plain Code block until a grammar resolves.
- PdfFrame: iframe over a Blob URL (#view=FitH), revoked on unmount.
- CsvTable: papaparse -> Mantine Table (first 100 rows), plain-Code fallback.
- HtmlFrame: sandbox="" iframe over a Blob URL with an injected CSP meta
  (no script-src, falls through to default-src 'none').
- contentViewerUtils: getMimeKind, base64 decode helpers, formatXml,
  wrapHtmlWithCsp, SAFE_HREF markdown-anchor allowlist (unsafe href -> span).

ResourcePreviewPanel: extend the URI-suffix MIME inference table
(.csv/.json/.xml/.html/.htm/.css/.pdf alongside .md).

Deps: papaparse + react-syntax-highlighter (+ @types). tsconfig paths stubs
redirect the deep RSH imports for typecheck (the @types ambient declarations
are shadowed by the real .js under bundler resolution) — same pattern the
repo uses for react/pino.

Tests cover PDF blob-URL creation + cleanup, CSV table render + fallback,
HTML sandbox attributes + CSP injection, JSON/XML/CSS highlighted output,
the lazy-load/alias/failed-load/shared-runtime paths, and the URI-suffix
MIME-inference paths. validate, test:coverage (per-file gate), test:storybook,
and integration all pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@cliffhall cliffhall added the v2 Issues and PRs for v2 label Jun 29, 2026
@cliffhall

Copy link
Copy Markdown
Member Author

@claude review

@claude

claude Bot commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

Claude finished @cliffhall's task in 2m 33s —— View job


Code review — ContentViewer MIME dispatch

Read through the new dispatch surface, the lazy highlighter, the three blob frames, and the panel wiring. This is a clean, well-factored change — pure helpers split out for isolated testing, layered HTML hardening, a genuinely lazy highlighter, and thorough tests/stories. A few things worth addressing, one of which I think is a real (dev-only) bug.

🔴 useObjectUrl is fragile under React StrictMode (PDF/HTML previews can go blank in dev)

useObjectUrl.ts:14-23 creates the URL in useMemo but revokes it in a separate cleanup effect:

const url = useMemo(() => URL.createObjectURL(blob), [blob]);
useEffect(() => () => URL.revokeObjectURL(url), [url]);

main.tsx:21 wraps the app in <StrictMode>. In dev, StrictMode runs effects setup → cleanup → setup. The cleanup revokes url, but useMemo does not re-run on the second setup (no re-render occurs), so the committed url stays revoked and is never recreated. The <iframe src={url}> in PdfFrame/HtmlFrame then points at a dead blob URL — the preview shows blank/broken (flaky, depending on whether the iframe fetched the blob before the synchronous revoke). Production (no StrictMode) is unaffected, which is likely why it looked fine in the smoke test, but cd clients/web && npm run dev is the day-to-day path.

The create-in-useMemo / revoke-in-effect split is the canonical StrictMode pitfall. The robust fix is to pair creation and revocation in the same effect so each setup gets a fresh URL:

const [url, setUrl] = useState<string>();
useEffect(() => {
  const next = URL.createObjectURL(blob);
  setUrl(next);
  return () => URL.revokeObjectURL(next);
}, [blob]);

(That costs the one extra render the current comment was trying to avoid; alternatively, defer the revoke to a microtask so the StrictMode re-setup can cancel it — the same trick AppRenderer.tsx already uses.) The unit test (useObjectUrl.test.tsx) renders without StrictMode, so it doesn't catch this — worth adding a StrictMode-wrapped case. Fix this →

🟡 CSV table silently truncates at 100 rows

CsvTable.tsx:35-38 slices to MAX_ROWS with no indication to the user that rows were dropped. AGENTS.md's "no silent caps" guidance applies here — a footer like "Showing first 100 of N rows" (only when result.data.length > MAX_ROWS) would make the cap visible. Fix this →

🟡 Unguarded atob can crash the panel on malformed blobs

decodeBase64ToUtf8 (contentViewerUtils.ts:81) and decodeBase64ToBytes (:94) call atob, which throws InvalidCharacterError on non-base64 input. These run during render (ResourceContentTextualContent, and PdfFrame's useMemo), so a server returning a malformed blob throws synchronously and takes down the whole ResourcePreviewPanel rather than degrading to the BinaryNotice fallback. Worth wrapping the decode and falling back to the binary notice / a plain error block. The blob comes from an external MCP server, so this is reachable in practice.

🟢 Minor / optional

  • formatXml regex (contentViewerUtils.ts:124) splits on >\s*<, so >/< inside attribute values, text nodes, comments, or CDATA will misindent. It's display-only (the original text is preserved for copy), so low impact — just flagging that the indentation is best-effort.
  • copyable is dropped for CSV/HTML/PDF. ResourcePreviewPanel passes copyable, but the csv/html branches and the blob pdf branch render no CopyButton. Reasonable for a table/iframe, but a user can't copy the raw CSV/HTML source. Intentional? If so, ignore.
  • PDF iframe has no sandbox while HTML gets sandbox="". Browser PDF viewers don't execute embedded JS by default so risk is low, but sandbox as defense-in-depth would be cheap and consistent.
  • Redundant <Stack gap="xs"> wrappers around single children in the csv/html/pdf branches (ContentViewer.tsx:232-243, 309-313) — harmless, just noise.

Nice work overall — the lazy-load caching, the load-bearing CSP comment, and the markdown SAFE_HREF allowlist are all solid. The StrictMode item is the one I'd prioritize.
feat/1329-expand-content-viewer

Address review feedback on the ContentViewer MIME-dispatch PR:

- useObjectUrl: fix a dev-only StrictMode bug where the committed object
  URL was revoked but never recreated (setup→cleanup→setup with no
  re-render), blanking PDF/HTML previews. Revocation is now deferred to a
  microtask and guarded by a "live URL" ref so a StrictMode re-setup
  cancels it, while a real unmount / blob change still releases the URL.
  Avoids set-state-in-effect; mirrors AppRenderer's deferred-disposal.
- atob: guard base64 decoding with tryDecodeBase64ToUtf8 /
  tryDecodeBase64ToBytes so a malformed blob from an external server
  degrades to the binary-content notice instead of throwing during render
  and taking down the whole preview panel.
- CsvTable: surface the 100-row display cap with a
  "Showing first N of M rows" caption (no silent truncation).
- Extract BinaryNotice into its own module so PdfFrame can reuse it
  without an import cycle.

Tests added for the StrictMode path, malformed-base64 fallbacks (PDF +
textual blob), the safe-decode helpers, and the truncation caption.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@cliffhall

Copy link
Copy Markdown
Member Author

Thanks for the thorough review — pushed e2b058ec addressing the findings.

🔴 useObjectUrl StrictMode bug — Fixed. Confirmed the setup→cleanup→setup-with-no-rerender failure mode. I didn't take the useState route since it trips the repo's react-hooks/set-state-in-effect rule (which we deliberately don't suppress). Instead: keep create-in-useMemo (live URL on first paint, stable string return) and defer revocation to a microtask guarded by a liveUrlRef — a StrictMode re-setup restores the ref to the same URL so the microtask skips the revoke, while a real unmount / blob change leaves the ref pointing elsewhere so the stale URL is released. Mirrors the deferred-disposal trick already in AppRenderer. Added a StrictMode-wrapped test asserting the committed URL is never revoked.

🟡 Unguarded atob — Fixed. Added tryDecodeBase64ToUtf8 / tryDecodeBase64ToBytes (return null instead of throwing). The textual-blob branch and PdfFrame now degrade to the binary-content notice on malformed base64 instead of throwing during render. Tests added for both paths plus the helpers.

🟡 CSV silent truncation — Fixed. CsvTable now renders a Showing first 100 of N rows caption when the row count exceeds MAX_ROWS (only when truncated). Test added.

🟢 Minor:

  • formatXml regex — acknowledged; left as-is. It's display-only (raw text is preserved for copy) and the indentation is explicitly best-effort.
  • copyable dropped for CSV/HTML/PDF — intentional. The copy overlay is absolutely-positioned top-right and would sit over the iframe/table chrome, and copying the raw source of a rendered table/iframe/PDF is low-value. Left as-is.
  • PDF iframe sandbox — intentionally not added. Chrome's built-in PDF viewer needs scripts + same-origin to render; sandbox="" blanks the preview (verified). The HTML path keeps sandbox="" because that content is untrusted markup we render ourselves; the PDF is handed to the browser's native viewer.
  • Redundant Stack gap="xs" wrappers — left for consistency: every ResourceContent branch (image/audio/binary/csv/html/pdf) uses the same single-child Stack as a uniform spacing container, so removing just three would make the dispatch less symmetric.

Also extracted BinaryNotice into its own module so PdfFrame can reuse it without an import cycle. Full validate + coverage gate (per-file ≥90) + storybook all green.

@claude review

@claude

claude Bot commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

Claude finished @cliffhall's task in 1m 17s —— View job


Re-review — ContentViewer fixes (e2b058ec)

Re-read the three flagged areas plus the new wiring. All three findings are resolved correctly, and the reasoning in each reply holds up against the code.

  • useObjectUrl StrictMode fix
  • atob guarding (tryDecode*)
  • CSV truncation caption
  • Wiring in ContentViewer / PdfFrame / HtmlFrame / BinaryNotice

🔴 → ✅ useObjectUrl StrictMode

The deferred-revoke-guarded-by-liveUrlRef approach is sound. I traced all three paths:

  • StrictMode setup→cleanup→setup: cleanup nulls the ref + schedules the microtask; re-setup restores liveUrlRef.current = url (same url, no re-render); microtask sees liveUrlRef.current === url → skips revoke. ✅
  • Real unmount: cleanup nulls the ref, no re-setup; microtask sees null !== url → revokes. ✅
  • Blob change: old cleanup nulls + schedules; new setup sets ref to newUrl; microtask sees newUrl !== oldUrl → revokes the stale one. ✅

The react-hooks/set-state-in-effect constraint is a legitimate reason to avoid the useState shape, and liveUrlRef is a clean alternative that keeps a live URL on first paint. The StrictMode-wrapped test asserts the committed URL is never revoked — exactly the regression guard that was missing. Good fix.

🟢 Minor (optional, dev-only): createObjectURL inside useMemo is double-invoked by StrictMode, so the discarded render pass leaks one un-revoked URL per mount in dev (the committed URL is the one the ref guards). This is inherent to the create-in-useMemo shape and harmless in prod (no StrictMode) — just noting it's the residual cost of avoiding useState. Not worth changing.

🟡 → ✅ Unguarded atob

tryDecodeBase64ToUtf8 / tryDecodeBase64ToBytes wrap the throwing decoders and return null. Both call sites degrade correctly:

  • ResourceContent textual-blob branch (ContentViewer.tsx:305-307) → BinaryNotice on null.
  • PdfFrame (PdfFrame.tsx:22,29-31) → BinaryNotice on null. Worth noting the hooks ordering is correct here: useObjectUrl is still called unconditionally (with an empty Blob when bytes is null) before the early return, so no conditional-hook violation. ✅

🟡 → ✅ CSV truncation

parseCsv now carries total, and the Table.Caption renders Showing first 100 of N rows only when parsed.total > MAX_ROWS (CsvTable.tsx:50-57). Matches the "no silent caps" guidance. ✅

🟢 Minor items — all reasonably dispositioned

formatXml best-effort, dropped copyable for CSV/HTML/PDF, no PDF sandbox (Chrome's viewer needs scripts/same-origin — confirmed reasonable), and the uniform Stack gap="xs" wrappers: each acknowledgement is defensible. No objection.

The BinaryNotice extraction to break the would-be PdfFrame → ContentViewer import cycle is a nice touch, and reusing it as the malformed-base64 fallback keeps the degradation consistent across branches.


Verdict: All blocking/medium findings addressed; the StrictMode fix is correct and now has a regression test. Nothing here blocks merge. ✅
feat/1329-expand-content-viewer

@cliffhall cliffhall merged commit baf79b8 into v2/main Jun 30, 2026
1 check passed
@cliffhall cliffhall deleted the feat/1329-expand-content-viewer branch June 30, 2026 13:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

v2 Issues and PRs for v2

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Expand the resource content viewer to render PDF, CSV, HTML, XML, and syntax-highlighted code

1 participant