[newsfeed] add allowBasicHtmlTags option for basic emphasis#4176
[newsfeed] add allowBasicHtmlTags option for basic emphasis#4176egeekial wants to merge 2 commits into
Conversation
Render a strict allowlist of basic formatting tags (b, strong, i, em, u) in news titles and descriptions, while neutralizing all other HTML. Feeds such as The Atlantic encode emphasis as entities (<em>), which html-to-text decoded to a literal <em> string that the template then auto-escaped, so the raw tag was shown on screen. The new opt-in allowBasicHtmlTags option (default false) sanitizes both fields by escaping everything and restoring only the exact, attribute-free allowlisted tags, so the result is safe to render and arbitrary HTML/script injection is impossible. Adds unit tests for the sanitizer and an e2e test covering rendering and an injection attempt. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
do you ever envision the user wanting/needing to adjust the list of tags in newfeedfetcher.js if this is approved, will you also create a matching PR for the documentation repo? |
My intent was to keep this limited to basic formatting and make the feature something users can simply turn on or off. I could imagine a user wanting finer control, such as allowing italics but not bold or underline, but I suspect most users would not want or need that level of granularity. If you think that configurability would be beneficial, I would prefer exposing it as a subset of the existing safe list rather than allowing users to add arbitrary tags. Letting users add new tags seems more likely to create rendering issues or introduce an injection risk.
Yes, I can do that. |
|
I think this is a useful improvement. Thanks! :) Building on @sdetweil’s point about configurability, I think a single option like Suggested behavior:
This would keep the security model strict, avoids surprising defaults, no need for multiple options and still gives users flexible control. I think it should be possible to implement this with a reasonable level of complexity. |
I totally agree: nice pr and idea, but needs a little tweaking like kristjan proposes. |
I don't know enough about the tags in the content. Are we sure this is the finite list.. (without us having to take PRs every so often, slowing functional delivery. One thing I've learned over the years is that I learn new stuff constantly about how users expect the system to behave. I like the PR, and the selectable list.. |
…gs allowlist Per PR feedback, expose a single configurable array option instead of a boolean. Users opt into individual formatting tags chosen from a hardcoded safe list (b, strong, i, em, u, br, code, s, sub, sup); the default is [] so current behavior is unchanged. Any requested tag outside the safe list is ignored with a warning, keeping the strict allowlist-only security model. sanitizeBasicHtml now takes the validated allowlist as an argument and builds its selectors/restore-regex from it. br is handled as a void element (emitted as a single <br>) when allowed, and otherwise keeps collapsing to a space. The Nunjucks template uses a length check since an empty array is truthy. Updates the unit and e2e tests/config to the new option, adding cases for per-tag allowlisting, the empty-list default, the new safe tags, and br. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Please make sure that you have followed these 3 rules before submitting your Pull Request:
Render a strict allowlist of basic formatting tags (b, strong, i, em, u) in news titles and descriptions, while neutralizing all other HTML.
Feeds such as The Atlantic encode emphasis as entities (<em>), which html-to-text decoded to a literal string that the template then auto-escaped, so the raw tag was shown on screen. The new opt-in allowBasicHtmlTags option (default false) sanitizes both fields by escaping everything and restoring only the exact, attribute-free allowlisted tags, so the result is safe to render and arbitrary HTML/script injection is impossible.
Adds unit tests for the sanitizer and an e2e test covering rendering and an injection attempt.
Before screenshot:

After screenshot:
Note: Sometimes the development moves very fast. It is highly
recommended that you update your branch of
developbefore creating apull request to send us your changes. This makes everyone's lives
easier (including yours) and helps us out on the development team.
Thanks again and have a nice day!