just remove figures if they don't attach to any content#45
Conversation
instead of making contentless spans
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review infoConfiguration used: defaults Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughAttempt removal of contentless Figure and Link structural elements from their parent before falling back to demotion; add removal-tracking counters and identity helpers; update logs and tests to stop asserting a specific Changes
Sequence Diagram(s)(Skipped — changes are internal to remediation logic and tests, not multi-component control flow.) Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
tests/server.tests/Integration/Remediate/PdfRemediationProcessorAltTextAssociationTests.cs (1)
14-14:⚠️ Potential issue | 🟡 MinorStale test method names and assertion reason strings still describe demotion, not removal
- Method
ProcessAsync_ContentlessFigureWithoutAlt_DemotesRole(Line 14) andProcessAsync_ContentlessFigureWithPlaceholderAlt_RemovesAltAndDemotesRole(Line 58) still refer to "DemotesRole" even though the new primary behavior is removal from the parent.- The
becausestrings on Lines 45 and 93 ("contentless /Figure nodes should be demoted") contradict the PR intent.🛠️ Suggested rename
- public async Task ProcessAsync_ContentlessFigureWithoutAlt_DemotesRole() + public async Task ProcessAsync_ContentlessFigureWithoutAlt_RemovesFigureFromStructTree() - ListStructElementsByRole(outputPdf, PdfName.Figure).Should().BeEmpty("contentless /Figure nodes should be demoted"); + ListStructElementsByRole(outputPdf, PdfName.Figure).Should().BeEmpty("contentless /Figure nodes should be removed"); - public async Task ProcessAsync_ContentlessFigureWithPlaceholderAlt_RemovesAltAndDemotesRole() + public async Task ProcessAsync_ContentlessFigureWithPlaceholderAlt_RemovesAltAndFigureFromStructTree() - ListStructElementsByRole(outputPdf, PdfName.Figure).Should().BeEmpty("contentless /Figure nodes should be demoted"); + ListStructElementsByRole(outputPdf, PdfName.Figure).Should().BeEmpty("contentless /Figure nodes should be removed");Also applies to: 45-45, 58-58, 93-93
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/server.tests/Integration/Remediate/PdfRemediationProcessorAltTextAssociationTests.cs` at line 14, Rename the stale test method names and update assertion reason strings to reflect removal (not demotion): change ProcessAsync_ContentlessFigureWithoutAlt_DemotesRole and ProcessAsync_ContentlessFigureWithPlaceholderAlt_RemovesAltAndDemotesRole to names that mention "RemovesFromParent" (or similar) and update every assertion `because` string that says "contentless /Figure nodes should be demoted" to "contentless /Figure nodes should be removed from their parent" (or equivalent). Ensure you update all references to these test method names within the test class and any test runners/mappings so names remain consistent with the PR behavior.
🧹 Nitpick comments (2)
tests/server.tests/Integration/Remediate/PdfRemediationProcessorAltTextAssociationTests.cs (1)
45-46: Tests do not cover the Span-demotion fallback path
ListStructElementsByRole(outputPdf, PdfName.Figure).Should().BeEmpty()passes whether the figure was removed or demoted to Span (since a Span-demoted element no longer has role/Figure). The newListStructElements(...).Should().NotBeEmpty()check (Lines 46, 94) only verifies something remains in the structure tree, not specifically that aSpandemotion fallback element does or does not exist.Consider adding a dedicated test where the parent struct element makes removal impossible (e.g., single direct/non-indirect child dict), asserting that the Span fallback is used and the element is still present with role Span. This would give meaningful regression coverage for the two distinct code paths in
TryRemoveStructElemFromParent.Would you like me to draft a test case for the Span-demotion fallback path?
Also applies to: 93-94
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/server.tests/Integration/Remediate/PdfRemediationProcessorAltTextAssociationTests.cs` around lines 45 - 46, The tests currently don't verify the Span-demotion fallback path in TryRemoveStructElemFromParent: add a new test in PdfRemediationProcessorAltTextAssociationTests that sets up a parent struct element that cannot be removed (e.g., a parent with a single direct/non-indirect child dict), run the remediation, then assert that ListStructElementsByRole(outputPdf, PdfName.Span) contains the demoted element (and that ListStructElementsByRole(outputPdf, PdfName.Figure) no longer contains it), so the Span fallback is explicitly covered; use the existing helpers ListStructElements, ListStructElementsByRole and reference PdfName.Figure/PdfName.Span and TryRemoveStructElemFromParent behavior when constructing the scenario.server.core/Remediate/PdfRemediationProcessor.cs (1)
1058-1073: EmptyKarray left on parent after removing the last childWhen
kidsis aPdfArraycontaining only the target struct elem, afterarray.Remove(i)the parent retainsK: []. The scalar-child path (Lines 1076-1079) correctly removes theKentry entirely, but the array path does not clean up an empty result. While not technically invalid per the PDF spec, an empty K array can confuse accessibility checkers and diverges from the cleaner scalar-removal behavior.🛠️ Proposed fix: remove empty K array from parent
if (kids is PdfArray array) { var removedAny = false; for (var i = array.Size() - 1; i >= 0; i--) { var item = array.Get(i); if (!IsSameStructElem(item, structElem, targetRef)) { continue; } array.Remove(i); removedAny = true; } + if (removedAny && array.Size() == 0) + { + parent.Remove(PdfName.K); + } return removedAny; }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@server.core/Remediate/PdfRemediationProcessor.cs` around lines 1058 - 1073, The array-branch logic in PdfRemediationProcessor that iterates a PdfArray of kids and removes matching items (uses PdfArray, IsSameStructElem, structElem, targetRef and array.Remove(i)) can leave an empty K: [] on the parent; after the removal loop, if removedAny is true and array.Size() == 0, remove the "K" entry from the parent (the same way the scalar-child path does in the nearby code) so the parent dictionary no longer contains an empty K array — locate the parent object used by this method and delete its K key when the array becomes empty.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@server.core/Remediate/PdfRemediationProcessor.cs`:
- Around line 580-591: The counters contentlessFiguresRemoved and
contentlessLinksRemoved are incremented regardless of whether elements were
actually removed or only demoted; update the logic in the loop that iterates
PdfStructTreeIndex.ListStructElementsByRole so you increment separate counters
for removals vs demotions (e.g., contentlessFiguresRemoved vs
contentlessFiguresDemoted and contentlessLinksRemoved vs
contentlessLinksDemoted) based on the boolean result of
TryRemoveStructElemFromParent(figure) / TryRemoveStructElemFromParent(link), and
then change the telemetry/log messages to report removed and demoted counts
distinctly instead of claiming all were "removed".
---
Outside diff comments:
In
`@tests/server.tests/Integration/Remediate/PdfRemediationProcessorAltTextAssociationTests.cs`:
- Line 14: Rename the stale test method names and update assertion reason
strings to reflect removal (not demotion): change
ProcessAsync_ContentlessFigureWithoutAlt_DemotesRole and
ProcessAsync_ContentlessFigureWithPlaceholderAlt_RemovesAltAndDemotesRole to
names that mention "RemovesFromParent" (or similar) and update every assertion
`because` string that says "contentless /Figure nodes should be demoted" to
"contentless /Figure nodes should be removed from their parent" (or equivalent).
Ensure you update all references to these test method names within the test
class and any test runners/mappings so names remain consistent with the PR
behavior.
---
Nitpick comments:
In `@server.core/Remediate/PdfRemediationProcessor.cs`:
- Around line 1058-1073: The array-branch logic in PdfRemediationProcessor that
iterates a PdfArray of kids and removes matching items (uses PdfArray,
IsSameStructElem, structElem, targetRef and array.Remove(i)) can leave an empty
K: [] on the parent; after the removal loop, if removedAny is true and
array.Size() == 0, remove the "K" entry from the parent (the same way the
scalar-child path does in the nearby code) so the parent dictionary no longer
contains an empty K array — locate the parent object used by this method and
delete its K key when the array becomes empty.
In
`@tests/server.tests/Integration/Remediate/PdfRemediationProcessorAltTextAssociationTests.cs`:
- Around line 45-46: The tests currently don't verify the Span-demotion fallback
path in TryRemoveStructElemFromParent: add a new test in
PdfRemediationProcessorAltTextAssociationTests that sets up a parent struct
element that cannot be removed (e.g., a parent with a single direct/non-indirect
child dict), run the remediation, then assert that
ListStructElementsByRole(outputPdf, PdfName.Span) contains the demoted element
(and that ListStructElementsByRole(outputPdf, PdfName.Figure) no longer contains
it), so the Span fallback is explicitly covered; use the existing helpers
ListStructElements, ListStructElementsByRole and reference
PdfName.Figure/PdfName.Span and TryRemoveStructElemFromParent behavior when
constructing the scenario.
ℹ️ Review info
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
server.core/Remediate/PdfRemediationProcessor.cstests/server.tests/Integration/Remediate/PdfRemediationProcessorAltTextAssociationTests.cs
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
server.core/Remediate/PdfRemediationProcessor.cs (1)
580-595: Counter semantics now correctly track removed vs. demoted — LGTM.The previous review's concern about unconditional counter increments has been addressed:
contentlessFiguresRemovedis only incremented on a successfulTryRemoveStructElemFromParentreturn, andcontentlessFiguresDemotedonly on the fallback path. Same pattern is applied consistently at Lines 610–625 for links.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@server.core/Remediate/PdfRemediationProcessor.cs` around lines 580 - 595, Counters for contentless figures were fixed to increment only on the branch taken; apply the same semantics to the link-handling block: in the loop over PdfStructTreeIndex.ListStructElementsByRole(pdf, PdfName.Link) ensure you call TryRemoveStructElemFromParent(link) and only increment contentlessLinksRemoved when it returns true, otherwise set link.Put(PdfName.S, RoleSpan) and increment contentlessLinksDemoted; reference the existing methods StructElemHasAssociatedContent and TryRemoveStructElemFromParent to locate and mirror the corrected pattern used for figures.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@server.core/Remediate/PdfRemediationProcessor.cs`:
- Around line 1070-1086: In TryRemoveStructElemFromParent, when handling the
PdfArray case (variable array/kids), after removing matching items set
removedAny as now and if removedAny is true and array.Size() == 0 remove the
parent's /K entry (call parent.Remove(PdfName.K)) before returning; ensure the
method still returns true when removals occurred. This mirrors the single-child
branch and prevents leaving an empty /K array on the parent.
---
Duplicate comments:
In `@server.core/Remediate/PdfRemediationProcessor.cs`:
- Around line 580-595: Counters for contentless figures were fixed to increment
only on the branch taken; apply the same semantics to the link-handling block:
in the loop over PdfStructTreeIndex.ListStructElementsByRole(pdf, PdfName.Link)
ensure you call TryRemoveStructElemFromParent(link) and only increment
contentlessLinksRemoved when it returns true, otherwise set link.Put(PdfName.S,
RoleSpan) and increment contentlessLinksDemoted; reference the existing methods
StructElemHasAssociatedContent and TryRemoveStructElemFromParent to locate and
mirror the corrected pattern used for figures.
instead of making contentless spans
Summary by CodeRabbit
Bug Fixes
Tests