Demotes struct elements lacking associated content#44
Conversation
📝 WalkthroughWalkthroughThe changes add logic to detect and demote contentless structure elements in PDF remediation. Helper methods determine when structure elements lack associated content; a fallback pass then removes placeholder Alt text and demotes these elements to neutral Span roles. Integration tests validate the demotion behavior on tagged PDFs with contentless figures. Changes
Sequence DiagramsequenceDiagram
participant Proc as PdfRemediationProcessor
participant StructTree as Structure Tree
participant AltSvc as Alt Text Service
participant Logger as Logger
rect rgba(70, 130, 180, 0.5)
Note over Proc: Remediation Process (Fallback Pass)
end
loop For each Figure/Link element
Proc->>StructTree: Check StructElemHasAssociatedContent()
StructTree-->>Proc: has content? (true/false)
alt No Associated Content
Proc->>StructTree: Remove Alt text
Proc->>StructTree: Set Role to Span
Proc->>Proc: Increment demotion counter
Proc->>Logger: Log demotion event
else Has Associated Content
Note over Proc: Skip remediation
end
end
Proc->>Logger: Log final summary with demotion counts
Logger-->>Proc: Summary emitted
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~28 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (2)
server.core/Remediate/PdfRemediationProcessor.cs (1)
1082-1090:DereferenceStructTreeNodeduplicatesPdfStructTreeIndex.Dereference.This method (lines 1082-1090) is byte-for-byte identical to the private
Dereferencemethod inside the nestedPdfStructTreeIndexrecord (lines 2102-2110). Consider extracting a single sharedstatichelper at thePdfRemediationProcessorclass level to avoid the duplication.♻️ Suggested consolidation
- private static PdfObject DereferenceStructTreeNode(PdfObject obj) - { - if (obj is PdfIndirectReference reference) - { - return reference.GetRefersTo(true); - } - - return obj; - }Then update
StructElemKidsContainAssociatedContent(line 1046) andPdfStructTreeIndex.Dereferencecallers to use a single shared helper, e.g.:+ private static PdfObject Dereference(PdfObject obj) + => obj is PdfIndirectReference reference ? reference.GetRefersTo(true) : obj;(rename the existing
PdfStructTreeIndex.Dereferenceto call through, or inline)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@server.core/Remediate/PdfRemediationProcessor.cs` around lines 1082 - 1090, Extract the duplicated dereference logic into a single private static helper on PdfRemediationProcessor (e.g., a method that takes PdfObject and returns PdfIndirectReference.GetRefersTo(true) when the input is a PdfIndirectReference), then update the existing PdfRemediationProcessor.DereferenceStructTreeNode and the nested PdfStructTreeIndex.Dereference to delegate to that shared helper (or remove the nested duplicate and call the new helper directly from StructElemKidsContainAssociatedContent and any PdfStructTreeIndex callers); reference the existing methods DereferenceStructTreeNode, PdfStructTreeIndex.Dereference, StructElemKidsContainAssociatedContent and the PdfStructTreeIndex record when making the change.tests/server.tests/Integration/Remediate/PdfRemediationProcessorAltTextAssociationTests.cs (1)
223-252: Note:GetFallbackAltTextForImage()return value is coupled to the processor'sPlaceholderImageAltTextconstant.
FakeAltTextService.GetFallbackAltTextForImage()returns"alt text for image", which must match the processor'sPlaceholderImageAltTextconstant forHasPlaceholderAltto trigger. If that constant ever changes, this test will silently pass without actually testing the placeholder path. Consider adding a brief comment noting the coupling, or extracting the value to a shared constant.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/server.tests/Integration/Remediate/PdfRemediationProcessorAltTextAssociationTests.cs` around lines 223 - 252, The test's FakeAltTextService.GetFallbackAltTextForImage() is hard-coded to "alt text for image" and is coupled to the processor's PlaceholderImageAltText constant which makes HasPlaceholderAlt assertions fragile; either reference a shared constant for the fallback value (extract the processor's PlaceholderImageAltText into a test-visible constant and use that in FakeAltTextService.GetFallbackAltTextForImage and/or the test assertions) or add a clear comment above FakeAltTextService.GetFallbackAltTextForImage noting the coupling to PlaceholderImageAltText so future changes don't silently break the test; update references to use the shared symbol (PlaceholderImageAltText or newly extracted constant) and adjust HasPlaceholderAlt assertions accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@server.core/Remediate/PdfRemediationProcessor.cs`:
- Around line 1082-1090: Extract the duplicated dereference logic into a single
private static helper on PdfRemediationProcessor (e.g., a method that takes
PdfObject and returns PdfIndirectReference.GetRefersTo(true) when the input is a
PdfIndirectReference), then update the existing
PdfRemediationProcessor.DereferenceStructTreeNode and the nested
PdfStructTreeIndex.Dereference to delegate to that shared helper (or remove the
nested duplicate and call the new helper directly from
StructElemKidsContainAssociatedContent and any PdfStructTreeIndex callers);
reference the existing methods DereferenceStructTreeNode,
PdfStructTreeIndex.Dereference, StructElemKidsContainAssociatedContent and the
PdfStructTreeIndex record when making the change.
In
`@tests/server.tests/Integration/Remediate/PdfRemediationProcessorAltTextAssociationTests.cs`:
- Around line 223-252: The test's
FakeAltTextService.GetFallbackAltTextForImage() is hard-coded to "alt text for
image" and is coupled to the processor's PlaceholderImageAltText constant which
makes HasPlaceholderAlt assertions fragile; either reference a shared constant
for the fallback value (extract the processor's PlaceholderImageAltText into a
test-visible constant and use that in
FakeAltTextService.GetFallbackAltTextForImage and/or the test assertions) or add
a clear comment above FakeAltTextService.GetFallbackAltTextForImage noting the
coupling to PlaceholderImageAltText so future changes don't silently break the
test; update references to use the shared symbol (PlaceholderImageAltText or
newly extracted constant) and adjust HasPlaceholderAlt assertions accordingly.
ℹ️ Review info
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
server.core/Remediate/PdfRemediationProcessor.cstests/server.tests/Integration/Remediate/PdfRemediationProcessorAltTextAssociationTests.cs
don't try to add alt text to fake figures
Summary by CodeRabbit
New Features
Tests