Skip to content

Feature/ab#32008 add office extraction support#2048

Draft
jacobwillsmith wants to merge 3 commits intodevfrom
feature/AB#32008-AddOfficeExtractionSupport
Draft

Feature/ab#32008 add office extraction support#2048
jacobwillsmith wants to merge 3 commits intodevfrom
feature/AB#32008-AddOfficeExtractionSupport

Conversation

@jacobwillsmith
Copy link

Pull request overview

This pull request adds Office document text extraction support for AI attachment summarization.

It enables .docx, .xls, and .xlsx content to be extracted and normalized before it is sent into AI attachment analysis, improving summary quality and reducing metadata-only fallbacks when readable file content exists.

Changes:

  • Added Word/Excel extraction paths in TextExtractionService for .docx, .xls, and .xlsx
  • Reused shared normalization/cleanup helpers for Office extraction output before AI prompt usage
  • Added bounded extraction limits for Office parsing to keep processing constrained
  • Added NPOI package reference to the application project

# Conflicts:
#	applications/Unity.GrantManager/src/Unity.GrantManager.Application/AI/TextExtractionService.cs
#	applications/Unity.GrantManager/src/Unity.GrantManager.Application/Unity.GrantManager.Application.csproj
@github-actions
Copy link

🧪 Unit Test Results (Parallel Execution)

Tests

📊 Summary

Result Count
✅ Passed 451
❌ Failed 0
⚠️ Skipped 0

📄 HTML Reports

  • Merged Tests (HTML): Included in artifacts
    Generated automatically by CI.

@github-actions
Copy link

🧪 Unit Test Results (Parallel Execution)

Tests

📊 Summary

Result Count
✅ Passed 451
❌ Failed 0
⚠️ Skipped 0

📄 HTML Reports

  • Merged Tests (HTML): Included in artifacts
    Generated automatically by CI.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant