Skip to content

Implement Cross-Platform PDF Generation without MS Word dependency#2111

Merged
sydseter merged 6 commits intoOWASP:masterfrom
abhijit9040:master
Feb 4, 2026
Merged

Implement Cross-Platform PDF Generation without MS Word dependency#2111
sydseter merged 6 commits intoOWASP:masterfrom
abhijit9040:master

Conversation

@abhijit9040
Copy link
Contributor

Key Changes
PDF Conversion Engine:-

  • Prioritized LibreOffice (headless mode) for PDF generation across all platforms.
  • Added automatic detection of LibreOffice on Windows, including the default installation path when it is not available in PATH.
  • Implemented a temporary user profile using -env:UserInstallation to:
  • Prevent file locking
  • Enable parallel and reliable PDF conversions
  • Improve stability in CI/CD and Docker environments

Template Migration:-

  • Replaced .docx guide templates with .odt (OpenDocument Text) to ensure consistent, pixel-accurate PDF output via LibreOffice.
  • Updated scripts/convert.py to prioritize .odt templates for the guide layout.
  • Enhanced XML Processing
  • Refactored replace_text_in_xml_file to robustly support:
  • ODT internal XML (content.xml, styles.xml)
  • Existing IDML XML structures
  • Optimized replacement logic by sorting keys from longest to shortest, preventing partial placeholder corruption during substitution.

Script Robustness:-

  • Moved Microsoft Word–dependent imports (docx, docx2pdf) into targeted functions, allowing:
  • The script to run without these libraries when LibreOffice is available
  • Cleaner execution in Linux, Docker, and CI environments
  • Improved metadata extraction and language file matching to better handle:
  • Legacy naming conventions
  • Edge cases such as against-security editions
  • Variations in language code formats

Cleanup and Performance:-

  • Automated cleanup of intermediate .odt files after successful PDF generation.
  • Added more granular debug-level logging to improve observability and troubleshooting of the conversion pipeline.

Issue:-#2110 – Implement Cross-Platform PDF Generation without MS Word dependency
image

Copy link
Collaborator

@sydseter sydseter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have a look at my comments. Thank you for your efforts!

@sydseter
Copy link
Collaborator

sydseter commented Feb 1, 2026

Remember to make sure you run the tests before pushing your commits.

@sydseter
Copy link
Collaborator

sydseter commented Feb 1, 2026

nc

@sydseter sydseter closed this Feb 1, 2026
@sydseter sydseter reopened this Feb 1, 2026
@abhijit9040 abhijit9040 requested a review from sydseter February 2, 2026 05:20
@abhijit9040
Copy link
Contributor Author

Hi @sydseter , Please review my PR and let me know if any further changes are needed.

@sydseter
Copy link
Collaborator

sydseter commented Feb 3, 2026

I will need some time to test it out. I'll get back to you.

@sydseter
Copy link
Collaborator

sydseter commented Feb 3, 2026

This works quite well. It would be great if we also could do a couple of the following things:
Could we document the installation for Libreoffice: here? https://github.com/owasp/cornucopia/blob/master/scripts/README.md

  • Windows: winget install -e --id TheDocumentFoundation.LibreOffice
  • Mac OS X: ?
  • Ubuntu: ?

@abhijit9040
Copy link
Contributor Author

Yes, that makes sense. I’ll add LibreOffice installation instructions to scripts/README.md

@abhijit9040
Copy link
Contributor Author

Hi @sydseter , Take a final look ,I have updated LibreOffice installation instructions . Let me know if any further changes are needed.

sydseter
sydseter previously approved these changes Feb 4, 2026
@sydseter
Copy link
Collaborator

sydseter commented Feb 4, 2026

@abhijit9040 some of your commits aren’t signed. Could you do a git reset and recommit using git commit signing?

…elines

- Resolved cyclomatic complexity and mypy errors in scripts/convert.py.
- Added LibreOffice installation instructions to scripts/README.md.
- Added Abhijit Sahoo to the volunteer contributor list.
- Improved GitHub Actions workflows for PR artifact commenting and secure checkout.
- Reverted unintentional changes to copi.owasp.org/Dockerfile as per feedback.
@abhijit9040
Copy link
Contributor Author

The conflicts are due to recent upstream changes . I’ll push an update shortly.

@sydseter sydseter merged commit 010475d into OWASP:master Feb 4, 2026
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants