Skip to content

Latest commit

 

History

History
560 lines (289 loc) · 32.8 KB

File metadata and controls

560 lines (289 loc) · 32.8 KB

v5.10.1 - 2026-04-24

Fix

  • Memory management for docling upstream (#263) (db84017)

v5.10.0 - 2026-04-22

Feature

Fix

v5.9.0 - 2026-04-15

Feature

  • Adding the cpp analysis script and enhancing the extraction of bitmap types (fix for rotated images). (#250) (70fa300)

v5.8.0 - 2026-04-08

Feature

  • Improve extraction from fillable fields (#247) (c3c1e85)

Fix

v5.7.0 - 2026-04-01

Feature

v5.6.2 - 2026-03-29

Fix

  • Prevent infinite loop in TOC extraction with circular PDF refererences (#246) (092d1b8)

v5.6.1 - 2026-03-24

Fix

v5.6.0 - 2026-03-20

Feature

Performance

  • Optimize stream decoding with regexp fast path (#242) (e54775c)

v5.5.0 - 2026-03-04

Feature

v5.4.2 - 2026-03-03

Fix

  • Ligatures and unicode chars in Differences (#234) (856c0fe)

v5.4.1 - 2026-03-03

Fix

v5.4.0 - 2026-02-24

Feature

  • Add config option to remove glyph output (#231) (9657023)

Fix

v5.3.4 - 2026-02-23

Fix

v5.3.3 - 2026-02-20

Fix

  • Replace fixed-size utf8::append buffers with std::back_inserter to prevent segfaults (#224) (237cef6)
  • Bridge PointerHolder to std::shared_ptr for qpdf 10.x + (#221) (b0817db)

v5.3.2 - 2026-02-17

Fix

v5.3.1 - 2026-02-17

Fix

  • Deal with image containing rotated pages (#217) (0b592f6)

v5.3.0 - 2026-02-16

Feature

  • Refactor pdf resources to pdf page item (#215) (e7812a1)
  • Refactored the code and removed a lot of extra json parameters (#213) (67d2922)
  • Removing the json from the pdf-parser (#210) (3272dd8)
  • Renaming lines to shapes and enriching with graphics (color, filling and stroking) (#209) (ea5f1d8)
  • Add decoding config to decode_page (#208) (f01ce84)
  • Add-image-extraction (#207) (25672da)

Fix

  • Recursively traverse parent chain for inherited MediaBox (#204) (bb0b4ef)

Performance

v5.2.0 - 2026-01-30

Feature

Performance

v5.1.0 - 2026-01-26

Feature

v5.0.0 - 2026-01-20

Feature

Breaking

v4.7.3 - 2026-01-13

Fix

v4.7.2 - 2025-12-02

Fix

  • "could not find the page-dimensions" error solved restoring the parent mediabox (#181) (1d3f78e)

v4.7.1 - 2025-11-05

Fix

v4.7.0 - 2025-10-20

Feature

  • Support reading password protected PDF (#169) (0c64402)

v4.6.0 - 2025-10-17

Feature

v4.5.1 - 2025-10-16

Fix

v4.5.0 - 2025-09-17

Feature

v4.4.0 - 2025-09-04

Feature

  • Reset to the old parameters in sanitation (#163) (0402b3f)

v4.3.0 - 2025-09-03

Feature

v4.2.3 - 2025-08-22

Fix

v4.2.2 - 2025-08-19

Fix

v4.2.1 - 2025-08-19

Fix

v4.2.0 - 2025-08-19

Feature

v4.1.0 - 2025-06-24

Feature

Fix

v4.0.5 - 2025-06-13

Fix

v4.0.4 - 2025-06-10

Fix

  • Fix cropbox if it is larger than mediabox (#126) (a157d5a)

v4.0.3 - 2025-06-05

Fix

  • Filenames with unicode chars on Windows (#124) (ec6556b)

v4.0.2 - 2025-06-04

Fix

v4.0.1 - 2025-04-09

Fix

  • Use FontMatrix to scale Type3 font metrics (#113) (38ddbb5)

v4.0.0 - 2025-03-14

Feature

  • Update API, naming, and tests. Move data model to docling-core (#107) (ca7d584)

Fix

  • Update mergify config for major releases (#109) (e6225c9)

Breaking

  • Update API, naming, and tests. Move data model to docling-core (#107) (ca7d584)

v3.4.0 - 2025-02-18

Feature

  • Establish char_cells, word_cells and line_cells, other fixes (#101) (c2f9741)

v3.3.1 - 2025-02-13

Fix

Documentation

  • Updated import for pdf_parser_v2 in README (#100) (01238dd)
  • Fixed broken link in README.md (#97) (8ec116e)

v3.3.0 - 2025-02-06

Feature

Fix

v3.2.0 - 2025-02-02

Feature

  • Added the pure chars and fixed the duplicate text (#91) (9718762)

Fix

Documentation

  • Fix unit of measure of processing speed (#89) (760b932)

v3.1.2 - 2025-01-27

Fix

  • Added more updates to better font-parsing (#87) (de18986)

v3.1.1 - 2025-01-21

Fix

  • Move autoflake to dev dependencies (#86) (eed5080)

v3.1.0 - 2025-01-17

Feature

  • Update for complex fonts, rendering, and experimental high-level API (#82) (525ed8e)

v3.0.0 - 2024-12-09

Feature

  • Massive quality improvements to v2 parser and new sanitize_cells API (#73) (1fccb29)

Breaking

  • Massive quality improvements to v2 parser and new sanitize_cells API (#73) (1fccb29)

v2.1.2 - 2024-11-22

Fix

v2.1.1 - 2024-11-21

Fix

v2.1.0 - 2024-11-20

Feature

  • Add the export of annotations and ToC (#58) (22cf280)

v2.0.5 - 2024-11-20

Fix

v2.0.4 - 2024-11-13

Fix

  • Removing asserts that break parse-v2 (#55) (bb978c2)

v2.0.3 - 2024-11-05

Fix

  • Replace all the FATAL with ERROR messages in the v2 parser (#53) (cd15d00)

v2.0.2 - 2024-10-30

Fix

  • Improve qpdf optimization options (#52) (82284d4)

v2.0.1 - 2024-10-25

Fix

v2.0.0 - 2024-10-23

Feature

  • Upgrade to v2.0.0 (#48) (6fdd748)
  • Fixed the v2 parser to only return the pages that are requested (#47) (48451ad)

Breaking

v1.6.2 - 2024-10-18

Fix

  • Cmake-cxxopts by using similar approach as glm (#44) (6427726)

v1.6.1 - 2024-10-18

Fix

v1.6.0 - 2024-10-11

Feature

  • Add an experimental v2 parser to improve performance (#29) (e5856f0)

v1.5.1 - 2024-10-10

Fix

  • Allow more compatible pywin32 versions (#40) (68b848c)

v1.5.0 - 2024-10-10

Feature

v1.4.1 - 2024-10-02

Fix

  • Windows build properly linking to system libraries (#36) (e26ed05)

v1.4.0 - 2024-10-02

Feature

Fix

v1.3.1 - 2024-09-30

Fix

v1.3.0 - 2024-09-20

Feature

v1.2.1 - 2024-09-18

Fix

v1.2.0 - 2024-09-09

Feature

v1.1.3 - 2024-08-30

Fix

v1.1.2 - 2024-08-29

Fix

v1.1.1 - 2024-08-23

Fix

v1.1.0 - 2024-08-22

Feature

  • Deal with qpdf errors on a page by page basis (#11) (400fcb3)

v1.0.0 - 2024-08-22

Feature

Breaking

v0.3.1 - 2024-08-22

Fix

v0.3.0 - 2024-08-21

Feature

v0.2.0 - 2024-08-13

Feature

v0.1.0 - 2024-08-07

Feature

v0.0.1 - 2024-08-07

Fix