Ai
Docling v2.91.0 Improves Document Image Extraction and Parsing Reliability

Docling v2.91.0 Improves Document Image Extraction and Parsing Reliability

Docling v2.91.0 Improves Document Image Extraction and Parsing Reliability

Docling v2.91.0 delivers a focused update aimed at improving document conversion accuracy and parser reliability across several formats. The headline change is new support for extracting VML images embedded through v:imagedata elements in DOCX files, while the rest of the release tightens validation and fixes edge cases in OCR, HTML processing, VLM prompting, and PowerPoint note handling.

What Changed

The main feature in this release is improved DOCX media extraction. Docling can now extract VML-based images referenced with v:imagedata, which should improve coverage for older or non-standard Word documents where images are not stored through the more typical modern drawing paths.

On the reliability side, the update strengthens input validation for METS-GBS processing, reducing the risk of malformed inputs causing downstream parsing issues. EasyOCR model downloading was also fixed, which should make setup and first-run behavior more dependable for OCR workflows.

Docling also removed a bogus preamble from the VLM chat template, which likely improves prompt cleanliness and reduces avoidable noise in vision-language model interactions. For HTML ingestion, the release refines image URL and size handling and includes broader fixes to the html_backend, suggesting more stable rendering and extraction behavior from web-based sources.

Finally, PPTX notes are now assigned to ContentLayer.NOTES, which gives presentation note content a clearer structural destination in the output pipeline.

Why It Matters

This is the kind of release that matters most to teams using Docling in production pipelines. The new DOCX VML image extraction closes an important compatibility gap for legacy Word documents and enterprise files that do not always follow modern formatting expectations.

The remaining fixes improve trustworthiness across multimodal and document ingestion workflows. Better METS-GBS validation can reduce processing failures, EasyOCR download fixes lower operational friction, and HTML plus PPTX handling improvements should lead to cleaner structured outputs. Together, these changes make v2.91.0 less about flashy new capability and more about broadening real-world document coverage while reducing breakage in edge cases.

Official Source: https://github.com/docling-project/docling/releases/tag/v2.91.0

What's your reaction?

0
AWESOME!
AWESOME!
0
LOVED
LOVED
0
NICE
NICE
0
LOL
LOL
0
FUNNY
FUNNY
0
EW!
EW!
0
OMG!
OMG!
0
FAIL!
FAIL!