Docling has released version 2.96.0 with a notable new feature: a threaded PDF backend for docling-parse. This update aims to speed up document processing, especially for large-scale operations. The release also fixes an issue where the system now properly accepts JSON for transformer model types, closing a gap in configuration flexibility.
The headline addition is the new threaded docling-parse (v6) PDF backend. This backend leverages threading to parallelize PDF parsing, which can significantly reduce processing time for multi-page documents. The change is captured in commit 3c26f5a. For developers using docling in batch processing pipelines, this could mean a direct speedup.
Additionally, a fix (commit d25aea1) addresses a problem where the JSON transformers model type was not being accepted. Now it is, which means configurations that specify model type as JSON will work as expected. This is particularly relevant for users who rely on JSON-based model specifications.
Documentation also got a minor update: rendering of icons was fixed, improving readability.
Threading in document parsing isn’t new, but the integration into docling-parse v6 is a practical upgrade. I’ve seen many teams struggle with I/O-bound PDF extraction – this should help. The fix for JSON transformers removes a frustrating roadblock for anyone using JSON configurations. It's a small change but eliminates a surprising fail state.
For an open-source project used in AI workflows, these incremental improvements matter. The update doesn’t break existing setups – it adds optional performance gains and fixes a bug. If you're running docling in production, upgrading is a no-brainer.
Official Source: https://github.com/docling-project/docling/releases/tag/v2.96.0