2026-02-04, 14:00–18:00, B.1.031
Docling is rapidly becoming the de-facto standard in open source document AI. The project has achieved remarkable adoption with over 45K GitHub stars, more than 1.5 million monthly downloads, and multiple top rankings on global GitHub and HuggingFace trending leaderboards. Incubated as a Linux Foundation AI & Data project, Docling provides local-first, enterprise-grade capabilities, excelling at parsing complex layouts, extracting tables, and converting unstructured documents into AI-ready structured formats.
In this hands-on session, you'll get a chance to:
- ingest and parse multiple doc formats including PDF, DOCX and more
- convert complex tables into usable formats
- extract and prepare images for AI processing
- preserve metadata for visual grounding
- explore AI integration with frameworks like LangChain to power RAG and model training
Project links:
- https://www.docling.ai/
- https://github.com/docling-project
- https://lfaidata.foundation/projects/docling/
Carol Chen is a Community Architect at Red Hat, supporting and promoting various upstream communities such as Docling, InstructLab, Ansible and ManageIQ. She has been actively involved in open source communities while working for Jolla and Nokia previously. In addition, she also has experiences in software development/integration in her 12 years in the mobile industry. Carol has spoken at events around the world, including AI_Dev in France and OpenInfra Summit in China. On a personal note, Carol plays the Timpani in an orchestra in Tampere, Finland, where she now calls home.