Chandra is an OCR model that converts images and PDFs into structured HTML/Markdown/JSON, preserving layout information. It excels at handling complex tables, forms, and handwriting across over 90 languages.
Chandra OCR 2 is a state-of-the-art OCR model developed by Datalab that transforms images and PDFs into structured HTML, Markdown, or JSON formats, meticulously preserving layout details. It boasts significant improvements in handling math, tables, and multilingual OCR, supporting over 90 languages with excellent handwriting recognition. The model accurately reconstructs forms, including checkboxes, and performs strongly with complex layouts. Users can extract images and diagrams with captions and structured data. Chandra offers two inference modes: local (HuggingFace) and remote (vLLM server), with CLI tools and an interactive Streamlit app for ease of use. A managed platform with higher accuracy, zero data retention, and SOC 2 Type 2 compliance is available through Datalab's API.
Best used for
Ideal for developers and data scientists who need to automate document intelligence, extract data from complex forms, and process multilingual content. Especially valuable for high-volume workloads requiring accurate layout preservation and structured output for further analysis or integration.
What kind of documents can Chandra OCR 2 process effectively?
Chandra OCR 2 is designed to handle a wide range of complex documents, including those with intricate tables, various forms, and diverse handwriting styles. It also excels with mathematical equations, multi-column layouts, and documents in over 90 languages, preserving full layout information.
What are the different ways to use Chandra OCR 2?
You can use Chandra OCR 2 via CLI tools for single files or directories, with options for vLLM (recommended) or HuggingFace backends. An interactive Streamlit web app is also available for single-page processing. For production, a vLLM server can be launched via Docker.
Is Chandra OCR 2 suitable for commercial use?
The code for Chandra is Apache 2.0 licensed. However, the model weights use a modified OpenRAIL-M license, which is free for research, personal use, and startups under $2M funding/revenue. For broader commercial licensing or to remove OpenRAIL requirements, a commercial license is required.
How does Chandra OCR 2 compare to other OCR models?
Chandra OCR 2 tops external benchmarks like olmocr and shows significant improvements in internal multilingual benchmarks. It outperforms many competitors, including previous Chandra versions, dots.ocr, and even large language models like GPT-4o and Gemini Flash, especially in complex document scenarios.