MPLUG-DocOwl
Visit ToolmPLUG-DocOwl is an open-source modularized multimodal large language model for document understanding. It supports OCR-free analysis and allows finetuning with custom data.
At a glance
Trending
mPLUG-DocOwl is an open-source modularized multimodal large language model for document understanding. It supports OCR-free analysis and allows finetuning with custom data.
Trending
About
mPLUG-DocOwl is a powerful open-source modularized multimodal large language model designed for comprehensive document understanding. It excels in OCR-free document analysis, enabling the extraction of information from various document types without relying on traditional optical character recognition. The tool provides training code for finetuning stronger models with custom datasets, making it highly adaptable for specific research or application needs. It supports multiple versions like DocOwl1.5 and DocOwl2, with capabilities extending to chart understanding (TinyChart) and scientific diagram analysis (PaperOwl). Demos are available on HuggingFace and ModelScope, showcasing its capabilities in tasks like DocVQA, InfoVQA, and ChartQA.
Capabilities
Pricing & Plans
Open Source
Free
FAQs
Trending