3D-LLM
Visit Tool3D-LLM is an academic research tool that injects the 3D world into Large Language Models. It is the first LLM capable of taking 3D representations as inputs, handling both object and scene data.
At a glance
Trending
3D-LLM is an academic research tool that injects the 3D world into Large Language Models. It is the first LLM capable of taking 3D representations as inputs, handling both object and scene data.
Trending
About
3D-LLM is a pioneering Large Language Model developed by UMass-Embodied-AGI, designed to process and understand 3D representations. This tool is unique as it's the first LLM capable of directly taking 3D data as input, encompassing both individual objects (like those from Objaverse) and complex scene data (such as ScanNet and HM3D). It offers functionalities for pretraining and finetuning, with released checkpoints for various tasks like ScanQA, SQA3d, and 3DMV_VQA. The project provides detailed instructions for installation, inference, finetuning, and generating 3DLanguage data through a three-step feature extraction process, making it a valuable resource for researchers in embodied AI and 3D vision-language understanding.
Capabilities
Pricing & Plans
Open Source
Free
FAQs
Trending