Minimind-V
Visit Toolminimind-v is an open-source project for training a small visual language model (VLM). It enables training a 65M-parameter VLM from scratch in just 2 hours with minimal cost.
At a glance
Trending
minimind-v is an open-source project for training a small visual language model (VLM). It enables training a 65M-parameter VLM from scratch in just 2 hours with minimal cost.
Trending
About
minimind-v is an open-source project designed to facilitate the training of small visual language models (VLMs) from scratch. With a focus on accessibility and efficiency, it allows users to train a 65M-parameter VLM in approximately two hours, costing as little as 3 RMB. The project provides a comprehensive framework including the minimal structure of VLM large models, dataset cleaning, pre-training, and SFT (Supervised Fine-Tuning) code. It serves as both a minimal implementation of an open-source VLM and a concise tutorial for those new to visual language models, aiming to democratize access to multimodal AI development. The project supports various model sizes, from 26M to 200M parameters, and includes features like dynamic model scanning and WebUI support.
Capabilities
Pricing & Plans
Open Source
Free
FAQs
Trending