Qwen600
Visit Toolqwen600 is an Open Source & Models tool that provides a static, CUDA-only mini inference engine for the QWEN3-0.6B model. It offers faster inference compared to llama.cpp and Hugging Face with flash-attn.
At a glance
Trending
Also listed in