Ktransformers
Visit Toolktransformers is an open-source framework for optimizing large language model inference and fine-tuning. It leverages CPU-GPU heterogeneous computing for enhanced efficiency and performance.
At a glance
Trending
ktransformers is an open-source framework for optimizing large language model inference and fine-tuning. It leverages CPU-GPU heterogeneous computing for enhanced efficiency and performance.
Trending
About
KTransformers is an open-source research project focused on efficient inference and fine-tuning of large language models (LLMs) through CPU-GPU heterogeneous computing. It comprises two core modules: kt-kernel for high-performance inference kernels and kt-sft for a fine-tuning framework. kt-kernel offers CPU-optimized operations with AMX/AVX acceleration, MoE optimization, and quantization support (INT4/INT8 CPU, GPTQ GPU), with easy integration via Python API. kt-sft integrates with LLaMA-Factory for resource-efficient fine-tuning of ultra-large MoE models, supporting LoRA and production-ready features like chat and batch inference. The framework is designed for researchers and engineers working to optimize LLM performance on diverse hardware configurations.
Capabilities
Pricing & Plans
Open Source
Free
FAQs
Trending