Distributed-Llama
Visit ToolDistributed-llama enables distributed LLM inference by connecting home devices into a powerful cluster, accelerating performance through tensor parallelism and high-speed synchronization. It supports Linux, macOS, and Windows, optimized for ARM and x86_64 AVX2 CPUs.
At a glance
Trending