Llama3.Java
Visit Toolllama3.java is an Open Source & Models tool that provides Llama 3+ inference in pure Java. It supports various Llama models and quantizations, offering fast matrix-vector multiplication routines.
At a glance
Trending
llama3.java is an Open Source & Models tool that provides Llama 3+ inference in pure Java. It supports various Llama models and quantizations, offering fast matrix-vector multiplication routines.
Trending
About
llama3.java is an open-source project enabling Llama 3, 3.1, and 3.2 inference implemented entirely in a single Java file. Based on Andrej Karpathy's llama2.c, it serves both educational purposes and as a platform for testing and tuning compiler optimizations on the JVM, particularly for the Graal compiler. Key features include a single-file, dependency-free implementation, GGUF format parsing, a Llama 3+ tokenizer, and support for various weight formats (F16, BF16, F32) and quantizations (Q4_0, Q4_1, Q4_K, Q5_K, Q6_K, Q8_0). It also offers fast matrix-vector multiplication using Java's Vector API, a simple CLI with chat and instruct modes, and GraalVM Native Image support for AOT model pre-loading, ensuring instant time-to-first-token.
Capabilities
Pricing & Plans
Open Source ยท Free
Free
FAQs
Trending