LLMSpeculativeSampling
Visit ToolLLMSpeculativeSampling is a Coding & Development tool that accelerates large language model inference using speculative decoding. It employs a smaller approximation model to generate token guesses, which are then validated by a larger target model for improved efficiency.
At a glance
Trending