Yet-Another-Applied-Llm-Benchmark
Visit ToolYet Another Applied LLM Benchmark is an Open Source & Models tool that evaluates language models on specific programming and problem-solving tasks. It uses a dataflow DSL to create diverse and sophisticated test cases, running code in isolated containers.
At a glance
Trending