ShypdShypd.ai

Mle-Bench

Visit Tool

MLE-bench is an open-source benchmark for measuring how well AI agents perform at machine learning engineering. It provides code for dataset construction, evaluation logic, and agents for benchmarking.

At a glance

Pricing
Open Source
Free tier
Yes
API
No
Skill level
Technical

Trending

      

Explore

Browse AI tools by category