Reading List
Putnam-AXIOM Variation from Michael Tsai RSS feed.
Putnam-AXIOM Variation
Aryan Gulati et al. (PDF, via Hacker News): As large language models (LLMs) continue to advance, many existing benchmarks designed to evaluate their reasoning capabilities are becoming saturated. Therefore, we present the Putnam-AXIOM Original benchmark consisting of 236 mathematical problems from the William Lowell Putnam Mathematical Competition, along with detailed step-by-step solutions. To preserve the […]