Dwayne.xyz

Reading List

Putnam-AXIOM Variation from Michael Tsai RSS feed.

Putnam-AXIOM Variation

Michael Tsai

Aryan Gulati et al. (PDF, via Hacker News): As large language models (LLMs) continue to advance, many existing benchmarks designed to evaluate their reasoning capabilities are becoming saturated. Therefore, we present the Putnam-AXIOM Original benchmark consisting of 236 mathematical problems from the William Lowell Putnam Mathematical Competition, along with detailed step-by-step solutions. To preserve the […]

tech
apple