R1 surpassed OpenAI o1’s performance on benchmarks including AIME (mathematical reasoning), MATH-500 (word problems) and SWE-bench Verified (programming)