OpenAI on X

--- Summary:

Today we’re launching SWE-Lancer—a new, more realistic benchmark to evaluate the coding performance of AI models.
SWE-Lancer includes over 1,400 freelance software engineering tasks from Upwork, valued at $1 million USD total in real-world payouts.
https://t.co/c3pFcL41uK— OpenAI (@OpenAI) February 18, 2025

--- Full Article:

Author: OpenAI Profile: https://twitter.com/OpenAI Source: https://x.com/OpenAI/status/1891911123517018521

--- Embedded Post (converted):

Today we’re launching SWE-Lancer—a new, more realistic benchmark to evaluate the coding performance of AI models. SWE-Lancer includes over 1,400 freelance software engineering tasks from Upwork, valued at $1 million USD total in real-world payouts. https://t.co/c3pFcL41uK— OpenAI (@OpenAI) February 18, 2025

Keen's Clippings

Explorer

OpenAI on X

Graph View