Cristina Criddle in San Francisco and Melissa Heikkilä in London
 
Stay informed with free updates
Simply sign up to the Artificial intelligence myFT Digest -- delivered directly to your inbox.
Artificial intelligence is poised to outperform humans in writing code as leading groups, including OpenAI, Anthropic and Google, race to release systems that are reshaping the software industry.
San Francisco-based OpenAI released a suite of new models this week that independent benchmarks suggest are among the best yet for computer programming.
Its new GPT-4.1, o3, and o4-mini models are more effective at solving programming problems, the latter two using ‘reasoning’, giving more time to think through complex queries.
OpenAI on Wednesday also announced a freely available system called Codex CLI, a so-called AI “agent” designed to use its models to help users with coding tasks.
These moves match recent efforts from rivals Anthropic, Google, Meta and a host of start-ups, that are betting on coding as one of the clearest early uses for large language models.
The emphasis on programming as the next frontier for AI systems signals one of the most tangible examples of how the technology could transform industries, with thousands of software developers already using new models in their work.
“This is the year . . . that AI becomes better than humans at competitive code forever,” said OpenAI’s chief product officer Kevin Weil on the Overpowered podcast this week. He compared the advances to AI surpassing humans at chess several years ago, but argued this had a more democratising impact “on the world if everybody can create software”.
Leading industry figures say LLMs have sped up the software development process by generating entire blocks of code based on a few text instructions. AI systems can also identify errors and attempt to correct them.
Over the past year, AI models have become far more capable of understanding complex patterns, reasoning over problems presented in programming, and solving them logically.
In 2023, AI systems were only able to solve 4.4 per cent of coding problems based on an industry test called SWE-bench. This figure jumped to 69.1 per cent this year.
Meanwhile, research from Microsoft’s coding platform GitHub found 92 per cent of US-based developers use AI coding tools.
“AI coding is saving thousands of dollars for an engineer,” said Misha Laskin, co-founder and chief executive of coding start-up Reflection AI. “For some of these categories, it is able to do it for you on demand with something you might have paid $10,000 for. We’re entering an unprecedentedly large market.”
Start-ups such as Reflection have attracted strong investor interest, with it raising $130mn to date with funding from Sequoia and Lightspeed. Anysphere, behind coding automation tool Cursor, raised $105mn at a $2.5bn valuation in January.
“We are driving the cost down of what it means to do intelligent work, that means we’re going to have to rethink what some of our roles are,” said Eiso Kant, co-founder of Poolside, a start-up that raised $500mn in October at a $3bn valuation from investors including Bain Capital and Nvidia.
Meta launched a model called Code Llama last year, which uses text prompts to generate and discuss code. Anthropic has its own coding product, Claude Code, which was launched in February.
Mike Krieger, chief product officer at Anthropic, said the software engineer’s role would increasingly involve “understanding the requirements [of users], working as a team, and figuring out that what you built was actually the right thing to build”.
“It is more about advocating for your idea or seeing how those things play out [and becoming] almost like a puppet master or an orchestra conductor of these [AI] agents,” he added.
“I don’t think coding will disappear at all,” said Thomas Wolf, co-founder of Hugging Face, an open source AI platform. “Coders will just use this tool to go faster.”