Screenshot from a Devin introduction speech by Cognition Labs.

Devin – The AI developer

Devin was released to the world on 12th March by Cognition Labs. This stealth startup consists of 10 people, and has received $21 million from Peter Thiel’s Founders Fund. As a result it has produced an AI agent that outperforms the current offerings from tech giants.

You can see it in action in this tweet from Cognition AI’s Twitter / X account:

Origin story

Perhaps more remarkable than the agent, are the original founders:

Steven Hao, Scott Wu and Walden Yan (left to right)
  • Scott Wu – Chief Executive Officer

    A child maths prodigy, who won the International Olympiad in Informatics (IOI) for Statistics three years in row. Achieving an unbeated 100% score in 2014. Watch this impressive video of him in action as a child.

    He is also highly regarded in the competitive programming scene, and has achieved legendary grandmaster rank on CodeForces.


  • Steven Hao – Chief Technology Officer

    An international grandmaster on CodeForces, and a gold and silver medallist in the IOI Statistics competition.


  • Walden Yan – Chief Product Officer

    A highly capable programmer, having reached grandmaster on CodeForces. He also achieved gold in the 2020 IOI Statistics contest.

The current team at Cognition Labs consists of 10 members, who also have some remarkable credentials. Amongst them, they have 10 IOI gold medals, including Scott’s brother Neal Wu who also won 3 IOI gold medals.

Benchmark

How does Devin compare against other AI models?

This graph for the SWE-bench benchmark, measures how many real world software engineering tasks are completed:

Bar graph for Reald World Software Engineering Performance showing that Devin resolved 13.86% of issues. The next highest is Claude 2 which resolved 4.80% of issues.

At 13.86% of issues resolved, Devin greatly outperforms the closest rival Claude 2 by almost 3 times. Considering that Devin was completely unassissted, whereas the other models were told exactly which files to edit, shows how much more capable it is.

What’s more impressive is the difference in resources behind Cognition Lab’s Devin and Anthropic’s Claude 2.

Anthropic is a 240 people company which raised over $7 billion in funding in July 2023 alone. This far outweighs Cogntion Labs, which has a size of 10 people and $21 million in funding.

Capabilities

Devin can complete many tasks that a software engineer may come across, including:

Keep in mind that Devin is only in alpha, so it will only improve from here. Meaning it is a possibility that it will replace human software engineers, but whether it will have the common sense to work without close human oversight remains to be seen.

Read more about Devin on the Cognition blog.