AI May Automate AI R&D by EOY 2028

Claude Mythos Preview now solves 93.9% of real-world GitHub issues on SWE-Bench, a massive leap from Claude 2's 2% in late 2023. This near-saturation of coding benchmarks is a key indicator that AI can automate its own engineering. Based on this trend, Anthropic's Jack Clark predicts a 60%+ chance of no-human-involved AI R&D by EOY 2028. This shifts the focus from AI-assisted coding to fully automated AI development.
Claude Mythos Preview's 93.9% score on the SWE-Bench benchmark signals a critical inflection point, jumping from just 2% for Claude 2 in late 2023. This demonstrates AI's ability to resolve complex, real-world GitHub issues autonomously. According to Anthropic's Jack Clark, this capability, combined with AI's growing proficiency in chaining multi-step tasks, lays the groundwork for fully automated AI R&D. He assigns a 60%+ probability that an AI system could autonomously build its own successor by the end of 2028. For engineering teams, this foreshadows a future beyond AI as a copilot, where AI systems could manage the entire R&D pipeline and accelerate progress exponentially.
Read the original → Import AI (Jack Clark)
- #ai
- #llms
- #swe-bench
- #self-improvement
Get five bites like this every day.
Tezvyn delivers a daily feed of 60-second tech bites with quizzes to lock in what you learn.