The Class of 2023: What Happened to AutoGPT, BabyAGI, and the First Wave of AI Agents

The spring of agents#

In March and April 2023, a gold rush happened on GitHub. Within a few weeks, AutoGPT hit a million stars, BabyAGI got covered in every major tech publication, AgentGPT launched a polished SaaS wrapper, and dozens of copycats appeared. The premise across all of them was that GPT-4, combined with a recursion loop that let it plan and call tools, would approximate a general-purpose worker. Give it a goal, walk away, come back to a finished project.

For a few weeks, it really did feel like something had changed. I spent weekends, like many developers, watching these systems generate plans, execute sub-tasks, get stuck, get unstuck.

Three years later, most of those projects have pivoted or quieted down. The ones that survived look different. And the infrastructure that ended up mattering came from different directions entirely. This is a retrospective with verifiable current state for each major project.

AutoGPT#

Then: Launched 30 March 2023 by Toran Bruce Richards. The iconic project that defined the category.

Now: Still active at github.com/Significant-Gravitas/AutoGPT. Around 183,000 stars — high but plateauing. The project pivoted to "AutoGPT Platform", a low-code workflow builder with Agent Builder, Forge framework, and the agbenchmark suite. The original autonomous-loop premise was replaced with configurable agents running inside defined workflows, which is structurally the opposite of the 2023 vision.

The lesson: The AutoGPT team iterated honestly. When the autonomous-goal-execution premise did not produce a sellable product, they rebuilt the core as a workflow tool with AI steps rather than an autonomous agent. That is what most of the copycats did not do.

BabyAGI#

Then: Launched late March 2023 by Yohei Nakajima. A few hundred lines of Python wiring a task-management loop to GPT-4.

Now: The original repo is archived as babyagi_archive (September 2024 snapshot). Nakajima shipped BabyAGI 2 (functionz) in September 2024, but activity has been minimal through 2026. The project is effectively a research sketch, not a production platform.

The lesson: BabyAGI was always a research sketch, not a product. That was clear from how it was written. Its influence is in the patterns it contributed (a task list, a queue, a loop) which turned up in almost every later agent project. Treat it as a seminal paper in code form, not a failed product.

AgentGPT / reworkd#

Then: A clean web UI for an AutoGPT-style agent, launched April 2023 by reworkd. First time many non-technical people encountered agent-style AI.

Now: The company pivoted in July 2024 to AI-powered web-scraping. TechCrunch coverage reports they were spending around $2,000 per day in API calls on AgentGPT without reaching profitability. They raised $2.75 million from Paul Graham, Nat Friedman, Daniel Gross's AI Grant, SV Angel, and General Catalyst, and now sell structured data extraction at scale. AgentGPT's legacy interface still exists at agentgpt.reworkd.ai but is not where the revenue is.

The lesson: reworkd recognised early that a general-purpose agent product was not going to find paying customers, and pivoted to a narrow, data-extraction use case where the AI-agent pattern actually worked. Probably the best-executed example of "start with the hype, end with narrow-and-useful" in this cohort.

MetaGPT#

Then: Launched mid-2023 as a multi-agent software-engineering framework ("assign different agents to different roles in a software team").

Now: Still very active. github.com/FoundationAgents/MetaGPT (renamed from geekan/MetaGPT) has around 67,200 stars and ongoing commits. The team launched MGX (MetaGPT X) in February 2025, and their AFlow paper was accepted at ICLR 2025 as an oral presentation (top 1.8% of submissions). This is one of the success stories of the cohort: they found a specific niche (multi-agent software engineering research) and stayed focused on it.

The lesson: Narrow focus and academic rigour worked. MetaGPT is not a consumer product, but it has become a credible research platform, which is a real outcome.

GPT-Engineer#

Then: Launched mid-2023 by Anton Osika as a command-line code-generation tool.

Now: The original repo has around 55,200 stars, but the last release was v0.3.1 in June 2024. The README now redirects to lovable.dev (Osika's next company) or aider as alternatives. The repo is effectively in maintenance mode; the community-run gpt-engineer-org exists as a successor shell but with reduced activity.

The lesson: When the founder moves to a new project, the original usually does not survive. Osika moved to lovable.dev, which has its own growth trajectory. gpt-engineer is frozen as a legacy artifact.

smol-ai / developer#

Then: "smol developer" was a minimal, opinionated code-generation agent that trended in mid-2023.

Now: github.com/smol-ai/developer has around 12,200 stars but only 124 total commits and 69 open issues. The smol.ai team has shifted attention to AI News, their newsletter and publication, which is actively updated through 2026. The developer project itself is technically available but effectively dormant.

The lesson: Open-source projects need sustained maintainer attention. When the founder's attention moves, the project usually freezes unless a committed successor community forms. For smol developer, neither happened at scale.

Microsoft AutoGen#

Then: Released late 2023 by Microsoft Research as a multi-agent conversation framework. Became one of the most-starred agent frameworks.

Now: github.com/microsoft/autogen has around 57,200 stars and the last release was python-v0.7.5 on 30 September 2025. The repo banner explicitly states "maintenance mode, no new features", and Microsoft recommends the new Microsoft Agent Framework (RC 1.0 on 19 February 2026) as the successor. The AutoGen community forked the project as AG2 (ag2ai/ag2) in November 2024 and continues active development toward v1.0.

The lesson: Even well-resourced Microsoft projects can get sunset when internal strategy shifts. The fact that AutoGen is now in maintenance mode while the Agent Framework is positioned as its successor is a clear signal about where Microsoft thinks the market is going.

HuggingGPT / JARVIS#

Then: Microsoft Research paper published at NeurIPS 2023 (arXiv 2303.17580), demonstrating an LLM orchestrating specialised models via Hugging Face.

Now: The microsoft/JARVIS repo still exists but has had no major releases through 2025-2026. The research team shipped follow-ups (TaskBench in November 2023, EasyTool in January 2024) but the HuggingGPT brand did not become a product. The paper itself remains frequently cited in tool-routing research, so it survived as an academic contribution, not as an artifact.

The lesson: Academia processed the agent moment differently from industry. The papers contributed real ideas that survived into 2026 in refined form. The industry projects that copied the same ideas without the academic framing mostly did not survive.

What actually survived — from different roots#

None of the 2026 agent-shaped businesses making real money traced their lineage to AutoGPT or BabyAGI. The successful patterns came from different directions:

Cursor (founded 2022, refocused on AI coding 2023): over $2 billion ARR, 1 million+ daily active users (TechCrunch, March 2026).
Claude Code (launched February 2025): around $2.5 billion run-rate by early 2026.
Replit Agent (launched 2024): $240 million ARR in 2025, $9 billion valuation reported in 2026.
GitHub Copilot: 20 million users, about $800 million ARR.
Devin (launched March 2024, walked back after Upwork demo debunking): became a more bounded product. Cognition reported over $73 million ARR after the Windsurf acquisition.

The pattern I wrote about in The Quiet Death of AI Agents: narrow scope, human oversight, artifact-producing outputs. The 2023 class tried to build the general case. The class of 2024 and later accepted that the general case is a research problem and built for specific, valuable niches.

The useful test for the next wave#

Something new is going to hit later this year or next. The ingredients are in place: better models, better tools, MCP as infrastructure, the GAIA leaderboard showing real progress on agent benchmarks. When it happens, the test I will apply:

Is the scope bounded? If the pitch starts with "general-purpose", it is likely the 2023 pattern again.
Is the output an artifact? Can a human see what the agent produced and reject it? If the "output" is actions in external systems, the blast radius of errors is too wide.
Can a human supervise? Does the product expect you to be in the loop, or does it promise you can walk away?
Are the failure modes documented? The best 2026 agent products (Claude Code, Computer Use) are remarkably frank about what does not work.

The class of 2023 was not a failure. It was a useful experiment that taught the field what does not work. The people who learned from it built the tools that are now genuinely useful. The people who kept trying to make AutoGPT-style general agents work mostly moved on. That is a useful outcome, even if it was not the one the early posts promised.

Sources#

AutoGPT repo: https://github.com/Significant-Gravitas/AutoGPT
BabyAGI archive: https://github.com/yoheinakajima/babyagi_archive
BabyAGI 2: https://github.com/yoheinakajima/babyagi
Reworkd pivot, TechCrunch, July 2024: https://techcrunch.com/2024/07/24/reworkd-paul-graham-nat-friedman-daniel-gross-scrape-ai-agents/
MetaGPT repo: https://github.com/FoundationAgents/MetaGPT
GPT-Engineer repo: https://github.com/AntonOsika/gpt-engineer
smol-ai/developer: https://github.com/smol-ai/developer
Microsoft AutoGen: https://github.com/microsoft/autogen
Microsoft Agent Framework: https://github.com/microsoft/agent-framework
AG2 community fork: https://github.com/ag2ai/ag2
HuggingGPT paper (arXiv 2303.17580): https://arxiv.org/abs/2303.17580
Microsoft JARVIS repo: https://github.com/microsoft/JARVIS
Devin walkback, Pragmatic Engineer: https://newsletter.pragmaticengineer.com/p/the-pulse-90
Cursor $2B ARR, TechCrunch March 2026: https://techcrunch.com/2026/03/02/cursor-has-reportedly-surpassed-2b-in-annualized-revenue/

Roland Hentschel

AI & Web Technology Expert

Web developer and AI enthusiast helping businesses navigate the rapidly evolving landscape of AI tools. Testing and comparing tools so you don't have to.

The Class of 2023: What Happened to AutoGPT, BabyAGI, and the First Wave of AI Agents

The spring of agents#

AutoGPT#

BabyAGI#

AgentGPT / reworkd#

MetaGPT#

GPT-Engineer#

smol-ai / developer#

Microsoft AutoGen#

HuggingGPT / JARVIS#

What actually survived — from different roots#

The useful test for the next wave#

Further reading#

Sources#

Roland Hentschel

Tools Covered in This Post

GitHub Copilot Guide 2026: Worth It for Developers?

Cursor Guide 2026: The AI-First Editor That Changed My Workflow

Zapier Guide 2026

More from the Blog

AI Detection Tools Are Broken. The Evidence and What Editors Do Instead

The AI Image Generation War Is Over. Here Is the 2026 Map

The Deep Research Showdown: Claude, ChatGPT, Perplexity, and Gemini Compared