At the start of February, Prosus Group brought together 100 agent builders from its portfolio companies at AI House Amsterdam for its first Agent Bootcamp. The event aimed to advance the development of AI agents that can deliver measurable business outcomes. Nishikant Dhanuka, Senior Director of AI at Prosus Group, shared insights on the current state and future direction of AI agents.
Dhanuka noted a shift in industry focus over the past year. “A year ago, the conversation across the AI industry centered on the question: which is the smartest model? Today, the conversation has shifted to something far more consequential: for how long can your agent work autonomously before it breaks? This transition from one-shot intelligence to endurance, from conversational AI to genuine autonomy, is the defining narrative of 2026. And much of the industry is still grappling with what this shift truly means.”
In 2025, several key developments marked progress in AI agent technology. The introduction of DeepSeek allowed models to separate their reasoning process from output, enabling better planning by agents. The Model Context Protocol (MCP) became standard for integrating agents with tools. Developers increasingly used agentic Integrated Development Environments (IDEs), such as Claude Code, shifting their workflow toward agent panels rather than traditional code panels. Agentic browsers also emerged, allowing web tasks to be performed autonomously.
By year-end, OpenAI and Stripe introduced an Agentic Commerce Protocol that enabled agents to conduct real transactions beyond information retrieval. New models like Claude Opus 4.5 and GPT 5.2 were designed specifically for autonomous work rather than chat functions.
Dhanuka described a changing landscape in model performance among major AI labs: “When you plot model performance across major AI labs, the picture is striking. OpenAI’s initial lead was enormous, but today the field is crowded.” He explained that as access to advanced models becomes widespread, competitive advantage now lies in orchestration layers and application logic rather than model intelligence alone.
One notable trend was the rise of terminals as interfaces for autonomous agents. Tools like Claude Code made terminal use accessible even to non-technical users by providing structured loops—gathering context, taking action, verifying results—which enable sustained complex work.
The pace continued into January 2026 with new features such as Claude Code’s “Ralph Wiggum Loop,” CoWork for non-engineers, and sub-agent swarms through Claude Code Tasks. Clawdbot (now OpenClaw), an open-source local agent capable of managing browser and file operations while communicating via messaging apps, gained popularity as a proactive assistant.
Another significant development was Moonshot AI’s Kimi 2.5 release featuring Agent Swarm mode—a model trained through reinforcement learning to decide when to launch sub-agents.
Newer frontier models are now able to handle longer and more complex tasks autonomously—up to five hours per session—with task duration doubling roughly every six months.
Dhanuka emphasized that advances benefit all types of agents: “You might think these advances only matter for coding agents. The truth is that any agent can be represented as a coding agent if you give it access to a terminal and filesystem.” This approach allows finance or customer support agents to automate multi-step processes without human intervention.
He summarized three main takeaways:
1. Production-grade autonomy is now available; agents can maintain focus for hours.
2. Code has become a universal interface for autonomy; giving an agent shell access extends its capabilities.
3. As models become commoditized, value shifts toward building robust orchestration layers—including context management and memory architectures—that provide lasting advantages.
“At Prosus, we see this play out daily across a portfolio spanning four continents, and from food delivery to classifieds to travel,” Dhanuka said. “The teams building great agents aren’t waiting for smarter models. They’re investing in the harness, the orchestration, and the infrastructure that turns autonomy from a demo into a business outcome.”
As Dhanuka finalized his article on February 5th, both Anthropic and OpenAI released new frontier models: Claude Opus 4.6 introduces features like agent teams and adaptive thinking within a large context window; GPT-5.3-Codex is notable as OpenAI’s first model involved in debugging its own training run.
“The race is intensifying—in one direction: autonomy,” Dhanuka concluded.


