From Manus and MCP: The Cross-Boundary Exploration of AI Agents in Web3
On March 6th, Manus, the world’s first universal AI agent product launched by Chinese startup Monica, went viral on domestic tech media and social networks. On its first day, invitations were so scarce that they became a hot commodity, with some fetching prices as high as 50,000 yuan on second-hand platforms. Many industry KOLs received early access to the invitations, and a flood of experience-reviews followed.
As a universal AI agent product, Manus is capable of autonomously completing tasks from planning to execution, such as drafting reports and creating spreadsheets. It not only generates ideas but can think independently and take action. With its powerful abilities in independent thought, planning, and executing complex tasks, it delivers complete results, showcasing unprecedented versatility and execution capacity.
The explosive popularity of Manus has not only attracted attention within the industry but has also provided valuable product ideas and design inspiration for various AI agent developments. With the rapid advancements in AI technology, AI agents are gradually transitioning from concept to reality and are demonstrating tremendous application potential across various industries, including the Web3 sector.
Background Knowledge
AI Agent, which refers to an artificial intelligence agent, is a computer program capable of making decisions and executing tasks autonomously based on the environment, input, and predefined goals. The core components of an AI Agent include a Large Language Model (LLM) as its “brain,” allowing it to process information, learn from interactions, make decisions, and execute actions; perception and observation mechanisms to sense its environment; a reasoning process for analyzing observations and memories while considering potential actions; action execution as an explicit response to thought and observation; and memory and retrieval to store past experiences for learning.
The design patterns of AI Agents depart from the ReAct model, with two developmental paths: one focusing on the AI Agent’s planning ability, including REWOO, Plan & Execute, and LLM Compiler; while the other emphasizes reflective ability, including Basic Reflection, Reflexion, Self Discovery, and LATS.
Among these, the ReAct model was the first AI Agent design pattern to emerge, and it remains the most widely used. Therefore, we will primarily introduce the concept of ReAct. ReAct refers to solving various language reasoning and decision-making tasks by combining reasoning (Reasoning) and acting (Acting) within language models. Its typical process can be described in an interesting cycle: Thought → Action → Observation, abbreviated as the TAO cycle.
- Thought: When faced with a problem, we need to engage in deep thinking. This thinking process involves defining the problem, identifying the key information necessary to solve it, and outlining reasoning steps.
- Action: Once the thinking direction has been established, the next step is action. Based on our thoughts, we take appropriate measures or execute specific tasks to drive the problem toward a resolution.
- Observation: After taking action, we must carefully observe the results. This step tests the effectiveness of our actions and whether we are approaching the solution to the problem.
- Iterative Loop
AI Agents can also be categorized into Single Agent and Multi Agent based on the number of agents. Single Agent focuses on the collaboration between the LLM and tools, with the Agent interacting with the user through multiple iterations during task completion. Multi-Agent assigns different roles to different Agents, facilitating cooperation among them to complete complex tasks; however, compared to Single Agents, user interaction tends to be less frequent. Most current frameworks are centered around Single Agent scenarios.
The Model Context Protocol (MCP), launched by Anthropic on November 25, 2024, is an open-source protocol designed to solve the connection and interaction problems between LLMs and external data sources. One can liken LLMs to operating systems and MCP to USB interfaces, supporting the flexible insertion of external data and tools, enabling users to read and utilize these external resources.
MCP offers three capabilities to extend LLMs: Resources (knowledge expansion), Tools (execution functions, calling external systems), and Prompts (pre-written prompt templates). The MCP protocol adopts a client-server architecture with the underlying transmission using the JSON-RPC protocol. Anyone can develop and host an MCP Server and can take the service offline at any time.
Current State of AI Agents in Web3
In the Web3 industry, the buzz surrounding AI Agents peaked in January of this year before experiencing a significant decline, with the overall market value dropping by over 90%. The major remaining voices and valuations are still focused on the exploration of AI Agent frameworks, represented respectively by “Launchpad models exemplified by Virtuals Protocol,” “DAO models represented by ElizaOS,” and “Business models represented by Swarms.”
Launchpad allow users to create, deploy, and monetize AI Agents, similar to pump.fun in the meme sphere, but targeted toward AI Agents. Virtuals Protocol is currently the largest launchpad, with over 100,000 agents issued, and the popular “cryptocurrency KOL” AIXBT was created based on Virtuals. The platform features a modular Agent framework called G.A.M.E, designed to provide developers with an efficient and open framework that simplifies AI Agent development and launch, akin to building a website with WordPress.
DAOs represent decentralized autonomous organizations. ElizaOS (formerly ai16z) was created by @shawmakesmagic on the daos.fun platform. Its initial idea was to use AI models to simulate investment decisions made by the renowned venture capital firm a16z and co-founder Marc Andreessen, incorporating feedback from DAO members for investment purposes. It has since evolved into a DAO centered on the Eliza framework for AI agent developers. The Eliza framework, built with TypeScript, provides a flexible and scalable platform for AI Agent development, allowing these agents to interact across multiple platforms while maintaining consistent personality and knowledge.
Swarms, initiated in 2022 by @KyeGomezB, now 20 years old, is an enterprise-level multi-agent framework. Swarms enables multiple AI Agents to collaborate like a team, solving complex business operation needs through smart orchestration and efficient cooperation. Initially, Swarms was just a Web2 AI agent project. According to the founder, Swarms has over 45 million agents operating in production environments, servicing the world’s largest financial, insurance, and healthcare institutions. It transitioned from Web2 to Web3 only after issuing the $SWARMS token in December 2024.
From an economic model perspective, currently, only launchpad can achieve a self-sustaining economic cycle. For instance, in the case of Virtuals:
- Agent Creation: Creators launch new AI agents on the Virtuals platform.
- Binding Curve Setup: The creator pays 100 $VIRTUAL tokens to create a binding curve for the new agent’s token, pairing it with $VIRTUAL.
- Liquidity Pool Creation: Once the binding curve limit is reached, the agent “graduates” and creates a liquidity pool for the agent’s token paired with $VIRTUAL, adhering to a fair launch principle with no pre-mining or internal distribution, fixed total supply, and long-term liquidity lock.
Virtuals charges launch fees for AI Agents, transaction fees for each agent token trade, and reasoning fees when agents access LLMs through Virtuals API. ElizaOS and Swarms are currently planning to build their own launchpad.
However, launchpad have their issues. The play of asset issuance requires the issued assets to have inherent “attractiveness” to form a positive feedback loop. Currently, the vast majority of launched AI Agents are essentially memes without intrinsic value support. Once they lose market attention, they rapidly decrease in value. In the current sluggish market, launchpad struggle to attract creators, so the economic model cannot effectively operate.
MCP’s Exploration in Web3
The emergence of MCP offers new exploratory directions for current AI Agents in Web3, most intuitively in two areas:
- Deploying MCP servers on blockchain networks, addressing the single-point issue of MCP servers while enhancing censorship resistance.
- Equipping MCP servers with the capability to interact with blockchains, such as facilitating DeFi transactions and management, thus lowering technical barriers.
The first direction places high demands on the underlying blockchain’s storage system, data management capabilities, and asynchronous computing abilities, suggesting the selection of blockchains like 0G. 0G is a modular AI blockchain featuring a scalable programmable DA layer suitable for AI DApps. Its modular technology will enable frictionless inter-chain interoperability while ensuring security, minimizing fragmentation, and maximizing connectivity, thereby creating a decentralized AI ecosystem.
The second direction resembles a variant of DeFAI; however, current DeFAI backends comprise self-packaged Tool functions. UnifAI is creating a unified DeFAI MCP server to avoid duplicating efforts. UnifAI offers a platform where autonomous AI agents can perform on-chain and off-chain tasks within the Web3 ecosystem. It includes UniQ for task automation, an agent service market, and infrastructure for tool discovery.
In addition to the two directions mentioned, @brucexu_eth, founder of LXDAO and ETHPanda, proposed a scheme to build an OpenMCP.Network creator incentive network on Ethereum. The MCP Server requires hosting and stable service provision. Users pay LLM vendors, who distribute actual incentives through the network to maintain the sustainability and reliability of the MCP servers, thus encouraging MCP creators to continuously produce high-quality content. This network will need to utilize smart contracts to automate, ensure transparency, trust, and censorship resistance in the incentive distribution. During operation, signature, permission validation, and privacy protection can be implemented using Ethereum wallets, ZK technologies, and more.
While theoretically, the combination of MCP and Web3 could inject decentralized trust mechanisms and economic incentive layers into the AI Agent applications, current zero-knowledge proof (ZKP) technologies still struggle to verify the authenticity of Agent behavior, and decentralized networks still face efficiency issues. Thus, this is not a short-term feasible solution.
Conclusion
The release of Manus marks a significant milestone in universal AI Agent products, and a milestone product is also needed in the Web3 world to challenge the skepticism regarding Web3’s practicality, which has been primarily viewed as speculative.
The emergence of MCP has provided new exploratory directions for AI Agents in Web3, including deploying MCP servers on blockchain networks and enabling MCP servers to interact with blockchains, or establishing an MCP Server creator incentive network.
AI represents one of the grandest narratives in history; for Web3, integrating with AI is inevitable. We still need to maintain patience and confidence as we continue our explorations.
About ZAN
As a technology brand of Ant Digital Technologies for Web3 products and services, ZAN provides rich and reliable services for business innovations and a development platform for Web3 endeavors.
The ZAN product family includes ZAN Node Service, ZAN PowerZebra (zk acceleration), ZAN Identity (Know your customers and clients), ZAN Smart Contract Review, with more products in the pipeline.