In the era of intelligent agents, how do agents interact with each other?

At the recently concluded OpenAI Developer Conference, Ultraman stated that o1 has reached AGI Level 2 (inference) and will soon reach Level 3 (agent). However, before the widespread adoption of intelligent agents, there is still a key issue that has not been solved: how these agents interact with each other.

1. Challenges in the Era of Intelligent Agents

In the future, billions of intelligent agents will permeate all aspects of life, assisting humans in completing various tasks.

For example, a personal assistant agent ordering food needs to interact with a restaurant agent to inquire about the menu, prices, cooking methods, and delivery times. All of this relies on the collaboration between agents.

For these agents to work efficiently together, they need to communicate and interoperate, seamlessly collaborating like friends and colleagues.

This brings two main challenges:

How can an agent verify the identity of another?
How can we ensure secure communication between intelligent agents?

If all intelligent agents can only verify identities and communicate within a single platform, the industry could face two outcomes: either a platform monopoly or isolation among intelligent agents on different platforms, leading to fragmentation and incompatibility. Even if communication mechanisms are established between some major platforms, these mechanisms are often closed and controlled, making the entire ecosystem lack openness and flexibility.

Therefore, before the era of intelligent agents fully arrives, we need an open, low-cost cross-platform identity authentication and secure communication solution that allows all intelligent agents to connect freely and securely on any platform, achieving seamless collaboration.

2. Limitations of Existing Identity Authentication Methods

Most existing identity authentication relies on centralized platforms and services. Common methods include social media accounts, email accounts, and bank accounts, which are independently managed by their respective platforms, requiring users to register and authenticate separately on each platform.

The identity information between these centralized platforms is isolated. For example, a Google user cannot easily communicate securely with users on Facebook or X because identity information cannot be shared across platforms, and there is a lack of secure authentication methods. This limitation makes it difficult for users on different platforms to communicate directly, increasing the complexity and cost of collaboration.

For intelligent agents, the limitations of traditional identity authentication schemes are more pronounced, making it difficult for agents on different platforms to recognize each other’s identities and communicate seamlessly.

3. Email and Bitcoin: Benchmarks for Cross-Platform Authentication

Currently, there are two identity authentication schemes on the Internet that differ from mainstream centralized methods: email and Bitcoin.

Email: Email enables cross-platform authentication and communication. Whether it is Google, Microsoft, or other service providers, email accounts can recognize each other and send emails, providing a natural cross-platform authentication and communication mechanism.
Bitcoin: Bitcoin’s identity authentication is fully decentralized without the intervention of a centralized platform. The process of creating a Bitcoin address is simple: the user generates a private key, then derives a public key from it, and performs a series of hashing operations on the public key to generate a unique Bitcoin address. This address functions like a user’s identity ID and can be used to receive and send Bitcoin. The security of transactions and identity verification relies on the protection of the private key. This method achieves true decentralized identity authentication.
These identity authentication schemes have some common features:

Cross-Platform: Whether it’s email or Bitcoin, users are not restricted to a specific platform. Email users can communicate across service providers, and Bitcoin users can trade freely without depending on any particular platform.
Simplicity: The design of both email and Bitcoin addresses is simple. Users need only an ID to confirm their identity. Though their underlying technologies differ, both use asymmetric encryption.
Decentralization: Especially Bitcoin, which completely removes centralized institutions, allowing users to authenticate and trade using their own private keys. While email relies on service providers, it also has a certain level of decentralization because different providers can communicate.
Although these two solutions cannot be directly applied to intelligent agents, their technology serves as valuable references.

4. W3C DID Specification: A New Approach to Cross-Platform Authentication

Is there a technology that can solve the issue of cross-platform identity authentication now? This brings us to the W3C’s recently released DID (Decentralized Identifier) specification. Although its current application is relatively limited, I believe it is the most suitable foundational technology for intelligent agent communication.

DID is a new identity standard specifically designed to address cross-platform and decentralized identity authentication. Simply put, DID allows each individual or agent to have an independent, decentralized identity that is not dependent on any platform but rather controlled by the user.

This way, intelligent agents can implement a unified identity authentication system that allows them to recognize and authenticate each other, regardless of the platform they run on. This not only enhances user control over identity information but also allows for seamless interoperability across platforms, breaking down traditional identity barriers and enabling true cross-platform collaboration.

5. DID Alliance Method: Specific Implementation

However, the DID specification is just a framework, with many specific methods, each having different goals and technical implementations. Currently, there is no ready-made solution that fully meets the needs of intelligent agents. Therefore, we have developed a new standardized approach under the DID framework to solve the identity authentication problem for intelligent agents.

This is the DID Alliance Method that we propose.

Even if you are not familiar with DID, it doesn’t matter. Let me briefly introduce how to use our defined DID Alliance Method to achieve cross-platform identity authentication and secure encrypted communication.

6. Specific Steps of DID Alliance Method

Step 1: Create DID and DID Documents

DID is a decentralized identity identifier that is unique, created, and controlled by the user without relying on any centralized platform. The DID document contains information related to DID, such as public keys, verification methods, and service details, which help others verify the legitimacy of the identity. The DID Alliance Method is inspired by the Bitcoin address creation process: users generate a private key, derive a public key, and then hash the public key to generate a unique DID. In this way, the private key, public key, and DID form a strict one-to-one correspondence.

Step 2: Publish the DID Document

Users can publish DID documents on their own platform or a third-party hosting platform. All these platforms must comply with the publication and viewing standards for DID documents. Since building a self-hosted DID document platform is costly for most agents, third-party hosting platforms are an option. All compliant nodes are alliance nodes, hence the name “DID Alliance”.

Step 3: Disseminate the DID of the Agent

Disseminate the DID of an agent through reliable means, such as email, official websites, or authentication websites. The first step in communicating with an agent is to find its DID.

Step 4: Download the DID Document According to the DID

Download the DID document from a self-built platform or third-party hosting platform to obtain relevant information about the DID.

Step 5: Verify the Identity of the Agent

According to the DID document, find the message service endpoint of the agent and initiate an identity verification request. This involves first verifying the correctness of the public key using the DID and then verifying that the agent possesses the corresponding private key through signature verification.

Step 6: Conduct Secure Encrypted Communication

If verification is successful, end-to-end encrypted communication can proceed using public and private keys. We refer to the standard TLS process design to ensure the highest level of security.

At this point, the identity authentication and encrypted communication process between intelligent agents is complete, without requiring users to rely on any platform.

Future Outlook Currently, we mainly transmit text information. In the future, we will expand to more diverse formats, including files (such as audio and video), live streams, and real-time communication. This will cover the vast majority of business scenarios on the Internet.

The DID we designed is essentially equivalent to a blockchain address, which can also be used as a blockchain wallet address. Therefore, building a fully decentralized identity system based on blockchain becomes an option. We can publish DID documents on the blockchain for public access, create business-related tokens for DID-based transactions and settlements, making value transfer between intelligent agents more convenient, and organize distributed computing power using blockchain to form a decentralized intelligent agent messaging service network. The possibilities are endless for the future.