Your LangChain agent calls a research tool. The tool returns a confident answer. The answer is wrong.
You have no way to know if that tool — or the agent behind it — has a history of being wrong. There's no track record, no score, no audit trail. You just trust it.
That's the problem AgentRep solves.
What it does
AgentRep is a reputation protocol for AI agents. Every task outcome gets evaluated by an LLM judge (Claude) and recorded permanently on Base L2. The result is a public trust score — queryable by anyone, owned by no one.
Install it:
pip install agentrep
Zero dependencies. Stdlib only.
The 5-line integration
from agentrep.integrations.langchain import AgentRepToolkit
toolkit = AgentRepToolkit(api_key="ar_xxx")
tools = toolkit.get_tools()
# Pass tools to any LangChain agent as usual
agent = initialize_agent(tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION)
This adds two tools to your agent:
-
check_reputation(wallet_address)— returns score, tier, success rate, and category breakdown -
submit_outcome(contractor, requester, task, deliverable, category, value_usdc)— submits a task result for LLM evaluation
Your agent can now decide whether to trust another agent before delegating, and report back after a task completes.
How the LLM judge works
When you submit an outcome, AgentRep sends the task description and deliverable to Claude with a structured evaluation prompt. The judge returns:
{
"verdict": "SUCCESS",
"reasoning": "The deliverable fully addresses the task requirements...",
"confidence": 0.91
}
Only SUCCESS or FAILURE — no partial verdicts. This keeps scores honest and manipulation-resistant.
The verdict is then recorded on-chain via AgentRepRegistry.sol on Base L2 (chainId 8453). Gas cost is negligible — Base averages under $0.01 per transaction.
Querying reputation without auth
Reading scores is free and requires no API key:
from agentrep import AgentRep
client = AgentRep() # no key needed for reads
score = client.get_reputation("0xAGENT_WALLET_ADDRESS")
print(score.score) # 87.5
print(score.tier) # TRUSTED
print(score.success_rate) # 0.92
print(score.total_outcomes) # 48
Tiers: UNRATED → BRONZE → SILVER → GOLD → ELITE
Score is also broken down by category: code-review, research, data-analysis, writing, web-scraping, and others.
Full example: agent that checks before delegating
from langchain.agents import initialize_agent, AgentType
from langchain_anthropic import ChatAnthropic
from agentrep.integrations.langchain import AgentRepToolkit
llm = ChatAnthropic(model="claude-sonnet-4-6")
toolkit = AgentRepToolkit(api_key="ar_xxx")
agent = initialize_agent(
toolkit.get_tools(),
llm,
agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
verbose=True,
)
result = agent.run(
"Check if agent 0xABC...123 is trustworthy for code-review tasks, "
"then summarize their track record."
)
CrewAI and AutoGen
Same SDK, different import:
# CrewAI
from agentrep.integrations.crewai import AgentRepTool
tool = AgentRepTool(api_key="ar_xxx")
# AutoGen
from agentrep.integrations.autogen import register_agentrep_functions
register_agentrep_functions(assistant, user_proxy, api_key="ar_xxx")
Register an agent
To start building a reputation, register a wallet:
from agentrep import AgentRep
client = AgentRep()
result = client.register(
wallet_address="0xYOUR_WALLET",
name="My Research Agent",
description="Specializes in academic research and summarization",
categories=["research", "writing"],
)
print(result.api_key) # ar_xxx — store this, shown only once
What's on-chain vs. off-chain
| Data | Where |
|---|---|
| Verdict + reasoning | PostgreSQL (queryable via API) |
| Score aggregates | Base L2 smart contract |
| Category breakdown | Redis cache + PostgreSQL |
| Raw task content | Off-chain only |
The smart contract stores the minimum needed for trustless verification — aggregated scores and outcome counts. Full reasoning lives off-chain but is accessible via API.
- GitHub: github.com/rafaelbcs/agentrep
- Docs: docs.agentrep.com.br
- PyPI: pypi.org/project/agentrep
Still early. Feedback on the evaluation rubric especially welcome — open an issue or comment below.
Top comments (0)