Slack Just Throttled Your OpenClaw Agent. You Probably Haven't Noticed Yet.
On March 3rd, Slack flipped a switch. If your OpenClaw agent connects to Slack through a non-Marketplace app (which is most of them), the conversations.history and conversations.replies API methods are now limited to one request per minute, returning a maximum of 15 messages each time.
That's not a typo. One request per minute. Fifteen messages.
If your agent reads channel history for context, searches threads for decisions, or retrieves recent messages to understand what it's being asked about, it just got a lot worse at its job. And the failure mode is silent. No error messages. No warnings in the logs. The agent just... knows less than it used to.
What Actually Changed
Slack announced the rate limit changes back in May 2025, but the initial rollout only affected newly created apps and new installations. Most teams running OpenClaw didn't notice because their existing installations were grandfathered in.
That grace period ended March 3, 2026. Now every non-Marketplace app gets the new limits. The two methods that matter most for AI agents — conversations.history (read recent channel messages) and conversations.replies (read thread replies) — went from roughly 50 requests per minute with 100+ messages each, to 1 request per minute with 15 messages max.
Internal apps built by the workspace's own team aren't affected. Slack Marketplace apps aren't affected. But OpenClaw's gateway, which registers as an external app using OAuth, gets hammered. Same for most third-party agent platforms.
Why This Kills Agent Context
Here's how a typical OpenClaw agent uses these APIs:
When someone messages the agent in a channel, the agent calls conversations.history to grab the last 50-100 messages for context. It needs to know what the team's been discussing, what decisions were made, what names and projects are relevant. Without that context, it's answering in a vacuum.
When someone replies in a thread, the agent calls conversations.replies to get the full thread. This is how it maintains conversational state across multi-turn interactions.
Under the old limits, this was instant. Call the API, get 100 messages back, process them, respond. Maybe 200ms of added latency.
Under the new limits, the agent gets 15 messages per request. If it needs more context, it has to wait 60 seconds between requests. Need 60 messages of history? That's four requests. Four minutes of waiting before the agent can respond.
In practice, most agent frameworks don't wait. They grab the 15 messages they can get and work with that. Which means your agent just lost 85% of its available context, and nobody told it.
The Symptoms
If any of these sound familiar, the rate limits are probably the cause:
The agent forgets context from earlier in the day. It used to recall conversations from 50 messages back. Now it only sees the last 15. Anything older might as well not exist.
Thread responses get weird after the first few messages. The agent loses track of what was said early in a long thread because it can only retrieve 15 replies. Message 16 onwards is gone.
The agent asks questions you already answered. It literally can't see your earlier messages. It's not being dumb. It's blind.
Response latency spikes randomly. If your framework does retry on rate limit errors (HTTP 429), those retries add seconds or minutes. Users see the agent "thinking" for an unusually long time.
Context retrieval tools return partial results. If your agent has an MCP tool that searches channel history, the search results got a lot thinner.
What You Can Do
There are a few approaches, ranging from quick fixes to architectural changes.
1. Cache aggressively
The rate limit is on API calls, not on how many messages you hold in memory. If your agent reads channel history once and caches it, subsequent requests in the same channel don't need another API call.
Build a simple message cache keyed by channel ID. When conversations.history returns 15 messages, store them. On the next request, check the cache first. Periodically refresh (once per minute, since that's all you're allowed anyway), and merge new messages into the existing cache.
This doesn't help with the first request, but it helps with everything after. The agent starts with 15 messages of context and builds up over time rather than starting fresh every interaction.
2. Pre-fetch during idle time
Your agent doesn't need to wait until someone asks it a question to read channel history. Run a background job that calls conversations.history once per minute for your most active channels. By the time someone actually messages the agent, the cache already has 30-60 minutes of history stacked up.
At one request per minute, 60 minutes of pre-fetching gives you 900 messages (15 per request * 60 requests). That's more context than most agents had before the rate limit change. The trick is you have to start gathering it before you need it.
3. Use Slack's Events API instead
The rate limits apply to the Web API. They don't apply to the Events API. If your app subscribes to message events, Slack pushes every message to your webhook in real time. No polling. No rate limits on incoming events.
This is the architectural shift most teams should make. Instead of calling conversations.history to retroactively read what happened, subscribe to events and maintain your own message store. You get every message the moment it's sent, you store as much history as you want, and you never hit the rate limit at all.
The downside: you need persistent storage. A database, even SQLite, to hold the message stream. And you only have messages from the point you started subscribing forward; you can't backfill older history quickly.
4. Get on the Marketplace
Marketplace apps aren't subject to the new rate limits. If you're running a commercially distributed app (which is how most OpenClaw gateways register), going through Slack's Marketplace review process removes the throttle.
The review process isn't trivial. Slack reviews your app for security, data handling, user experience, and policy compliance. But if your agent is a core part of your workflow, it's worth pursuing. The alternative is building increasingly elaborate workarounds for an API that's going to keep getting more restrictive.
5. Summarise instead of retrieving
If you can't store all the messages, store summaries. After the agent processes a conversation, generate a 200-token summary of what was discussed and decided. Store that. When the agent needs historical context, it reads summaries rather than raw messages.
This is what we covered in a previous article about agent memory. The rate limit change makes it not just a nice-to-have but a necessity. Fifteen messages of history isn't enough for most agent use cases. But fifteen messages plus a stack of conversation summaries is plenty.
Why Slack Did This
The official rationale: preventing data exfiltration by unvetted applications. And honestly, it's not wrong. conversations.history with no rate limit is a data scraper's dream. Connect an OAuth app, vacuum up every message in every channel, train a model on it. Slack had a legitimate reason to restrict this.
The cynical reading: Slack wants AI features to go through their platform. Their own Slack AI product and the Agents API aren't subject to these limits. Third-party agents that compete with Slack's built-in AI get throttled. You can read it either way.
The practical reality doesn't change regardless of motive. The limits exist, they're getting stricter, and building your agent architecture around unlimited API access to Slack message history is a bet against the platform.
What SlackClaw Does About This
SlackClaw handles the rate limit problem at the platform level. The managed service uses the Events API approach by default — messages stream in real time and get stored in a context layer that the agent draws from. No polling, no rate limits, no 15-message ceiling.
For teams that hit the rate limit wall and don't want to re-architect their setup, the platform also includes the caching and summarisation patterns I described above, pre-configured. You don't build the message cache or the summary pipeline yourself.
The platform is Marketplace-registered too, which helps with rate limit headroom for the API methods it does need to call (reactions, file uploads, user profiles, etc.).
The Bigger Picture
This rate limit change is a signal. Slack is moving toward a model where AI agents either go through official channels (the Marketplace, the Agents API) or accept significant constraints. The free-for-all era of "connect your bot and do whatever you want" is ending.
If you're building anything serious on Slack, plan for a world where API access gets more restrictive over time, not less. Design your agent to work with minimal API calls and maximum local state. Cache everything. Summarise everything. Subscribe to events instead of polling history.
The agents that break aren't the ones that hit the limit. They're the ones that don't know it exists.
Helen Mireille is chief of staff at an early-stage tech startup. She writes about what breaks when AI agents meet real platform constraints.
Top comments (0)