Rahsi™ SharePoint Online Throttling
Internals, Limits and Survival Patterns (Copilot Era Edition)
If your SharePoint automation “randomly” dies, it’s not random.
It’s throttling doing exactly what it was designed to do.
In the Copilot era, this is no longer a developer inconvenience — it is a tenant reliability boundary.
Every Microsoft Graph crawl, SharePoint REST burst, Power Automate loop, Logic App, Azure Function, migration job, and agentic workload competes for the same service protection envelope. When Microsoft 365 is treated like an infinite API, the platform enforces limits — decisively.
This article is not an anti-Microsoft critique.
It is respect for a hyperscale system — and a blueprint for building automation that survives real-world load.
Why SharePoint Online throttling exists
Throttling protects four things simultaneously:
- Shared service health across tenants
- Fair resource usage inside a tenant
- Backend stability (indexing, permissions, search, content services)
- Burst suppression that prevents cascading outages
Throttling is the platform saying:
You can do this — but not at this shape, not at this rate, not with this concurrency.
Change the shape of your workload and throughput usually improves without lifting limits.
What actually triggers throttling
Throttling is not just about volume.
It is about who, what, and how.
1. User-driven bursts
- Large UI downloads or sync storms
- Broad searches or list enumerations
- Parallel browser requests across libraries
2. App-driven bursts
- Crawlers and daemons
- Migration tools
- Service principals running wide enumeration loops
3. Tenant-level contention
- Many “small” automations running together
- Background indexing and compliance services
- Copilot-era always-on workloads
4. Workload shape (the real trigger)
The fastest way to hit throttling:
- High concurrency
- Repeated calls to the same site or library
- Expensive queries and expansions
- Inefficient pagination
- Aggressive retry loops
Microsoft Graph vs SharePoint REST (the incomplete myth)
You will often hear: “Use Graph, it’s safer.”
Graph is the strategic API — but it does not remove throttling.
What changes:
- Different gateways and limits
- Different cost per request
- Different response headers and error shapes
- Different aggregation behavior
The truth:
Graph vs REST is not safe vs unsafe.
It is different throttle envelopes with different cost models.
Bad workload shape throttles everywhere.
The most important rule: obey Retry-After
Treat throttling like a traffic signal.
Correct behavior
- If
Retry-Afterexists, wait exactly that duration - If not, apply exponential backoff with jitter
- Reduce concurrency after throttling
- Cap retry attempts
What breaks tenants
- Immediate retries
- Fixed delays for all errors
- Parallel retries across workers
- Re-reading the same large page repeatedly
That turns throttling into an outage.
Survival patterns that actually work
1. Global concurrency control
Do not let each worker decide concurrency.
Use a single global limit per workload class.
Reliability beats raw speed.
2. Queue + token bucket
The most throttle-resistant pattern:
- Push work to a queue
- Pull with a token bucket rate limiter
- Reduce tokens dynamically when throttled
This pattern survives busy tenants.
3. Idempotency is non-negotiable
If a call can be retried, it must be safe to retry.
- Deterministic IDs
- External state tracking
- Clear “done” markers
- No partial writes without boundaries
4. Intelligent batching
Batching helps only when:
- Limits are respected
- Failures do not re-run entire batches
- Payload size is controlled
Fewer calls matter more than bigger calls.
5. Partition awareness
Avoid hammering the same site or library.
- Rotate targets
- Schedule per site
- Spread load across time
Hot partitions throttle first.
Why “run it off-peak” no longer works
Off-peak execution is unreliable because:
- Background services never stop
- Copilot increases baseline usage
- Global tenants have no true quiet window
- Indexing, sync, and compliance workloads persist
Off-peak reduces friction.
It does not replace architecture.
Throttling is not Microsoft blocking you
It is Microsoft 365 protecting your tenant — and teaching your automation how to behave at scale.
If you build anything serious on SharePoint Online or Microsoft Graph, mastering throttling will save you months.
Read the full article:
https://www.aakashrahsi.online/post/rahsi-sharepoint-online-throttling
A minimal throttle-handling template
text
if throttled:
wait = Retry-After if present
else exponential_backoff + jitter
reduce concurrency
retry until max_attempts
else:
proceed
---
Top comments (0)