Last week we noticed something alarming in our API usage logs. Nydar was making nearly 15,000 API calls per day to our primary market data provider — against a daily budget of 2,000. We were overshooting by 7.5x. Not by a little. Not by double. By seven and a half times our allocation.
The data was still flowing because the provider throttles gradually rather than hard-blocking, but we were living on borrowed time. One policy change on their end and our entire stock data pipeline goes dark. No quotes, no heatmaps, no analyst ratings, no institutional holdings. Everything that makes Nydar useful for stock traders — gone.
We had to fix it. And we had to fix it without users noticing anything had changed.
This is the story of how we diagnosed the problem, fixed it in four phases over 48 hours, and ended up with a genuinely better product than we started with.
Why not just buy more quota?
Before we get into the technical work, let's address the obvious question. Our data provider offers paid tiers with higher limits. Why not just upgrade?
Two reasons. First, the economics don't add up at our stage. We're a growing platform, not a hedge fund. Moving from the free tier to a plan that would cover 15,000 calls/day would cost more per month than our entire server infrastructure. That's not a good trade when most of those calls are wasted.
Second — and more importantly — buying more quota doesn't fix the underlying problem. If our architecture is making 7.5x more calls than necessary, throwing money at the rate limit just means we're paying 7.5x more than we should. We'd rather fix the root cause and keep that budget for when we actually need it — when the user base grows and the real demand exceeds the free tier.
Optimisation first. Scaling second.
How Nydar's data pipeline works
To understand where the waste was happening, you need to understand how data flows through the system.
Nydar supports three asset classes: stocks, crypto, and forex. Each has different data providers, different update frequencies, and different trading hours. The architecture looks roughly like this:
- External APIs — providers like Finnhub (stocks, forex), Binance (crypto), TwelveData (supplementary quotes), and others
- Backend data sources — Python classes that wrap each provider, handle authentication, and implement caching
- In-memory cache layer — a simple dictionary with TTL-based expiration. No Redis, no external cache. Just a dict with timestamps
- REST endpoints — FastAPI routes that widgets call to get data
- WebSocket layer — a persistent connection that pushes real-time updates to the frontend at configurable intervals
- Frontend widgets — 40+ React components that render the data
Every API call happens at layer 2. The cache at layer 3 is supposed to prevent redundant calls. The WebSocket at layer 5 determines polling frequency. The waste was happening because layers 2, 3, and 5 weren't coordinating properly.
Discovering the problem
It started at 11 PM on a Tuesday. A routine check of our data provider's dashboard showed we'd used 14,898 of our 2,000 daily allocation. That's not a typo. Fourteen thousand eight hundred and ninety-eight.
The natural reaction is to assume something is broken — a runaway loop, a misconfigured retry, maybe a bot hammering the API. But our logs didn't show anything obviously wrong. The system was behaving exactly as designed. The problem wasn't a bug. It was architecture.
Nydar aggregates real-time data from multiple providers — live quotes, order books, analyst ratings, institutional filings, options chains, earnings calendars, and more. Each of these features makes API calls. Individually, each one is reasonable. A quote here, a filing there. Collectively, they were drowning us.
The audit that changed everything
The first step was figuring out where all these calls were actually going. We had basic daily counters, but that's like knowing your electricity bill is high without knowing which appliance is the problem. We needed per-endpoint, per-symbol granularity.
We rebuilt our API usage tracker with a detailed breakdown structure: which API, which date, which endpoint, which symbol, how many calls. We let it run for a full 24-hour cycle and looked at the results the next morning.
They were eye-opening:
| Source | Calls/hour | The problem |
|---|---|---|
| Forex rate lookups | ~991 | Each of 15 currency pairs hit the bulk rates endpoint individually — but that endpoint returns every rate in one response |
| Heatmap quotes | ~3,843 | 25 individual stock quotes per refresh, fired sequentially, with only a 60-second cache |
| WebSocket polling | ~480 | Polling every 30 seconds for stock data even when the US market was closed |
| Public snapshot | ~240 | Fetching 4 stock quotes every 120 seconds with no market-hours check |
| TwelveData duplication | uncounted | A get_quote() method that bypassed the cache entirely and duplicated what get_ticker() already does |
Four categories of waste: redundant calls, unnecessary calls, excessive calls, and duplicate calls.
The forex revelation
The forex issue deserves its own section because it was the single most absurd finding.
Our data provider has a /forex/rates endpoint. You pass it a base currency — say EUR — and it returns exchange rates for every quote currency in a single response. EUR/USD, EUR/GBP, EUR/JPY, EUR/AUD, all of them. One call, all the data.
We were calling it once per pair.
So fetching EUR/USD, EUR/GBP, and EUR/JPY meant three separate API calls to the same endpoint, each returning the exact same blob of data, and we'd extract one rate from each response and throw away the rest. Multiply that by 15 active currency pairs refreshing every 60 seconds, and you get nearly a thousand wasted calls per hour.
The fix was embarrassingly simple: call /forex/rates?base=EUR once, cache the full response for 60 seconds, and have all EUR-based pair lookups read from that cache. One call instead of fifteen. A 93% reduction in forex API usage from a single architectural insight.
This is why auditing matters. We would never have guessed this was the biggest offender. The forex widget looked simple and lightweight from the outside. But underneath, it was our most expensive feature per data point.
Phase 1: Stop the bleeding
With the audit data in hand, we tackled the four biggest offenders in order of impact.
Forex bulk caching was the first and easiest win. We already described it above — one call per base currency instead of one per pair. Forex API usage dropped from ~900 calls/hour to ~60 calls/hour. We could have stopped here and still cut our total daily calls by nearly a third.
Market hours gating was the second-biggest win and required more thought. We built a is_stock_market_open() utility that determines whether the US stock market is in regular trading hours — 9:30 AM to 4:00 PM Eastern Time, Monday through Friday, accounting for DST. When the market is closed, stock-related endpoints return stale cached data with a "market_closed": true flag. No API call at all.
This eliminated roughly 4,000 calls per day that were happening between market close and market open. Think about that: for 17.5 hours out of every 24, we were making API calls for data that literally cannot change (regular-hours stock prices don't move when the exchange is closed). Pure waste.
But this one almost killed us.
The midnight UTC war story
Our first implementation used is_stock_market_active() instead of is_stock_market_open(). The "active" version includes pre-market (4 AM–9:30 AM ET) and after-hours (4 PM–8 PM ET) sessions. It seemed like the responsible choice — broader coverage means fresher data, right?
The problem is timezone arithmetic, and it's the kind of bug that only manifests at specific times of day.
Our daily API counter resets at midnight UTC. Midnight UTC is 7 PM Eastern Time — right in the middle of the after-hours session. So here's what happened every single night:
- 11:59 PM UTC: Counter at 1,987 of 2,000. System is throttled. Everything is fine.
- 12:00 AM UTC (7 PM ET): Counter resets to zero.
is_stock_market_active()returns true because after-hours runs until 8 PM ET. - 12:00:01 AM UTC: The system thinks the market is active and it has a fresh budget. Full-speed polling begins.
We had a widget called AssetBands that shows support and resistance levels for a watchlist of stocks. It polls 50 symbols every 30 seconds via the bulk tickers endpoint. At full speed, with a fresh daily counter, it burned through 2,000 API calls in under 30 minutes. By 12:30 AM UTC — 7:30 PM Eastern, still in after-hours — our entire daily quota was gone.
The first morning this happened, we thought our Phase 1 fixes hadn't worked. The usage dashboard showed 2,000+ calls, same as before. It took an hour of staring at the per-hour breakdown to spot it: a massive spike at exactly midnight UTC, then nothing. The observability tooling we'd built in Phase 3 literally paid for itself on day one.
The fix was a one-word change in one file: is_stock_market_active() to is_stock_market_open(). Regular hours only. Pre-market and after-hours data is genuinely useful, but not at the cost of your entire daily budget. If we ever need extended-hours data, we'll budget for it as a separate, explicit cost — not as a side effect of a boolean function that happens to return true at the wrong time.
Heatmap optimisation addressed both speed and volume. The heatmap widget fetches quotes for 25 stocks to build a sector performance grid. Two changes: we switched from sequential fetches to asyncio.gather() for parallel execution, and bumped the cache TTL from 60 seconds to 180 seconds.
The parallel fetch doesn't reduce API call count — it's still 25 calls — but it cuts the response time from around 5 seconds to under 1 second. Users were seeing a blank heatmap for five seconds on every refresh. Now it snaps in. The cache bump from 60s to 180s reduces call volume by two-thirds. For a heatmap showing sector performance, three-minute-old data is perfectly fine.
TwelveData deduplication was the last Phase 1 fix. Our TwelveData integration had two methods that hit the same underlying API: get_ticker() (cached) and get_quote() (not cached). The quote method existed for historical reasons and was bypassing the cache on every call. We routed it through the cached get_ticker() path. Usage dropped by roughly 50%.
WebSocket back-off. This one isn't in the original audit table because it overlaps with market-hours gating, but it's worth calling out. Our WebSocket layer pushes real-time updates to connected clients. During market hours, it polls every 30 seconds. But we had no concept of "the market is closed, slow down." After adding market-hours awareness, the WebSocket backs off to a 5-minute interval outside regular hours. Still polling — in case of corporate events or overnight moves — but at 1/10th the frequency.
We also added an extended OHLCV cache for candlestick data. During market hours, OHLCV data has a standard cache TTL. When the market is closed, we bump it to 4 hours. Candlestick data from 3 PM isn't going to change at 11 PM. A 4-hour cache during off-hours means the data is there when someone opens the app late at night, but we're not refreshing it every few minutes for no reason.
Combined, these changes reduced our total API calls by an estimated 60–70%. But we weren't done. Reducing calls is only half the problem.
Phase 2: What happens when you hit the wall anyway
Even with a 70% reduction, there will be days when traffic spikes, when a user has 40 widgets open, when something unexpected happens. You will hit your quota ceiling eventually. The question is: what does the user see when you do?
Before our work, the answer was "broken widgets." API calls would fail with HTTP errors, the frontend would show generic error states or infinite spinners, and the user would assume the platform was down. Terrible.
We built a three-layer quota exhaustion system that transforms this failure mode into something that actually makes sense.
Layer 1: The backend quota guard. Before every single API call to a rate-limited provider, we check a daily counter. If the quota is exhausted, we raise a QuotaExhaustedError immediately — no network call, no wasted time, no ambiguity. This check happens in a single method called _get_params() that all 17+ endpoints flow through. One chokepoint, complete coverage.
The key insight here is that quota exhaustion isn't an error. It's a state. The system should handle it as gracefully as it handles "market closed" or "no data available." Raising a typed exception (rather than returning None or an empty response) means every layer of the stack can handle it explicitly.
Layer 2: The global exception handler. A FastAPI exception handler catches QuotaExhaustedError from any route — not just the ones we've thought of — and returns a structured 429 response: {"code": "QUOTA_EXHAUSTED", "message": "..."}. This means we never need to remember to wrap a new endpoint in quota-handling logic. The safety net is global.
We also had to deal with a provider-specific quirk: our data provider returns HTTP 403 (Forbidden) when you're rate-limited, not the standard 429 (Too Many Requests). Python's response.raise_for_status() turns a 403 into a generic HTTPStatusError, which our existing error handling was logging as a server error. We built a _check_response() static method that intercepts 403 and 429 responses before anything else touches them and converts them to QuotaExhaustedError. Without this, quota events would masquerade as server errors throughout the codebase.
Layer 3: Frontend intelligence. An Axios interceptor catches 429 responses and rewrites the error message to "Daily data limit reached — resets at midnight UTC." Clean, human-readable, no jargon.
But we went further. Each widget now has a smart error state component called WidgetErrorState that checks context before deciding what to show. If the stock market is closed and there's a quota error, the user sees "Market Closed" with the last known values — not "quota exhausted." Because from the user's perspective, there's nothing wrong. The market is closed. That's why the numbers aren't moving. Showing them a quota error would be technically accurate but experientially misleading.
This distinction — between what's technically true and what's useful for the user — turned out to be one of the most important decisions in the entire project. Ten widgets now use this smart error dispatcher: Chart, VolumeProfile, MarketBreadth, IPO, ShortSqueeze, OrderFlow, OrderBook, AggregatedBook, LiquidationMap, and VolumeDelta.
The WebSocket channel. Most of Nydar's real-time data flows over a WebSocket connection, not REST. So the quota signal needs to travel that path too. When the backend detects quota exhaustion, it pushes a quota_exhausted message type over the WebSocket. The frontend listens for this event and fires an internal ws:quota-exhausted event that any component can subscribe to.
This is important because WebSocket-driven widgets wouldn't see the REST 429 responses. Without the WebSocket channel, a widget that gets its data purely from the push connection would keep showing stale data with no indication of why it stopped updating. The user would see prices frozen at their last value with no error, no market-closed badge, nothing. Just silence. That's worse than an error message — it's a lie.
What the user actually sees before and after
Let's make this concrete. Before the optimisation work, here's what happened when a user opened Nydar at 10 PM on a Tuesday:
- Chart widget: Showed a loading spinner for 5 seconds, then "Error loading data"
- Order book: Infinite spinner
- Heatmap: Loaded after 5 seconds but with random gaps where individual stock quotes failed
- Market breadth: "Something went wrong"
- Volume profile: Empty canvas
After the work:
- Chart widget: Shows the day's chart with a subtle "Market Closed" badge and the closing price
- Order book: Shows the last known state with a "Market Closed" indicator
- Heatmap: Loads in under a second showing end-of-day sector performance
- Market breadth: Shows the day's final breadth reading with a closing timestamp
- Volume profile: Shows the day's volume profile, which is actually more useful at end of day than during the session (the full profile is complete)
Every widget shows data. Every widget explains its state. No spinners, no errors, no confusion. The platform feels alive and informed even when the market isn't.
Phase 3: You can't optimise what you can't see
The is_stock_market_active() bug taught us that we needed much better visibility into what the system was doing in real time. Phase 3 was about building that observability layer.
Per-endpoint, per-symbol tracking. Not just "you made 2,000 calls today" but "the heatmap endpoint made 847 calls, 340 of which were for AAPL, across the quote and ticker sub-endpoints." This level of detail is what lets you spot anomalies instantly. If AAPL suddenly accounts for 80% of your API calls, you know something is wrong with the AAPL-specific code path.
Rotating log files. We set up a dedicated logs/api_usage.log with a rotating handler: 10MB per file, 5 files max. That gives us roughly a week of history without unbounded disk growth. The logs capture every API call with timestamp, endpoint, symbol, and cache status (hit or miss). When something goes wrong at 3 AM, we have the data to reconstruct what happened.
Admin dashboard endpoint. We enriched our existing /api/admin/api-usage endpoint to return top consumers, per-endpoint breakdowns, cache hit rates, and exhaustion status. This isn't a fancy dashboard — it's a JSON endpoint that we curl when we need to check things. Simple, but it's saved us multiple times already.
Throttled persistence. The usage tracker writes its stats to a JSON file, but doing that on every API call would create its own performance problem. We added a 30-second throttle: stats accumulate in memory and flush to disk periodically, with an explicit flush on application shutdown. This is one of those boring-but-essential details that separates production systems from prototypes.
Phase 4: The paranoia phase
The final phase was about edge cases and defensive coding. The kind of work that doesn't show up in metrics but prevents 3 AM pages.
Silent quota propagation. Here's a problem we didn't anticipate: when a QuotaExhaustedError gets raised inside a method, it can get caught by a generic except Exception block higher up the call stack. Generic exception handlers typically log the error at ERROR level. Quota exhaustion isn't an error — it's expected behaviour. But without explicit handling, every quota event generates a log line that looks like something is broken.
With 15 forex pairs refreshing every 30 seconds, that's potentially 1,800 false-alarm ERROR log entries per hour. Our log aggregator would light up like a Christmas tree for something that's completely normal.
The fix is tedious but essential: add except QuotaExhaustedError: raise before every except Exception block in every method that might transitively call an API. The quota error punches through all the generic handlers and gets caught only by the global exception handler where it belongs. We did this across all 17+ endpoint methods in both our stock and forex data sources.
Premium interest capture. This is the business side of quota management, and it's an example of turning a constraint into a feature.
When users encounter the quota exhaustion state during market hours (which is now rare, but happens on high-traffic days), they don't just see an error. They see a subtle call-to-action: "Want uninterrupted data? Register your interest for premium access."
Clicking it opens a modal that captures their email and what tier of data access they'd value. We deliberately didn't build a payment flow. We're not selling something that doesn't exist yet. We're gathering signal. The registrations go to a simple JSON store on the backend — nothing fancy, no email marketing platform, just a file that grows by one line when someone expresses interest. Aggregate stats are available at an admin endpoint so we can track conversion rates from quota events to signups.
The insight here is that the users who encounter quota limitations during market hours are, by definition, our most active users. They have multiple widgets open, they're checking data frequently, they care enough to be trading during peak hours. These are exactly the people we want to talk to about a premium tier. The quota limit acts as a natural filter for our highest-value users.
We didn't plan this. It emerged from asking "what should the user see when we hit the wall?" and realising that the answer isn't just "an error message" — it's an opportunity to understand what our most engaged users would pay for.
The numbers
After all four phases rolled out over 48 hours, here's where we landed:
| Metric | Before | After | Change |
|---|---|---|---|
| Daily API calls | ~14,898 | <2,000 | -85% |
| Forex calls/hour | ~991 | ~60 | -94% |
| Off-hours waste | ~4,000/day | 0 | -100% |
| Heatmap response time | ~5 seconds | <1 second | -80% |
| Error logs from quota events | ~1,800/hour | 0 | -100% |
| User-facing errors on quota | Broken widgets | "Market Closed" or graceful CTA | qualitative |
We're now comfortably within our API budget with headroom for growth. During market hours, data freshness is identical to before — we didn't increase any cache TTLs that affect real-time trading data. The 180-second heatmap cache is the only user-visible change, and for a sector heatmap, that granularity is more than sufficient.
The frontend side: market-aware error states
One piece of this work deserves a deeper look because it applies to any product that depends on external data: the frontend error state hierarchy.
Before this project, our widgets had a binary state: either data loaded successfully, or the widget showed a generic error. After the work, every widget can be in one of four states:
- Loading — data is being fetched
- Success — data rendered normally
- Market Closed — it's outside trading hours, showing last known values with a clear indicator
- Quota Exhausted — daily limit reached, with a premium interest CTA
The market-closed state is checked first. This is critical. If the market is closed and the quota is exhausted, the user sees "Market Closed" — because that's the real reason the data isn't updating. The quota state only shows during market hours when the user would actually expect fresh data.
We also built the market hours check into the frontend itself, and this was trickier than it sounds.
The DST problem
Daylight Saving Time is the bane of any system that cares about US market hours. The NYSE opens at 9:30 AM Eastern, but "Eastern" means different UTC offsets depending on the time of year: UTC-5 during winter (EST), UTC-4 during summer (EDT). If you hardcode the offset, your market-hours check breaks twice a year — in March and November — and it breaks silently. Prices keep updating normally; it's just your cache gating that's wrong, burning extra API calls for two weeks until someone notices.
We solved this using the browser's built-in Intl.DateTimeFormat API with the America/New_York timezone. The browser handles DST transitions correctly because it uses the IANA timezone database, which is updated regularly. No external API calls, no timezone libraries, no hardcoded offsets.
The implementation is lightweight: format the current UTC time into America/New_York components, extract the hour and minute, and check if it falls between 9:30 and 16:00 on a weekday. This runs entirely on the client. Zero network overhead. The frontend knows what time it is in New York without asking anyone.
We use this in multiple places: the WidgetErrorState component checks it before deciding whether to show "Market Closed" vs "Quota Exhausted", and the AssetBands widget uses it to switch between 30-second active polling and 5-minute idle polling.
What we learned
Audit before you optimise. We would have guessed wrong about the biggest offenders. The forex bulk-cache fix was the single highest-impact change, but it wasn't on anyone's radar until we saw the per-endpoint numbers. Intuition is good for generating hypotheses. Data is what tells you where to actually spend your time.
"Active" and "open" aren't the same thing. Pre-market and after-hours sessions exist, and traders do care about them. But they're not worth burning your entire daily API budget on. For cache gating and polling decisions, regular trading hours is the right boundary. If you need extended-hours data, budget for it explicitly rather than treating it as a side effect of your regular polling.
Error states are features, not afterthoughts. The quota exhaustion UX is arguably better than what we had before the optimisation work. Users now see contextual, market-aware status messages instead of generic loading spinners when data isn't updating. The system is more honest about what's happening and why. A well-designed error state builds trust. A broken widget destroys it.
Typed exceptions beat boolean flags. Using QuotaExhaustedError as a proper exception type — rather than returning None or setting is_exhausted = True on some state object — meant that every layer of the stack could handle quota events explicitly. The global exception handler catches everything we forgot. The silent propagation pattern keeps logs clean. Types are documentation that the compiler enforces.
Observability pays for itself on day one. The per-endpoint usage dashboard caught the is_stock_market_active() regression within hours of deployment. Without it, we'd have woken up to another 15,000-call day with no idea why the fix we'd just shipped didn't work. Every minute spent building observability tools saves hours of debugging later.
Cache TTLs are product decisions, not just engineering ones. How stale can heatmap data be before it misleads a trader? Is 60-second-old data meaningfully different from 180-second-old data for a sector overview? These aren't questions engineers should answer alone. They require understanding what the user is actually doing with the data and what decisions they're making based on it.
The pattern
If you're building a trading platform — or any application that aggregates data from rate-limited APIs — the pattern is the same:
- Measure first. Per-endpoint, per-symbol, per-hour. You need the resolution to find the real problems.
- Eliminate redundancy. Bulk endpoints exist for a reason. If you're calling an API that returns 100 results to extract 1, cache the other 99.
- Respect time boundaries. If data can't change, don't ask for it. Market hours, business hours, weekends — build these boundaries into your polling logic.
- Design your error states. When you hit the limit, what does the user see? If the answer is "a broken widget," you have work to do.
- Build the dashboard before you need it. You'll need it sooner than you think.
What's next
We're now running comfortably within our API budget with room to grow. But this work opened up questions we hadn't considered before.
Multi-provider failover. If one data provider hits its limit, could we transparently fall back to another? We already have TwelveData and AlphaVantage as secondary sources for some data types. The quota guard infrastructure makes this possible — instead of raising QuotaExhaustedError, we could try the next provider in a priority chain. We haven't built this yet, but the architecture is ready for it.
Per-user quota awareness. Right now, all users share the same API budget. As we grow, we'll need to think about fair scheduling — making sure one power user with 40 widgets doesn't crowd out everyone else. The per-symbol tracking from Phase 3 gives us the data to understand usage patterns. The question is whether to implement server-side rate limiting per user, or to solve it at the data layer with smarter caching and sharing.
Holiday calendars. Our market-hours check handles weekdays and DST, but not market holidays. On Christmas Day, Martin Luther King Day, and a dozen other holidays, the US stock market is closed but our code thinks it's a normal Monday. We need to add a holiday calendar — either hardcoded for the year or fetched from an API (which, ironically, would be another API call to manage).
These are good problems to have. They're the problems of a platform that's working, growing, and thinking about the next level of sophistication. The 48 hours we spent on API optimisation didn't just fix a quota crisis — they gave us the observability, the error-state architecture, and the caching patterns to build on for everything that comes next.
Not bad for a week's work.
Originally published at Nydar. Nydar is a free trading platform with AI-powered signals and analysis.
Top comments (0)