<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Uche Wendy</title>
    <description>The latest articles on DEV Community by Uche Wendy (@uche_wendy_9f87dcb3b339d0).</description>
    <link>https://dev.to/uche_wendy_9f87dcb3b339d0</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3642304%2F7c55bbcc-aadf-4402-9b84-c9a434ae1324.png</url>
      <title>DEV Community: Uche Wendy</title>
      <link>https://dev.to/uche_wendy_9f87dcb3b339d0</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/uche_wendy_9f87dcb3b339d0"/>
    <language>en</language>
    <item>
      <title>Optimizing "GitHub-as-a-Database": Solving Rate Limits with Server-Side Caching</title>
      <dc:creator>Uche Wendy</dc:creator>
      <pubDate>Wed, 03 Dec 2025 02:47:10 +0000</pubDate>
      <link>https://dev.to/uche_wendy_9f87dcb3b339d0/optimizing-github-as-a-database-solving-rate-limits-with-server-side-caching-2aa5</link>
      <guid>https://dev.to/uche_wendy_9f87dcb3b339d0/optimizing-github-as-a-database-solving-rate-limits-with-server-side-caching-2aa5</guid>
      <description>&lt;h2&gt;
  
  
  The Context: An Open-Source Educational Tool
&lt;/h2&gt;

&lt;p&gt;I currently serve as the Tech Lead for DigitalBoneBox, an open-source educational platform designed to render high-fidelity anatomical resources for anatomy students.&lt;/p&gt;

&lt;p&gt;Our architectural constraints are unique: we needed a "database" that was completely open, version-controlled, and accessible to non-technical contributors (like anatomy professors) who might want to fix a typo or add a description without touching a database console.&lt;/p&gt;

&lt;p&gt;The solution? &lt;strong&gt;"GitHub-as-a-Database."&lt;/strong&gt; We store our data (JSON files and images) in a specific data branch of our public repository. The application fetches this content via the GitHub Raw API to render the UI.&lt;/p&gt;

&lt;p&gt;While this lowered the barrier to entry for contributors, it introduced a critical engineering challenge: The N+1 Fetch Problem.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Problem: Latency and Rate Limits&lt;/strong&gt;&lt;br&gt;
In our initial architecture, the client (or a thin server wrapper) would:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Fetch the "Manifest" file (a list of all bones).&lt;/li&gt;
&lt;li&gt;Iterate through that list.&lt;/li&gt;
&lt;li&gt;Fire a separate HTTP request for each bone to get its metadata (images, sub-bones, descriptions).&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;For a boneset like the "Bony Pelvis," this resulted in dozens of simultaneous HTTP requests to raw.githubusercontent.com.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Consequence:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Latency&lt;/strong&gt;: The user interface would "pop in" elements slowly as requests completed.&lt;br&gt;
&lt;strong&gt;The Kill Switch:&lt;/strong&gt; GitHub imposes a strict rate limit on unauthenticated requests (60 requests per hour per IP). A single student clicking through the application aggressively could exhaust this limit in minutes, causing the application to crash with 403 Forbidden errors.&lt;/p&gt;

&lt;p&gt;As the application grew, this architecture became unsustainable. We needed a way to preserve the open-source data model while ensuring high availability.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Solution: Server-Side "Warm Cache" Strategy&lt;/strong&gt;&lt;br&gt;
To solve this, I led the refactoring of our Node.js backend to implement an In-Memory Warm Cache. Instead of fetching data on request, we shifted the heavy lifting to the startup phase.&lt;/p&gt;

&lt;p&gt;The Architecture Shift&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Cold Start: When the Node.js server boots up, it enters a "Warming" state.&lt;/li&gt;
&lt;li&gt;Bulk Fetching: The server traverses the GitHub data structure once, fetching the manifest and all constituent bone files.&lt;/li&gt;
&lt;li&gt;In-Memory Indexing: These files are parsed and stored in a local JavaScript object (searchCache).&lt;/li&gt;
&lt;li&gt;Serving: All subsequent user requests (Search, Navigation, Dropdowns) are served instantly from the local memory.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This reduces the API load from N (requests per user action) to 1 (request per server deployment).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Implementation&lt;/strong&gt;&lt;br&gt;
Here is a simplified look at the logic we implemented in our server.js to handle this caching strategy.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;// Cache storage
let searchCache = null;

// The "Warming" Logic
async function initializeSearchCache() {
    try {
        console.log("Initializing search cache...");

        // 1. Fetch the master manifest once
        const bonesetData = await fetchJSON(BONESET_JSON_URL);

        const searchData = [];

        // 2. Iterate and fetch details server-side (only happens once at startup)
        for (const boneId of bonesetData.bones || []) {
            const boneData = await fetchJSON(`${BONES_DIR_URL}${boneId}.json`);

            if (boneData) {
                // Flatten the data for efficient searching
                searchData.push({
                    id: boneData.id,
                    name: boneData.name,
                    type: "bone",
                    // ... additional metadata
                });
            }
        }

        // 3. Store in memory
        searchCache = searchData;
        console.log(`Cache warmed: ${searchData.length} items ready.`);

    } catch (error) {
        console.error("Critical: Failed to warm cache:", error);
    }
}

// Initialize on server start
initializeSearchCache();
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;The Results&lt;/strong&gt;&lt;br&gt;
The impact of this refactoring was immediate:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Zero Rate-Limit Errors: Since the server only fetches data once upon reboot (or scheduled refresh), we stay well below GitHub's API limits, regardless of how many students are using the app concurrently.&lt;/li&gt;
&lt;li&gt;Sub-100ms Response Times: Search queries that previously waited for network round-trips now return instantly from memory.&lt;/li&gt;
&lt;li&gt;Resilience: If GitHub goes down temporarily, the application continues to function for all connected users because the data is already cached in the server's RAM.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Conclusion&lt;/strong&gt;&lt;br&gt;
In open-source development, we often have to balance contributor experience (using simple files on GitHub) with user experience (performance and reliability). By introducing a caching middleware layer, we maintained the simplicity of our "Git-based database" for our anatomy professors while delivering a robust, production-grade experience for our students.&lt;/p&gt;

&lt;p&gt;For developers building similar read-heavy applications on static data sources: &lt;strong&gt;don't let your clients fetch directly&lt;/strong&gt;. Build a caching layer early—your API limits will thank you.&lt;/p&gt;

</description>
      <category>javascript</category>
      <category>node</category>
      <category>opensource</category>
      <category>performance</category>
    </item>
  </channel>
</rss>
