DEV Community

Shannon
Shannon

Posted on

Utilizing OS-recommended caching in go to store files on disk

#go

Table of Contents

  1. Overview
  2. Local caching
  3. API Scaffolding
  4. Creating the HTTP client
  5. Seeing it in action

Overview

I recently started writing a wrapper around an API that rate limits you after 60 requests/minute. This is pretty low in the grand scheme of software, but its responses effectively don't change after a single query. With that in mind, it seemed like a great time to implement a local cache that is stored on disk and platform agnostic.

Local caching

There's a bunch of great caching options that range from storing things in memory at startup to using Redis or other database services, but this tutorial will keep things straight forward: storing responses locally on disk.

To start, let's briefly look at one of go's methods for placing cache files, regardless of the operating system you may be using. os.UserCacheDir() in the os go package is what we'll use for this part. If we look at the function, it states UserCacheDir returns the default root directory to use for user-specific cached data. Users should create their own application-specific subdirectory within this one and use that. Going into that function, we can see there's specific logic for Mac, Windows, Linux, and more.

A quick peek at the function:

...
    switch runtime.GOOS {
    case "windows":
        dir = Getenv("LocalAppData")
        if dir == "" {
            return "", errors.New("%LocalAppData% is not defined")
        }

    case "darwin", "ios":
        dir = Getenv("HOME")
        if dir == "" {
            return "", errors.New("$HOME is not defined")
        }
        dir += "/Library/Caches"
...
Enter fullscreen mode Exit fullscreen mode

As I'm using a Mac, I thought it would be interesting to show what that looks like. The command (ls $HOME/Library/Caches/slucas-caching-testing) generates the following response:

user_1.json user_2.json
Enter fullscreen mode Exit fullscreen mode

API Scaffolding

I want to frame this from the perspective of querying an API, so we'll set up some scaffolding to generate responses for essentially GET /users/1, GET /users/2, and so on. I won't get super into detail on this because it's setting up an httptest server and just generating responses. Nevertheless, it's worth noting.

type Response struct {
    ID   int    `json:"id"`
    Name string `json:"name"`
}

// getResponseByIndex returns a specific response by index (0-3)
func getResponseByIndex(index int) ([]byte, error) {
    responses := make([]Response, 4)
    names := []string{"User1", "User2", "User3", "User4"}
    for i, name := range names {
        responses[i].ID = i + 1 // Use 1-indexed IDs
        responses[i].Name = name
    }

    return json.Marshal(responses[index])
}

// setupFakeServer creates a test HTTP server that serves stub data
func setupFakeServer() *httptest.Server {
    handler := http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
        // Set content type
        w.Header().Set("Content-Type", "application/json")

        // Extract the user ID from the path
        parts := strings.Split(r.URL.Path, "/")
        if len(parts) != 3 {
            http.NotFound(w, r)
            return
        }

        // Get integer of userID
        userID, err := strconv.Atoi(parts[2])
        if err != nil || userID < 1 || userID > 4 {
            http.NotFound(w, r)
            return
        }

        // Get the response corresponding to the user ID (convert to 0-indexed)
        jsonData, err := getResponseByIndex(userID - 1)
        if err != nil {
            http.Error(w, err.Error(), http.StatusInternalServerError)
            return
        }

        // Write the response
        w.WriteHeader(http.StatusOK)
        w.Write(jsonData)
    })

    return httptest.NewServer(handler)
}
Enter fullscreen mode Exit fullscreen mode

Above, we're doing the following:

  1. Setting up some mock responses through the Response data type
  2. Creating a new handler func that splits out the path and looks for a specific user
  3. Writes a JSON response

Creating the HTTP Client

Next up, we'll implement our client, which is just a wrapper around an HTTP client with a directory name.

type CacheClient struct {
    httpClient *http.Client
    cacheDir   string
}
Enter fullscreen mode Exit fullscreen mode

And a constructor to set up both of these pieces.

// NewCacheClient creates a HTTP client with cache directory initialized
func NewCacheClient() (*CacheClient, error) {
    cacheDir, err := os.UserCacheDir()
    if err != nil {
        return nil, fmt.Errorf("failed to get cache dir: %w", err)
    }

    cacheDir = filepath.Join(cacheDir, "slucas-caching-testing")
    if err := os.MkdirAll(cacheDir, 0755); err != nil {
        return nil, fmt.Errorf("failed to create cache dir: %w", err)
    }

    return &CacheClient{
        httpClient: &http.Client{
            Timeout: 10 * time.Second,
        },
        cacheDir: cacheDir,
    }, nil
}
Enter fullscreen mode Exit fullscreen mode

A couple of points to note about this function.

  1. We're doing the suggested method from the UserCachedDir function: Users should create their own application-specific subdirectory within this one and use that by creating a directory for the cached files at startup.
  2. We're explicitly setting up our struct and constructor to mention Cache in the name. It would be totally reasonable to also implement a normal client that has no caching. We'll be writing methods below to get a user, which is tied to the CacheClient struct.

Caching

First, we're setting up some methods to find, read, and write from the cache.

// getCachePath returns the cache file path for a given user ID
func (c *CacheClient) getCachePath(userID int) string {
    return filepath.Join(c.cacheDir, fmt.Sprintf("user_%d.json", userID))
}

// getCachedResponse retrieves a cached response for a user ID
func (c *CacheClient) getCachedResponse(userID int) ([]byte, error) {
    cachePath := c.getCachePath(userID)
    data, err := os.ReadFile(cachePath)
    if err != nil {
        if os.IsNotExist(err) {
            return nil, nil // Cache miss, not an error
        }
        return nil, fmt.Errorf("failed to read cache: %w", err)
    }
    return data, nil
}

// saveToCache saves a response to cache for a user ID
func (c *CacheClient) saveToCache(userID int, data []byte) error {
    cachePath := c.getCachePath(userID)
    return os.WriteFile(cachePath, data, 0644)
}
Enter fullscreen mode Exit fullscreen mode

All of this is pretty straightforward, but the key parts are that we have a way to save json responses as a file on disk. It's a relatively naive approach where we aren't using a mutex to prevent concurrent writes to the same file or adding an expiration, but it's fine in this concise example.

And finally, we create our caching logic in a GetUser method!

// GetUser retrieves user data, checking cache first, then making HTTP request if needed
func (c *CacheClient) GetUser(baseURL string, userID int) (*Response, error) {
    // Check cache first
    cachedData, err := c.getCachedResponse(userID)
    if err != nil {
        return nil, err
    }
    if cachedData != nil {
        fmt.Printf("User %d found in cache\n", userID)
        var resp Response
        if err := json.Unmarshal(cachedData, &resp); err != nil {
        } else {
            return &resp, nil
        }
    }

    // Cache miss, make HTTP request
    url := fmt.Sprintf("%s/user/%d", baseURL, userID)
    httpResp, err := c.httpClient.Get(url)
    if err != nil {
        return nil, fmt.Errorf("failed to make request: %w", err)
    }
    defer httpResp.Body.Close()

    body, err := io.ReadAll(httpResp.Body)
    if err != nil {
        return nil, fmt.Errorf("failed to read response: %w", err)
    }

    // Save to cache
    if err := c.saveToCache(userID, body); err != nil {
        fmt.Printf("Warning: failed to save to cache: %v\n", err)
    }

    // Parse and return response
    var resp Response
    if err := json.Unmarshal(body, &resp); err != nil {
        return nil, fmt.Errorf("failed to parse response: %w", err)
    }

    return &resp, nil
}
Enter fullscreen mode Exit fullscreen mode

I think this is all quite readable as well, but it never hurts to walk through step-by-step.

  1. Check for a cached response. If it exists, return
  2. Cache missed. Run GET request
  3. Save to cache
  4. Return JSON response

Seeing it in action

func main() {
    // Set up httptest server
    server := setupFakeServer()
    defer server.Close()

    fmt.Printf("Server running at: %s\n", server.URL)

    // Create a client with caching enabled
    client, err := NewCacheClient()
    if err != nil {
        fmt.Printf("Error creating client: %v\n", err)
        return
    }

    // Example: Get user 1 (will fetch from server and cache)
    fmt.Println("\nFirst request (should fetch from server):")
    user, err := client.GetUser(server.URL, 1)
    if err != nil {
        fmt.Printf("Error getting user: %v\n", err)
        return
    }
    fmt.Printf("User: %+v\n", user)

    // Example: Get user 1 again (should use cache)
    fmt.Println("\nSecond request (should use cache):")
    user, err = client.GetUser(server.URL, 1)
    if err != nil {
        fmt.Printf("Error getting user: %v\n", err)
        return
    }
    fmt.Printf("User: %+v\n", user)
}
Enter fullscreen mode Exit fullscreen mode

I'll clean up my environment to simulate starting from scratch: rm $HOME/Library/Caches/slucas-caching-testing/*

And from here, we should see that the first result is pulled from the HTTP server, saved, and then our second request is gotten from the cache:

✗ ./caching                            (current-context is not set)
Server running at: http://127.0.0.1:60219

First request (should fetch from server):
User: &{ID:1 Name:User1}

Second request (should use cache):
User 1 found in cache
User: &{ID:1 Name:User1}
Enter fullscreen mode Exit fullscreen mode

And that's it! I hope it was useful seeing a little contrived example. From here, we could put in mutexes, add expirations, pre-load certain data, add hit rates to know the most commonly queried endpoints. Productionizing something opens up a whole new can of worms!

I'm dropping the full code below as well.

All code

package main

import (
    "encoding/json"
    "fmt"
    "io"
    "net/http"
    "net/http/httptest"
    "os"
    "path/filepath"
    "strconv"
    "strings"
    "time"
)

type CacheClient struct {
    httpClient *http.Client
    cacheDir   string
}

type Response struct {
    ID   int    `json:"id"`
    Name string `json:"name"`
}

// getResponseByIndex returns a specific response by index (0-3)
func getResponseByIndex(index int) ([]byte, error) {
    responses := make([]Response, 4)
    names := []string{"User1", "User2", "User3", "User4"}
    for i, name := range names {
        responses[i].ID = i + 1 // Use 1-indexed IDs
        responses[i].Name = name
    }

    return json.Marshal(responses[index])
}

// setupFakeServer creates a test HTTP server that serves stub data
func setupFakeServer() *httptest.Server {
    handler := http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
        // Set content type
        w.Header().Set("Content-Type", "application/json")

        // Extract the user ID from the path
        parts := strings.Split(r.URL.Path, "/")
        if len(parts) != 3 {
            http.NotFound(w, r)
            return
        }

        // Get integer of userID
        userID, err := strconv.Atoi(parts[2])
        if err != nil || userID < 1 || userID > 4 {
            http.NotFound(w, r)
            return
        }

        // Get the response corresponding to the user ID (convert to 0-indexed)
        jsonData, err := getResponseByIndex(userID - 1)
        if err != nil {
            http.Error(w, err.Error(), http.StatusInternalServerError)
            return
        }

        // Write the response
        w.WriteHeader(http.StatusOK)
        w.Write(jsonData)
    })

    return httptest.NewServer(handler)
}

// NewCacheClient creates a HTTP client with cache directory initialized
func NewCacheClient() (*CacheClient, error) {
    cacheDir, err := os.UserCacheDir()
    if err != nil {
        return nil, fmt.Errorf("failed to get cache dir: %w", err)
    }

    cacheDir = filepath.Join(cacheDir, "slucas-caching-testing")
    if err := os.MkdirAll(cacheDir, 0755); err != nil {
        return nil, fmt.Errorf("failed to create cache dir: %w", err)
    }

    return &CacheClient{
        httpClient: &http.Client{
            Timeout: 10 * time.Second,
        },
        cacheDir: cacheDir,
    }, nil
}

// getCachePath returns the cache file path for a given user ID
func (c *CacheClient) getCachePath(userID int) string {
    return filepath.Join(c.cacheDir, fmt.Sprintf("user_%d.json", userID))
}

// getCachedResponse retrieves a cached response for a user ID
func (c *CacheClient) getCachedResponse(userID int) ([]byte, error) {
    cachePath := c.getCachePath(userID)
    data, err := os.ReadFile(cachePath)
    if err != nil {
        if os.IsNotExist(err) {
            return nil, nil // Cache miss, not an error
        }
        return nil, fmt.Errorf("failed to read cache: %w", err)
    }
    return data, nil
}

// saveToCache saves a response to cache for a user ID
func (c *CacheClient) saveToCache(userID int, data []byte) error {
    cachePath := c.getCachePath(userID)
    return os.WriteFile(cachePath, data, 0644)
}

// GetUser retrieves user data, checking cache first, then making HTTP request if needed
func (c *CacheClient) GetUser(baseURL string, userID int) (*Response, error) {
    // Check cache first
    cachedData, err := c.getCachedResponse(userID)
    if err != nil {
        return nil, err
    }
    if cachedData != nil {
        fmt.Printf("User %d found in cache\n", userID)
        var resp Response
        if err := json.Unmarshal(cachedData, &resp); err != nil {
        } else {
            return &resp, nil
        }
    }

    // Cache miss, make HTTP request
    url := fmt.Sprintf("%s/user/%d", baseURL, userID)
    httpResp, err := c.httpClient.Get(url)
    if err != nil {
        return nil, fmt.Errorf("failed to make request: %w", err)
    }
    defer httpResp.Body.Close()

    body, err := io.ReadAll(httpResp.Body)
    if err != nil {
        return nil, fmt.Errorf("failed to read response: %w", err)
    }

    // Save to cache
    if err := c.saveToCache(userID, body); err != nil {
        fmt.Printf("Warning: failed to save to cache: %v\n", err)
    }

    // Parse and return response
    var resp Response
    if err := json.Unmarshal(body, &resp); err != nil {
        return nil, fmt.Errorf("failed to parse response: %w", err)
    }

    return &resp, nil
}

func main() {
    // Set up httptest server
    server := setupFakeServer()
    defer server.Close()

    fmt.Printf("Server running at: %s\n", server.URL)

    // Create a client with caching enabled
    client, err := NewCacheClient()
    if err != nil {
        fmt.Printf("Error creating client: %v\n", err)
        return
    }

    // Example: Get user 1 (will fetch from server and cache)
    fmt.Println("\nFirst request (should fetch from server):")
    user, err := client.GetUser(server.URL, 1)
    if err != nil {
        fmt.Printf("Error getting user: %v\n", err)
        return
    }
    fmt.Printf("User: %+v\n", user)

    // Example: Get user 1 again (should use cache)
    fmt.Println("\nSecond request (should use cache):")
    user, err = client.GetUser(server.URL, 1)
    if err != nil {
        fmt.Printf("Error getting user: %v\n", err)
        return
    }
    fmt.Printf("User: %+v\n", user)

    // Example: Get user 2
    fmt.Println("\nRequest for user 2:")
    user2, err := client.GetUser(server.URL, 2)
    if err != nil {
        fmt.Printf("Error getting user: %v\n", err)
        return
    }
    fmt.Printf("User: %+v\n", user2)
}

Enter fullscreen mode Exit fullscreen mode

Top comments (0)