DEV Community

Cover image for Leveraging AI Models with Go and Replicate API: A Comprehensive Guide
Serhii Mariiekha
Serhii Mariiekha

Posted on

1

Leveraging AI Models with Go and Replicate API: A Comprehensive Guide

In the rapidly evolving landscape of artificial intelligence, Replicate.com has emerged as a powerful platform that provides access to a wide array of pre-trained AI models through a simple API interface. This essay explores how to effectively utilize Replicate's API using Go, demonstrating how to integrate various models into your applications while maintaining clean, maintainable code.

Understanding Replicate's Architecture

Replicate provides a RESTful API that allows developers to run machine learning models in the cloud. The platform handles the complexity of model deployment, scaling, and infrastructure management, letting developers focus on integration and application logic. When working with Replicate in Go, we'll need to understand a few key concepts:

  1. Model Versions: Each model on Replicate has specific versions, identified by unique hashes
  2. Predictions: Running a model creates a "prediction" - an asynchronous job that processes your inputs
  3. Webhooks: Optional callbacks that notify your application when predictions complete

Setting Up the Development Environment

Before diving into implementation, let's set up our Go project with the necessary dependencies:

package main

import (
    "bytes"
    "encoding/json"
    "fmt"
    "io/ioutil"
    "net/http"
    "os"
    "time"
)

// Configuration struct to hold our Replicate API settings.
type Config struct {
    Token string
    BaseURL  string
}

// NewConfig creates a new configuration instance.
func NewConfig() *Config {
    return &Config{
        Token: os.Getenv("REPLICATE_API_TOKEN"),
        BaseURL:  "https://api.replicate.com/v1",
    }
}
Enter fullscreen mode Exit fullscreen mode

Creating a Robust Client

Let's implement a reusable client structure that will handle our API interactions:

// Client represents our Replicate API client
type Client struct {
    config *Config
    http   *http.Client
}

// NewClient creates a new Replicate client instance with retry functionality.
func NewClient(config *Config) *Client {
    client := retryablehttp.NewClient()
    client.RetryMax = 3
    client.RetryWaitMin = 1 * time.Second
    client.RetryWaitMax = 30 * time.Second
    client.HTTPClient.Timeout = time.Second * 30

    return &Client{
        config: config,
        http:   client.StandardClient(),
    }
}

// createRequest helps build HTTP requests with appropriate headers
func (c *Client) createRequest(method, endpoint string, body any) (*http.Request, error) {
    var buf bytes.Buffer

    if body != nil {
        if err := json.NewEncoder(&buf).Encode(body); err != nil {
            return nil, fmt.Errorf("encoding request body: %w", err)
        }
    }

    req, err := http.NewRequest(method, c.config.APIBase+endpoint, &buf)
    if err != nil {
        return nil, fmt.Errorf("creating request: %w", err)
    }

    req.Header.Set("Authorization", "Token "+c.config.APIToken)
    req.Header.Set("Content-Type", "application/json")

    return req, nil
}
Enter fullscreen mode Exit fullscreen mode

Working with Stable Diffusion

Let's implement a practical example using Stable Diffusion, a popular image generation model:

// PredictionInput represents the input parameters for Stable Diffusion.
type StableDiffusionInput struct {
    Prompt string `json:"prompt"`
    Width  int    `json:"width,omitempty"`
    Height int    `json:"height,omitempty"`
}

// CreatePrediction starts a new image generation task
func (c *Client) CreateStableDiffusionPrediction(input *StableDiffusionInput) (*Prediction, error) {
    payload := map[string]any{
        "version": "stable-diffusion-v1-5",
        "input": input,
    }

    req, err := c.createRequest("POST", "/predictions", payload)
    if err != nil {
        return nil, err
    }

    resp, err := c.http.Do(req)
    if err != nil {
        return nil, fmt.Errorf("making request: %w", err)
    }
    defer resp.Body.Close()

    if resp.StatusCode != http.StatusCreated {
        body, _ := ioutil.ReadAll(resp.Body)
        return nil, fmt.Errorf("API error: %s: %s", resp.Status, body)
    }

    var prediction Prediction
    if err := json.NewDecoder(resp.Body).Decode(&prediction); err != nil {
        return nil, fmt.Errorf("decoding response: %w", err)
    }

    return &prediction, nil
}
Enter fullscreen mode Exit fullscreen mode

Implementing a Prediction Poller

Since Replicate's predictions are asynchronous, we need a way to poll for results:

// PredictionStatus represents the current state of a prediction
type PredictionStatus string

const (
    StatusStarting    PredictionStatus = "starting"
    StatusProcessing  PredictionStatus = "processing"
    StatusSucceeded   PredictionStatus = "succeeded"
    StatusFailed      PredictionStatus = "failed"
)

// PollPrediction continuously checks a prediction's status until completion
func (c *Client) PollPrediction(id string) (*Prediction, error) {
    ticker := time.NewTicker(2 * time.Second)
    defer ticker.Stop()

    timeout := time.After(10 * time.Minute)

    for {
        select {
        case <-ticker.C:
            prediction, err := c.GetPrediction(id)
            if err != nil {
                return nil, err
            }

            switch prediction.Status {
            case StatusSucceeded:
                return prediction, nil

            case StatusFailed:
                return nil, fmt.Errorf("prediction failed: %s", prediction.Error)

            case StatusStarting, StatusProcessing:
                continue

            default:
                return nil, fmt.Errorf("unknown status: %s", prediction.Status)
            }

        case <-timeout:
            return nil, fmt.Errorf("prediction timed out after 10 minutes")
        }
    }
}
Enter fullscreen mode Exit fullscreen mode

Putting It All Together

Here's how we can use our implementation to generate images with Stable Diffusion:

func main() {
    config := NewConfig()
    client := NewClient(config)

    input := &StableDiffusionInput{
        Prompt: "matte black sports car with purple neon wheels:1.2, low rider, neon lights, sunset, reflective puddles, scifi, concept car, sideview, tropical background, 35mm photograph, film, bokeh, professional, 4k, highly detailed",
        Width:  768,
        Height: 512,
    }

    // Create the prediction
    prediction, err := client.CreateStableDiffusionPrediction(input)
    if err != nil {
        fmt.Printf("Error creating prediction: %v\n", err)
        return
    }

    // Poll for results
    result, err := client.PollPrediction(prediction.ID)
    if err != nil {
        fmt.Printf("Error polling prediction: %v\n", err)
        return
    }

    // Handle the result
    if images, ok := result.Output.([]string); ok && len(images) > 0 {
        fmt.Printf("Generated image URL: %s\n", images[0])
    }
}
Enter fullscreen mode Exit fullscreen mode

Error Handling and Best Practices

When working with external APIs, robust error handling is crucial. Here are some key practices implemented in our code:

  1. Timeouts: Our HTTP client includes a timeout to prevent hanging requests
  2. Proper error wrapping: We use fmt.Errorf with %w to maintain error context
  3. Resource cleanup: We properly close response bodies using defer
  4. Type safety: We use strong typing for API responses and requests
  5. Configuration management: API tokens are loaded from environment variables

Extending to Other Models

The structure we've created can be easily extended to work with other models on Replicate. Here's how you might adapt it for a different model:

// Generic prediction creation function.
func (c *Client) CreatePrediction(version string, input any) (*Prediction, error) {
    payload := map[string]any{
        "version": version,
        "input":   input,
    }

    req, err := c.createRequest(http.MethodPost, "/predictions", payload)
    if err != nil {
        return nil, err
    }

    resp, err := c.http.Do(req)
    if err != nil {
        return nil, fmt.Errorf("making request: %w", err)
    }

    defer resp.Body.Close()

    if resp.StatusCode != http.StatusCreated {
        return nil, fmt.Errorf("api error: %s", resp.Status)
    }

    var prediction Prediction
    if err := json.NewDecoder(resp.Body).Decode(&prediction); err != nil {
        return nil, fmt.Errorf("decoding response: %w", err)
    }

    return &prediction, nil 
}
Enter fullscreen mode Exit fullscreen mode

Conclusion

Working with Replicate's API in Go provides a powerful way to integrate AI capabilities into your applications. By following good software engineering practices and implementing robust error handling, we can create reliable integrations that scale well as our applications grow.

The code structure we've explored provides a solid foundation that you can build upon, whether you're working with image generation, natural language processing, or any other AI models available on Replicate. Remember to always consider rate limiting, error handling, and proper resource cleanup when working with external APIs.

For production applications, consider adding features like:

  • Retry mechanisms with exponential backoff
  • Metric collection for API calls
  • Proper logging and monitoring
  • Rate limiting implementation
  • Webhook support for asynchronous notifications

This foundation will serve as an excellent starting point for building sophisticated AI-powered applications using Go and Replicate's extensive model ecosystem.

Image of Timescale

🚀 pgai Vectorizer: SQLAlchemy and LiteLLM Make Vector Search Simple

We built pgai Vectorizer to simplify embedding management for AI applications—without needing a separate database or complex infrastructure. Since launch, developers have created over 3,000 vectorizers on Timescale Cloud, with many more self-hosted.

Read more →

Top comments (0)

Billboard image

The Next Generation Developer Platform

Coherence is the first Platform-as-a-Service you can control. Unlike "black-box" platforms that are opinionated about the infra you can deploy, Coherence is powered by CNC, the open-source IaC framework, which offers limitless customization.

Learn more

đź‘‹ Kindness is contagious

Please leave a ❤️ or a friendly comment on this post if you found it helpful!

Okay