Jakob Khalil

Posted on Feb 11

Building a YAML Schema to Go Model Generator for OpenAI's API

#go #openai #codegen #api

When working with OpenAI's API, I needed to generate Go models from their YAML schema specification. Instead of manually creating these models, I built a generator using Go's text/template package. Here's how I did it and what I learned along the way.

The Problem

OpenAI provides their API specification in YAML format, but I needed strongly-typed Go models for my application. Manual conversion would be time-consuming and error-prone, especially when the API specs change.

The Solution

I created a generator that:

Parses OpenAI's YAML schema
Generates Go structs with proper types and JSON tags
Handles complex types like oneOf/anyOf
Generates proper marshaling/unmarshaling methods

Let's look at an example of what the generator produces:

// 10:16:output/ChatCompletionRequestUserMessage.go
// Messages sent by an end user, containing prompts or additional context
// information.
type ChatCompletionRequestUserMessage struct {
    Content UserMessageContent `json:"content"`
    Name *string `json:"name,omitempty"`
    Role Role `json:"role"`
}

This was generated from a YAML schema like this:

components:
  schemas:
    ChatCompletionRequestUserMessage:
      description: Messages sent by an end user, containing prompts or additional context information.
      type: object
      required:
        - content
        - role
      properties:
        content:
          $ref: '#/components/schemas/UserMessageContent'
        name:
          type: string
        role:
          $ref: '#/components/schemas/Role'

Key Features

1. OneOf/AnyOf Handling

One of the trickier parts was handling OpenAI's oneOf/anyOf patterns. For example, the UserMessageContent can be either a string or an array:

// 19:22:output/ChatCompletionRequestUserMessage.go
type UserMessageContent struct {
    UserMessageContentString *string
    UserMessageContentArray *[]ChatCompletionRequestUserMessageContentPart
}

The generator creates a struct with both possibilities and implements custom marshaling:

// 37:54:output/ChatCompletionRequestUserMessage.go
func (x UserMessageContent) MarshalJSON() ([]byte, error) {
    rawResult, err := x.Match(
        func(y string) (any, error) {
            return json.Marshal(y)
        },
        func(y []ChatCompletionRequestUserMessageContentPart) (any, error) {
            return json.Marshal(y)
        },
    )
    if err != nil {
        return nil, err
    }
    result, ok := rawResult.([]byte)
    if !ok {
        return nil, fmt.Errorf("expected match to return type '[]byte'")
    }
    return result, nil
}

2. Enum Support

The generator also handles enums elegantly. For example, the ReasoningEffort enum:

// 269:275:output/ChatCompletionRequest.go
type ReasoningEffortEnum string

const (
    ReasoningEffortEnumHigh ReasoningEffortEnum = "high"
    ReasoningEffortEnumLow ReasoningEffortEnum = "low"
    ReasoningEffortEnumMedium ReasoningEffortEnum = "medium"
)

3. Documentation Preservation

Notice how the generator preserves documentation from the YAML schema:

// 10:13:output/ChatCompletionRequestDeveloperMessage.go
// Developer-provided instructions that the model should follow, regardless of
// messages sent by the user. With o1 models and newer, `developer` messages
// replace the previous `system` messages.
type ChatCompletionRequestDeveloperMessage struct {

The Generator Template

Here's a simplified version of the main template I used:

const modelTemplate = `
// THIS IS GENERATED CODE. DO NOT EDIT  
package {{.PackageName}}

import (
    "encoding/json"
    "fmt"
)

{{if .Description}}// {{.Description}}{{end}}
type {{.Name}} struct {
    {{range .Properties}}
    {{.Name}} {{.Type}} ` + "`json:\"{{.JsonTag}},omitempty\"`" + `
    {{end}}
}

{{if .HasOneOf}}
func (x *{{.Name}}) Match(
    {{range .OneOfTypes}}
    fn{{.Name}} func(y {{.Type}}) (any, error),
    {{end}}
) (any, error) {
    {{range .OneOfTypes}}
    if x.{{.Name}} != nil {
        return fn{{.Name}}(*x.{{.Name}})
    }
    {{end}}
    return nil, fmt.Errorf("invalid content: all variants are nil")
}
{{end}}
`

Usage

Using the generator is straightforward:

func main() {
    schema, err := LoadYAMLSchema("openai-schema.yaml")
    if err != nil {
        log.Fatal(err)
    }

    generator := NewGenerator(schema)
    if err := generator.Generate("output/"); err != nil {
        log.Fatal(err)
    }
}

Benefits

Type Safety: Generated models provide compile-time type checking
Maintainability: Easy to regenerate when the API spec changes
Consistency: Ensures all models follow the same patterns
Documentation: Preserves API documentation in Go comments

Conclusion

Building this generator has significantly improved our OpenAI API integration workflow. The generated code is consistent, well-documented, and type-safe. When OpenAI updates their API, we can simply regenerate our models.

DEV Community