Leapcell

Posted on Dec 30, 2024

Go Generics: A Deep Dive

#go #webdev #programming

1. Go Without Generics

Before the introduction of generics, there were several approaches to implementing generic functions that support different data types:

Approach 1: Implement a function for each data type
This approach leads to extremely redundant code and high maintenance costs. Any modification requires the same operation to be performed on all functions. Moreover, since the Go language does not support function overloading with the same name, it is also inconvenient to expose these functions for external module calls.

Approach 2: Use the data type with the largest range
To avoid code redundancy, another method is to use the data type with the largest range, i.e., Approach 2. A typical example is math.Max, which returns the larger of two numbers. To be able to compare data of various data types, math.Max uses float64, the data type with the largest range among numeric types in Go, as the input and output parameters, thus avoiding precision loss. Although this solves the code redundancy problem to some extent, any type of data needs to be converted to the float64 type first. For example, when comparing int with int, type casting is still required, which not only degrades performance but also seems unnatural.

Approach 3: Use the interface{} type
Using the interface{} type effectively solves the above problems. However, the interface{} type introduces certain runtime overhead because it requires type assertions or type judgments at runtime, which may lead to some performance degradation. Additionally, when using the interface{} type, the compiler cannot perform static type checking, so some type errors may only be discovered at runtime.

2. Advantages of Generics

Go 1.18 introduced support for generics, which is a significant change since the open-sourcing of the Go language.
Generics is a feature of programming languages. It allows programmers to use generic types instead of actual types in programming. Then, through explicit passing or automatic deduction during actual calls, the generic types are replaced, achieving the purpose of code reuse. In the process of using generics, the data type to be operated on is specified as a parameter. Such parameter types are called generic classes, generic interfaces, and generic methods in classes, interfaces, and methods respectively.
The main advantages of generics are improving code reusability and type safety. Compared with traditional formal parameters, generics make writing universal code more concise and flexible, providing the ability to handle different types of data and further enhancing the expressiveness and reusability of the Go language. At the same time, since the specific types of generics are determined at compile time, type checking can be provided, avoiding type conversion errors.

3. Differences Between Generics and `interface{}`

In the Go language, both interface{} and generics are tools for handling multiple data types. To discuss their differences, let's first look at the implementation principles of interface{} and generics.

3.1 `interface{}` Implementation Principle

interface{} is an empty interface without methods in the interface type. Since all types implement interface{}, it can be used to create functions, methods, or data structures that can accept any type. The underlying structure of interface{} at runtime is represented as eface, whose structure is shown below, mainly containing two fields, _type and data.

type eface struct {
    _type *_type
    data  unsafe.Pointer
}
type type struct {
    Size uintptr
    PtrBytes uintptr // number of (prefix) bytes in the type that can contain pointers
    Hash uint32 // hash of type; avoids computation in hash tables
    TFlag TFlag // extra type information flags
    Align_ uint8 // alignment of variable with this type
    FieldAlign_ uint8 // alignment of struct field with this type
    Kind_ uint8 // enumeration for C
    // function for comparing objects of this type
    // (ptr to object A, ptr to object B) -> ==?
    Equal func(unsafe.Pointer, unsafe.Pointer) bool
    // GCData stores the GC type data for the garbage collector.
    // If the KindGCProg bit is set in kind, GCData is a GC program.
    // Otherwise it is a ptrmask bitmap. See mbitmap.go for details.
    GCData *byte
    Str NameOff // string form
    PtrToThis TypeOff // type for pointer to this type, may be zero
}

_type is a pointer to the _type structure, which contains information such as the size, kind, hash function, and string representation of the actual value. data is a pointer to the actual data. If the size of the actual data is less than or equal to the size of a pointer, the data will be directly stored in the data field; otherwise, the data field will store a pointer to the actual data.
When an object of a specific type is assigned to a variable of the interface{} type, the Go language implicitly performs the boxing operation of eface, setting the _type field to the type of the value and the data field to the data of the value. For example, when the statement var i interface{} = 123 is executed, Go will create an eface structure, where the _type field represents the int type and the data field represents the value 123.
When retrieving the stored value from interface{}, an unboxing process occurs, that is, type assertion or type judgment. This process requires explicitly specifying the expected type. If the type of the value stored in interface{} matches the expected type, the type assertion will succeed, and the value can be retrieved. Otherwise, the type assertion will fail, and additional handling is required for this situation.

var i interface{} = "hello"
s, ok := i.(string)
if ok {
    fmt.Println(s) // Output "hello"
} else {
    fmt.Println("not a string")
}

It can be seen that interface{} supports operations on multiple data types through boxing and unboxing operations at the runtime.

3.2 Generics Implementation Principle

The Go core team was very cautious when evaluating the implementation schemes of Go generics. A total of three implementation schemes were submitted:

Stenciling scheme
Dictionaries scheme
GC Shape Stenciling scheme

The Stenciling scheme is also the implementation scheme adopted by languages such as C++ and Rust for implementing generics. Its implementation principle is that during the compilation period, according to the specific type parameters when the generic function is called or the type elements in the constraints, a separate implementation of the generic function is generated for each type argument to ensure type safety and optimal performance. However, this method will slow down the compiler. Because when there are many data types being called, the generic function needs to generate independent functions for each data type, which may result in very large compiled files. At the same time, due to issues such as CPU cache misses and instruction branch prediction, the generated code may not run efficiently.

The Dictionaries scheme only generates one function logic for the generic function but adds a parameter dict as the first parameter to the function. The dict parameter stores the type-related information of the type arguments when the generic function is called and passes the dictionary information using the AX register (AMD) during the function call. The advantage of this scheme is that it reduces the compilation phase overhead and does not increase the size of the binary file. However, it increases the runtime overhead, cannot perform function optimization at the compilation stage, and has problems such as dictionary recursion.

type Op interface{
       int|float 
}
func Add[T Op](m, n T) T { 
       return m + n
} 
// After generation =>
const dict = map[type] typeInfo{
       int : intInfo{
             newFunc,
             lessFucn,
             //......
        },
        float : floatInfo
} 
func Add(dict[T], m, n T) T{}

Go finally integrated the above two schemes and proposed the GC Shape Stenciling scheme for generic implementation. It generates function code in units of the GC Shape of a type. Types with the same GC Shape reuse the same code (the GC Shape of a type refers to its representation in the Go memory allocator/garbage collector). All pointer types reuse the *uint8 type. For types with the same GC Shape, a shared instantiated function code is used. This scheme also automatically adds a dict parameter to each instantiated function code to distinguish different types with the same GC Shape.

type V interface{
        int|float|*int|*float
} 
func F[T V](m, n T) {}
// 1. Generate templates for regular types int/float
func F[go.shape.int_0](m, n int){} 
func F[go.shape.float_0](m, n int){}
// 2. Pointer types reuse the same template
func F[go.shape.*uint8_0](m, n int){}
// 3. Add dictionary passing during the call
const dict = map[type] typeInfo{
        int : intInfo{},
        float : floatInfo{}
} 
func F[go.shape.int_0](dict[int],m, n int){}

3.3 Differences

From the underlying implementation principles of interface{} and generics, we can find that the main difference between them is that interface{} supports handling different data types during runtime, while generics support handling different data types statically at the compilation stage. There are mainly the following differences in practical use:

(1) Performance difference: The boxing and unboxing operations performed when different types of data are assigned to or retrieved from interface{} are costly and introduce additional overhead. In contrast, generics do not require boxing and unboxing operations, and the code generated by generics is optimized for specific types, avoiding runtime performance overhead.

(2) Type safety: When using the interface{} type, the compiler cannot perform static type checking and can only perform type assertions at runtime. Therefore, some type errors may only be discovered at runtime. In contrast, Go's generic code is generated at compile time, so the generic code can obtain type information at compile time, ensuring type safety.

4. Scenarios for Generics

4.1 Applicable Scenarios

When implementing general data structures: By using generics, you can write code once and reuse it on different data types. This reduces code duplication and improves code maintainability and extensibility.
When operating on native container types in Go: If a function uses parameters of Go built-in container types such as slices, maps, or channels, and the function code does not make any specific assumptions about the element types in the containers, using generics can completely decouple the container algorithms from the element types in the containers. Before the generics syntax was available, reflection was usually used for implementation, but reflection makes the code less readable, cannot perform static type checking, and greatly increases the runtime overhead of the program.
When the logic of methods implemented for different data types is the same: When methods of different data types have the same function logic and the only difference is the data type of the input parameters, generics can be used to reduce code redundancy.

4.2 Not Applicable Scenarios

Do not replace interface types with type parameters: Interfaces support a certain sense of generic programming. If the operations on variables of certain types only call the methods of that type, just use the interface type directly without using generics. For example, io.Reader uses an interface to read various types of data from files and random number generators. io.Reader is easy to read from the code perspective, highly efficient, and there is almost no difference in function execution efficiency, so there is no need to use type parameters.
When the implementation details of methods for different data types are different: If the method implementation for each type is different, the interface type should be used instead of generics.
In scenarios with strong runtime dynamics: For example, in scenarios where type judgment is performed using switch, directly using interface{} will have better results.

5. Traps in Generics

5.1 `nil` Comparison

In the Go language, type parameters are not allowed to be directly compared with nil because type parameters are type-checked at compile time, while nil is a special value at runtime. Since the underlying type of the type parameter is unknown at compile time, the compiler cannot determine whether the underlying type of the type parameter supports comparison with nil. Therefore, to maintain type safety and avoid potential runtime errors, the Go language does not allow direct comparison between type parameters and nil.

// Wrong example
func ZeroValue0[T any](v T) bool {
    return v == nil  
}
// Correct example 1
func Zero1[T any]() T {
    return *new(T) 
}
// Correct example 2
func Zero2[T any]() T {
    var t T
    return t 
}
// Correct example 3
func Zero3[T any]() (t T) {
    return 
}

5.2 Invalid Underlying Elements

The type T of the underlying element must be a base type and cannot be an interface type.

// Wrong definition!
type MyInt int
type I0 interface {
        ~MyInt // Wrong! MyInt is not a base type, int is
        ~error // Wrong! error is an interface
}

5.3 Invalid Union Type Elements

Union type elements cannot be type parameters, and non-interface elements must be pairwise disjoint. If there is more than one element, it cannot contain an interface type with non-empty methods, nor can it be comparable or embed comparable.

func I1[K any, V interface{ K }]() { // Wrong, K in interface{ K } is a type parameter
}
type MyInt int
func I5[K any, V interface{ int | MyInt }]() { // Correct
}
func I6[K any, V interface{ int | ~MyInt }]() { // Wrong! The intersection of int and ~MyInt is int
}
type MyInt2 = int
func I7[K any, V interface{ int | MyInt2 }]() { // Wrong! int and MyInt2 are the same type, they intersect
}
// Wrong! Because there are more than one union elements and cannot be comparable
func I13[K comparable | int]() {
}
// Wrong! Because there are more than one union elements and elements cannot embed comparable
func I14[K interface{ comparable } | int]() {
}

5.4 Interface Types Cannot Be Recursively Embedded

// Wrong! Cannot embed itself
type Node interface {
        Node
}
// Wrong! Tree cannot embed itself through TreeNode
type Tree interface {
        TreeNode
}
type TreeNode interface {
        Tree
}

6. Best Practices

To make good use of generics, the following points should be noted during use:

Avoid over-generalizing. Generics are not suitable for all scenarios, and it is necessary to carefully consider in which scenarios they are appropriate. Reflection can be used when appropriate: Go has runtime reflection. The reflection mechanism supports a certain sense of generic programming. If certain operations need to support the following scenarios, reflection can be considered: (1) Operating on types without methods, where the interface type is not applicable. (2) When the operation logic for each type is different, generics are not applicable. An example is the implementation of the encoding/json package. Since it is not desired that each type to be encoded implements the MarshalJson method, the interface type cannot be used. And because the encoding logic for different types is different, generics should not be used.
Clearly use *T, []T and map[T1]T2 instead of letting T represent pointer types, slices, or maps. Different from the fact that type parameters in C++ are placeholders and will be replaced with real types, the type of the type parameter T in Go is the type parameter itself. Therefore, representing it as pointer, slice, map, and other data types will lead to many unexpected situations during use, as shown below:

func Set[T *int|*uint](ptr T) {
        *ptr = 1
}
func main() {
        i := 0
        Set(&i)
        fmt.Println(i) // Report an error: invalid operation
}

The above code will report an error: invalid operation: pointers of ptr (variable of type T constrained by *int | *uint) must have identical base types. The reason for this error is that T is a type parameter, and the type parameter is not a pointer and does not support the dereference operation. This can be solved by changing the definition to the following:

func Set[T int|uint](ptr *T) {
        *ptr = 1
}

Summary

Overall, the benefits of generics can be summarized in three aspects:

Types are determined during the compilation period, ensuring type safety. What is put in is what is taken out.
Readability is improved. The actual data type is explicitly known from the coding stage.
Generics merge the processing code for the same type, improving the code reuse rate and increasing the general flexibility of the program. However, generics are not a necessity for general data types. It is still necessary to carefully consider whether to use generics according to the actual usage situation.

Leapcell: The Advanced Platform for Go Web Hosting, Async Tasks, and Redis

Finally, let me introduce Leapcell, the most suitable platform for deploying Go services.

1. Multi-Language Support

Develop with JavaScript, Python, Go, or Rust.

2. Deploy unlimited projects for free

pay only for usage — no requests, no charges.

3. Unbeatable Cost Efficiency

Pay-as-you-go with no idle charges.
Example: $25 supports 6.94M requests at a 60ms average response time.

4. Streamlined Developer Experience

Intuitive UI for effortless setup.
Fully automated CI/CD pipelines and GitOps integration.
Real-time metrics and logging for actionable insights.

5. Effortless Scalability and High Performance

Auto-scaling to handle high concurrency with ease.
Zero operational overhead — just focus on building.

Explore more in the documentation!

Leapcell Twitter: https://x.com/LeapcellHQ

DEV Community

Go Generics: A Deep Dive

1. Go Without Generics

2. Advantages of Generics

3. Differences Between Generics and `interface{}`

3.1 `interface{}` Implementation Principle

3.2 Generics Implementation Principle

3.3 Differences

4. Scenarios for Generics

4.1 Applicable Scenarios

4.2 Not Applicable Scenarios

5. Traps in Generics

5.1 `nil` Comparison

5.2 Invalid Underlying Elements

5.3 Invalid Union Type Elements

5.4 Interface Types Cannot Be Recursively Embedded

6. Best Practices

Summary

Leapcell: The Advanced Platform for Go Web Hosting, Async Tasks, and Redis

1. Multi-Language Support

2. Deploy unlimited projects for free

3. Unbeatable Cost Efficiency

4. Streamlined Developer Experience

5. Effortless Scalability and High Performance

Top comments (0)

Read next

🚀 react-hook-use-cta v2 Release! 🚀

1422. Maximum Score After Splitting a String

How To Integrate GitHub Sign-In: A Four Step Guide

Python projects for beginners to advanced

1. Go Without Generics

2. Advantages of Generics

3. Differences Between Generics and interface{}

3.1 interface{} Implementation Principle

3.2 Generics Implementation Principle

3.3 Differences

4. Scenarios for Generics

4.1 Applicable Scenarios

4.2 Not Applicable Scenarios

5. Traps in Generics

5.1 nil Comparison

5.2 Invalid Underlying Elements

5.3 Invalid Union Type Elements

5.4 Interface Types Cannot Be Recursively Embedded

6. Best Practices

Summary

Leapcell: The Advanced Platform for Go Web Hosting, Async Tasks, and Redis

1. Multi-Language Support

2. Deploy unlimited projects for free

3. Unbeatable Cost Efficiency

4. Streamlined Developer Experience

5. Effortless Scalability and High Performance

Read next

🚀 react-hook-use-cta v2 Release! 🚀

1422. Maximum Score After Splitting a String

How To Integrate GitHub Sign-In: A Four Step Guide

Python projects for beginners to advanced

3. Differences Between Generics and `interface{}`

3.1 `interface{}` Implementation Principle

5.1 `nil` Comparison