Hugo Oliveira

Posted on Oct 20

Understanding Strings in Go: Bytes, Runes, and the Truth Behind len

#go #programming #tutorial #beginners

When you first start working with Go, strings might seem simple — until you try to count characters, index them, or work with emojis. Then you realize that Go treats strings in a way that’s both elegant and slightly tricky.

Let’s clear the confusion once and for all.

💡 What a string really is

In Go, a string is an immutable sequence of bytes, not a list of characters.

Internally, it’s represented roughly like this:

type stringStruct struct {
    Data *byte
    Len  int
}

Every byte is part of a UTF-8–encoded value. This means that characters like á or 🚀 may use multiple bytes.

📏 len() counts bytes, not characters

This one surprises a lot of newcomers. The len() function returns the number of bytes, not the number of visible characters.

s := "Olá"
fmt.Println(len(s)) // 4

Looks like three letters, right?
But “á” uses two bytes in UTF-8 (0xC3 0xA1).

🧩 Accessing by index (s[i])

When you index a string like s[i], you get a single byte (uint8), not a character.

s := "Olá"
fmt.Println(s[2]) // 195 (0xC3)

Here you’re only seeing part of the letter “á”.
Strings in Go are sequences of bytes, not runes.

🔁 Iterating with for

There are two main ways to loop over strings in Go — and they behave very differently.

1. Byte by byte

s := "Olá"
for i := 0; i < len(s); i++ {
    fmt.Printf("%d: %x\n", i, s[i])
}

Output:

0: 4f
1: 6c
2: c3
3: a1

Each byte of “á” is printed separately (c3 and a1).

2. Rune by rune

s := "Olá"
for i, r := range s {
    fmt.Printf("%d: %c (%[2]U)\n", i, r)
}

Output:

0: O (U+004F)
1: l (U+006C)
2: á (U+00E1)

Now Go decodes UTF-8 correctly and gives you full Unicode characters.

⚙️ byte vs rune

Type	Size	Meaning	Example
`byte`	1 byte (`uint8`)	Raw UTF-8 data	`'a' = 97`
`rune`	4 bytes (`int32`)	A full Unicode character	`'á' = 225`, `'🚀' = 128640`

You can convert between them:

s := "🚀"
b := []byte(s)
r := []rune(s)

fmt.Println(len(b)) // 4 bytes
fmt.Println(len(r)) // 1 rune

🧠 Choosing the right type

Use case	Recommended type	Why
File IO or network data	`[]byte`	Performance and control
Counting or printing characters	`[]rune` or `for range`	Handles Unicode properly
Storing text	`string`	Simple, safe, and immutable

🧭 Quick summary

Operation	Returns	Interprets as
`len(string)`	`int`	Number of bytes
`string[i]`	`uint8`	Byte value
`for range string`	`int32`	Unicode character
`[]byte(string)`	Slice of bytes	UTF-8 encoded
`[]rune(string)`	Slice of runes	Unicode code points

💬 Final thoughts

Go treats strings as byte sequences for a reason. It keeps things fast, memory-safe, and predictable.
But once you understand the difference between bytes and runes, you unlock the full power of Go’s simplicity.

Next time you work with text in Go, remember:
len() doesn’t count letters. It counts bytes.
And that tiny detail makes all the difference.

DEV Community

Understanding Strings in Go: Bytes, Runes, and the Truth Behind len

Top comments (0)