When you first start working with Go, strings might seem simple — until you try to count characters, index them, or work with emojis. Then you realize that Go treats strings in a way that’s both elegant and slightly tricky.
Let’s clear the confusion once and for all.
💡 What a string really is
In Go, a string is an immutable sequence of bytes, not a list of characters.
Internally, it’s represented roughly like this:
type stringStruct struct {
Data *byte
Len int
}
Every byte is part of a UTF-8–encoded value. This means that characters like á
or 🚀
may use multiple bytes.
📏 len()
counts bytes, not characters
This one surprises a lot of newcomers. The len()
function returns the number of bytes, not the number of visible characters.
s := "Olá"
fmt.Println(len(s)) // 4
Looks like three letters, right?
But “á” uses two bytes in UTF-8 (0xC3 0xA1
).
🧩 Accessing by index (s[i]
)
When you index a string like s[i]
, you get a single byte (uint8
), not a character.
s := "Olá"
fmt.Println(s[2]) // 195 (0xC3)
Here you’re only seeing part of the letter “á”.
Strings in Go are sequences of bytes, not runes.
🔁 Iterating with for
There are two main ways to loop over strings in Go — and they behave very differently.
1. Byte by byte
s := "Olá"
for i := 0; i < len(s); i++ {
fmt.Printf("%d: %x\n", i, s[i])
}
Output:
0: 4f
1: 6c
2: c3
3: a1
Each byte of “á” is printed separately (c3
and a1
).
2. Rune by rune
s := "Olá"
for i, r := range s {
fmt.Printf("%d: %c (%[2]U)\n", i, r)
}
Output:
0: O (U+004F)
1: l (U+006C)
2: á (U+00E1)
Now Go decodes UTF-8 correctly and gives you full Unicode characters.
⚙️ byte
vs rune
Type | Size | Meaning | Example |
---|---|---|---|
byte |
1 byte (uint8 ) |
Raw UTF-8 data | 'a' = 97 |
rune |
4 bytes (int32 ) |
A full Unicode character |
'á' = 225 , '🚀' = 128640
|
You can convert between them:
s := "🚀"
b := []byte(s)
r := []rune(s)
fmt.Println(len(b)) // 4 bytes
fmt.Println(len(r)) // 1 rune
🧠 Choosing the right type
Use case | Recommended type | Why |
---|---|---|
File IO or network data | []byte |
Performance and control |
Counting or printing characters |
[]rune or for range
|
Handles Unicode properly |
Storing text | string |
Simple, safe, and immutable |
🧭 Quick summary
Operation | Returns | Interprets as |
---|---|---|
len(string) |
int |
Number of bytes |
string[i] |
uint8 |
Byte value |
for range string |
int32 |
Unicode character |
[]byte(string) |
Slice of bytes | UTF-8 encoded |
[]rune(string) |
Slice of runes | Unicode code points |
💬 Final thoughts
Go treats strings as byte sequences for a reason. It keeps things fast, memory-safe, and predictable.
But once you understand the difference between bytes and runes, you unlock the full power of Go’s simplicity.
Next time you work with text in Go, remember:
len()
doesn’t count letters. It counts bytes.
And that tiny detail makes all the difference.
Top comments (0)