RuneCountInString vs len
s := "héllo"
fmt.Println(len(s)) // 6 — bytes
fmt.Println(utf8.RuneCountInString(s)) // 5 — runes
Output
6
5
unicode/utf8Work with UTF-8-encoded strings at the rune level: counts, encoding, decoding, validation.
len(s) gives bytes, not runes. Use utf8 when you need the rune count.
s := "héllo"
fmt.Println(len(s)) // 6 — bytes
fmt.Println(utf8.RuneCountInString(s)) // 5 — runes
6
5
utf8.ValidString("hello") // true
utf8.ValidString("a\xffb") // false
Returns the rune and its byte width. Use for manual rune iteration; the range loop does this for you.
s := "héllo"
for i := 0; i < len(s); {
r, w := utf8.DecodeRuneInString(s[i:])
fmt.Printf("%c at %d (%d bytes)\n", r, i, w)
i += w
}
h at 0 (1 bytes)
é at 1 (2 bytes)
l at 3 (1 bytes)
l at 4 (1 bytes)
o at 5 (1 bytes)
buf := make([]byte, 4)
n := utf8.EncodeRune(buf, '🎉')
fmt.Printf("%d bytes: % x\n", n, buf[:n])
4 bytes: f0 9f 8e 89
A plain for-range on a string already decodes UTF-8 — no utf8 package needed.
for i, r := range "héllo" {
fmt.Printf("%d: %c\n", i, r)
}
0: h
1: é
3: l
4: l
5: o