loading...

Concatenate strings in golang a quick benchmark : + or fmt.Sprintf ?

pmalhaire profile image pmalhaire Updated on ・5 min read

Concatenate strings in golang a quick benchmark

Introduction

When I begin to enter golang code bases with developer of various seniority level. I was surprised to see many ways to do join strings. Making onetwo out of one and two is used a lot in any programming language.

I decided to do a quick benchmark. Comming from the C/C++ world the result in golang surprized me (in a good way).

The origin

Let's do it in a C maner using Sprintf.

I use pointers here to simulate the fact that we are in a function.

#include <stdlib.h>
#include <stdio.h>

int main(){
    char str[] = "my_string";
    char *s = malloc(sizeof(str) + sizeof(str));
    sprintf(s, "my_string%s", str);
    printf("%s\n", s);
}

Let's go golang

Let's do like in C. This was (at least when I begin with golang) a instinctive way for me.

package main

import "fmt"

func main() {
    str := "my_string"
    fmt.Println(fmt.Sprintf("my_string%s", str))
}

Let's try an other way strings.Join

Discussing with others I realize that strings.Join is quite popular.

package main

import (
    "fmt"
    "strings"
)

func main() {
    str := "my_string"
    fmt.Println(strings.Join([]string{"my_string", str}, ""))
}

Let's try C++ way using strings.Builder

Having experienced string builder in C++ and Java I thought I could give it a try.

package main

import (
    "fmt"
    "strings"
)

func main() {
    str := "my_string"
    var b strings.Builder
    b.WriteString("my_string")
    b.WriteString(str)
}

Let's try Cpp way using +

Plus is kind of obvious but does not feel smart.

package main

import "fmt"

func main() {
    str := "my_string"
    fmt.Println("my_string" + str)
}

Bench tells us

Golang has tremendous test utilities. We can do a benchmark really easily.

$go test -bench=. > bench.result
goos: linux
goarch: amd64

BenchmarkSprintf-4          10000000           156 ns/op
BenchmarkLongSprintf-4      10000000           186 ns/op
BenchmarkConstSprintf-4     10000000           158 ns/op

BenchmarkJoin-4             20000000            68.8 ns/op
BenchmarkLongJoin-4         20000000            86.9 ns/op
BenchmarkConstJoin-4        20000000            66.1 ns/op

BenchmarkBuilder-4          20000000           104 ns/op
BenchmarkLongBuilder-4      20000000           101 ns/op
BenchmarkConstBuilder-4     20000000           102 ns/op

BenchmarkPlus-4             50000000            25.9 ns/op
BenchmarkLongPlus-4         20000000            74.4 ns/op
BenchmarkConstPlus-4        2000000000           0.39 ns/op

Conclusion

Surprisingly (at least for me) a simple + is the fastest.printf. Here Go make the simple way the most powerful which rocks.

Behind the curtain

While doing this post I noticed the binary size for golang and C are quite different :

  • golang : 1,9M
  • c : 8,3K

It's not surprising as C does not include any runtime, but it made me curious.

To get an overview of what's make golang and C programs different I looked at the generated syscalls using strace. See this interesting post on strace.

Note : As said in the comments it's not related to string concatenation, but deserves a future post.

C version

#include <stdlib.h>
#include <stdio.h>

int main(){
    char str[] = "my_string";
    char *s = malloc(sizeof(str) + sizeof(str));
    sprintf(s, "my_string%s", str);
    printf("%s\n", s);
}

Let's have a look at strace.

gcc -o sprintf sprintf.c
strace -c ./sprintf
my_stringmy_string
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 24.84    0.000305          23        13           mmap
 23.05    0.000283          28        10           mprotect
 10.18    0.000125          25         5           openat
  8.63    0.000106         106         1           munmap
  8.14    0.000100          17         6         6 access
  4.48    0.000055          14         4           read
  4.15    0.000051           9         6           fstat
  3.83    0.000047          47         1           write
  3.34    0.000041          41         1           arch_prctl
  2.44    0.000030           6         5           close
  2.12    0.000026           9         3           brk
  1.47    0.000018          18         1           execve
  1.14    0.000014           7         2           rt_sigaction
  0.65    0.000008           8         1           prlimit64
  0.57    0.000007           7         1           set_tid_address
  0.49    0.000006           6         1           rt_sigprocmask
  0.49    0.000006           6         1           set_robust_list
------ ----------- ----------- --------- --------- ----------------
100.00    0.001228                    62         6 total

Golang version

package main

import "fmt"

func main() {
    str := "my_string"
    fmt.Println("my_string" + str)
}

Let's have a look at strace.

$ go build plus.go
$ strace -c ./plus
my_stringmy_string
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 69.80    0.001900          17       114           rt_sigaction
  8.74    0.000238          79         3           clone
  5.95    0.000162          20         8           rt_sigprocmask
  4.70    0.000128          32         4           futex
  4.41    0.000120         120         1           readlinkat
  2.24    0.000061           8         8           mmap
  1.84    0.000050          50         1           write
  1.69    0.000046          15         3           fcntl
  0.62    0.000017          17         1           gettid
  0.00    0.000000           0         1           execve
  0.00    0.000000           0         2           sigaltstack
  0.00    0.000000           0         1           arch_prctl
  0.00    0.000000           0         1           sched_getaffinity
------ ----------- ----------- --------- --------- ----------------
100.00    0.002722                   148           total

What strace tells us

Go does a bit more system calls than c does, which seems legit given the fact that it has a gc.

To get to understand more I invite you to disassemble the result binaries.

As a matter of fact we must also have a look at the default compilation parameters for C and golang to have a proper comparison.

Code reference

Here is the code used to bench different options.

package main_test

import (
    "fmt"
    "strings"
    "testing"
)

var str, longStr string = "my_string", `qwertyuiopqwertyuiopqwertyuio
qwertyuiopqwertyuiopqwertyuiopqwertyuiopqwertyuiopqwertyuiopqwertyuiop`

const cStr = "my_string"

func BenchmarkPlus(b *testing.B) {
    for n := 0; n < b.N; n++ {
        _ = "my_string" + str
    }
}

func BenchmarkLongPlus(b *testing.B) {
    for n := 0; n < b.N; n++ {
        _ = "my_string" + longStr
    }
}

func BenchmarkConstPlus(b *testing.B) {
    for n := 0; n < b.N; n++ {
        _ = "my_string" + cStr
    }
}

func BenchmarkJoin(b *testing.B) {
    for n := 0; n < b.N; n++ {
        _ = strings.Join([]string{"my_string%s", str}, "")
    }
}

func BenchmarkLongJoin(b *testing.B) {
    for n := 0; n < b.N; n++ {
        _ = strings.Join([]string{"my_string%s", longStr}, "")
    }
}

func BenchmarkConstJoin(b *testing.B) {
    for n := 0; n < b.N; n++ {
        _ = strings.Join([]string{"my_string%s", cStr}, "")
    }
}
func BenchmarkSprintf(b *testing.B) {
    for n := 0; n < b.N; n++ {
        _ = fmt.Sprintf("my_string%s", str)
    }
}

func BenchmarkLongSprintf(b *testing.B) {
    for n := 0; n < b.N; n++ {
        _ = fmt.Sprintf("my_string%s", longStr)
    }
}

func BenchmarkConstSprintf(b *testing.B) {
    for n := 0; n < b.N; n++ {
        _ = fmt.Sprintf("my_string%s", cStr)
    }
}

func BenchmarkBuilder(b *testing.B) {
    for n := 0; n < b.N; n++ {
        var b strings.Builder
        b.WriteString("my_string")
        b.WriteString(longStr)
    }
}

func BenchmarkLongBuilder(b *testing.B) {
    for n := 0; n < b.N; n++ {
        var b strings.Builder
        b.WriteString("my_string")
        b.WriteString(longStr)
    }
}

func BenchmarkConstBuilder(b *testing.B) {
    for n := 0; n < b.N; n++ {
        var b strings.Builder
        b.WriteString("my_string")
        b.WriteString(longStr)
    }
}

Posted on by:

pmalhaire profile

pmalhaire

@pmalhaire

Go / C++ / Haskell /python. Easy to read code has less bugs than elegant code.

Discussion

markdown guide
 

I tried to do this micro optimization a few months ago and kinda failed.

The recommended way is to use strings.Builder. As I did not knew the string size a simple + worked better in benchmarks (at least for strings less than ~20 characters.

I ended up approximating the result (and pre allocate memory with a buffer) and got the best result, but most of the times + is the best choice.

 

Thanks for your comment I'll update my post accordingly, note that the c version preallocates the buffer.

 

On strace: string concatenation isn't a system call. The strace for these programs should be the same as the strace for hello world.

 

I didn't mean to say that. What made you think this way ? Maybe I Can make my post more clear with your help.

 

The article is about string concatenation. Why look at strace at all?

It's to explain why C is more efficient than Go, which is no explicitly explained.

I don't agree that a syscall count has anything to do with a languages efficiency compared to another language.

I'll make it more clear.