There was a discussion in Stackoverflow involving that question, so I geniunely asked myself is it that simple?
They had this piece of sample code:
type MyStruct struct {
F1, F2, F3, F4, F5, F6, F7 string
I1, I2, I3, I4, I5, I6, I7 int64
}
func BenchmarkAppendingStructs(b *testing.B) {
var s []MyStruct
for i := 0; i < b.N; i++ {
s = append(s, MyStruct{})
}
}
func BenchmarkAppendingPointers(b *testing.B) {
var s []*MyStruct
for i := 0; i < b.N; i++ {
s = append(s, &MyStruct{})
}
}
Getting these results:
BenchmarkAppendingStructs 1000000 3528 ns/op
BenchmarkAppendingPointers 5000000 246 ns/op
But the question here is, certainly, not about the number of allocations that one may do! Since we’re testing the amount that’s getting allocated.
I used the same code, but with different benchmark options:
$ go test -bench=. -benchmem
And eventually got these results:
BenchmarkAppendingStructs-12 3168454 439.8 ns/op 996 B/op 0 allocs/op
BenchmarkAppendingPointers-12 14701717 97.50 ns/op 221 B/op 1 allocs/op
Which basically can be interpreted as the case with []T
involves 4x more memory than the case with []*T
.
So what’s the actual answer? The conversation in Golang Group simply states this:
On the flip side,
[]*T
will be less cache/TLB/GC friendly than[]T
, unlessT
is much larger than a ptr. You are also allocating an extra pointer per element. For small slices it probably doesn’t matter much either way but when slices hold milions of objects…. Elements of[]T
are less likely to be mangled from random places compared to[]*T
as in the latter case you may have access through other ptrs. IO and comm will also be easier with[]T
.[]*T
makes sense where there is genuine need to share. e.g. put an object on multiple lists. Otherwise it can often just be premature optimization.
The investigation is that one pointer weighs around 4 or 8 bytes depending on architechture (x86 vs x64). And we assume that our structs are heavy as hell, consisting of 10 string fields at least. Eventually, when we have lots of string fields which by default weight 16 bytes, and you have to take their length as well (let us assume the average string length is 10), thus:
len(s) + 16 -> 26 bytes
10 string fields -> 260 bytes
10 elements in the slice -> 2600 bytes (~2.6KiB)
So in the case of []T
you’d end up copying ~2.6KiB
over copying 10*8 = 80 bytes
That’s all folks!
tldr; If your struct is heavy it’s better to use
[]*T
, otherwise using[]T
is the answer.