Open
Description
$ gotip version
go version devel +a62071a209 Sat Jan 6 04:52:00 2018 +0000 linux/amd64
type T struct {
s1, s2 string
}
//go:noinline
func foo(t T) { _ = t }
func bar() {
var t T
foo(t)
}
generates
0x0020 00032 (test.go:14) MOVUPS X0, (SP)
0x0024 00036 (test.go:14) MOVUPS X0, 16(SP)
0x0029 00041 (test.go:14) CALL "".foo(SB)
but when
type T struct {
s1, s2, s3 string // one more string
}
0x001d 00029 (test.go:13) XORPS X0, X0
0x0020 00032 (test.go:13) MOVUPS X0, "".t+48(SP)
0x0025 00037 (test.go:13) MOVUPS X0, "".t+64(SP)
0x002a 00042 (test.go:13) MOVUPS X0, "".t+80(SP)
0x002f 00047 (test.go:13) MOVQ SP, DI
0x0032 00050 (test.go:14) LEAQ "".t+48(SP), SI
0x0037 00055 (test.go:14) DUFFCOPY $854
0x004a 00074 (test.go:14) CALL "".foo(SB)
The stack is bigger; first we MOVUPS
a bunch of zeros to 48/64/80(SP)
, then we call DUFFCOPY
to move them again to (SP). This seems wasteful. Even if we cross the multiple-MOVs/DUFF threshold, it seems it would be possible to just DUFFZERO
at (SP)
, essentially the thing the first snippet does.
This also happen when there's no zeroing going on. For example, for struct { a, b, c, d int64}
, when initialized as t = {1, 2, 3, 4}
, the values are moved directly to (SP)
, but for struct { a, b, c, d, e int64}
, which is bigger than 32bytes, they aren't. There are 5 moves high into the stack and then a DUFFCOPY
call moves them to (SP)
.