Skip to content

cmd/compile/internal/pgo: error parsing profile (for pgo) after scaling #73640

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
matthinrichsen-wf opened this issue May 8, 2025 · 10 comments
Labels
BugReport Issues describing a possible bug in the Go implementation. compiler/runtime Issues related to the Go compiler and/or runtime. NeedsFix The path to resolution is known, but the work has not been done.
Milestone

Comments

@matthinrichsen-wf
Copy link

Go version

go version go1.24.2 darwin/arm64

Output of go env in your module/workspace:

AR='ar'
CC='cc'
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_ENABLED='1'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
CXX='c++'
GCCGO='gccgo'
GO111MODULE=''
GOARCH='arm64'
GOARM64='v8.0'
GOAUTH='netrc'
GOBIN=''
GOCACHE='/Users/gopher/Library/Caches/go-build'
GOCACHEPROG=''
GODEBUG=''
GOENV='/Users/gopher/Library/Application Support/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFIPS140='off'
GOFLAGS=''
GOGCCFLAGS='-fPIC -arch arm64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -ffile-prefix-map=/var/folders/4x/zvfqpl3d4zx4f7jq76sy_m6r0000gs/T/go-build2991403458=/tmp/go-build -gno-record-gcc-switches -fno-common'
GOHOSTARCH='arm64'
GOHOSTOS='darwin'
GOINSECURE=''
GOMOD=''
GOMODCACHE='/Users/gopher/workspaces/wf/go/pkg/mod'
GONOPROXY=''
GONOSUMDB=''
GOOS='darwin'
GOPATH='/Users/gopher/workspaces/wf/go'
GOPRIVATE=''
GOPROXY='https://proxy.golang.org,direct'
GOROOT='/opt/homebrew/Cellar/go/1.24.2/libexec'
GOSUMDB='sum.golang.org'
GOTELEMETRY='local'
GOTELEMETRYDIR='/Users/gopher/Library/Application Support/go/telemetry'
GOTMPDIR=''
GOTOOLCHAIN='auto'
GOTOOLDIR='/opt/homebrew/Cellar/go/1.24.2/libexec/pkg/tool/darwin_arm64'
GOVCS=''
GOVERSION='go1.24.2'
GOWORK=''
PKG_CONFIG='pkg-config'

What did you do?

I have a profile which I'd like to use for PGO, however it is exceptionally large. I scaled the profile to reduce the size. However after scaling, the profile cannot be used for PGO. I've included an example scaler:

package main

import (
	"bytes"
	"os"
	"time"

	"github.com/google/pprof/profile"
)

func main() {
	b, err := os.ReadFile(`profile.pb`)
	if err != nil {
		panic(err)
	}
	p, err := profile.ParseData(b)
	if err != nil {
		panic(err)
	}

	// scale the profile to reduce the size
	p.Scale(1 / (float64(time.Millisecond)))
	p.Scale(float64(time.Millisecond))

	buf := &bytes.Buffer{}
	err = p.Write(buf)
	if err != nil {
		panic(err)
	}
	os.WriteFile(`scaled.pb`, buf.Bytes(), 0644)
}

I then attempted to use PGO with scaled profile
go build -pgo scaled.pb main.go

What did you see happen?

Received the following error:

preprofile: error parsing profile: profile missing Function.start_line data (Go version of profiled application too old? Go 1.20+ automatically adds this to profiles)

The original profile can be used with pgo, as expected:
go build -pgo profile.pb main.go

What did you expect to see?

I expected both the scaled and non-scaled profiles to be usable for PGO.

@gabyhelp gabyhelp added the BugReport Issues describing a possible bug in the Go implementation. label May 8, 2025
@cherrymui cherrymui changed the title compile/internal/pgo: error parsing profile (for pgo) after scaling cmd/compile/internal/pgo: error parsing profile (for pgo) after scaling May 9, 2025
@cherrymui
Copy link
Member

Hmmm, apparently the github.com/google/pprof/profile package doesn't preserve the start_line field? It probably should, as it is in the proto definition.

cc @prattmic

@cherrymui cherrymui added NeedsFix The path to resolution is known, but the work has not been done. compiler/runtime Issues related to the Go compiler and/or runtime. labels May 9, 2025
@cherrymui cherrymui added this to the Backlog milestone May 9, 2025
@prattmic
Copy link
Member

prattmic commented May 9, 2025

I think this is just a bad error message. I suspect what actually happened is that your profile still has samples, but every sample has value.

For example, here is what I see when running your program on a profile I collected from a simple spinning program.

Before:

$ pprof -raw profile.pb
PeriodType: cpu nanoseconds
Period: 10000000
Time: 2025-05-09 11:33:34.982942248 -0400 EDT
Duration: 4.81
Samples:
samples/count cpu/nanoseconds
        462 4620000000: 1 2 
          1   10000000: 3 4 5 6 7 8 
Locations
     1: 0x4b5324 M=1 main.spin /tmp/prof/main.go:11:0 s=10
             main.main /tmp/prof/main.go:22:0 s=16
     2: 0x43b12a M=1 runtime.main /usr/lib/google-golang/src/runtime/proc.go:283:0 s=147
     3: 0x449026 M=1 runtime.mput /usr/lib/google-golang/src/runtime/proc.go:6576:0 s=6570
     4: 0x43ff06 M=1 runtime.stopm /usr/lib/google-golang/src/runtime/proc.go:2916:0 s=2902
     5: 0x4419fb M=1 runtime.findRunnable /usr/lib/google-golang/src/runtime/proc.go:3655:0 s=3282
     6: 0x442af0 M=1 runtime.schedule /usr/lib/google-golang/src/runtime/proc.go:4028:0 s=3999
     7: 0x442fa4 M=1 runtime.park_m /usr/lib/google-golang/src/runtime/proc.go:4152:0 s=4101
     8: 0x46ef6d M=1 runtime.mcall /usr/lib/google-golang/src/runtime/asm_amd64.s:459:0 s=429
Mappings
1: 0x400000/0x4b6000/0x0 /tmp/prof/example.com a129a9fa53e87d8f7c56925d3bc0f9d8f6868cd5 [FN]

After:

$ pprof -raw scaled.pb 
PeriodType: cpu nanoseconds
Period: 10000000
Time: 2025-05-09 11:33:34.982942248 -0400 EDT
Duration: 4.81
Samples:
samples/count cpu/nanoseconds
          0 4620000000: 1 2 
          0   10000000: 3 4 5 6 7 8 
Locations
     1: 0x4b5324 M=1 main.spin /tmp/prof/main.go:11:0 s=10
             main.main /tmp/prof/main.go:22:0 s=16
     2: 0x43b12a M=1 runtime.main /usr/lib/google-golang/src/runtime/proc.go:283:0 s=147
     3: 0x449026 M=1 runtime.mput /usr/lib/google-golang/src/runtime/proc.go:6576:0 s=6570
     4: 0x43ff06 M=1 runtime.stopm /usr/lib/google-golang/src/runtime/proc.go:2916:0 s=2902
     5: 0x4419fb M=1 runtime.findRunnable /usr/lib/google-golang/src/runtime/proc.go:3655:0 s=3282
     6: 0x442af0 M=1 runtime.schedule /usr/lib/google-golang/src/runtime/proc.go:4028:0 s=3999
     7: 0x442fa4 M=1 runtime.park_m /usr/lib/google-golang/src/runtime/proc.go:4152:0 s=4101
     8: 0x46ef6d M=1 runtime.mcall /usr/lib/google-golang/src/runtime/asm_amd64.s:459:0 s=429
Mappings
1: 0x400000/0x4b6000/0x0 /tmp/prof/example.com a129a9fa53e87d8f7c56925d3bc0f9d8f6868cd5 [FN]

Two things to notice here.

  1. The function start lines are still present (that's the s=NUMBER on the Locations).
  2. All of the Samples have a samples/count value of zero. This means the samples/count part of the profile is effectively empty.

What happens in PGO parsing:

  1. PGO can actually use either samples/count or cpu/nanoseconds. We just pick the first one we find, which is samples/count.
  2. Normally we bail out if the profile contains no Samples at all, but this one does contain samples.
  3. We build a call graph from all of the samples/count samples. Samples with value 0 are treated as not present, so this graph is empty.
  4. We process this graph and complain that we never saw a start line (because we didn't have any nodes to look at!).

We should probably bail out if (3) creates an empty graph, the same way we bail out if the entire profile is empty. Though it might be nice to warn users, or even fall back to using cpu/nanoseconds, which would have worked fine.


@matthinrichsen-wf can you confirm that go tool pprof -raw scaled.pb has 0 for samples/count for every single sample? If not, I think I found a different bug.

I think the problem with this scaling approach is that the cpu/nanoseconds values are very large, so the 1/1000 scaling drives the samples/counts to 0, but the cpu/nanoseconds are still large.

I think this approach would work if you stripped out the cpu/nanoseconds sample type entirely first.

@matthinrichsen-wf
Copy link
Author

@prattmic Yep! All the samples/count are 0.

I don't think I quite understand how to strip out cpu/nanoseconds samples with the given interface. Is there some documentation on how to do this?

@prattmic
Copy link
Member

prattmic commented May 9, 2025

I don't think there is any nice helper function. I think it would work fine to simply remove the cpu/nanoseconds entry from Profile.SampleType (e.g., in my profile that would just be p.SampleType = p.SampleType[:1]), and similarly drop the same index from every Sample.Value in Profile.Sample.

@matthinrichsen-wf
Copy link
Author

Gotcha! Like this?

package main

import (
	"bytes"
	"os"
	"time"

	"github.com/google/pprof/profile"
)

func removeI[T any](in []T, index int) []T {
	if index < 0 || index >= len(in) {
		return in
	}
	out := make([]T, 0, len(in)-1)
	out = append(out, in[:index]...)
	out = append(out, in[index+1:]...)
	return out
}

func main() {
	b, err := os.ReadFile(`profile.pb`)
	if err != nil {
		panic(err)
	}
	p, err := profile.ParseData(b)
	if err != nil {
		panic(err)
	}

	sampleTypes := make([]*profile.ValueType, 0, len(p.SampleType))
	for i, st := range p.SampleType {
		if st.Type == `cpu` && st.Unit == `nanoseconds` {
			for _, s := range p.Sample {
				s.Value = removeI(s.Value, i)
			}
			continue
		}
		sampleTypes = append(sampleTypes, st)

	}
	p.SampleType = sampleTypes

	// scale the profile to reduce the size
	p.Scale(1 / (float64(time.Millisecond)))
	p.Scale(float64(time.Millisecond))

	buf := &bytes.Buffer{}
	err = p.Write(buf)
	if err != nil {
		panic(err)
	}
	os.WriteFile(`scaled.pb`, buf.Bytes(), 0644)
}

I've used it in go build -pgo scaled.pb and it seems to work

@matthinrichsen-wf
Copy link
Author

matthinrichsen-wf commented May 9, 2025

I now notice that when I use go tool pprof -http=: scaled.pb that it shows an empty profile

@matthinrichsen-wf
Copy link
Author

I also notice that samples/count is also empty from the raw output:

PeriodType: cpu nanoseconds
Period: 10000000
Time: 2025-05-07 00:40:10.243376642 -0500 CDT
Duration: 10m1
Samples:
samples/count
Locations
     1: 0x5dc1d0 M=1 runtime.memclrNoHeapPointers /usr/local/go/src/runtime/memclr_amd64.s:193:0 s=15
     2: 0x573695 M=1 runtime.mallocgcSmallNoscan /usr/local/go/src/runtime/malloc.go:1287:0 s=1254
     3: 0x5d0998 M=1 runtime.mallocgc /usr/local/go/src/runtime/malloc.go:1055:0 s=1010
     4: 0x5d56a8 M=1 runtime.makeslice /usr/local/go/src/runtime/slice.go:116:0 s=101
     5: 0x5a9648 M=1 runtime.saveAncestors /usr/local/go/src/runtime/proc.go:5163:0 s=5145
     6: 0x5a916d M=1 runtime.newproc1 /usr/local/go/src/runtime/proc.go:5077:0 s=5035
     7: 0x5a8fe4 M=1 runtime.newproc.func1 /usr/local/go/src/runtime/proc.go:5021:0 s=5020
     8: 0x5d94e9 M=1 runtime.systemstack /usr/local/go/src/runtime/asm_amd64.s:514:0 s=483
     9: 0x5a8f7e M=1 runtime.newproc /usr/local/go/src/runtime/proc.go:5020:0 s=5017

I'm curious if that's to be expected and if it will impact the ability of pgo to optimize well?

@prattmic
Copy link
Member

prattmic commented May 9, 2025

That profile has no samples. With no samples PGO will completely ignore it and do nothing. I suspect you just scaled away every sample (were there any samples with more than 1000 count?). Try a less extreme scaling.

@matthinrichsen-wf
Copy link
Author

@prattmic Looking at the original, no there we no counts above 1,000. I did find there's a scaling method that seems to work:

package main

import (
	"bytes"
	"os"
	"time"

	"github.com/google/pprof/profile"
)

func scaleCPUSamples(p *profile.Profile, ratio float64) {
	ratios := make([]float64, len(p.SampleType))
	for i, st := range p.SampleType {
		if st.Type == `cpu` && st.Unit == `nanoseconds` {
			ratios[i] = ratio
		} else {
			ratios[i] = 1
		}
	}
	p.ScaleN(ratios)
}

func main() {
	b, err := os.ReadFile(`profile.pb`)
	if err != nil {
		panic(err)
	}
	p, err := profile.ParseData(b)
	if err != nil {
		panic(err)
	}

	// scale the profile to reduce the size
	scaleCPUSamples(p, 1/(float64(time.Millisecond)))
	scaleCPUSamples(p, float64(time.Millisecond))

	buf := &bytes.Buffer{}
	err = p.Write(buf)
	if err != nil {
		panic(err)
	}
	os.WriteFile(`scaled.pb`, buf.Bytes(), 0644)
}

^ this reduces profile size and can still be used by pgo and seems to solve the problem. For context: the original profile is >100MB.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
BugReport Issues describing a possible bug in the Go implementation. compiler/runtime Issues related to the Go compiler and/or runtime. NeedsFix The path to resolution is known, but the work has not been done.
Projects
None yet
Development

No branches or pull requests

4 participants