Description
Go version
go version go1.23.1 darwin/amd64
Output of go env
in your module/workspace:
go env
GO111MODULE=''
GOARCH='amd64'
GOBIN=''
GOCACHE='/Users/user/Library/Caches/go-build'
GOENV='/Users/user/Library/Application Support/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFLAGS=''
GOHOSTARCH='amd64'
GOHOSTOS='darwin'
GOINSECURE=''
GOMODCACHE='/Users/user/go/pkg/mod'
GONOPROXY='github.com/vsivsi'
GONOSUMDB='github.com/vsivsi'
GOOS='darwin'
GOPATH='/Users/user/go'
GOPRIVATE='github.com/vsivsi'
GOPROXY='https://proxy.golang.org,direct'
GOROOT='/usr/local/Cellar/go/1.23.1/libexec'
GOSUMDB='sum.golang.org'
GOTMPDIR=''
GOTOOLCHAIN='local'
GOTOOLDIR='/usr/local/Cellar/go/1.23.1/libexec/pkg/tool/darwin_amd64'
GOVCS=''
GOVERSION='go1.23.1'
GODEBUG=''
GOTELEMETRY='local'
GOTELEMETRYDIR='/Users/user/Library/Application Support/go/telemetry'
GCCGO='gccgo'
GOAMD64='v1'
AR='ar'
CC='cc'
CXX='c++'
CGO_ENABLED='1'
GOMOD='/Users/user/wordcounter/go.mod'
GOWORK=''
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
PKG_CONFIG='pkg-config'
GOGCCFLAGS='-fPIC -arch x86_64 -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -ffile-prefix-map=/var/folders/h5/p_k0bq5d55g3r_v_sz5ghz5h0000gn/T/go-build693439929=/tmp/go-build -gno-record-gcc-switches -fno-common'
What did you do?
Modified an existing "pull" style io.Reader
loop to instead use a go 1.23 iterator, with no other changes to the following logic in the loop body.
So this:
for {
n, err := wordReader.Read(buf)
if err == io.EOF {
break
} else if err != nil {
panic(err)
}
word := string(buf[:n])
// rest of loop logic
}
Became this:
for word := range wordReader.All() {
// rest of loop logic (same as case above)
}
Where new wordReader.All()
iterator implementation is based on wordReader.Read()
:
// All returns an iterator allowing the caller to iterate over the WordReader using for/range.
func (wr *WordReader) All() iter.Seq[string] {
word := make([]byte, 1024)
return func(yield func(string) bool) {
var err error
var n int
for n, err = wr.Read(word); err == nil; n, err = wr.Read(word) {
if !yield(string(word[:n])) {
return
}
}
if err != io.EOF {
fmt.Fprintf(os.Stderr, "error reading word: %v\n", err)
}
}
}
A full reproduction including both the original "pull" and new "iterator" versions of the loop is available here:
https://go.dev/play/p/ENULYowKCLS
What did you see happen?
Executing with the new range/iterator loop does not terminate. In the go playground link above this manifests as the original "pull" style loop completing immediately (in EstimateUniqueWordsPull()
) , and then the program times-out the playground while running EstimateUniqueWordsIter()
.
If run locally as a CLI, the program does not terminate.
Note! Frustratingly there are many cases that rescue this non-termination behavior. Running under a debug build or using the race detector works as expected.
As do even seemingly trivial changes to the code, e.g. in the Playground repro:
// fmt.Println("Rounds", rounds) // ######### Uncomment line to rescue
Simply adding a single Println reverts to the expected behavior. This feels very racy to me, but there are no goroutine calls in the reproducing code, so this is a pretty clear indication that something bad is happening out of sight.
What did you expect to see?
I expect the two loops documented above as implemented in EstimateUniqueWordsPull()
and EstimateUniqueWordsIter()
to behave essentially identically, each running nearly instantaneously on a small test input, and the program should return a result for each and properly terminate.
Note, in the reproduction code in the playground, it would be normal for either of the above functions to return values resulting in either the "PASS" case or the "expected X, got Y" mismatch. This code implements a probabilistic counting estimation algorithm that uses randomness, as described here:
https://www.quantamagazine.org/computer-scientists-invent-an-efficient-new-way-to-count-20240516/