Skip to content

Commit a4b2c5e

Browse files
committed
cmd/go: work around occasional ETXTBSY running cgo
Fixes #3001. (This time for sure!) R=golang-dev, r, fullung CC=golang-dev https://golang.org/cl/5845044
1 parent 11cc5a2 commit a4b2c5e

File tree

1 file changed

+61
-8
lines changed

1 file changed

+61
-8
lines changed

src/cmd/go/build.go

Lines changed: 61 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,7 @@ import (
2121
"runtime"
2222
"strings"
2323
"sync"
24+
"time"
2425
)
2526

2627
var cmdBuild = &Command{
@@ -1047,14 +1048,66 @@ func (b *builder) runOut(dir string, desc string, cmdargs ...interface{}) ([]byt
10471048
}
10481049
}
10491050

1050-
var buf bytes.Buffer
1051-
cmd := exec.Command(cmdline[0], cmdline[1:]...)
1052-
cmd.Stdout = &buf
1053-
cmd.Stderr = &buf
1054-
cmd.Dir = dir
1055-
// TODO: cmd.Env
1056-
err := cmd.Run()
1057-
return buf.Bytes(), err
1051+
nbusy := 0
1052+
for {
1053+
var buf bytes.Buffer
1054+
cmd := exec.Command(cmdline[0], cmdline[1:]...)
1055+
cmd.Stdout = &buf
1056+
cmd.Stderr = &buf
1057+
cmd.Dir = dir
1058+
// TODO: cmd.Env
1059+
err := cmd.Run()
1060+
1061+
// cmd.Run will fail on Unix if some other process has the binary
1062+
// we want to run open for writing. This can happen here because
1063+
// we build and install the cgo command and then run it.
1064+
// If another command was kicked off while we were writing the
1065+
// cgo binary, the child process for that command may be holding
1066+
// a reference to the fd, keeping us from running exec.
1067+
//
1068+
// But, you might reasonably wonder, how can this happen?
1069+
// The cgo fd, like all our fds, is close-on-exec, so that we need
1070+
// not worry about other processes inheriting the fd accidentally.
1071+
// The answer is that running a command is fork and exec.
1072+
// A child forked while the cgo fd is open inherits that fd.
1073+
// Until the child has called exec, it holds the fd open and the
1074+
// kernel will not let us run cgo. Even if the child were to close
1075+
// the fd explicitly, it would still be open from the time of the fork
1076+
// until the time of the explicit close, and the race would remain.
1077+
//
1078+
// On Unix systems, this results in ETXTBSY, which formats
1079+
// as "text file busy". Rather than hard-code specific error cases,
1080+
// we just look for that string. If this happens, sleep a little
1081+
// and try again. We let this happen three times, with increasing
1082+
// sleep lengths: 100+200+400 ms = 0.7 seconds.
1083+
//
1084+
// An alternate solution might be to split the cmd.Run into
1085+
// separate cmd.Start and cmd.Wait, and then use an RWLock
1086+
// to make sure that copyFile only executes when no cmd.Start
1087+
// call is in progress. However, cmd.Start (really syscall.forkExec)
1088+
// only guarantees that when it returns, the exec is committed to
1089+
// happen and succeed. It uses a close-on-exec file descriptor
1090+
// itself to determine this, so we know that when cmd.Start returns,
1091+
// at least one close-on-exec file descriptor has been closed.
1092+
// However, we cannot be sure that all of them have been closed,
1093+
// so the program might still encounter ETXTBSY even with such
1094+
// an RWLock. The race window would be smaller, perhaps, but not
1095+
// guaranteed to be gone.
1096+
//
1097+
// Sleeping when we observe the race seems to be the most reliable
1098+
// option we have.
1099+
//
1100+
// http://golang.org/issue/3001
1101+
//
1102+
if err != nil && nbusy < 3 && strings.Contains(err.Error(), "text file busy") {
1103+
time.Sleep(100 * time.Millisecond << uint(nbusy))
1104+
nbusy++
1105+
continue
1106+
}
1107+
1108+
return buf.Bytes(), err
1109+
}
1110+
panic("unreachable")
10581111
}
10591112

10601113
// mkdir makes the named directory.

0 commit comments

Comments
 (0)