Skip to content

x/net/http2: Race in handler execution results in zero-byte data frame, causing incompatibility with gRPC #56317

Closed
@LINKIWI

Description

@LINKIWI

What version of Go are you using (go version)?

$ go version
go version go1.19.2 linux/amd64

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE=""
GOENV=""
GOEXE=""
GOEXPERIMENT=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOINSECURE=""
GOMODCACHE=""
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH=""
GOPRIVATE=""
GOPROXY=""
GOROOT="/usr/lib/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/lib/go/pkg/tool/linux_amd64"
GOVCS=""
GOVERSION="go1.19.2"
GCCGO="gccgo"
GOAMD64="v1"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD="/tmp/bug/go.mod"
GOWORK=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -Wl,--no-gc-sections -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build3528461531=/tmp/go-build -gno-record-gcc-switches"

What did you do?

In a proprietary HTTP + gRPC reverse proxy, when issuing unary gRPC calls, I observe intermittent occurrences of the error:

rpc error: code = Internal desc = server closed the stream without sending trailers

The occurrence of this error is not deterministically reproducible, and affects only a small percentage of requests. Usually, a client retry of the RPC alleviates the problem.

See the investigation notes after the survey questions in this issue.

What did you expect to see?

I expect to see no occurrences of this error under regular operation.

What did you see instead?

I see this error affecting 1 - 5% of requests.


Context

I'm working with a proprietary HTTP reverse proxy with built-in support for gRPC over HTTP/2.

Example

The proxy is proprietary, but its core logic is demonstrated below.

package main

import (
	"crypto/tls"
	"net"
	"net/http"
	"net/http/httputil"

	"golang.org/x/net/http2"
	"golang.org/x/net/http2/h2c"
)

func main() {
	tr := &http2.Transport{
		AllowHTTP: true,
		DialTLS: func(network string, addr string, cfg *tls.Config) (net.Conn, error) {
			return net.Dial("tcp", "localhost:7001")
		},
	}

	rp := &httputil.ReverseProxy{
		Transport:     tr,
		FlushInterval: -1,
		Director: func(req *http.Request) {
			req.URL.Scheme = "http"
		},
	}

	handler := h2c.NewHandler(rp, &http2.Server{})

	srv := &http.Server{
		Addr:    "localhost:7000",
		Handler: handler,
	}

	srv.ListenAndServe()
}

Symptom

Clients issuing gRPC calls through the proxy that return gRPC application-level errors intermittently (non-determinstically) observe errors from the grpc-go library server closed the stream without sending trailers.

GODEBUG=http2debug=2 reveals that the issue manifests only when Go's http2.Server writes a HEADERS frame with flag END_HEADERS followed by a zero-byte DATA frame with flag END_STREAM.

The issue does not manifest (i.e. the application-level error is propagated correctly) when Go's http2.Server writes a HEADERS frame with flags END_HEADERS | END_STREAM.

Note that there are no trailers included in this message.

Example trace with no errors (RPC returns successfully)

2022/10/18 00:12:57 http2: Transport encoding header ":method" = "POST"
2022/10/18 00:12:57 http2: Transport encoding header ":path" = "/service/Method"
2022/10/18 00:12:57 http2: Transport encoding header ":scheme" = "http"
2022/10/18 00:12:57 http2: Transport encoding header "te" = "trailers"
2022/10/18 00:12:57 http2: Transport encoding header "grpc-timeout" = "998862u"
2022/10/18 00:12:57 http2: Transport encoding header "content-type" = "application/grpc"
2022/10/18 00:12:57 http2: Transport encoding header "user-agent" = "grpc-go/1.49.0"
2022/10/18 00:12:57 http2: Transport encoding header "x-forwarded-for" = "127.0.0.1"
2022/10/18 00:12:57 http2: Transport encoding header "accept-encoding" = "gzip"
2022/10/18 00:12:57 http2: Framer 0xc000366000: wrote HEADERS flags=END_HEADERS stream=7 len=17
2022/10/18 00:12:57 http2: Framer 0xc000366000: wrote DATA stream=7 len=12 data="\x00\x00\x00\x00\a\n\x05\b\x02\x12\x01a"
2022/10/18 00:12:57 http2: Framer 0xc000366000: wrote DATA flags=END_STREAM stream=7 len=0 data=""
2022/10/18 00:12:57 http2: Framer 0xc000366000: read WINDOW_UPDATE len=4 (conn) incr=12
2022/10/18 00:12:57 http2: Transport received WINDOW_UPDATE len=4 (conn) incr=12
2022/10/18 00:12:57 http2: Framer 0xc000366000: read PING len=8 ping="\x02\x04\x10\x10\t\x0e\a\a"
2022/10/18 00:12:57 http2: Transport received PING len=8 ping="\x02\x04\x10\x10\t\x0e\a\a"
2022/10/18 00:12:57 http2: Framer 0xc000366000: wrote PING flags=ACK len=8 ping="\x02\x04\x10\x10\t\x0e\a\a"
2022/10/18 00:12:57 http2: Framer 0xc000366000: read HEADERS flags=END_STREAM|END_HEADERS stream=7 len=4
2022/10/18 00:12:57 http2: decoded hpack field header field ":status" = "200"
2022/10/18 00:12:57 http2: decoded hpack field header field "content-type" = "application/grpc"
2022/10/18 00:12:57 http2: decoded hpack field header field "grpc-status" = "5"
2022/10/18 00:12:57 http2: decoded hpack field header field "grpc-message" = "open /tmp/a: no such file or directory"
2022/10/18 00:12:57 http2: Transport received HEADERS flags=END_STREAM|END_HEADERS stream=7 len=4

Example trace with error (internal error raised by grpc-go)

2022/10/18 00:12:55 http2: Transport encoding header ":method" = "POST"
2022/10/18 00:12:55 http2: Transport encoding header ":path" = "/service/Method"
2022/10/18 00:12:55 http2: Transport encoding header ":scheme" = "http"
2022/10/18 00:12:55 http2: Transport encoding header "grpc-timeout" = "998585u"
2022/10/18 00:12:55 http2: Transport encoding header "content-type" = "application/grpc"
2022/10/18 00:12:55 http2: Transport encoding header "user-agent" = "grpc-go/1.49.0"
2022/10/18 00:12:55 http2: Transport encoding header "te" = "trailers"
2022/10/18 00:12:55 http2: Transport encoding header "x-forwarded-for" = "127.0.0.1"
2022/10/18 00:12:55 http2: Transport encoding header "accept-encoding" = "gzip"
2022/10/18 00:12:55 http2: Framer 0xc000366000: wrote HEADERS flags=END_HEADERS stream=5 len=18
2022/10/18 00:12:55 http2: Framer 0xc000366000: wrote DATA stream=5 len=12 data="\x00\x00\x00\x00\a\n\x05\b\x02\x12\x01a"
2022/10/18 00:12:55 http2: Framer 0xc000366000: wrote DATA flags=END_STREAM stream=5 len=0 data=""
2022/10/18 00:12:55 http2: Framer 0xc000366000: read WINDOW_UPDATE len=4 (conn) incr=12
2022/10/18 00:12:55 http2: Transport received WINDOW_UPDATE len=4 (conn) incr=12
2022/10/18 00:12:55 http2: Framer 0xc000366000: read PING len=8 ping="\x02\x04\x10\x10\t\x0e\a\a"
2022/10/18 00:12:55 http2: Transport received PING len=8 ping="\x02\x04\x10\x10\t\x0e\a\a"
2022/10/18 00:12:55 http2: Framer 0xc000366000: wrote PING flags=ACK len=8 ping="\x02\x04\x10\x10\t\x0e\a\a"
2022/10/18 00:12:55 http2: Framer 0xc000366000: read HEADERS flags=END_STREAM|END_HEADERS stream=5 len=4
2022/10/18 00:12:55 http2: decoded hpack field header field ":status" = "200"
2022/10/18 00:12:55 http2: decoded hpack field header field "content-type" = "application/grpc"
2022/10/18 00:12:55 http2: decoded hpack field header field "grpc-status" = "5"
2022/10/18 00:12:55 http2: decoded hpack field header field "grpc-message" = "open /tmp/a: no such file or directory"
2022/10/18 00:12:55 http2: Transport received HEADERS flags=END_STREAM|END_HEADERS stream=5 len=4
2022/10/18 00:12:55 http2: server encoding header ":status" = "200"
2022/10/18 00:12:55 http2: server encoding header "content-type" = "application/grpc"
2022/10/18 00:12:55 http2: server encoding header "grpc-message" = "open /tmp/a: no such file or directory"
2022/10/18 00:12:55 http2: server encoding header "grpc-status" = "5"
2022/10/18 00:12:55 http2: server encoding header "date" = "Tue, 18 Oct 2022 07:12:55 GMT"
2022/10/18 00:12:55 http2: Framer 0xc0003900e0: wrote HEADERS flags=END_HEADERS stream=1 len=89
2022/10/18 00:12:55 http2: Framer 0xc0003900e0: wrote DATA flags=END_STREAM stream=1 len=0 data=""

RCA

I believe this is due to a race caused by concurrent http.Handler execution in http2/server.go.

In the case that handler execution completes before headers are written, rws.handlerDone is true and Go includes END_STREAM in the initial HEADERS frame. In the case that handler execution is still in-progress when the first write occurs, the HEADERS frame is written without END_STREAM, and a subsequent write sends a zero-byte data frame with END_STREAM, acting purely as a control message.

Ultimately this causes non-determinism where the specific scenario that unary gRPC methods that return errors quickly are disproportionately affected.

According to gRPC specification, END_STREAM should be included in the last HEADERS frame to indicate termination of the response. In grpc-go, encountering END_STREAM in a data frame is an explicit error case. However, HTTP/2 protocol specification itself doesn't prohibit this.

Proposal

A similar (identical?) issue was identified in nghttp2 (see: nghttp2/nghttp2#588). The submitted fix was to include END_STREAM in the HEADERS payload if the body is empty and no trailers exist. I'm not sure if a similar approach is feasible in http2.Server.

Metadata

Metadata

Assignees

Labels

FrozenDueToAgeNeedsInvestigationSomeone must examine and confirm this is a valid issue and not a duplicate of an existing one.

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions