Skip to content

rpc: removed dependency on deprecated (*grpc.ClientConn).State() #7949

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 4 commits into from

Conversation

d4l3k
Copy link
Contributor

@d4l3k d4l3k commented Jul 20, 2016

Fixes #7430.

Tests fail with the updated dependencies.

E160720 16:46:41.551691 server.go:490  http2: server: error reading preface from client 127.0.0.1:49478: bogus greeting "GET /_status/metrics//1 "
I160720 16:46:41.551791 server/status_test.go:174  could not GET https://127.0.0.1:49466/_status/metrics//1 - Get https://127.0.0.1:49466/_status/metrics//1: EOF
--- FAIL: TestMetricsEndpoint (1.71s)
    status_test.go:194: There was an error retrieving https://127.0.0.1:49466/_status/metrics//1

Doesn't seem to be an issue when actually running a node.


This change is Reviewable

@d4l3k
Copy link
Contributor Author

d4l3k commented Jul 21, 2016

Fixed the issue. A grpc update started set NextProtos on our global tls.Config object instead of making a copy with it.

grpc/grpc-go@78e558b

@tamird
Copy link
Contributor

tamird commented Jul 21, 2016

I don't think you can copy it like that; vet is going to complain. That's probably why upstream made the change begin with.

@d4l3k
Copy link
Contributor Author

d4l3k commented Jul 21, 2016

I thought so too but it isn't complaining for some reason

@tamird
Copy link
Contributor

tamird commented Jul 21, 2016

Reviewed 9 of 10 files at r3.
Review status: 0 of 17 files reviewed at latest revision, 7 unresolved discussions, some commit checks failed.


rpc/context.go, line 73 [r3] (raw file):

const (
  // Unknown indicates that the health status of the remote server isn't known.

nit: you can have a single comment for this entire const block when the names are descriptive. In this case, these comments aren't really useful.


rpc/context.go, line 240 [r3] (raw file):

// ConnState returns whether the most recent heartbeat succeeded or not. This
// should not be used as a definite status of a nodes health and just used to
// prioritized healthy nodes over unhealthy ones.

s/prioritized/prioritize/


rpc/context.go, line 248 [r3] (raw file):

}

// ConnStateChange returns a channel that will trigger when the connection

"trigger"?


rpc/context.go, line 250 [r3] (raw file):

// ConnStateChange returns a channel that will trigger when the connection
// health state changes.
func (ctx *Context) ConnStateChange(remoteAddr string) <-chan ConnectivityState {

instead of hanging these method on the context, can you make the dial method return a thin wrapper around the grpc.ClientConn that implements these methods? I don't see any reason to go back to the context every time.

I'd use that technique to just provide exactly the API that grpc gave us and continue to use that.


rpc/context.go, line 258 [r3] (raw file):

// WaitForConnected waits until the connection is no longer in the unknown
// state. If the state is unhealthy, or times out it returns an error.

"error. state."


rpc/context.go, line 260 [r3] (raw file):

// state. If the state is unhealthy, or times out it returns an error.
// state.
func (ctx *Context) WaitForConnected(remoteAddr string, timeout time.Duration) error {

seems like you could generalize this to exactly what grpc's WaitForStateChange provided.


rpc/context.go, line 263 [r3] (raw file):

  var timerC <-chan time.Time
  if timeout > 0 {
      timerC = time.NewTimer(timeout).C

why not time.After? call it timeoutChan if you make the change.


Comments from Reviewable

if state == grpc.Shutdown {
return nil, roachpb.NewErrorf("roachpb.Batch RPC failed as client connection was closed")
}
if err := s.rpcContext.WaitForConnected(s.remoteAddr, healthyTimeout); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't WaitForConnected take the context.Context and fail early when it expires?

@bdarnell
Copy link
Contributor

Reviewed 4 of 5 files at r1, 1 of 10 files at r3, 9 of 9 files at r4, 1 of 1 files at r5, 2 of 2 files at r6.
Review status: all files reviewed at latest revision, 10 unresolved discussions, some commit checks failed.


security/tls.go, line 122 [r6] (raw file):

      return cfg
  }
  newConfig := *cfg

Have you checked this with go 1.7 vet? It's gotten better about detecting copied locks. Whether vet complains or not, it's unsafe to use this unless you can be sure that no one has started using a non-copied version of this tls.Config.

The Go standard library ran into this and solved it for themselves with a non-public method (grumble, grumble): golang/go@d931716 (two subtly different non-public methods, in fact)

Other projects appear to have copied the same code (which will need to be updated for go 1.7 as two new fields have been added to the config struct). We could do the same but this really needs to be part of the standard library (See golang/go#15771). Or maybe we could pass around func() *tls.Config wrappers instead of pointers to shared tls.Config objects (eww, but the alternative is a brittle manual field-by-field copy).


Comments from Reviewable

d4l3k added 4 commits July 25, 2016 16:36
…iginal

GRPC used to make a copy of the tls.Config when using it in grpc.NewTLS. Not
making a copy lead to globally setting `(*tls.Config).NextProtos = []string{"h2"}`
which messed up protocol negotation in HTTP endpoint tests.

grpc/grpc-go@78e558b
@d4l3k
Copy link
Contributor Author

d4l3k commented Jul 25, 2016

Review status: all files reviewed at latest revision, 10 unresolved discussions, some commit checks failed.


internal/client/rpc_sender.go, line 57 [r6] (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

shouldn't WaitForConnected take the context.Context and fail early when it expires?

Done.

kv/transport.go, line 180 [r6] (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

Here, too.

Done.

rpc/context.go, line 73 [r3] (raw file):

Previously, tamird (Tamir Duberstein) wrote…

nit: you can have a single comment for this entire const block when the names are descriptive. In this case, these comments aren't really useful.

Done.

rpc/context.go, line 240 [r3] (raw file):

Previously, tamird (Tamir Duberstein) wrote…

s/prioritized/prioritize/

Done.

rpc/context.go, line 248 [r3] (raw file):

Previously, tamird (Tamir Duberstein) wrote…

"trigger"?

Done.

rpc/context.go, line 250 [r3] (raw file):

Previously, tamird (Tamir Duberstein) wrote…

instead of hanging these method on the context, can you make the dial method return a thin wrapper around the grpc.ClientConn that implements these methods? I don't see any reason to go back to the context every time.

I'd use that technique to just provide exactly the API that grpc gave us and continue to use that.

Done.

rpc/context.go, line 258 [r3] (raw file):

Previously, tamird (Tamir Duberstein) wrote…

"error. state."

Done.

rpc/context.go, line 260 [r3] (raw file):

Previously, tamird (Tamir Duberstein) wrote…

seems like you could generalize this to exactly what grpc's WaitForStateChange provided.

Done.

rpc/context.go, line 263 [r3] (raw file):

Previously, tamird (Tamir Duberstein) wrote…

why not time.After? call it timeoutChan if you make the change.

Done.

security/tls.go, line 122 [r6] (raw file):

Previously, bdarnell (Ben Darnell) wrote…

Have you checked this with go 1.7 vet? It's gotten better about detecting copied locks. Whether vet complains or not, it's unsafe to use this unless you can be sure that no one has started using a non-copied version of this tls.Config.

The Go standard library ran into this and solved it for themselves with a non-public method (grumble, grumble): golang/go@d931716 (two subtly different non-public methods, in fact)

Other projects appear to have copied the same code (which will need to be updated for go 1.7 as two new fields have been added to the config struct). We could do the same but this really needs to be part of the standard library (See golang/go#15771). Or maybe we could pass around func() *tls.Config wrappers instead of pointers to shared tls.Config objects (eww, but the alternative is a brittle manual field-by-field copy).

Updated to be a copy of (*tls.Config).clone() with a TODO about updating for 1.8. I can't imagine tls.Config fields changing between now and the 1.8 release.

Comments from Reviewable

@bdarnell
Copy link
Contributor

Review status: 0 of 30 files reviewed at latest revision, 10 unresolved discussions, some commit checks pending.


security/tls.go, line 122 [r6] (raw file):

Previously, d4l3k (Tristan Rice) wrote…

Updated to be a copy of (*tls.Config).clone() with a TODO about updating for 1.8. I can't imagine tls.Config fields changing between now and the 1.8 release.

I can imagine it: https://github.com/golang/go/issues/16363 is proposing to add a new field and is tagged for the go1.8 milestone.

Comments from Reviewable

@d4l3k
Copy link
Contributor Author

d4l3k commented Jul 25, 2016

security/tls.go, line 122 [r6] (raw file):

Previously, bdarnell (Ben Darnell) wrote…

I can imagine it: golang/go#16363 is proposing to add a new field and is tagged for the go1.8 milestone.

Sorry for the poor wording. `(*tls.Config).Clone()`is also slated for 1.8. When that gets added, we'll remove this function.

Comments from Reviewable

@tamird
Copy link
Contributor

tamird commented Jul 25, 2016

Can you rebase this into two commits: 1bfe1db in one and everything else in another?


Reviewed 18 of 30 files at r7, 9 of 9 files at r8, 1 of 1 files at r9, 2 of 2 files at r10.
Review status: all files reviewed at latest revision, 10 unresolved discussions, some commit checks pending.


base/context.go, line 201 [r10] (raw file):

  })

  return security.CopyTLSConfig(ctx.clientTLSConfig.tlsConfig), ctx.clientTLSConfig.err

can you add comments explaining why?


internal/client/rpc_sender.go, line 33 [r10] (raw file):

type sender struct {
  remoteAddr string

are these new fields still used? looks like no.


internal/client/rpc_sender.go, line 64 [r10] (raw file):

  for state := c.State(); state != grpc.Ready; state, err = c.WaitForStateChange(ctxWithTimeout, state) {
      if err != nil {
          return nil, roachpb.NewError(errors.Wrap(err, "roachpb.Batch RPC failed"))

while you're here, give this a better message (WaitForStateChange failed)


kv/send_test.go, line 137 [r10] (raw file):

  }

  waitForConnState := func(desiredState grpc.ConnectivityState) {

why not use the method provided by conn?


rpc/client.go, line 35 [r10] (raw file):

}

// State returns whether the most recent heartbeat succeeded or not. This should

this comment implies a boolean return value, which is not the case.


rpc/client.go, line 47 [r10] (raw file):

      cc.state = grpc.Ready
  } else {
      cc.state = grpc.Shutdown

why shutdown?


rpc/context.go, line 69 [r10] (raw file):

type connMeta struct {
  conn *ClientConn

why is this a pointer? in fact, why does connMeta still exist at all?


security/tls.go, line 122 [r6] (raw file):

Previously, d4l3k (Tristan Rice) wrote…

Sorry for the poor wording. (*tls.Config).Clone()is also slated for 1.8. When that gets added, we'll remove this function.

add a link to the upstream implementation.

Comments from Reviewable

@tbg
Copy link
Member

tbg commented Jul 25, 2016

Reviewed 5 of 5 files at r1, 9 of 9 files at r2, 10 of 10 files at r3, 9 of 9 files at r4, 1 of 1 files at r5, 2 of 2 files at r6, 30 of 30 files at r7, 9 of 9 files at r8, 1 of 1 files at r9, 2 of 2 files at r10.
Review status: all files reviewed at latest revision, 8 unresolved discussions, some commit checks failed.


Comments from Reviewable

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants