Skip to content

Added maxBadConnRetries to limit the number of times to obtain broken… #1837

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from

Conversation

monkey92t
Copy link
Collaborator

This PR is not intended to solve #1737.

It is to limit the number of attempts. If all connections in a connection pool fail, we should not let it detect all connections in the entire connection pool.

@monkey92t
Copy link
Collaborator Author

This is an improvement to #1824.

Signed-off-by: monkey92t <[email protected]>
Signed-off-by: monkey92t <[email protected]>
@monkey92t monkey92t requested a review from vmihailenco July 28, 2021 07:05
@@ -235,7 +239,7 @@ func (p *ConnPool) Get(ctx context.Context) (*Conn, error) {
return nil, err
}

for {
for i := 0; i < maxBadConnRetries; i++ {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we need some limit here after #1824 was merged, but this won't help with #1737.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And I see that you already realize this :)

@vmihailenco
Copy link
Collaborator

vmihailenco commented Jul 28, 2021

I think that with this change we may open more than PoolSize connections if there are more than maxBadConnRetries in the pool (and probably break the pool). So this does not look right.

I guess we should introduce something like ErrBadConn from database/sql and change the code to use it. I don't have more details so it is fine if you leave this for me. And we probably should do this in v9 branch since it is a tricky/risky change. I will try to find some time in August to prepare a PR.

Speaking about v9. I think it is time we start thinking about releasing first beta version so we can start testing it. But before that we should remove/replace deprecated APIs. Could you take care of that?

@monkey92t
Copy link
Collaborator Author

I don't quite understand what you mean, there are only two situations where you can create a new connection:

  1. The connection pool is empty
  2. Get and close maxBadConnRetries=2 broken connections

In both cases, there is no problem creating a new connection.

I agree with ErrBadConn plan, but we can't add it in a short time.

Compared to v8.11.0, we have recently added (or fixed) some features, and we need to consider releasing a new version.
But #1824 does not impose some restrictions on Get() *Conn, this PR is a temporary risk avoidance solution

@monkey92t
Copy link
Collaborator Author

This PR will be replaced by ErrBadConn in V9. I just want users to use the improvements #1811, #1820, #1824, #1825....

@vmihailenco
Copy link
Collaborator

In both cases, there is no problem creating a new connection.

Well, you can't create new connections having idle connections in the pool. Otherwise you have > PoolSzie connections in the pool which almost certainly breaks the pool, because queue chan struct{} is used for synchronization and it's size is equal to PoolSize.

does not impose some restrictions on Get() *Conn, this PR is a temporary risk avoidance solution

I am not sure it is a big problem since connCheck should be fast enough to handle 100 connections. I think it is fine to release v8 as it is now.

@monkey92t
Copy link
Collaborator Author

Well, you can't create new connections having idle connections in the pool. Otherwise you have > PoolSzie connections in the pool which almost certainly breaks the pool, because queue chan struct{} is used for synchronization and it's size is equal to PoolSize.

I still can’t understand. When I closed maxBadConnRetries=2 connections and created a new connection, why would the connection pool overflow (> PoolSize)?

Suppose now len(idleConns) = 10, maxBadConnRetries=2,

for i := 0; i < maxBadConnRetries; i++ {
		p.connsMu.Lock()
		cn := p.popIdle()
		p.connsMu.Unlock()

		if cn == nil {
			break
		}

		if p.isStaleConn(cn) {
			_ = p.CloseConn(cn)
			continue
		}

		atomic.AddUint32(&p.stats.Hits, 1)
		return cn, nil
	}

After execution, len(idleConns) = 8, it seems that there is no problem in creating a new connection.

@monkey92t
Copy link
Collaborator Author

I am not sure it is a big problem since connCheck should be fast enough to handle 100 connections. I think it is fine to release v8 as it is now.

OK, I just think it is risky, if you think there is nothing wrong with it, we can close this issue.

@monkey92t monkey92t closed this Jul 29, 2021
@vmihailenco
Copy link
Collaborator

vmihailenco commented Jul 29, 2021

After execution, len(idleConns) = 8, it seems that there is no problem in creating a new connection.

There were 8 idle conns at some point in the past, but in the next instant there can be 100 or 0 idle connections because there are other threads that can add/remove connections.

The pool is synchronized around p.queue chan struct{}:

  1. you must acquire a struct{} from the p.queue
  2. holding the struct{}, you must use exsiting idle conn or open a new conn if there are no idle conns
  3. in the end , you must put the struct{} to the p.queue to let other threads use the pool

Hopefully that helps.

@monkey92t monkey92t deleted the pool branch February 11, 2023 15:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants