-
Notifications
You must be signed in to change notification settings - Fork 25.3k
Utility to broadcast to unpromotable shards #93552
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Utility to broadcast to unpromotable shards #93552
Conversation
Introduces: * IndexShardRoutingTable now lists unpromotable assigned shards. * New utility TransportActions#broadcastToUnpromotableShards(). * New hook in ReplicationOperation for custom logic when the primary operation completes. * Refresh action now uses the new hook to broadcast the unpromotable refresh action to all unpromotable shards. Fixes ES-5454
Hi @kingherc, I've created a changelog YAML for you. |
@DaveCTurner take a look at this new PR, which uses a simple hook similar to the one you proposed in ReplicationOperation. Also extracted the utility function to broadcast to unpromotable shards. |
Pinging @elastic/es-distributed (Team:Distributed) |
server/src/main/java/org/elasticsearch/cluster/routing/IndexShardRoutingTable.java
Outdated
Show resolved
Hide resolved
TransportService transportService, | ||
String responseHandlerExecutor, | ||
ActionListener<Void> listener | ||
) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am new to this area, but I wonder if this should be a static utility?
It seems like there is a pattern of making abstract actions like TransportBroadcastAction
.
May be @tlrx or @DaveCTurner could advice on this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Written like this, I would agree. However, I think the caller should be able to pass in an IndexShardRoutingTable
rather than having to read the cluster state from the ClusterService
, because the state in the ClusterService
may differ from the one that the caller expects. With that change, it's a bit harder to express this as another kind of abstract action - the IndexShardRoutingTable
would have to be carried in the request itself, which must be an ActionRequest
, and hence Writeable
. Maybe that's ok? It still seems a bit weird cos we never actually want to send such a request over the wire.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also do we want to test this logic or it is okay that it is implicitly covered by other tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @idegtiarenko and @DaveCTurner ! Another reason we cannot do an abstract action is that the Refresh action already inherits TransportReplicationAction
. I tried that in the previous days with PRs #93518 and #93532 but after discussing with @DaveCTurner , the conclusion was to not make the replication action more complex. And instead provide a simple hook (this PR) that they can do custom stuff, such as broadcasting to unpromotable shards. Since we precluded extending TransportReplicationAction
, we cannot make an abstract action as Refresh cannot inherit two classes. So either Refresh would need to hardcode the broadcasting logic, or we refactor it in a utility function like this that can be re-used across the board (e.g., by Stateless actions).
should be able to pass in an IndexShardRoutingTable rather than having to read the cluster state from the ClusterService
@DaveCTurner , I can make it pass IndexShardRoutingTable. But I believe I would still need the ClusterService to get the cluster state to convert shards' node IDs to DiscoveryNodes, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also do we want to test this logic or it is okay that it is implicitly covered by other tests?
For the moment, this is tested in org.elasticsearch.cluster.routing.ShardRoutingRoleIT#testRefreshOfUnpromotableShards indirectly. That is the only IT file so far in ES that tests unpromotable shards. I will see if/how I can make a generic test for the new utility function there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we cannot make an abstract action as Refresh cannot inherit two classes.
That's ok tho, the unpromotable bit would be a separate action.
But I believe I would still need the ClusterService to get the cluster state to convert shards' node IDs to DiscoveryNodes, right?
Ah ok you'd need to accept the DiscoveryNodes
too.
Maybe that's ok?
I've thought about this for a bit and decided that yes this is ok and indeed I prefer it 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've thought about this for a bit and decided that yes this is ok and indeed I prefer it
Trying to connect the dots, you mean we can make a new abstract action that underneaths broadcasts it to unpromotable shards, and Refresh does not inherit it, but rather executes it? I guess that will work and is in line with @idegtiarenko 's comment. Let me see if/how I can make this.
BTW, we'd still need the hook of this PR for the custom logic so that Refresh actually shoots the new abstract action.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refresh does not inherit it, but rather executes it
Yep.
we'd still need the hook of this PR for the custom logic
Yep.
I will see if/how I can make a generic test for the new utility function there.
I would prefer to have unit tests covering this logic. It is a little awkward to set everything up for such a test, but it's well worth it. See e.g. TransportNodesActionTests
or TransportBroadcastByNodeActionTests
for some test suites of similar things.
Closing in favor of new PR #93600 |
Introduces:
Fixes ES-5454