Skip to content

Clarify the behavior of remote/info and resolve/cluster for connected status of remotes #118993

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 16 commits into from
Jan 29, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 14 additions & 3 deletions docs/reference/cluster/remote-info.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -26,10 +26,18 @@ Returns configured remote cluster information.
[[cluster-remote-info-api-desc]]
==== {api-description-title}

The cluster remote info API allows you to retrieve all of the configured
remote cluster information. It returns connection and endpoint information keyed
The cluster remote info API allows you to retrieve information about configured
remote clusters. It returns connection and endpoint information keyed
by the configured remote cluster alias.

TIP: This API returns information that reflects current state on the local cluster.
The `connected` field does not necessarily reflect whether a remote cluster is
down or unavailable, only whether there is currently an open connection to it.
Elasticsearch does not spontaneously try to reconnect to a disconnected remote
cluster. To trigger a reconnection, attempt a <<modules-cross-cluster-search,{ccs}>>,
<<esql-cross-clusters,{esql} {ccs}>>, or try the
<<indices-resolve-cluster-api,resolve cluster>> endpoint.


[[cluster-remote-info-api-response-body]]
==== {api-response-body-title}
Expand All @@ -39,7 +47,10 @@ by the configured remote cluster alias.
`proxy`.

`connected`::
True if there is at least one connection to the remote cluster.
True if there is at least one open connection to the remote cluster. When
false, it means that the cluster no longer has an open connection to the
remote cluster. It does not necessarily mean that the remote cluster is
down or unavailable, just that at some point a connection was lost.

`initial_connect_timeout`::
The initial connect timeout for remote cluster connections.
Expand Down
56 changes: 21 additions & 35 deletions docs/reference/indices/resolve-cluster.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -11,9 +11,7 @@ For the most up-to-date API details, refer to {api-es}/group/endpoint-indices[In
--

Resolves the specified index expressions to return information about
each cluster, including the local "querying" cluster, if included. If no index expression
is provided, this endpoint will return information about all the remote
clusters that are configured on the querying cluster.
each cluster, including the local "querying" cluster, if included.

This endpoint is useful before doing a <<modules-cross-cluster-search,{ccs}>> in
order to determine which remote clusters should be included in a search.
Expand All @@ -24,10 +22,12 @@ with this endpoint.

For each cluster in scope, information is returned about:

1. whether the querying ("local") cluster is currently connected to it
1. whether the querying ("local") cluster was able to connect to each remote cluster
specified in the index expression. Note that this endpoint actively attempts to
contact the remote clusters, unlike the <<cluster-remote-info,remote/info>> endpoint.
2. whether each remote cluster is configured with `skip_unavailable` as `true` or `false`
3. whether there are any indices, aliases or data streams on that cluster that match
the index expression (if one provided)
the index expression
4. whether the search is likely to have errors returned when you do a {ccs} (including any
authorization errors if your user does not have permission to query a remote cluster or
the indices on that cluster)
Expand All @@ -42,12 +42,6 @@ Once the proper security permissions are obtained, then you can rely on the `con
in the response to determine whether the remote cluster is available and ready for querying.
====

NOTE: When querying older clusters that do not support the _resolve/cluster endpoint
without an index expression, the local cluster will send the index expression `dummy*`
to those remote clusters, so if an errors occur, you may see a reference to that index
expression even though you didn't request it. If it causes a problem, you can instead
include an index expression like `*:*` to this endpoint to bypass the issue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These deleted sections will be added back in another PR that doesn't target 8.17.2.

////
[source,console]
--------------------------------
Expand Down Expand Up @@ -77,14 +71,6 @@ PUT _cluster/settings
// TEST[s/35.238.149.\d+:930\d+/\${transport_host}/]
////

[source,console]
----
GET /_resolve/cluster
----
// TEST[continued]

Returns information about all remote clusters configured on the local cluster.

[source,console]
----
GET /_resolve/cluster/my-index-*,cluster*:my-index-*
Expand Down Expand Up @@ -140,21 +126,28 @@ ignored when frozen. Defaults to `false`.
+
deprecated:[7.16.0]

[TIP]
====
The index options above are only allowed when specifying an index expression.
You will get an error if you specify index options to the _resolve/cluster API
that takes no index expression.
====


[discrete]
[[usecases-for-resolve-cluster]]
=== Test availability of remote clusters

The <<cluster-remote-info,remote/info>> endpoint is commonly used to test whether the "local"
cluster (the cluster being queried) is connected to its remote clusters, but it does not
necessarily reflect whether the remote cluster is available or not. The remote cluster may
be available, while the local cluster is not currently connected to it.

You can use the resolve-cluster API to attempt to reconnect to remote clusters
(for example with `GET _resolve/cluster/*:*`) and
the `connected` field in the response will indicate whether it was successful or not.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This still isn't true given the (IMO misguided) move to using FAIL_IF_DISCONNECTED. The connected field indicates whether the responding node was connected to the remote cluster at the start of the check. If it's disconnected at that point then we return connected: false and trigger a background connection attempt (unless one was already running) in the hope that some future call will eventually return connected: true.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks David. Pawan and I discussed this last week and came to a similar conclusion. We're going to revert #119516 in favor of another approach. We identified a couple of options and Pawan is going to follow up with the distributed team on some of the options that are thinking of. Once that's decided, I'll revise this PR according to the new model.

If a connection was (re-)established, this will also cause the
<<cluster-remote-info,remote/info>> endpoint to now indicate a connected status.


=== Advantages of using this endpoint before a {ccs}

You may want to exclude a cluster or index from a search when:

1. A remote cluster is not currently connected and is configured with `skip_unavailable`=`false`.
1. A remote cluster could not be connected to and is configured with `skip_unavailable`=`false`.
Executing a {ccs} under those conditions will cause
<<cross-cluster-search-failures,the entire search to fail>>.

Expand Down Expand Up @@ -268,14 +261,7 @@ GET /_resolve/cluster/not-present,clust*:my-index*,oldcluster:*?ignore_unavailab
},
"cluster_two": {
"connected": false, <3>
"skip_unavailable": false,
"matching_indices": true,
"version": {
"number": "8.13.0",
"build_flavor": "default",
"minimum_wire_compatibility_version": "7.17.0",
"minimum_index_compatibility_version": "7.0.0"
}
"skip_unavailable": false
},
"oldcluster": { <4>
"connected": true,
Expand Down