fix: do not let _resolve/cluster hang if remote is unresponsive (#119516) (#119528)

* fix: do not let `_resolve/cluster` hang if remote is unresponsive

Previously, `_resolve/cluster` would wait for a response from a remote
as part of the connection strategy. If the remote were to be
unresponsive, this API would wait until `netty` would terminate the
connection with a handshake exception. The threshold for terminating the
connection is `10s`. This means that the API would wait for `10s` before
determining that the remote is unresponsive. This strategy is now
replaced with a fail fast where a response is sent back to the user
immediately rather than waiting for a connection termination.

* Update docs/changelog/119516.yaml
This commit is contained in:
Pawan Kartik 2025-01-03 17:45:12 +00:00 committed by GitHub
parent 46ec08f2de
commit d41813c99e
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
2 changed files with 6 additions and 1 deletions

View file

@ -0,0 +1,5 @@
pr: 119516
summary: "Fix: do not let `_resolve/cluster` hang if remote is unresponsive"
area: Search
type: bug
issues: []

View file

@ -141,7 +141,7 @@ public class TransportResolveClusterAction extends HandledTransportAction<Resolv
RemoteClusterClient remoteClusterClient = remoteClusterService.getRemoteClusterClient( RemoteClusterClient remoteClusterClient = remoteClusterService.getRemoteClusterClient(
clusterAlias, clusterAlias,
searchCoordinationExecutor, searchCoordinationExecutor,
RemoteClusterService.DisconnectedStrategy.RECONNECT_IF_DISCONNECTED RemoteClusterService.DisconnectedStrategy.FAIL_IF_DISCONNECTED
); );
var remoteRequest = new ResolveClusterActionRequest(originalIndices.indices(), request.indicesOptions()); var remoteRequest = new ResolveClusterActionRequest(originalIndices.indices(), request.indicesOptions());
// allow cancellation requests to propagate to remote clusters // allow cancellation requests to propagate to remote clusters