elasticsearch/docs/reference/modules/cluster/remote-clusters-connect.asciidoc
Michael Peterson a451511e3a
Change skip_unavailable default value to true (#105792)
In order to improve the experience of cross-cluster search, we are changing
the default value of the remote cluster `skip_unavailable` setting from `false` to `true`.

This setting causes any cross-cluster _search (or _async_search) to entirely fail when
any remote cluster with `skip_unavailable=false` is either unavailable (connection to it fails)
or if the search on it fails on all shards.

Setting `skip_unavailable=true` allows partial results from other clusters to be
returned. In that case, the search response cluster metadata will show a `skipped`
status, so the user can see that no data came in from that cluster. Kibana also
now leverages this metadata in the cross-cluster search responses to allow users
to see how many clusters returned data and drill down into which clusters did not
(including failure messages).

Currently, the user/admin has to specifically set the value to `true` in the configs, like so:

```
cluster:
    remote:
        remote1:
            seeds: 10.10.10.10:9300
            skip_unavailable: true
```

even though that is probably what search admins want in the vast majority of cases.

Setting `skip_unavailable=false` should be a conscious (and probably rare) choice
by an Elasticsearch admin that a particular cluster's results are so essential to a
search (or visualization in dashboard or Discover panel) that no results at all should
be shown if it cannot return any results.
2024-04-29 15:53:47 -04:00

249 lines
8 KiB
Text

ifeval::["{trust-mechanism}"=="cert"]
:remote-interface: transport
:remote-interface-default-port: 9300
:remote-interface-default-port-plus1: 9301
:remote-interface-default-port-plus2: 9302
endif::[]
ifeval::["{trust-mechanism}"=="api-key"]
:remote-interface: remote cluster
:remote-interface-default-port: 9443
:remote-interface-default-port-plus1: 9444
:remote-interface-default-port-plus2: 9445
endif::[]
NOTE: You must have the `manage` cluster privilege to connect remote clusters.
The local cluster uses the <<modules-network,{remote-interface} interface>> to
establish communication with remote clusters. The coordinating nodes in the
local cluster establish <<long-lived-connections,long-lived>> TCP connections
with specific nodes in the remote cluster. {es} requires these connections to
remain open, even if the connections are idle for an extended period.
To add a remote cluster from Stack Management in {kib}:
. Select *Remote Clusters* from the side navigation.
. Enter a name (_cluster alias_) for the remote cluster.
. Specify the {es} endpoint URL, or the IP address or host name of the remote
cluster followed by the {remote-interface} port (defaults to
+{remote-interface-default-port}+). For example,
+cluster.es.eastus2.staging.azure.foundit.no:{remote-interface-default-port}+ or
+192.168.1.1:{remote-interface-default-port}+.
Alternatively, use the <<cluster-update-settings,cluster update settings API>>
to add a remote cluster. You can also use this API to dynamically configure
remote clusters for _every_ node in the local cluster. To configure remote
clusters on individual nodes in the local cluster, define static settings in
`elasticsearch.yml` for each node.
The following request adds a remote cluster with an alias of `cluster_one`. This
_cluster alias_ is a unique identifier that represents the connection to the
remote cluster and is used to distinguish between local and remote indices.
[source,console,subs=attributes+]
----
PUT /_cluster/settings
{
"persistent" : {
"cluster" : {
"remote" : {
"cluster_one" : { <1>
"seeds" : [
"127.0.0.1:{remote-interface-default-port}" <2>
]
}
}
}
}
}
----
// TEST[setup:host]
// TEST[s/127.0.0.1:\{remote-interface-default-port\}/\${transport_host}/]
<1> The cluster alias of this remote cluster is `cluster_one`.
<2> Specifies the hostname and {remote-interface} port of a seed node in the
remote cluster.
You can use the <<cluster-remote-info,remote cluster info API>> to verify that
the local cluster is successfully connected to the remote cluster:
[source,console]
----
GET /_remote/info
----
// TEST[continued]
The API response indicates that the local cluster is connected to the remote
cluster with the cluster alias `cluster_one`:
[source,console-result,subs=attributes+]
----
{
"cluster_one" : {
"seeds" : [
"127.0.0.1:{remote-interface-default-port}"
],
"connected" : true,
"num_nodes_connected" : 1, <1>
"max_connections_per_cluster" : 3,
"initial_connect_timeout" : "30s",
"skip_unavailable" : true, <2>
ifeval::["{trust-mechanism}"=="api-key"]
"cluster_credentials": "::es_redacted::", <3>
endif::[]
"mode" : "sniff"
}
}
----
// TESTRESPONSE[s/127.0.0.1:\{remote-interface-default-port\}/$body.cluster_one.seeds.0/]
// TESTRESPONSE[s/ifeval::(.|\n)*endif::\[\]//]
// TEST[s/"connected" : true/"connected" : $body.cluster_one.connected/]
// TEST[s/"num_nodes_connected" : 1/"num_nodes_connected" : $body.cluster_one.num_nodes_connected/]
<1> The number of nodes in the remote cluster the local cluster is
connected to.
<2> Indicates whether to skip the remote cluster if searched through {ccs} but
no nodes are available.
ifeval::["{trust-mechanism}"=="api-key"]
<3> If present, indicates the remote cluster has connected using API key
authentication.
endif::[]
===== Dynamically configure remote clusters
Use the <<cluster-update-settings,cluster update settings API>> to dynamically
configure remote settings on every node in the cluster. The following request
adds three remote clusters: `cluster_one`, `cluster_two`, and `cluster_three`.
The `seeds` parameter specifies the hostname and
<<modules-network,{remote-interface} port>> (default
+{remote-interface-default-port}+) of a seed node in the remote cluster.
The `mode` parameter determines the configured connection mode, which defaults
to <<sniff-mode,`sniff`>>. Because `cluster_one` doesn't specify a `mode`, it
uses the default. Both `cluster_two` and `cluster_three` explicitly use
different modes.
[source,console,subs=attributes+]
----
PUT _cluster/settings
{
"persistent": {
"cluster": {
"remote": {
"cluster_one": {
"seeds": [
"127.0.0.1:{remote-interface-default-port}"
]
},
"cluster_two": {
"mode": "sniff",
"seeds": [
"127.0.0.1:{remote-interface-default-port-plus1}"
],
"transport.compress": true,
"skip_unavailable": true
},
"cluster_three": {
"mode": "proxy",
"proxy_address": "127.0.0.1:{remote-interface-default-port-plus2}"
}
}
}
}
}
----
// TEST[setup:host]
// TEST[s/127.0.0.1:\{remote-interface-default-port\}/\${transport_host}/]
// TEST[s/\{remote-interface-default-port-plus1\}/9301/]
// TEST[s/\{remote-interface-default-port-plus2\}/9302/]
You can dynamically update settings for a remote cluster after the initial
configuration. The following request updates the compression settings for
`cluster_two`, and the compression and ping schedule settings for
`cluster_three`.
NOTE: When the compression or ping schedule settings change, all existing
node connections must close and re-open, which can cause in-flight requests to
fail.
[source,console]
----
PUT _cluster/settings
{
"persistent": {
"cluster": {
"remote": {
"cluster_two": {
"transport.compress": false
},
"cluster_three": {
"transport.compress": true,
"transport.ping_schedule": "60s"
}
}
}
}
}
----
// TEST[continued]
You can delete a remote cluster from the cluster settings by passing `null`
values for each remote cluster setting. The following request removes
`cluster_two` from the cluster settings, leaving `cluster_one` and
`cluster_three` intact:
[source,console]
----
PUT _cluster/settings
{
"persistent": {
"cluster": {
"remote": {
"cluster_two": {
"mode": null,
"seeds": null,
"skip_unavailable": null,
"transport.compress": null
}
}
}
}
}
----
// TEST[continued]
===== Statically configure remote clusters
If you specify settings in `elasticsearch.yml`, only the nodes with
those settings can connect to the remote cluster and serve remote cluster
requests.
NOTE: Remote cluster settings that are specified using the
<<cluster-update-settings,cluster update settings API>> take precedence over
settings that you specify in `elasticsearch.yml` for individual nodes.
In the following example, `cluster_one`, `cluster_two`, and `cluster_three` are
arbitrary cluster aliases representing the connection to each cluster. These
names are subsequently used to distinguish between local and remote indices.
[source,yaml,subs=attributes+]
----
cluster:
remote:
cluster_one:
seeds: 127.0.0.1:{remote-interface-default-port}
cluster_two:
mode: sniff
seeds: 127.0.0.1:{remote-interface-default-port-plus1}
transport.compress: true <1>
skip_unavailable: true <2>
cluster_three:
mode: proxy
proxy_address: 127.0.0.1:{remote-interface-default-port-plus2} <3>
----
<1> Compression is explicitly enabled for requests to `cluster_two`.
<2> Disconnected remote clusters are optional for `cluster_two`.
<3> The address for the proxy endpoint used to connect to `cluster_three`.
:!remote-interface:
:!remote-interface-default-port:
:!remote-interface-default-port-plus1:
:!remote-interface-default-port-plus2: