[DOCS] Rename mount types for searchable snapshots (#72699)

Changes:

* Renames 'full copy searchable snapshot' to 'fully mounted index.'
* Renames 'shared cache searchable snapshot' to 'partially mounted index.'
* Removes some unneeded cache setup instructions for the frozen tier. We added a default cache size with #71844.
This commit is contained in:
James Rodewig 2021-05-05 16:35:33 -04:00 committed by GitHub
parent 15e42fd748
commit ba66669eb3
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
10 changed files with 74 additions and 92 deletions

View file

@ -10,7 +10,8 @@
[id="{upid}-{api}-request"] [id="{upid}-{api}-request"]
==== Request ==== Request
The Cache Stats API provides statistics about searchable snapshot shared cache. Retrieves statistics about the shared cache for
{ref}/searchable-snapshots.html#partially-mounted[partially mounted indices].
["source","java",subs="attributes,callouts,macros"] ["source","java",subs="attributes,callouts,macros"]
-------------------------------------------------- --------------------------------------------------

View file

@ -259,12 +259,14 @@ Total size, in bytes, of all shards assigned to the node.
`total_data_set_size`:: `total_data_set_size`::
(<<byte-units,byte value>>) (<<byte-units,byte value>>)
Total data set size of all shards assigned to the node. Total data set size of all shards assigned to the node.
This includes the size of shards not stored fully on the node (shared cache searchable snapshots). This includes the size of shards not stored fully on the node, such as the
cache for <<partially-mounted,partially mounted indices>>.
`total_data_set_size_in_bytes`:: `total_data_set_size_in_bytes`::
(integer) (integer)
Total data set size, in bytes, of all shards assigned to the node. Total data set size, in bytes, of all shards assigned to the node.
This includes the size of shards not stored fully on the node (shared cache searchable snapshots). This includes the size of shards not stored fully on the node, such as the
cache for <<partially-mounted,partially mounted indices>>.
`reserved`:: `reserved`::
(<<byte-units,byte value>>) (<<byte-units,byte value>>)

View file

@ -243,12 +243,14 @@ Total size, in bytes, of all shards assigned to selected nodes.
`total_data_set_size`:: `total_data_set_size`::
(<<byte-units, byte units>>) (<<byte-units, byte units>>)
Total data set size of all shards assigned to selected nodes. Total data set size of all shards assigned to selected nodes.
This includes the size of shards not stored fully on the nodes (shared cache searchable snapshots). This includes the size of shards not stored fully on the nodes, such as the
cache for <<partially-mounted,partially mounted indices>>.
`total_data_set_size_in_bytes`:: `total_data_set_size_in_bytes`::
(integer) (integer)
Total data set size, in bytes, of all shards assigned to selected nodes. Total data set size, in bytes, of all shards assigned to selected nodes.
This includes the size of shards not stored fully on the nodes (shared cache searchable snapshots). This includes the size of shards not stored fully on the nodes, such as the
cache for <<partially-mounted,partially mounted indices>>.
`reserved`:: `reserved`::
(<<byte-units,byte value>>) (<<byte-units,byte value>>)

View file

@ -78,8 +78,9 @@ Once data is no longer being updated, it can move from the warm tier to the cold
stays while being queried infrequently. stays while being queried infrequently.
The cold tier is still a responsive query tier, but data in the cold tier is not normally updated. The cold tier is still a responsive query tier, but data in the cold tier is not normally updated.
As data transitions into the cold tier it can be compressed and shrunken. As data transitions into the cold tier it can be compressed and shrunken.
For resiliency, indices in the cold tier can rely on For resiliency, the cold tier can use <<fully-mounted,fully mounted indices>> of
<<ilm-searchable-snapshot, searchable snapshots>>, eliminating the need for replicas. <<ilm-searchable-snapshot,{search-snaps}>>, eliminating the need for
replicas.
[discrete] [discrete]
[[frozen-tier]] [[frozen-tier]]
@ -88,20 +89,11 @@ For resiliency, indices in the cold tier can rely on
Once data is no longer being queried, or being queried rarely, it may move from Once data is no longer being queried, or being queried rarely, it may move from
the cold tier to the frozen tier where it stays for the rest of its life. the cold tier to the frozen tier where it stays for the rest of its life.
The frozen tier uses <<searchable-snapshots,{search-snaps}>> to store and load The frozen tier uses <<partially-mounted,partially mounted indices>> to store
data from a snapshot repository. Instead of using a full local copy of your and load data from a snapshot repository. This reduces local storage and
data, these {search-snaps} use smaller <<shared-cache,local caches>> containing operating costs while still letting you search frozen data. Because {es} must
only recently searched data. If a search requires data that is not in a cache, sometimes fetch frozen data from the snapshot repository, searches on the frozen
{es} fetches the data as needed from the snapshot repository. This decouples tier are typically slower than on the cold tier.
compute and storage, letting you run searches over very large data sets with
minimal compute resources, which significantly reduces your storage and
operating costs.
The <<ilm-index-lifecycle, frozen phase>> automatically converts data
transitioning into the frozen tier into a shared-cache searchable snapshot.
Search is typically slower on the frozen tier than the cold tier, because {es}
must sometimes fetch data from the snapshot repository.
[discrete] [discrete]
[[data-tier-allocation]] [[data-tier-allocation]]

View file

@ -167,7 +167,7 @@ Discontinue use of the removed settings. If needed, use
cluster recovery pending a certain number of data nodes. cluster recovery pending a certain number of data nodes.
==== ====
.Setting the searchable snapshots shared cache size on non-frozen nodes is no longer permitted. .You can no longer set `xpack.searchable.snapshot.shared_cache.size` on non-frozen nodes.
[%collapsible] [%collapsible]
==== ====
*Details* + *Details* +
@ -176,10 +176,10 @@ that does not have the `data_frozen` role was deprecated in {es} 7.12.0 and has
been removed in {es} 8.0.0. been removed in {es} 8.0.0.
*Impact* + *Impact* +
Discontinue use of the removed setting. Note that searchable snapshots mounted {es} only allocates partially mounted indices to nodes with the `data_frozen`
using the `shared_cache` storage option were only allocated to nodes that had role. Do not set `xpack.searchable.snapshot.shared_cache.size` on nodes without
the `data_frozen` role, so removing this setting on nodes that do not have the the `data_frozen` role. Removing the setting on nodes without the `data_frozen`
`data_frozen` role will have no impact on functionality. role will not impact functionality.
==== ====
.By default, destructive index actions do not allow wildcards. .By default, destructive index actions do not allow wildcards.

View file

@ -8,8 +8,8 @@
experimental::[] experimental::[]
Clears indices and data streams from the <<shared-cache,shared searchable Clears indices and data streams from the shared cache for
snapshot cache>>. <<partially-mounted,partially mounted indices>>.
[[searchable-snapshots-api-clear-cache-request]] [[searchable-snapshots-api-clear-cache-request]]
==== {api-request-title} ==== {api-request-title}

View file

@ -48,12 +48,18 @@ include::{es-repo-dir}/rest-api/common-parms.asciidoc[tag=master-timeout]
Defaults to `false`. Defaults to `false`.
`storage`:: `storage`::
(Optional, string) Selects the kind of local storage used to accelerate +
searches of the mounted index. If `full_copy`, each node holding a shard of the --
searchable snapshot index makes a full copy of the shard to its local storage. (Optional, string)
If `shared_cache`, the shard uses the <<searchable-snapshot-mount-storage-options,Mount option>> for the
<<searchable-snapshots-shared-cache,shared cache>>. Defaults to `full_copy`. {search-snap} index. Possible values are:
See <<searchable-snapshot-mount-storage-options>>.
`full_copy` (Default):::
<<fully-mounted,Fully mounted index>>.
`shared_cache`:::
<<partially-mounted,Partially mounted index>>.
--
[[searchable-snapshots-api-mount-request-body]] [[searchable-snapshots-api-mount-request-body]]
==== {api-request-body-title} ==== {api-request-body-title}

View file

@ -6,7 +6,8 @@
<titleabbrev>Cache stats</titleabbrev> <titleabbrev>Cache stats</titleabbrev>
++++ ++++
Provide statistics about the searchable snapshots <<shared-cache,shared cache>>. Retrieves statistics about the shared cache for <<partially-mounted,partially
mounted indices>>.
[[searchable-snapshots-api-cache-stats-request]] [[searchable-snapshots-api-cache-stats-request]]
==== {api-request-title} ==== {api-request-title}
@ -22,12 +23,6 @@ If the {es} {security-features} are enabled, you must have the
`manage` cluster privilege to use this API. `manage` cluster privilege to use this API.
For more information, see <<security-privileges>>. For more information, see <<security-privileges>>.
[[searchable-snapshots-api-cache-stats-desc]]
==== {api-description-title}
You can use the Cache Stats API to retrieve statistics about the
usage of the <<shared-cache,shared cache>> on nodes in a cluster.
[[searchable-snapshots-api-cache-stats-path-params]] [[searchable-snapshots-api-cache-stats-path-params]]
==== {api-path-parms-title} ==== {api-path-parms-title}
@ -97,7 +92,8 @@ Contains statistics about the shared cache file.
[[searchable-snapshots-api-cache-stats-example]] [[searchable-snapshots-api-cache-stats-example]]
==== {api-examples-title} ==== {api-examples-title}
Retrieves the searchable snapshots shared cache file statistics for all data nodes: Gets the statistics about the shared cache for partially mounted indices from
all data nodes:
[source,console] [source,console]
-------------------------------------------------- --------------------------------------------------

View file

@ -117,16 +117,15 @@ copying data from the primary.
To search a snapshot, you must first mount it locally as an index. Usually To search a snapshot, you must first mount it locally as an index. Usually
{ilm-init} will do this automatically, but you can also call the {ilm-init} will do this automatically, but you can also call the
<<searchable-snapshots-api-mount-snapshot,mount snapshot>> API yourself. There <<searchable-snapshots-api-mount-snapshot,mount snapshot>> API yourself. There
are two options for mounting a snapshot, each with different performance are two options for mounting an index from a snapshot, each with different
characteristics and local storage footprints: performance characteristics and local storage footprints:
[[full-copy]] [[fully-mounted]]
Full copy:: Fully mounted index::
Loads a full copy of the snapshotted index's shards onto node-local storage Loads a full copy of the snapshotted index's shards onto node-local storage
within the cluster. This is the default mount option. {ilm-init} uses this within the cluster. {ilm-init} uses this option in the `hot` and `cold` phases.
option by default in the `hot` and `cold` phases.
+ +
Search performance for a full-copy searchable snapshot index is normally Search performance for a fully mounted index is normally
comparable to a regular index, since there is minimal need to access the comparable to a regular index, since there is minimal need to access the
snapshot repository. While recovery is ongoing, search performance may be snapshot repository. While recovery is ongoing, search performance may be
slower than with a regular index because a search may need some data that has slower than with a regular index because a search may need some data that has
@ -134,11 +133,11 @@ not yet been retrieved into the local copy. If that happens, {es} will eagerly
retrieve the data needed to complete the search in parallel with the ongoing retrieve the data needed to complete the search in parallel with the ongoing
recovery. recovery.
[[shared-cache]] [[partially-mounted]]
Shared cache:: Partially mounted index::
Uses a local cache containing only recently searched parts of the snapshotted Uses a local cache containing only recently searched parts of the snapshotted
index's data. {ilm-init} uses this option by default in the `frozen` phase and index's data. This cache has a fixed size and is shared across nodes in the
corresponding frozen tier. frozen tier. {ilm-init} uses this option in the `frozen` phase.
+ +
If a search requires data that is not in the cache, {es} fetches the missing If a search requires data that is not in the cache, {es} fetches the missing
data from the snapshot repository. Searches that require these fetches are data from the snapshot repository. Searches that require these fetches are
@ -146,39 +145,39 @@ slower, but the fetched data is stored in the cache so that similar searches
can be served more quickly in future. {es} will evict infrequently used data can be served more quickly in future. {es} will evict infrequently used data
from the cache to free up space. from the cache to free up space.
+ +
Although slower than a full local copy or a regular index, a shared-cache Although slower than a fully mounted index or a regular index, a
searchable snapshot index still returns search results quickly, even for large partially mounted index still returns search results quickly, even for
data sets, because the layout of data in the repository is heavily optimized large data sets, because the layout of data in the repository is heavily
for search. Many searches will need to retrieve only a small subset of the optimized for search. Many searches will need to retrieve only a small subset of
total shard data before returning results. the total shard data before returning results.
To mount a searchable snapshot index with the shared cache mount option, you To partially mount an index, you must have one or more nodes with a shared cache
must have one or more nodes with a shared cache available. By default, available. By default, dedicated frozen data tier nodes (nodes with the
dedicated frozen data tier nodes (nodes with the `data_frozen` role and no other `data_frozen` role and no other data roles) have a shared cache configured using
data roles) have a shared cache configured using the greater of 90% of total the greater of 90% of total disk space and total disk space subtracted a
disk space and total disk space subtracted a headroom of 100GB. headroom of 100GB.
Using a dedicated frozen tier is highly recommended for production use. If you Using a dedicated frozen tier is highly recommended for production use. If you
do not have a dedicated frozen tier, you must configure the do not have a dedicated frozen tier, you must configure the
`xpack.searchable.snapshot.shared_cache.size` setting to reserve space for the `xpack.searchable.snapshot.shared_cache.size` setting to reserve space for the
cache on one or more nodes. Indices mounted with the shared cache mount option cache on one or more nodes. Partially mounted indices
are only allocated to nodes that have a shared cache. are only allocated to nodes that have a shared cache.
[[searchable-snapshots-shared-cache]] [[searchable-snapshots-shared-cache]]
`xpack.searchable.snapshot.shared_cache.size`:: `xpack.searchable.snapshot.shared_cache.size`::
(<<static-cluster-setting,Static>>) (<<static-cluster-setting,Static>>)
The size of the space reserved for the shared cache, either specified as a Disk space reserved for the shared cache of partially mounted indices.
percentage of total disk space or an absolute <<byte-units,byte value>>. Accepts a percentage of total disk space or an absolute <<byte-units,byte
Defaults to 90% of total disk space on dedicated frozen data tier nodes, value>>. Defaults to `90%` of total disk space for dedicated frozen data tier
otherwise `0b`. nodes. Otherwise defaults to `0b`.
`xpack.searchable.snapshot.shared_cache.size.max_headroom`:: `xpack.searchable.snapshot.shared_cache.size.max_headroom`::
(<<static-cluster-setting,Static>>, <<byte-units,byte value>>) (<<static-cluster-setting,Static>>, <<byte-units,byte value>>)
For dedicated frozen tier nodes, the max headroom to maintain. Defaults to 100GB For dedicated frozen tier nodes, the max headroom to maintain. If
on dedicated frozen tier nodes when `xpack.searchable.snapshot.shared_cache.size` is not explicitly set, this
`xpack.searchable.snapshot.shared_cache.size` is not explicitly set, otherwise setting defaults to `100GB`. Otherwise it defaults to `-1` (not set). You can
-1 (not set). Can only be set when `xpack.searchable.snapshot.shared_cache.size` only configure this setting if `xpack.searchable.snapshot.shared_cache.size` is
is set as a percentage. set as a percentage.
To illustrate how these settings work in concert let us look at two examples To illustrate how these settings work in concert let us look at two examples
when using the default values of the settings on a dedicated frozen node: when using the default values of the settings on a dedicated frozen node:
@ -186,7 +185,7 @@ when using the default values of the settings on a dedicated frozen node:
* A 4000 GB disk will result in a shared cache sized at 3900 GB. 90% of 4000 GB * A 4000 GB disk will result in a shared cache sized at 3900 GB. 90% of 4000 GB
is 3600 GB, leaving 400 GB headroom. The default `max_headroom` of 100 GB is 3600 GB, leaving 400 GB headroom. The default `max_headroom` of 100 GB
takes effect, and the result is therefore 3900 GB. takes effect, and the result is therefore 3900 GB.
* A 400 GB disk will result in a shard cache sized at 360 GB. * A 400 GB disk will result in a shared cache sized at 360 GB.
You can configure the settings in `elasticsearch.yml`: You can configure the settings in `elasticsearch.yml`:
@ -199,11 +198,6 @@ IMPORTANT: You can only configure these settings on nodes with the
<<data-frozen-node,`data_frozen`>> role. Additionally, nodes with a shared <<data-frozen-node,`data_frozen`>> role. Additionally, nodes with a shared
cache can only have a single <<path-settings,data path>>. cache can only have a single <<path-settings,data path>>.
You can set `xpack.searchable.snapshot.shared_cache.size` to any size between a
couple of gigabytes up to 90% of available disk space. We only recommend larger
sizes if you use the node exclusively on a frozen tier or for searchable
snapshots.
[discrete] [discrete]
[[back-up-restore-searchable-snapshots]] [[back-up-restore-searchable-snapshots]]
=== Back up and restore {search-snaps} === Back up and restore {search-snaps}

View file

@ -36,19 +36,8 @@ node.roles: [ data_cold ]
node.roles: [ data_frozen ] node.roles: [ data_frozen ]
---- ----
For nodes in the frozen tier, set We recommend you use dedicated nodes in the frozen tier. If needed, you can
<<searchable-snapshots-shared-cache,`xpack.searchable.snapshot.shared_cache.size`>> assign other nodes to more than one tier.
to up to 90% of the node's available disk space. The frozen tier uses this space
to create a <<shared-cache,shared, fixed-size cache>> for
<<searchable-snapshots,searchable snapshots>>.
[source,yaml]
----
node.roles: [ data_frozen ]
xpack.searchable.snapshot.shared_cache.size: 50GB
----
If needed, you can assign a node to more than one tier.
[source,yaml] [source,yaml]
---- ----