[DOCS] Rename mount types for searchable snapshots (#72699)

Changes:

* Renames 'full copy searchable snapshot' to 'fully mounted index.'
* Renames 'shared cache searchable snapshot' to 'partially mounted index.'
* Removes some unneeded cache setup instructions for the frozen tier. We added a default cache size with #71844.
This commit is contained in:
James Rodewig 2021-05-05 16:35:33 -04:00 committed by GitHub
parent 15e42fd748
commit ba66669eb3
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
10 changed files with 74 additions and 92 deletions

View file

@ -10,7 +10,8 @@
[id="{upid}-{api}-request"]
==== Request
The Cache Stats API provides statistics about searchable snapshot shared cache.
Retrieves statistics about the shared cache for
{ref}/searchable-snapshots.html#partially-mounted[partially mounted indices].
["source","java",subs="attributes,callouts,macros"]
--------------------------------------------------

View file

@ -259,12 +259,14 @@ Total size, in bytes, of all shards assigned to the node.
`total_data_set_size`::
(<<byte-units,byte value>>)
Total data set size of all shards assigned to the node.
This includes the size of shards not stored fully on the node (shared cache searchable snapshots).
This includes the size of shards not stored fully on the node, such as the
cache for <<partially-mounted,partially mounted indices>>.
`total_data_set_size_in_bytes`::
(integer)
Total data set size, in bytes, of all shards assigned to the node.
This includes the size of shards not stored fully on the node (shared cache searchable snapshots).
This includes the size of shards not stored fully on the node, such as the
cache for <<partially-mounted,partially mounted indices>>.
`reserved`::
(<<byte-units,byte value>>)

View file

@ -243,12 +243,14 @@ Total size, in bytes, of all shards assigned to selected nodes.
`total_data_set_size`::
(<<byte-units, byte units>>)
Total data set size of all shards assigned to selected nodes.
This includes the size of shards not stored fully on the nodes (shared cache searchable snapshots).
This includes the size of shards not stored fully on the nodes, such as the
cache for <<partially-mounted,partially mounted indices>>.
`total_data_set_size_in_bytes`::
(integer)
Total data set size, in bytes, of all shards assigned to selected nodes.
This includes the size of shards not stored fully on the nodes (shared cache searchable snapshots).
This includes the size of shards not stored fully on the nodes, such as the
cache for <<partially-mounted,partially mounted indices>>.
`reserved`::
(<<byte-units,byte value>>)

View file

@ -78,8 +78,9 @@ Once data is no longer being updated, it can move from the warm tier to the cold
stays while being queried infrequently.
The cold tier is still a responsive query tier, but data in the cold tier is not normally updated.
As data transitions into the cold tier it can be compressed and shrunken.
For resiliency, indices in the cold tier can rely on
<<ilm-searchable-snapshot, searchable snapshots>>, eliminating the need for replicas.
For resiliency, the cold tier can use <<fully-mounted,fully mounted indices>> of
<<ilm-searchable-snapshot,{search-snaps}>>, eliminating the need for
replicas.
[discrete]
[[frozen-tier]]
@ -88,20 +89,11 @@ For resiliency, indices in the cold tier can rely on
Once data is no longer being queried, or being queried rarely, it may move from
the cold tier to the frozen tier where it stays for the rest of its life.
The frozen tier uses <<searchable-snapshots,{search-snaps}>> to store and load
data from a snapshot repository. Instead of using a full local copy of your
data, these {search-snaps} use smaller <<shared-cache,local caches>> containing
only recently searched data. If a search requires data that is not in a cache,
{es} fetches the data as needed from the snapshot repository. This decouples
compute and storage, letting you run searches over very large data sets with
minimal compute resources, which significantly reduces your storage and
operating costs.
The <<ilm-index-lifecycle, frozen phase>> automatically converts data
transitioning into the frozen tier into a shared-cache searchable snapshot.
Search is typically slower on the frozen tier than the cold tier, because {es}
must sometimes fetch data from the snapshot repository.
The frozen tier uses <<partially-mounted,partially mounted indices>> to store
and load data from a snapshot repository. This reduces local storage and
operating costs while still letting you search frozen data. Because {es} must
sometimes fetch frozen data from the snapshot repository, searches on the frozen
tier are typically slower than on the cold tier.
[discrete]
[[data-tier-allocation]]

View file

@ -167,7 +167,7 @@ Discontinue use of the removed settings. If needed, use
cluster recovery pending a certain number of data nodes.
====
.Setting the searchable snapshots shared cache size on non-frozen nodes is no longer permitted.
.You can no longer set `xpack.searchable.snapshot.shared_cache.size` on non-frozen nodes.
[%collapsible]
====
*Details* +
@ -176,10 +176,10 @@ that does not have the `data_frozen` role was deprecated in {es} 7.12.0 and has
been removed in {es} 8.0.0.
*Impact* +
Discontinue use of the removed setting. Note that searchable snapshots mounted
using the `shared_cache` storage option were only allocated to nodes that had
the `data_frozen` role, so removing this setting on nodes that do not have the
`data_frozen` role will have no impact on functionality.
{es} only allocates partially mounted indices to nodes with the `data_frozen`
role. Do not set `xpack.searchable.snapshot.shared_cache.size` on nodes without
the `data_frozen` role. Removing the setting on nodes without the `data_frozen`
role will not impact functionality.
====
.By default, destructive index actions do not allow wildcards.

View file

@ -8,8 +8,8 @@
experimental::[]
Clears indices and data streams from the <<shared-cache,shared searchable
snapshot cache>>.
Clears indices and data streams from the shared cache for
<<partially-mounted,partially mounted indices>>.
[[searchable-snapshots-api-clear-cache-request]]
==== {api-request-title}

View file

@ -48,12 +48,18 @@ include::{es-repo-dir}/rest-api/common-parms.asciidoc[tag=master-timeout]
Defaults to `false`.
`storage`::
(Optional, string) Selects the kind of local storage used to accelerate
searches of the mounted index. If `full_copy`, each node holding a shard of the
searchable snapshot index makes a full copy of the shard to its local storage.
If `shared_cache`, the shard uses the
<<searchable-snapshots-shared-cache,shared cache>>. Defaults to `full_copy`.
See <<searchable-snapshot-mount-storage-options>>.
+
--
(Optional, string)
<<searchable-snapshot-mount-storage-options,Mount option>> for the
{search-snap} index. Possible values are:
`full_copy` (Default):::
<<fully-mounted,Fully mounted index>>.
`shared_cache`:::
<<partially-mounted,Partially mounted index>>.
--
[[searchable-snapshots-api-mount-request-body]]
==== {api-request-body-title}

View file

@ -6,7 +6,8 @@
<titleabbrev>Cache stats</titleabbrev>
++++
Provide statistics about the searchable snapshots <<shared-cache,shared cache>>.
Retrieves statistics about the shared cache for <<partially-mounted,partially
mounted indices>>.
[[searchable-snapshots-api-cache-stats-request]]
==== {api-request-title}
@ -22,12 +23,6 @@ If the {es} {security-features} are enabled, you must have the
`manage` cluster privilege to use this API.
For more information, see <<security-privileges>>.
[[searchable-snapshots-api-cache-stats-desc]]
==== {api-description-title}
You can use the Cache Stats API to retrieve statistics about the
usage of the <<shared-cache,shared cache>> on nodes in a cluster.
[[searchable-snapshots-api-cache-stats-path-params]]
==== {api-path-parms-title}
@ -97,7 +92,8 @@ Contains statistics about the shared cache file.
[[searchable-snapshots-api-cache-stats-example]]
==== {api-examples-title}
Retrieves the searchable snapshots shared cache file statistics for all data nodes:
Gets the statistics about the shared cache for partially mounted indices from
all data nodes:
[source,console]
--------------------------------------------------

View file

@ -117,16 +117,15 @@ copying data from the primary.
To search a snapshot, you must first mount it locally as an index. Usually
{ilm-init} will do this automatically, but you can also call the
<<searchable-snapshots-api-mount-snapshot,mount snapshot>> API yourself. There
are two options for mounting a snapshot, each with different performance
characteristics and local storage footprints:
are two options for mounting an index from a snapshot, each with different
performance characteristics and local storage footprints:
[[full-copy]]
Full copy::
[[fully-mounted]]
Fully mounted index::
Loads a full copy of the snapshotted index's shards onto node-local storage
within the cluster. This is the default mount option. {ilm-init} uses this
option by default in the `hot` and `cold` phases.
within the cluster. {ilm-init} uses this option in the `hot` and `cold` phases.
+
Search performance for a full-copy searchable snapshot index is normally
Search performance for a fully mounted index is normally
comparable to a regular index, since there is minimal need to access the
snapshot repository. While recovery is ongoing, search performance may be
slower than with a regular index because a search may need some data that has
@ -134,11 +133,11 @@ not yet been retrieved into the local copy. If that happens, {es} will eagerly
retrieve the data needed to complete the search in parallel with the ongoing
recovery.
[[shared-cache]]
Shared cache::
[[partially-mounted]]
Partially mounted index::
Uses a local cache containing only recently searched parts of the snapshotted
index's data. {ilm-init} uses this option by default in the `frozen` phase and
corresponding frozen tier.
index's data. This cache has a fixed size and is shared across nodes in the
frozen tier. {ilm-init} uses this option in the `frozen` phase.
+
If a search requires data that is not in the cache, {es} fetches the missing
data from the snapshot repository. Searches that require these fetches are
@ -146,39 +145,39 @@ slower, but the fetched data is stored in the cache so that similar searches
can be served more quickly in future. {es} will evict infrequently used data
from the cache to free up space.
+
Although slower than a full local copy or a regular index, a shared-cache
searchable snapshot index still returns search results quickly, even for large
data sets, because the layout of data in the repository is heavily optimized
for search. Many searches will need to retrieve only a small subset of the
total shard data before returning results.
Although slower than a fully mounted index or a regular index, a
partially mounted index still returns search results quickly, even for
large data sets, because the layout of data in the repository is heavily
optimized for search. Many searches will need to retrieve only a small subset of
the total shard data before returning results.
To mount a searchable snapshot index with the shared cache mount option, you
must have one or more nodes with a shared cache available. By default,
dedicated frozen data tier nodes (nodes with the `data_frozen` role and no other
data roles) have a shared cache configured using the greater of 90% of total
disk space and total disk space subtracted a headroom of 100GB.
To partially mount an index, you must have one or more nodes with a shared cache
available. By default, dedicated frozen data tier nodes (nodes with the
`data_frozen` role and no other data roles) have a shared cache configured using
the greater of 90% of total disk space and total disk space subtracted a
headroom of 100GB.
Using a dedicated frozen tier is highly recommended for production use. If you
do not have a dedicated frozen tier, you must configure the
`xpack.searchable.snapshot.shared_cache.size` setting to reserve space for the
cache on one or more nodes. Indices mounted with the shared cache mount option
cache on one or more nodes. Partially mounted indices
are only allocated to nodes that have a shared cache.
[[searchable-snapshots-shared-cache]]
`xpack.searchable.snapshot.shared_cache.size`::
(<<static-cluster-setting,Static>>)
The size of the space reserved for the shared cache, either specified as a
percentage of total disk space or an absolute <<byte-units,byte value>>.
Defaults to 90% of total disk space on dedicated frozen data tier nodes,
otherwise `0b`.
Disk space reserved for the shared cache of partially mounted indices.
Accepts a percentage of total disk space or an absolute <<byte-units,byte
value>>. Defaults to `90%` of total disk space for dedicated frozen data tier
nodes. Otherwise defaults to `0b`.
`xpack.searchable.snapshot.shared_cache.size.max_headroom`::
(<<static-cluster-setting,Static>>, <<byte-units,byte value>>)
For dedicated frozen tier nodes, the max headroom to maintain. Defaults to 100GB
on dedicated frozen tier nodes when
`xpack.searchable.snapshot.shared_cache.size` is not explicitly set, otherwise
-1 (not set). Can only be set when `xpack.searchable.snapshot.shared_cache.size`
is set as a percentage.
For dedicated frozen tier nodes, the max headroom to maintain. If
`xpack.searchable.snapshot.shared_cache.size` is not explicitly set, this
setting defaults to `100GB`. Otherwise it defaults to `-1` (not set). You can
only configure this setting if `xpack.searchable.snapshot.shared_cache.size` is
set as a percentage.
To illustrate how these settings work in concert let us look at two examples
when using the default values of the settings on a dedicated frozen node:
@ -186,7 +185,7 @@ when using the default values of the settings on a dedicated frozen node:
* A 4000 GB disk will result in a shared cache sized at 3900 GB. 90% of 4000 GB
is 3600 GB, leaving 400 GB headroom. The default `max_headroom` of 100 GB
takes effect, and the result is therefore 3900 GB.
* A 400 GB disk will result in a shard cache sized at 360 GB.
* A 400 GB disk will result in a shared cache sized at 360 GB.
You can configure the settings in `elasticsearch.yml`:
@ -199,11 +198,6 @@ IMPORTANT: You can only configure these settings on nodes with the
<<data-frozen-node,`data_frozen`>> role. Additionally, nodes with a shared
cache can only have a single <<path-settings,data path>>.
You can set `xpack.searchable.snapshot.shared_cache.size` to any size between a
couple of gigabytes up to 90% of available disk space. We only recommend larger
sizes if you use the node exclusively on a frozen tier or for searchable
snapshots.
[discrete]
[[back-up-restore-searchable-snapshots]]
=== Back up and restore {search-snaps}

View file

@ -36,19 +36,8 @@ node.roles: [ data_cold ]
node.roles: [ data_frozen ]
----
For nodes in the frozen tier, set
<<searchable-snapshots-shared-cache,`xpack.searchable.snapshot.shared_cache.size`>>
to up to 90% of the node's available disk space. The frozen tier uses this space
to create a <<shared-cache,shared, fixed-size cache>> for
<<searchable-snapshots,searchable snapshots>>.
[source,yaml]
----
node.roles: [ data_frozen ]
xpack.searchable.snapshot.shared_cache.size: 50GB
----
If needed, you can assign a node to more than one tier.
We recommend you use dedicated nodes in the frozen tier. If needed, you can
assign other nodes to more than one tier.
[source,yaml]
----