mirror of
https://github.com/elastic/elasticsearch.git
synced 2025-06-28 09:28:55 -04:00
[DOCS] Address local vs. remote storage + shard limits feedback (#109360)
This commit is contained in:
parent
47edae4fbd
commit
900eb82c99
7 changed files with 40 additions and 26 deletions
|
@ -22,6 +22,9 @@ mounted indices>> of <<ilm-searchable-snapshot,{search-snaps}>> exclusively.
|
|||
This extends the storage capacity even further — by up to 20 times compared to
|
||||
the warm tier.
|
||||
|
||||
TIP: The performance of an {es} node is often limited by the performance of the underlying storage.
|
||||
Review our recommendations for optimizing your storage for <<indexing-use-faster-hardware,indexing>> and <<search-use-faster-hardware,search>>.
|
||||
|
||||
IMPORTANT: {es} generally expects nodes within a data tier to share the same
|
||||
hardware profile. Variations not following this recommendation should be
|
||||
carefully architected to avoid <<hotspotting,hot spotting>>.
|
||||
|
|
|
@ -94,6 +94,7 @@ auto-generated ids, Elasticsearch can skip this check, which makes indexing
|
|||
faster.
|
||||
|
||||
[discrete]
|
||||
[[indexing-use-faster-hardware]]
|
||||
=== Use faster hardware
|
||||
|
||||
If indexing is I/O-bound, consider increasing the size of the filesystem cache
|
||||
|
@ -110,13 +111,10 @@ different nodes so there's redundancy for any node failures. You can also use
|
|||
<<snapshot-restore,snapshot and restore>> to backup the index for further
|
||||
insurance.
|
||||
|
||||
Directly-attached (local) storage generally performs better than remote storage
|
||||
because it is simpler to configure well and avoids communications overheads.
|
||||
With careful tuning it is sometimes possible to achieve acceptable performance
|
||||
using remote storage too. Benchmark your system with a realistic workload to
|
||||
determine the effects of any tuning parameters. If you cannot achieve the
|
||||
performance you expect, work with the vendor of your storage system to identify
|
||||
the problem.
|
||||
[discrete]
|
||||
==== Local vs.remote storage
|
||||
|
||||
include::./remote-storage.asciidoc[]
|
||||
|
||||
[discrete]
|
||||
=== Indexing buffer size
|
||||
|
|
11
docs/reference/how-to/remote-storage.asciidoc
Normal file
11
docs/reference/how-to/remote-storage.asciidoc
Normal file
|
@ -0,0 +1,11 @@
|
|||
Directly-attached (local) storage generally performs
|
||||
better than remote storage because it is simpler to configure well and avoids
|
||||
communications overheads.
|
||||
|
||||
Some remote storage performs very poorly, especially
|
||||
under the kind of load that {es} imposes. However, with careful tuning, it is
|
||||
sometimes possible to achieve acceptable performance using remote storage too.
|
||||
Before committing to a particular storage architecture, benchmark your system
|
||||
with a realistic workload to determine the effects of any tuning parameters. If
|
||||
you cannot achieve the performance you expect, work with the vendor of your
|
||||
storage system to identify the problem.
|
|
@ -38,6 +38,7 @@ for `/dev/nvme0n1`, specify `blockdev --setra 256 /dev/nvme0n1`.
|
|||
// end::readahead[]
|
||||
|
||||
[discrete]
|
||||
[[search-use-faster-hardware]]
|
||||
=== Use faster hardware
|
||||
|
||||
If your searches are I/O-bound, consider increasing the size of the filesystem
|
||||
|
@ -46,16 +47,13 @@ sequential and random reads across multiple files, and there may be many
|
|||
searches running concurrently on each shard, so SSD drives tend to perform
|
||||
better than spinning disks.
|
||||
|
||||
Directly-attached (local) storage generally performs better than remote storage
|
||||
because it is simpler to configure well and avoids communications overheads.
|
||||
With careful tuning it is sometimes possible to achieve acceptable performance
|
||||
using remote storage too. Benchmark your system with a realistic workload to
|
||||
determine the effects of any tuning parameters. If you cannot achieve the
|
||||
performance you expect, work with the vendor of your storage system to identify
|
||||
the problem.
|
||||
|
||||
If your searches are CPU-bound, consider using a larger number of faster CPUs.
|
||||
|
||||
[discrete]
|
||||
==== Local vs. remote storage
|
||||
|
||||
include::./remote-storage.asciidoc[]
|
||||
|
||||
[discrete]
|
||||
=== Document modeling
|
||||
|
||||
|
|
4
docs/reference/how-to/shard-limits.asciidoc
Normal file
4
docs/reference/how-to/shard-limits.asciidoc
Normal file
|
@ -0,0 +1,4 @@
|
|||
<<cluster-shard-limit,Cluster shard limits>> prevent creation of more than
|
||||
1000 non-frozen shards per node, and 3000 frozen shards per dedicated frozen
|
||||
node. Make sure you have enough nodes of each type in your cluster to handle
|
||||
the number of shards you need.
|
|
@ -34,6 +34,9 @@ cluster sizing video]. As you test different shard configurations, use {kib}'s
|
|||
{kibana-ref}/elasticsearch-metrics.html[{es} monitoring tools] to track your
|
||||
cluster's stability and performance.
|
||||
|
||||
The performance of an {es} node is often limited by the performance of the underlying storage.
|
||||
Review our recommendations for optimizing your storage for <<indexing-use-faster-hardware,indexing>> and <<search-use-faster-hardware,search>>.
|
||||
|
||||
The following sections provide some reminders and guidelines you should
|
||||
consider when designing your sharding strategy. If your cluster is already
|
||||
oversharded, see <<reduce-cluster-shard-count>>.
|
||||
|
@ -225,10 +228,7 @@ GET _cat/shards?v=true
|
|||
[[shard-count-per-node-recommendation]]
|
||||
==== Add enough nodes to stay within the cluster shard limits
|
||||
|
||||
The <<cluster-shard-limit,cluster shard limits>> prevent creation of more than
|
||||
1000 non-frozen shards per node, and 3000 frozen shards per dedicated frozen
|
||||
node. Make sure you have enough nodes of each type in your cluster to handle
|
||||
the number of shards you need.
|
||||
include::./shard-limits.asciidoc[]
|
||||
|
||||
[discrete]
|
||||
[[field-count-recommendation]]
|
||||
|
|
|
@ -1,5 +1,5 @@
|
|||
[[modules-node]]
|
||||
=== Node
|
||||
=== Nodes
|
||||
|
||||
Any time that you start an instance of {es}, you are starting a _node_. A
|
||||
collection of connected nodes is called a <<modules-cluster,cluster>>. If you
|
||||
|
@ -14,6 +14,10 @@ All nodes know about all the other nodes in the cluster and can forward client
|
|||
requests to the appropriate node.
|
||||
// end::modules-node-description-tag[]
|
||||
|
||||
TIP: The performance of an {es} node is often limited by the performance of the underlying storage.
|
||||
Review our recommendations for optimizing your storage for <<indexing-use-faster-hardware,indexing>> and
|
||||
<<search-use-faster-hardware,search>>.
|
||||
|
||||
[[node-roles]]
|
||||
==== Node roles
|
||||
|
||||
|
@ -236,6 +240,8 @@ assign data nodes to specific tiers: `data_content`,`data_hot`, `data_warm`,
|
|||
|
||||
If you want to include a node in all tiers, or if your cluster does not use multiple tiers, then you can use the generic `data` role.
|
||||
|
||||
include::../how-to/shard-limits.asciidoc[]
|
||||
|
||||
WARNING: If you assign a node to a specific tier using a specialized data role, then you shouldn't also assign it the generic `data` role. The generic `data` role takes precedence over specialized data roles.
|
||||
|
||||
[[generic-data-node]]
|
||||
|
@ -471,12 +477,6 @@ properly-configured remote block devices (e.g. a SAN) and remote filesystems
|
|||
storage. You can run multiple {es} nodes on the same filesystem, but each {es}
|
||||
node must have its own data path.
|
||||
|
||||
The performance of an {es} cluster is often limited by the performance of the
|
||||
underlying storage, so you must ensure that your storage supports acceptable
|
||||
performance. Some remote storage performs very poorly, especially under the
|
||||
kind of load that {es} imposes, so make sure to benchmark your system carefully
|
||||
before committing to a particular storage architecture.
|
||||
|
||||
TIP: When using the `.zip` or `.tar.gz` distributions, the `path.data` setting
|
||||
should be configured to locate the data directory outside the {es} home
|
||||
directory, so that the home directory can be deleted without deleting your data!
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue