mirror of
https://github.com/elastic/elasticsearch.git
synced 2025-04-24 23:27:25 -04:00
179 lines
No EOL
6.9 KiB
Text
179 lines
No EOL
6.9 KiB
Text
[[modules-node]]
|
|
=== Node settings
|
|
|
|
Any time that you start an instance of {es}, you are starting a _node_. A
|
|
collection of connected nodes is called a <<modules-cluster,cluster>>. If you
|
|
are running a single node of {es}, then you have a cluster of one node.
|
|
|
|
Every node in the cluster can handle <<modules-network,HTTP and transport>>
|
|
traffic by default. The transport layer is used exclusively for communication
|
|
between nodes; the HTTP layer is used by REST clients.
|
|
[[modules-node-description]]
|
|
// tag::modules-node-description-tag[]
|
|
All nodes know about all the other nodes in the cluster and can forward client
|
|
requests to the appropriate node.
|
|
// end::modules-node-description-tag[]
|
|
|
|
TIP: The performance of an {es} node is often limited by the performance of the underlying storage.
|
|
Review our recommendations for optimizing your storage for <<indexing-use-faster-hardware,indexing>> and
|
|
<<search-use-faster-hardware,search>>.
|
|
|
|
[[node-name-settings]]
|
|
==== Node name setting
|
|
|
|
include::{es-ref-dir}/setup/important-settings/node-name.asciidoc[]
|
|
|
|
[[node-roles]]
|
|
==== Node role settings
|
|
|
|
You define a node's roles by setting `node.roles` in `elasticsearch.yml`. If you
|
|
set `node.roles`, the node is only assigned the roles you specify. If you don't
|
|
set `node.roles`, the node is assigned the following roles:
|
|
|
|
* [[master-node]]`master`
|
|
* [[data-node]]`data`
|
|
* `data_content`
|
|
* `data_hot`
|
|
* `data_warm`
|
|
* `data_cold`
|
|
* `data_frozen`
|
|
* `ingest`
|
|
* [[ml-node]]`ml`
|
|
* `remote_cluster_client`
|
|
* [[transform-node]]`transform`
|
|
|
|
The following additional roles are available:
|
|
|
|
* `voting_only`
|
|
|
|
[NOTE]
|
|
[[coordinating-only-node]]
|
|
.Coordinating node
|
|
===============================================
|
|
|
|
Requests like search requests or bulk-indexing requests may involve data held
|
|
on different data nodes. A search request, for example, is executed in two
|
|
phases which are coordinated by the node which receives the client request --
|
|
the _coordinating node_.
|
|
|
|
In the _scatter_ phase, the coordinating node forwards the request to the data
|
|
nodes which hold the data. Each data node executes the request locally and
|
|
returns its results to the coordinating node. In the _gather_ phase, the
|
|
coordinating node reduces each data node's results into a single global
|
|
result set.
|
|
|
|
Every node is implicitly a coordinating node. This means that a node that has
|
|
an explicit empty list of roles in the `node.roles` setting will only act as a coordinating
|
|
node, which cannot be disabled. As a result, such a node needs to have enough
|
|
memory and CPU in order to deal with the gather phase.
|
|
|
|
===============================================
|
|
|
|
[IMPORTANT]
|
|
====
|
|
If you set `node.roles`, ensure you specify every node role your cluster needs.
|
|
Every cluster requires the following node roles:
|
|
|
|
* `master`
|
|
* {blank}
|
|
+
|
|
--
|
|
`data_content` and `data_hot` +
|
|
OR +
|
|
`data`
|
|
--
|
|
|
|
Some {stack} features also require specific node roles:
|
|
|
|
- {ccs-cap} and {ccr} require the `remote_cluster_client` role.
|
|
- {stack-monitor-app} and ingest pipelines require the `ingest` role.
|
|
- {fleet}, the {security-app}, and {transforms} require the `transform` role.
|
|
The `remote_cluster_client` role is also required to use {ccs} with these
|
|
features.
|
|
- {ml-cap} features, such as {anomaly-detect}, require the `ml` role.
|
|
====
|
|
|
|
As the cluster grows and in particular if you have large {ml} jobs or
|
|
{ctransforms}, consider separating dedicated master-eligible nodes from
|
|
dedicated data nodes, {ml} nodes, and {transform} nodes.
|
|
|
|
To learn more about the available node roles, see <<node-roles-overview>>.
|
|
|
|
[discrete]
|
|
=== Node data path settings
|
|
|
|
[[data-path]]
|
|
==== `path.data`
|
|
|
|
Every data and master-eligible node requires access to a data directory where
|
|
shards and index and cluster metadata will be stored. The `path.data` defaults
|
|
to `$ES_HOME/data` but can be configured in the `elasticsearch.yml` config
|
|
file an absolute path or a path relative to `$ES_HOME` as follows:
|
|
|
|
[source,yaml]
|
|
----
|
|
path.data: /var/elasticsearch/data
|
|
----
|
|
|
|
Like all node settings, it can also be specified on the command line as:
|
|
|
|
[source,sh]
|
|
----
|
|
./bin/elasticsearch -Epath.data=/var/elasticsearch/data
|
|
----
|
|
|
|
The contents of the `path.data` directory must persist across restarts, because
|
|
this is where your data is stored. {es} requires the filesystem to act as if it
|
|
were backed by a local disk, but this means that it will work correctly on
|
|
properly-configured remote block devices (e.g. a SAN) and remote filesystems
|
|
(e.g. NFS) as long as the remote storage behaves no differently from local
|
|
storage. You can run multiple {es} nodes on the same filesystem, but each {es}
|
|
node must have its own data path.
|
|
|
|
TIP: When using the `.zip` or `.tar.gz` distributions, the `path.data` setting
|
|
should be configured to locate the data directory outside the {es} home
|
|
directory, so that the home directory can be deleted without deleting your data!
|
|
The RPM and Debian distributions do this for you already.
|
|
|
|
// tag::modules-node-data-path-warning-tag[]
|
|
WARNING: Don't modify anything within the data directory or run processes that
|
|
might interfere with its contents. If something other than {es} modifies the
|
|
contents of the data directory, then {es} may fail, reporting corruption or
|
|
other data inconsistencies, or may appear to work correctly having silently
|
|
lost some of your data. Don't attempt to take filesystem backups of the data
|
|
directory; there is no supported way to restore such a backup. Instead, use
|
|
<<snapshot-restore>> to take backups safely. Don't run virus scanners on the
|
|
data directory. A virus scanner can prevent {es} from working correctly and may
|
|
modify the contents of the data directory. The data directory contains no
|
|
executables so a virus scan will only find false positives.
|
|
// end::modules-node-data-path-warning-tag[]
|
|
|
|
[[custom-node-attributes]]
|
|
==== Custom node attributes
|
|
|
|
If needed, you can add custom attributes to a node. These attributes can be used to <<cluster-routing-settings,filter which nodes a shard can be allocated to>>, or to group nodes together for <<shard-allocation-awareness,shard allocation awareness>>.
|
|
|
|
[TIP]
|
|
===============================================
|
|
You can also set a node attribute using the `-E` command line argument when you start a node:
|
|
|
|
[source,sh]
|
|
--------------------------------------------------------
|
|
./bin/elasticsearch -Enode.attr.rack_id=rack_one
|
|
--------------------------------------------------------
|
|
===============================================
|
|
|
|
`node.attr.<attribute-name>`::
|
|
(<<dynamic-cluster-setting,Dynamic>>)
|
|
A custom attribute that you can assign to a node. For example, you might assign a `rack_id` attribute to each node to ensure that primary and replica shards are not allocated on the same rack. You can specify multiple attributes as a comma-separated list.
|
|
|
|
[discrete]
|
|
[[other-node-settings]]
|
|
=== Other node settings
|
|
|
|
More node settings can be found in <<settings>> and <<important-settings>>,
|
|
including:
|
|
|
|
* <<cluster-name,`cluster.name`>>
|
|
* <<node-name,`node.name`>>
|
|
* <<modules-network,network settings>> |