mirror of
https://github.com/elastic/elasticsearch.git
synced 2025-04-24 23:27:25 -04:00
Migrated documentation into the main repo
This commit is contained in:
parent
b9558edeff
commit
822043347e
316 changed files with 23987 additions and 0 deletions
230
docs/reference/modules/cluster.asciidoc
Normal file
230
docs/reference/modules/cluster.asciidoc
Normal file
|
@ -0,0 +1,230 @@
|
|||
[[modules-cluster]]
|
||||
== Cluster
|
||||
|
||||
[float]
|
||||
=== Shards Allocation
|
||||
|
||||
Shards allocation is the process of allocating shards to nodes. This can
|
||||
happen during initial recovery, replica allocation, rebalancing, or
|
||||
handling nodes being added or removed.
|
||||
|
||||
The following settings may be used:
|
||||
|
||||
`cluster.routing.allocation.allow_rebalance`::
|
||||
Allow to control when rebalancing will happen based on the total
|
||||
state of all the indices shards in the cluster. `always`,
|
||||
`indices_primaries_active`, and `indices_all_active` are allowed,
|
||||
defaulting to `indices_all_active` to reduce chatter during
|
||||
initial recovery.
|
||||
|
||||
|
||||
`cluster.routing.allocation.cluster_concurrent_rebalance`::
|
||||
Allow to control how many concurrent rebalancing of shards are
|
||||
allowed cluster wide, and default it to `2`.
|
||||
|
||||
|
||||
`cluster.routing.allocation.node_initial_primaries_recoveries`::
|
||||
Allow to control specifically the number of initial recoveries
|
||||
of primaries that are allowed per node. Since most times local
|
||||
gateway is used, those should be fast and we can handle more of
|
||||
those per node without creating load.
|
||||
|
||||
|
||||
`cluster.routing.allocation.node_concurrent_recoveries`::
|
||||
How many concurrent recoveries are allowed to happen on a node.
|
||||
Defaults to `2`.
|
||||
|
||||
|
||||
`cluster.routing.allocation.disable_new_allocation`::
|
||||
Allows to disable new primary allocations. Note, this will prevent
|
||||
allocations for newly created indices. This setting really make
|
||||
sense when dynamically updating it using the cluster update
|
||||
settings API.
|
||||
|
||||
|
||||
`cluster.routing.allocation.disable_allocation`::
|
||||
Allows to disable either primary or replica allocation (does not
|
||||
apply to newly created primaries, see `disable_new_allocation`
|
||||
above). Note, a replica will still be promoted to primary if
|
||||
one does not exist. This setting really make sense when
|
||||
dynamically updating it using the cluster update settings API.
|
||||
|
||||
|
||||
`cluster.routing.allocation.disable_replica_allocation`::
|
||||
Allows to disable only replica allocation. Similar to the previous
|
||||
setting, mainly make sense when using it dynamically using the
|
||||
cluster update settings API.
|
||||
|
||||
|
||||
`indices.recovery.concurrent_streams`::
|
||||
The number of streams to open (on a *node* level) to recover a
|
||||
shard from a peer shard. Defaults to `3`.
|
||||
|
||||
[float]
|
||||
=== Shard Allocation Awareness
|
||||
|
||||
Cluster allocation awareness allows to configure shard and replicas
|
||||
allocation across generic attributes associated the nodes. Lets explain
|
||||
it through an example:
|
||||
|
||||
Assume we have several racks. When we start a node, we can configure an
|
||||
attribute called `rack_id` (any attribute name works), for example, here
|
||||
is a sample config:
|
||||
|
||||
----------------------
|
||||
node.rack_id: rack_one
|
||||
----------------------
|
||||
|
||||
The above sets an attribute called `rack_id` for the relevant node with
|
||||
a value of `rack_one`. Now, we need to configure the `rack_id` attribute
|
||||
as one of the awareness allocation attributes (set it on *all* (master
|
||||
eligible) nodes config):
|
||||
|
||||
--------------------------------------------------------
|
||||
cluster.routing.allocation.awareness.attributes: rack_id
|
||||
--------------------------------------------------------
|
||||
|
||||
The above will mean that the `rack_id` attribute will be used to do
|
||||
awareness based allocation of shard and its replicas. For example, lets
|
||||
say we start 2 nodes with `node.rack_id` set to `rack_one`, and deploy a
|
||||
single index with 5 shards and 1 replica. The index will be fully
|
||||
deployed on the current nodes (5 shards and 1 replica each, total of 10
|
||||
shards).
|
||||
|
||||
Now, if we start two more nodes, with `node.rack_id` set to `rack_two`,
|
||||
shards will relocate to even the number of shards across the nodes, but,
|
||||
a shard and its replica will not be allocated in the same `rack_id`
|
||||
value.
|
||||
|
||||
The awareness attributes can hold several values, for example:
|
||||
|
||||
-------------------------------------------------------------
|
||||
cluster.routing.allocation.awareness.attributes: rack_id,zone
|
||||
-------------------------------------------------------------
|
||||
|
||||
*NOTE*: When using awareness attributes, shards will not be allocated to
|
||||
nodes that don't have values set for those attributes.
|
||||
|
||||
[float]
|
||||
=== Forced Awareness
|
||||
|
||||
Sometimes, we know in advance the number of values an awareness
|
||||
attribute can have, and more over, we would like never to have more
|
||||
replicas then needed allocated on a specific group of nodes with the
|
||||
same awareness attribute value. For that, we can force awareness on
|
||||
specific attributes.
|
||||
|
||||
For example, lets say we have an awareness attribute called `zone`, and
|
||||
we know we are going to have two zones, `zone1` and `zone2`. Here is how
|
||||
we can force awareness one a node:
|
||||
|
||||
[source,js]
|
||||
-------------------------------------------------------------------
|
||||
cluster.routing.allocation.awareness.force.zone.values: zone1,zone2
|
||||
cluster.routing.allocation.awareness.attributes: zone
|
||||
-------------------------------------------------------------------
|
||||
|
||||
Now, lets say we start 2 nodes with `node.zone` set to `zone1` and
|
||||
create an index with 5 shards and 1 replica. The index will be created,
|
||||
but only 5 shards will be allocated (with no replicas). Only when we
|
||||
start more shards with `node.zone` set to `zone2` will the replicas be
|
||||
allocated.
|
||||
|
||||
[float]
|
||||
==== Automatic Preference When Searching / GETing
|
||||
|
||||
When executing a search, or doing a get, the node receiving the request
|
||||
will prefer to execute the request on shards that exists on nodes that
|
||||
have the same attribute values as the executing node.
|
||||
|
||||
[float]
|
||||
==== Realtime Settings Update
|
||||
|
||||
The settings can be updated using the <<cluster-update-settings,cluster update settings API>> on a live cluster.
|
||||
|
||||
[float]
|
||||
=== Shard Allocation Filtering
|
||||
|
||||
Allow to control allocation if indices on nodes based on include/exclude
|
||||
filters. The filters can be set both on the index level and on the
|
||||
cluster level. Lets start with an example of setting it on the cluster
|
||||
level:
|
||||
|
||||
Lets say we have 4 nodes, each has specific attribute called `tag`
|
||||
associated with it (the name of the attribute can be any name). Each
|
||||
node has a specific value associated with `tag`. Node 1 has a setting
|
||||
`node.tag: value1`, Node 2 a setting of `node.tag: value2`, and so on.
|
||||
|
||||
We can create an index that will only deploy on nodes that have `tag`
|
||||
set to `value1` and `value2` by setting
|
||||
`index.routing.allocation.include.tag` to `value1,value2`. For example:
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
curl -XPUT localhost:9200/test/_settings -d '{
|
||||
"index.routing.allocation.include.tag" : "value1,value2"
|
||||
}'
|
||||
--------------------------------------------------
|
||||
|
||||
On the other hand, we can create an index that will be deployed on all
|
||||
nodes except for nodes with a `tag` of value `value3` by setting
|
||||
`index.routing.allocation.exclude.tag` to `value3`. For example:
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
curl -XPUT localhost:9200/test/_settings -d '{
|
||||
"index.routing.allocation.exclude.tag" : "value3"
|
||||
}'
|
||||
--------------------------------------------------
|
||||
|
||||
From version 0.90, `index.routing.allocation.require.*` can be used to
|
||||
specify a number of rules, all of which MUST match in order for a shard
|
||||
to be allocated to a node. This is in contrast to `include` which will
|
||||
include a node if ANY rule matches.
|
||||
|
||||
The `include`, `exclude` and `require` values can have generic simple
|
||||
matching wildcards, for example, `value1*`. A special attribute name
|
||||
called `_ip` can be used to match on node ip values. In addition `_host`
|
||||
attribute can be used to match on either the node's hostname or its ip
|
||||
address.
|
||||
|
||||
Obviously a node can have several attributes associated with it, and
|
||||
both the attribute name and value are controlled in the setting. For
|
||||
example, here is a sample of several node configurations:
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
node.group1: group1_value1
|
||||
node.group2: group2_value4
|
||||
--------------------------------------------------
|
||||
|
||||
In the same manner, `include`, `exclude` and `require` can work against
|
||||
several attributes, for example:
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
curl -XPUT localhost:9200/test/_settings -d '{
|
||||
"index.routing.allocation.include.group1" : "xxx"
|
||||
"index.routing.allocation.include.group2" : "yyy",
|
||||
"index.routing.allocation.exclude.group3" : "zzz",
|
||||
"index.routing.allocation.require.group4" : "aaa"
|
||||
}'
|
||||
--------------------------------------------------
|
||||
|
||||
The provided settings can also be updated in real time using the update
|
||||
settings API, allowing to "move" indices (shards) around in realtime.
|
||||
|
||||
Cluster wide filtering can also be defined, and be updated in real time
|
||||
using the cluster update settings API. This setting can come in handy
|
||||
for things like decommissioning nodes (even if the replica count is set
|
||||
to 0). Here is a sample of how to decommission a node based on `_ip`
|
||||
address:
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
curl -XPUT localhost:9200/_cluster/settings -d '{
|
||||
"transient" : {
|
||||
"cluster.routing.allocation.exclude._ip" : "10.0.0.1"
|
||||
}
|
||||
}'
|
||||
--------------------------------------------------
|
26
docs/reference/modules/discovery.asciidoc
Normal file
26
docs/reference/modules/discovery.asciidoc
Normal file
|
@ -0,0 +1,26 @@
|
|||
[[modules-discovery]]
|
||||
== Discovery
|
||||
|
||||
The discovery module is responsible for discovering nodes within a
|
||||
cluster, as well as electing a master node.
|
||||
|
||||
Note, ElasticSearch is a peer to peer based system, nodes communicate
|
||||
with one another directly if operations are delegated / broadcast. All
|
||||
the main APIs (index, delete, search) do not communicate with the master
|
||||
node. The responsibility of the master node is to maintain the global
|
||||
cluster state, and act if nodes join or leave the cluster by reassigning
|
||||
shards. Each time a cluster state is changed, the state is made known to
|
||||
the other nodes in the cluster (the manner depends on the actual
|
||||
discovery implementation).
|
||||
|
||||
[float]
|
||||
=== Settings
|
||||
|
||||
The `cluster.name` allows to create separated clusters from one another.
|
||||
The default value for the cluster name is `elasticsearch`, though it is
|
||||
recommended to change this to reflect the logical group name of the
|
||||
cluster running.
|
||||
|
||||
include::discovery/ec2.asciidoc[]
|
||||
|
||||
include::discovery/zen.asciidoc[]
|
82
docs/reference/modules/discovery/ec2.asciidoc
Normal file
82
docs/reference/modules/discovery/ec2.asciidoc
Normal file
|
@ -0,0 +1,82 @@
|
|||
[[modules-discovery-ec2]]
|
||||
=== EC2 Discovery
|
||||
|
||||
EC2 discovery allows to use the EC2 APIs to perform automatic discovery
|
||||
(similar to multicast in non hostile multicast environments). Here is a
|
||||
simple sample configuration:
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
cloud:
|
||||
aws:
|
||||
access_key: AKVAIQBF2RECL7FJWGJQ
|
||||
secret_key: vExyMThREXeRMm/b/LRzEB8jWwvzQeXgjqMX+6br
|
||||
|
||||
discovery:
|
||||
type: ec2
|
||||
--------------------------------------------------
|
||||
|
||||
You'll need to install the `cloud-aws` plugin. Please check the
|
||||
https://github.com/elasticsearch/elasticsearch-cloud-aws[plugin website]
|
||||
to find the most up-to-date version to install before (re)starting
|
||||
elasticsearch.
|
||||
|
||||
The following are a list of settings (prefixed with `discovery.ec2`)
|
||||
that can further control the discovery:
|
||||
|
||||
[cols="<,<",options="header",]
|
||||
|=======================================================================
|
||||
|Setting |Description
|
||||
|`groups` |Either a comma separated list or array based list of
|
||||
(security) groups. Only instances with the provided security groups will
|
||||
be used in the cluster discovery.
|
||||
|
||||
|`host_type` |The type of host type to use to communicate with other
|
||||
instances. Can be one of `private_ip`, `public_ip`, `private_dns`,
|
||||
`public_dns`. Defaults to `private_ip`.
|
||||
|
||||
|`availability_zones` |Either a comma separated list or array based list
|
||||
of availability zones. Only instances within the provided availability
|
||||
zones will be used in the cluster discovery.
|
||||
|
||||
|`any_group` |If set to `false`, will require all security groups to be
|
||||
present for the instance to be used for the discovery. Defaults to
|
||||
`true`.
|
||||
|
||||
|`ping_timeout` |How long to wait for existing EC2 nodes to reply during
|
||||
discovery. Defaults to 3s.
|
||||
|=======================================================================
|
||||
|
||||
[float]
|
||||
==== Filtering by Tags
|
||||
|
||||
EC2 discovery can also filter machines to include in the cluster based
|
||||
on tags (and not just groups). The settings to use include the
|
||||
`discovery.ec2.tag.` prefix. For example, setting
|
||||
`discovery.ec2.tag.stage` to `dev` will only filter instances with a tag
|
||||
key set to `stage`, and a value of `dev`. Several tags set will require
|
||||
all of those tags to be set for the instance to be included.
|
||||
|
||||
One practical use for tag filtering is when an EC2 cluster contains many
|
||||
nodes that are not running elasticsearch. In this case (particularly
|
||||
with high `ping_timeout` values) there is a risk that a new node's
|
||||
discovery phase will end before it has found the cluster (which will
|
||||
result in it declaring itself master of a new cluster with the same name
|
||||
- highly undesirable). Tagging elasticsearch EC2 nodes and then
|
||||
filtering by that tag will resolve this issue.
|
||||
|
||||
[float]
|
||||
==== Region
|
||||
|
||||
The `cloud.aws.region` can be set to a region and will automatically use
|
||||
the relevant settings for both `ec2` and `s3`. The available values are:
|
||||
`us-east-1`, `us-west-1`, `ap-southeast-1`, `eu-west-1`.
|
||||
|
||||
[float]
|
||||
==== Automatic Node Attributes
|
||||
|
||||
Though not dependent on actually using `ec2` as discovery (but still
|
||||
requires the cloud aws plugin installed), the plugin can automatically
|
||||
add node attributes relating to EC2 (for example, availability zone,
|
||||
that can be used with the awareness allocation feature). In order to
|
||||
enable it, set `cloud.node.auto_attributes` to `true` in the settings.
|
145
docs/reference/modules/discovery/zen.asciidoc
Normal file
145
docs/reference/modules/discovery/zen.asciidoc
Normal file
|
@ -0,0 +1,145 @@
|
|||
[[modules-discovery-zen]]
|
||||
=== Zen Discovery
|
||||
|
||||
The zen discovery is the built in discovery module for elasticsearch and
|
||||
the default. It provides both multicast and unicast discovery as well
|
||||
being easily extended to support cloud environments.
|
||||
|
||||
The zen discovery is integrated with other modules, for example, all
|
||||
communication between nodes is done using the
|
||||
<<modules-transport,transport>> module.
|
||||
|
||||
It is separated into several sub modules, which are explained below:
|
||||
|
||||
[float]
|
||||
==== Ping
|
||||
|
||||
This is the process where a node uses the discovery mechanisms to find
|
||||
other nodes. There is support for both multicast and unicast based
|
||||
discovery (can be used in conjunction as well).
|
||||
|
||||
[float]
|
||||
===== Multicast
|
||||
|
||||
Multicast ping discovery of other nodes is done by sending one or more
|
||||
multicast requests where existing nodes that exists will receive and
|
||||
respond to. It provides the following settings with the
|
||||
`discovery.zen.ping.multicast` prefix:
|
||||
|
||||
[cols="<,<",options="header",]
|
||||
|=======================================================================
|
||||
|Setting |Description
|
||||
|`group` |The group address to use. Defaults to `224.2.2.4`.
|
||||
|
||||
|`port` |The port to use. Defaults to `54328`.
|
||||
|
||||
|`ttl` |The ttl of the multicast message. Defaults to `3`.
|
||||
|
||||
|`address` |The address to bind to, defaults to `null` which means it
|
||||
will bind to all available network interfaces.
|
||||
|=======================================================================
|
||||
|
||||
Multicast can be disabled by setting `multicast.enabled` to `false`.
|
||||
|
||||
[float]
|
||||
===== Unicast
|
||||
|
||||
The unicast discovery allows to perform the discovery when multicast is
|
||||
not enabled. It basically requires a list of hosts to use that will act
|
||||
as gossip routers. It provides the following settings with the
|
||||
`discovery.zen.ping.unicast` prefix:
|
||||
|
||||
[cols="<,<",options="header",]
|
||||
|=======================================================================
|
||||
|Setting |Description
|
||||
|`hosts` |Either an array setting or a comma delimited setting. Each
|
||||
value is either in the form of `host:port`, or in the form of
|
||||
`host[port1-port2]`.
|
||||
|=======================================================================
|
||||
|
||||
The unicast discovery uses the
|
||||
<<modules-transport,transport>> module to
|
||||
perform the discovery.
|
||||
|
||||
[float]
|
||||
==== Master Election
|
||||
|
||||
As part of the initial ping process a master of the cluster is either
|
||||
elected or joined to. This is done automatically. The
|
||||
`discovery.zen.ping_timeout` (which defaults to `3s`) allows to
|
||||
configure the election to handle cases of slow or congested networks
|
||||
(higher values assure less chance of failure). Note, this setting was
|
||||
changed from 0.15.1 onwards, prior it was called
|
||||
`discovery.zen.initial_ping_timeout`.
|
||||
|
||||
Nodes can be excluded from becoming a master by setting `node.master` to
|
||||
`false`. Note, once a node is a client node (`node.client` set to
|
||||
`true`), it will not be allowed to become a master (`node.master` is
|
||||
automatically set to `false`).
|
||||
|
||||
The `discovery.zen.minimum_master_nodes` allows to control the minimum
|
||||
number of master eligible nodes a node should "see" in order to operate
|
||||
within the cluster. Its recommended to set it to a higher value than 1
|
||||
when running more than 2 nodes in the cluster.
|
||||
|
||||
[float]
|
||||
==== Fault Detection
|
||||
|
||||
There are two fault detection processes running. The first is by the
|
||||
master, to ping all the other nodes in the cluster and verify that they
|
||||
are alive. And on the other end, each node pings to master to verify if
|
||||
its still alive or an election process needs to be initiated.
|
||||
|
||||
The following settings control the fault detection process using the
|
||||
`discovery.zen.fd` prefix:
|
||||
|
||||
[cols="<,<",options="header",]
|
||||
|=======================================================================
|
||||
|Setting |Description
|
||||
|`ping_interval` |How often a node gets pinged. Defaults to `1s`.
|
||||
|
||||
|`ping_timeout` |How long to wait for a ping response, defaults to
|
||||
`30s`.
|
||||
|
||||
|`ping_retries` |How many ping failures / timeouts cause a node to be
|
||||
considered failed. Defaults to `3`.
|
||||
|=======================================================================
|
||||
|
||||
[float]
|
||||
==== External Multicast
|
||||
|
||||
The multicast discovery also supports external multicast requests to
|
||||
discover nodes. The external client can send a request to the multicast
|
||||
IP/group and port, in the form of:
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
{
|
||||
"request" : {
|
||||
"cluster_name": "test_cluster"
|
||||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
|
||||
And the response will be similar to node info response (with node level
|
||||
information only, including transport/http addresses, and node
|
||||
attributes):
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
{
|
||||
"response" : {
|
||||
"cluster_name" : "test_cluster",
|
||||
"transport_address" : "...",
|
||||
"http_address" : "...",
|
||||
"attributes" : {
|
||||
"..."
|
||||
}
|
||||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
|
||||
Note, it can still be enabled, with disabled internal multicast
|
||||
discovery, but still have external discovery working by keeping
|
||||
`discovery.zen.ping.multicast.enabled` set to `true` (the default), but,
|
||||
setting `discovery.zen.ping.multicast.ping.enabled` to `false`.
|
74
docs/reference/modules/gateway.asciidoc
Normal file
74
docs/reference/modules/gateway.asciidoc
Normal file
|
@ -0,0 +1,74 @@
|
|||
[[modules-gateway]]
|
||||
== Gateway
|
||||
|
||||
The gateway module allows one to store the state of the cluster meta
|
||||
data across full cluster restarts. The cluster meta data mainly holds
|
||||
all the indices created with their respective (index level) settings and
|
||||
explicit type mappings.
|
||||
|
||||
Each time the cluster meta data changes (for example, when an index is
|
||||
added or deleted), those changes will be persisted using the gateway.
|
||||
When the cluster first starts up, the state will be read from the
|
||||
gateway and applied.
|
||||
|
||||
The gateway set on the node level will automatically control the index
|
||||
gateway that will be used. For example, if the `fs` gateway is used,
|
||||
then automatically, each index created on the node will also use its own
|
||||
respective index level `fs` gateway. In this case, if an index should
|
||||
not persist its state, it should be explicitly set to `none` (which is
|
||||
the only other value it can be set to).
|
||||
|
||||
The default gateway used is the
|
||||
<<modules-gateway-local,local>> gateway.
|
||||
|
||||
[float]
|
||||
=== Recovery After Nodes / Time
|
||||
|
||||
In many cases, the actual cluster meta data should only be recovered
|
||||
after specific nodes have started in the cluster, or a timeout has
|
||||
passed. This is handy when restarting the cluster, and each node local
|
||||
index storage still exists to be reused and not recovered from the
|
||||
gateway (which reduces the time it takes to recover from the gateway).
|
||||
|
||||
The `gateway.recover_after_nodes` setting (which accepts a number)
|
||||
controls after how many data and master eligible nodes within the
|
||||
cluster recovery will start. The `gateway.recover_after_data_nodes` and
|
||||
`gateway.recover_after_master_nodes` setting work in a similar fashion,
|
||||
except they consider only the number of data nodes and only the number
|
||||
of master nodes respectively. The `gateway.recover_after_time` setting
|
||||
(which accepts a time value) sets the time to wait till recovery happens
|
||||
once all `gateway.recover_after...nodes` conditions are met.
|
||||
|
||||
The `gateway.expected_nodes` allows to set how many data and master
|
||||
eligible nodes are expected to be in the cluster, and once met, the
|
||||
`recover_after_time` is ignored and recovery starts. The
|
||||
`gateway.expected_data_nodes` and `gateway.expected_master_nodes`
|
||||
settings are also supported. For example setting:
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
gateway:
|
||||
recover_after_nodes: 1
|
||||
recover_after_time: 5m
|
||||
expected_nodes: 2
|
||||
--------------------------------------------------
|
||||
|
||||
In an expected 2 nodes cluster will cause recovery to start 5 minutes
|
||||
after the first node is up, but once there are 2 nodes in the cluster,
|
||||
recovery will begin immediately (without waiting).
|
||||
|
||||
Note, once the meta data has been recovered from the gateway (which
|
||||
indices to create, mappings and so on), then this setting is no longer
|
||||
effective until the next full restart of the cluster.
|
||||
|
||||
Operations are blocked while the cluster meta data has not been
|
||||
recovered in order not to mix with the actual cluster meta data that
|
||||
will be recovered once the settings has been reached.
|
||||
|
||||
include::gateway/local.asciidoc[]
|
||||
|
||||
include::gateway/fs.asciidoc[]
|
||||
|
||||
include::gateway/hadoop.asciidoc[]
|
||||
|
||||
include::gateway/s3.asciidoc[]
|
39
docs/reference/modules/gateway/fs.asciidoc
Normal file
39
docs/reference/modules/gateway/fs.asciidoc
Normal file
|
@ -0,0 +1,39 @@
|
|||
[[modules-gateway-fs]]
|
||||
=== Shared FS Gateway
|
||||
|
||||
*The shared FS gateway is deprecated and will be removed in a future
|
||||
version. Please use the
|
||||
<<modules-gateway-local,local gateway>>
|
||||
instead.*
|
||||
|
||||
The file system based gateway stores the cluster meta data and indices
|
||||
in a *shared* file system. Note, since it is a distributed system, the
|
||||
file system should be shared between all different nodes. Here is an
|
||||
example config to enable it:
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
gateway:
|
||||
type: fs
|
||||
--------------------------------------------------
|
||||
|
||||
[float]
|
||||
==== location
|
||||
|
||||
The location where the gateway stores the cluster state can be set using
|
||||
the `gateway.fs.location` setting. By default, it will be stored under
|
||||
the `work` directory. Note, the `work` directory is considered a
|
||||
temporal directory with ElasticSearch (meaning it is safe to `rm -rf`
|
||||
it), the default location of the persistent gateway in work intentional,
|
||||
*it should be changed*.
|
||||
|
||||
When explicitly specifying the `gateway.fs.location`, each node will
|
||||
append its `cluster.name` to the provided location. It means that the
|
||||
location provided can safely support several clusters.
|
||||
|
||||
[float]
|
||||
==== concurrent_streams
|
||||
|
||||
The `gateway.fs.concurrent_streams` allow to throttle the number of
|
||||
streams (per node) opened against the shared gateway performing the
|
||||
snapshot operation. It defaults to `5`.
|
36
docs/reference/modules/gateway/hadoop.asciidoc
Normal file
36
docs/reference/modules/gateway/hadoop.asciidoc
Normal file
|
@ -0,0 +1,36 @@
|
|||
[[modules-gateway-hadoop]]
|
||||
=== Hadoop Gateway
|
||||
|
||||
*The hadoop gateway is deprecated and will be removed in a future
|
||||
version. Please use the
|
||||
<<modules-gateway-local,local gateway>>
|
||||
instead.*
|
||||
|
||||
The hadoop (HDFS) based gateway stores the cluster meta and indices data
|
||||
in hadoop. Hadoop support is provided as a plugin and installing is
|
||||
explained https://github.com/elasticsearch/elasticsearch-hadoop[here] or
|
||||
downloading the hadoop plugin and placing it under the `plugins`
|
||||
directory. Here is an example config to enable it:
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
gateway:
|
||||
type: hdfs
|
||||
hdfs:
|
||||
uri: hdfs://myhost:8022
|
||||
--------------------------------------------------
|
||||
|
||||
[float]
|
||||
==== Settings
|
||||
|
||||
The hadoop gateway requires two simple settings. The `gateway.hdfs.uri`
|
||||
controls the URI to connect to the hadoop cluster, for example:
|
||||
`hdfs://myhost:8022`. The `gateway.hdfs.path` controls the path under
|
||||
which the gateway will store the data.
|
||||
|
||||
[float]
|
||||
==== concurrent_streams
|
||||
|
||||
The `gateway.hdfs.concurrent_streams` allow to throttle the number of
|
||||
streams (per node) opened against the shared gateway performing the
|
||||
snapshot operation. It defaults to `5`.
|
31
docs/reference/modules/gateway/local.asciidoc
Normal file
31
docs/reference/modules/gateway/local.asciidoc
Normal file
|
@ -0,0 +1,31 @@
|
|||
[[modules-gateway-local]]
|
||||
=== Local Gateway
|
||||
|
||||
The local gateway allows for recovery of the full cluster state and
|
||||
indices from the local storage of each node, and does not require a
|
||||
common node level shared storage.
|
||||
|
||||
Note, different from shared gateway types, the persistency to the local
|
||||
gateway is *not* done in an async manner. Once an operation is
|
||||
performed, the data is there for the local gateway to recover it in case
|
||||
of full cluster failure.
|
||||
|
||||
It is important to configure the `gateway.recover_after_nodes` setting
|
||||
to include most of the expected nodes to be started after a full cluster
|
||||
restart. This will insure that the latest cluster state is recovered.
|
||||
For example:
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
gateway:
|
||||
recover_after_nodes: 1
|
||||
recover_after_time: 5m
|
||||
expected_nodes: 2
|
||||
--------------------------------------------------
|
||||
|
||||
Note, to backup/snapshot the full cluster state it is recommended that
|
||||
the local storage for all nodes be copied (in theory not all are
|
||||
required, just enough to guarantee a copy of each shard has been copied,
|
||||
i.e. depending on the replication settings) while disabling flush.
|
||||
Shared storage such as S3 can be used to keep the different nodes'
|
||||
copies in one place, though it does comes at a price of more IO.
|
51
docs/reference/modules/gateway/s3.asciidoc
Normal file
51
docs/reference/modules/gateway/s3.asciidoc
Normal file
|
@ -0,0 +1,51 @@
|
|||
[[modules-gateway-s3]]
|
||||
=== S3 Gateway
|
||||
|
||||
*The S3 gateway is deprecated and will be removed in a future version.
|
||||
Please use the <<modules-gateway-local,local
|
||||
gateway>> instead.*
|
||||
|
||||
S3 based gateway allows to do long term reliable async persistency of
|
||||
the cluster state and indices directly to Amazon S3. Here is how it can
|
||||
be configured:
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
cloud:
|
||||
aws:
|
||||
access_key: AKVAIQBF2RECL7FJWGJQ
|
||||
secret_key: vExyMThREXeRMm/b/LRzEB8jWwvzQeXgjqMX+6br
|
||||
|
||||
|
||||
gateway:
|
||||
type: s3
|
||||
s3:
|
||||
bucket: bucket_name
|
||||
--------------------------------------------------
|
||||
|
||||
You’ll need to install the `cloud-aws` plugin, by running
|
||||
`bin/plugin install cloud-aws` before (re)starting elasticsearch.
|
||||
|
||||
The following are a list of settings (prefixed with `gateway.s3`) that
|
||||
can further control the S3 gateway:
|
||||
|
||||
[cols="<,<",options="header",]
|
||||
|=======================================================================
|
||||
|Setting |Description
|
||||
|`chunk_size` |Big files are broken down into chunks (to overcome AWS 5g
|
||||
limit and use concurrent snapshotting). Default set to `100m`.
|
||||
|=======================================================================
|
||||
|
||||
[float]
|
||||
==== concurrent_streams
|
||||
|
||||
The `gateway.s3.concurrent_streams` allow to throttle the number of
|
||||
streams (per node) opened against the shared gateway performing the
|
||||
snapshot operation. It defaults to `5`.
|
||||
|
||||
[float]
|
||||
==== Region
|
||||
|
||||
The `cloud.aws.region` can be set to a region and will automatically use
|
||||
the relevant settings for both `ec2` and `s3`. The available values are:
|
||||
`us-east-1`, `us-west-1`, `ap-southeast-1`, `eu-west-1`.
|
51
docs/reference/modules/http.asciidoc
Normal file
51
docs/reference/modules/http.asciidoc
Normal file
|
@ -0,0 +1,51 @@
|
|||
[[modules-http]]
|
||||
== HTTP
|
||||
|
||||
The http module allows to expose *elasticsearch* APIs
|
||||
over HTTP.
|
||||
|
||||
The http mechanism is completely asynchronous in nature, meaning that
|
||||
there is no blocking thread waiting for a response. The benefit of using
|
||||
asynchronous communication for HTTP is solving the
|
||||
http://en.wikipedia.org/wiki/C10k_problem[C10k problem].
|
||||
|
||||
When possible, consider using
|
||||
http://en.wikipedia.org/wiki/Keepalive#HTTP_Keepalive[HTTP keep alive]
|
||||
when connecting for better performance and try to get your favorite
|
||||
client not to do
|
||||
http://en.wikipedia.org/wiki/Chunked_transfer_encoding[HTTP chunking].
|
||||
|
||||
[float]
|
||||
=== Settings
|
||||
|
||||
The following are the settings the can be configured for HTTP:
|
||||
|
||||
[cols="<,<",options="header",]
|
||||
|=======================================================================
|
||||
|Setting |Description
|
||||
|`http.port` |A bind port range. Defaults to `9200-9300`.
|
||||
|
||||
|`http.max_content_length` |The max content of an HTTP request. Defaults
|
||||
to `100mb`
|
||||
|
||||
|`http.max_initial_line_length` |The max length of an HTTP URL. Defaults
|
||||
to `4kb`
|
||||
|
||||
|`http.compression` |Support for compression when possible (with
|
||||
Accept-Encoding). Defaults to `false`.
|
||||
|
||||
|`http.compression_level` |Defines the compression level to use.
|
||||
Defaults to `6`.
|
||||
|=======================================================================
|
||||
|
||||
It also shares the uses the common
|
||||
<<modules-network,network settings>>.
|
||||
|
||||
[float]
|
||||
=== Disable HTTP
|
||||
|
||||
The http module can be completely disabled and not started by setting
|
||||
`http.enabled` to `false`. This make sense when creating non
|
||||
<<modules-node,data nodes>> which accept HTTP
|
||||
requests, and communicate with data nodes using the internal
|
||||
<<modules-transport,transport>>.
|
75
docs/reference/modules/indices.asciidoc
Normal file
75
docs/reference/modules/indices.asciidoc
Normal file
|
@ -0,0 +1,75 @@
|
|||
[[modules-indices]]
|
||||
== Indices
|
||||
|
||||
The indices module allow to control settings that are globally managed
|
||||
for all indices.
|
||||
|
||||
[float]
|
||||
=== Indexing Buffer
|
||||
|
||||
The indexing buffer setting allows to control how much memory will be
|
||||
allocated for the indexing process. It is a global setting that bubbles
|
||||
down to all the different shards allocated on a specific node.
|
||||
|
||||
The `indices.memory.index_buffer_size` accepts either a percentage or a
|
||||
byte size value. It defaults to `10%`, meaning that `10%` of the total
|
||||
memory allocated to a node will be used as the indexing buffer size.
|
||||
This amount is then divided between all the different shards. Also, if
|
||||
percentage is used, allow to set `min_index_buffer_size` (defaults to
|
||||
`48mb`) and `max_index_buffer_size` which by default is unbounded.
|
||||
|
||||
The `indices.memory.min_shard_index_buffer_size` allows to set a hard
|
||||
lower limit for the memory allocated per shard for its own indexing
|
||||
buffer. It defaults to `4mb`.
|
||||
|
||||
[float]
|
||||
=== TTL interval
|
||||
|
||||
You can dynamically set the `indices.ttl.interval` allows to set how
|
||||
often expired documents will be automatically deleted. The default value
|
||||
is 60s.
|
||||
|
||||
The deletion orders are processed by bulk. You can set
|
||||
`indices.ttl.bulk_size` to fit your needs. The default value is 10000.
|
||||
|
||||
See also <<mapping-ttl-field>>.
|
||||
|
||||
[float]
|
||||
=== Recovery
|
||||
|
||||
The following settings can be set to manage recovery policy:
|
||||
|
||||
[horizontal]
|
||||
`indices.recovery.concurrent_streams`::
|
||||
defaults to `3`.
|
||||
|
||||
`indices.recovery.file_chunk_size`::
|
||||
defaults to `512kb`.
|
||||
|
||||
`indices.recovery.translog_ops`::
|
||||
defaults to `1000`.
|
||||
|
||||
`indices.recovery.translog_size`::
|
||||
defaults to `512kb`.
|
||||
|
||||
`indices.recovery.compress`::
|
||||
defaults to `true`.
|
||||
|
||||
`indices.recovery.max_bytes_per_sec`::
|
||||
since 0.90.1, defaults to `20mb`.
|
||||
|
||||
`indices.recovery.max_size_per_sec`::
|
||||
deprecated from 0.90.1. Replaced by `indices.recovery.max_bytes_per_sec`.
|
||||
|
||||
[float]
|
||||
=== Store level throttling
|
||||
|
||||
The following settings can be set to control store throttling:
|
||||
|
||||
[horizontal]
|
||||
`indices.store.throttle.type`::
|
||||
could be `merge` (default), `not` or `all`. See <<index-modules-store>>.
|
||||
|
||||
`indices.store.throttle.max_bytes_per_sec`::
|
||||
defaults to `20mb`.
|
||||
|
34
docs/reference/modules/jmx.asciidoc
Normal file
34
docs/reference/modules/jmx.asciidoc
Normal file
|
@ -0,0 +1,34 @@
|
|||
[[modules-jmx]]
|
||||
== JMX
|
||||
|
||||
[float]
|
||||
=== REMOVED AS OF v0.90
|
||||
|
||||
Use the stats APIs instead.
|
||||
|
||||
The JMX module exposes node information through
|
||||
http://java.sun.com/javase/technologies/core/mntr-mgmt/javamanagement/[JMX].
|
||||
JMX can be used by either
|
||||
http://en.wikipedia.org/wiki/JConsole[jconsole] or
|
||||
http://en.wikipedia.org/wiki/VisualVM[VisualVM].
|
||||
|
||||
Exposed JMX data include both node level information, as well as
|
||||
instantiated index and shard on specific node. This is a work in
|
||||
progress with each version exposing more information.
|
||||
|
||||
[float]
|
||||
=== jmx.domain
|
||||
|
||||
The domain under which the JMX will register under can be set using
|
||||
`jmx.domain` setting. It defaults to `{elasticsearch}`.
|
||||
|
||||
[float]
|
||||
=== jmx.create_connector
|
||||
|
||||
An RMI connector can be started to accept JMX requests. This can be
|
||||
enabled by setting `jmx.create_connector` to `true`. An RMI connector
|
||||
does come with its own overhead, make sure you really need it.
|
||||
|
||||
When an RMI connector is created, the `jmx.port` setting provides a port
|
||||
range setting for the ports the rmi connector can open on. By default,
|
||||
it is set to `9400-9500`.
|
69
docs/reference/modules/memcached.asciidoc
Normal file
69
docs/reference/modules/memcached.asciidoc
Normal file
|
@ -0,0 +1,69 @@
|
|||
[[modules-memcached]]
|
||||
== memcached
|
||||
|
||||
The memcached module allows to expose *elasticsearch*
|
||||
APIs over the memcached protocol (as closely
|
||||
as possible).
|
||||
|
||||
It is provided as a plugin called `transport-memcached` and installing
|
||||
is explained
|
||||
https://github.com/elasticsearch/elasticsearch-transport-memcached[here]
|
||||
. Another option is to download the memcached plugin and placing it
|
||||
under the `plugins` directory.
|
||||
|
||||
The memcached protocol supports both the binary and the text protocol,
|
||||
automatically detecting the correct one to use.
|
||||
|
||||
[float]
|
||||
=== Mapping REST to Memcached Protocol
|
||||
|
||||
Memcached commands are mapped to REST and handled by the same generic
|
||||
REST layer in elasticsearch. Here is a list of the memcached commands
|
||||
supported:
|
||||
|
||||
[float]
|
||||
==== GET
|
||||
|
||||
The memcached `GET` command maps to a REST `GET`. The key used is the
|
||||
URI (with parameters). The main downside is the fact that the memcached
|
||||
`GET` does not allow body in the request (and `SET` does not allow to
|
||||
return a result...). For this reason, most REST APIs (like search) allow
|
||||
to accept the "source" as a URI parameter as well.
|
||||
|
||||
[float]
|
||||
==== SET
|
||||
|
||||
The memcached `SET` command maps to a REST `POST`. The key used is the
|
||||
URI (with parameters), and the body maps to the REST body.
|
||||
|
||||
[float]
|
||||
==== DELETE
|
||||
|
||||
The memcached `DELETE` command maps to a REST `DELETE`. The key used is
|
||||
the URI (with parameters).
|
||||
|
||||
[float]
|
||||
==== QUIT
|
||||
|
||||
The memcached `QUIT` command is supported and disconnects the client.
|
||||
|
||||
[float]
|
||||
=== Settings
|
||||
|
||||
The following are the settings the can be configured for memcached:
|
||||
|
||||
[cols="<,<",options="header",]
|
||||
|===============================================================
|
||||
|Setting |Description
|
||||
|`memcached.port` |A bind port range. Defaults to `11211-11311`.
|
||||
|===============================================================
|
||||
|
||||
It also shares the uses the common
|
||||
<<modules-network,network settings>>.
|
||||
|
||||
[float]
|
||||
=== Disable memcached
|
||||
|
||||
The memcached module can be completely disabled and not started using by
|
||||
setting `memcached.enabled` to `false`. By default it is enabled once it
|
||||
is detected as a plugin.
|
88
docs/reference/modules/network.asciidoc
Normal file
88
docs/reference/modules/network.asciidoc
Normal file
|
@ -0,0 +1,88 @@
|
|||
[[modules-network]]
|
||||
== Network Settings
|
||||
|
||||
There are several modules within a Node that use network based
|
||||
configuration, for example, the
|
||||
<<modules-transport,transport>> and
|
||||
<<modules-http,http>> modules. Node level
|
||||
network settings allows to set common settings that will be shared among
|
||||
all network based modules (unless explicitly overridden in each module).
|
||||
|
||||
The `network.bind_host` setting allows to control the host different
|
||||
network components will bind on. By default, the bind host will be
|
||||
`anyLocalAddress` (typically `0.0.0.0` or `::0`).
|
||||
|
||||
The `network.publish_host` setting allows to control the host the node
|
||||
will publish itself within the cluster so other nodes will be able to
|
||||
connect to it. Of course, this can't be the `anyLocalAddress`, and by
|
||||
default, it will be the first non loopback address (if possible), or the
|
||||
local address.
|
||||
|
||||
The `network.host` setting is a simple setting to automatically set both
|
||||
`network.bind_host` and `network.publish_host` to the same host value.
|
||||
|
||||
Both settings allows to be configured with either explicit host address
|
||||
or host name. The settings also accept logical setting values explained
|
||||
in the following table:
|
||||
|
||||
[cols="<,<",options="header",]
|
||||
|=======================================================================
|
||||
|Logical Host Setting Value |Description
|
||||
|`_local_` |Will be resolved to the local ip address.
|
||||
|
||||
|`_non_loopback_` |The first non loopback address.
|
||||
|
||||
|`_non_loopback:ipv4_` |The first non loopback IPv4 address.
|
||||
|
||||
|`_non_loopback:ipv6_` |The first non loopback IPv6 address.
|
||||
|
||||
|`_[networkInterface]_` |Resolves to the ip address of the provided
|
||||
network interface. For example `_en0_`.
|
||||
|
||||
|`_[networkInterface]:ipv4_` |Resolves to the ipv4 address of the
|
||||
provided network interface. For example `_en0:ipv4_`.
|
||||
|
||||
|`_[networkInterface]:ipv6_` |Resolves to the ipv6 address of the
|
||||
provided network interface. For example `_en0:ipv6_`.
|
||||
|=======================================================================
|
||||
|
||||
When the `cloud-aws` plugin is installed, the following are also allowed
|
||||
as valid network host settings:
|
||||
|
||||
[cols="<,<",options="header",]
|
||||
|==================================================================
|
||||
|EC2 Host Value |Description
|
||||
|`_ec2:privateIpv4_` |The private IP address (ipv4) of the machine.
|
||||
|`_ec2:privateDns_` |The private host of the machine.
|
||||
|`_ec2:publicIpv4_` |The public IP address (ipv4) of the machine.
|
||||
|`_ec2:publicDns_` |The public host of the machine.
|
||||
|`_ec2_` |Less verbose option for the private ip address.
|
||||
|`_ec2:privateIp_` |Less verbose option for the private ip address.
|
||||
|`_ec2:publicIp_` |Less verbose option for the public ip address.
|
||||
|==================================================================
|
||||
|
||||
[float]
|
||||
=== TCP Settings
|
||||
|
||||
Any component that uses TCP (like the HTTP, Transport and Memcached)
|
||||
share the following allowed settings:
|
||||
|
||||
[cols="<,<",options="header",]
|
||||
|=======================================================================
|
||||
|Setting |Description
|
||||
|`network.tcp.no_delay` |Enable or disable tcp no delay setting.
|
||||
Defaults to `true`.
|
||||
|
||||
|`network.tcp.keep_alive` |Enable or disable tcp keep alive. By default
|
||||
not explicitly set.
|
||||
|
||||
|`network.tcp.reuse_address` |Should an address be reused or not.
|
||||
Defaults to `true` on none windows machines.
|
||||
|
||||
|`network.tcp.send_buffer_size` |The size of the tcp send buffer size
|
||||
(in size setting format). By default not explicitly set.
|
||||
|
||||
|`network.tcp.receive_buffer_size` |The size of the tcp receive buffer
|
||||
size (in size setting format). By default not explicitly set.
|
||||
|=======================================================================
|
||||
|
32
docs/reference/modules/node.asciidoc
Normal file
32
docs/reference/modules/node.asciidoc
Normal file
|
@ -0,0 +1,32 @@
|
|||
[[modules-node]]
|
||||
== Node
|
||||
|
||||
*elasticsearch* allows to configure a node to either be allowed to store
|
||||
data locally or not. Storing data locally basically means that shards of
|
||||
different indices are allowed to be allocated on that node. By default,
|
||||
each node is considered to be a data node, and it can be turned off by
|
||||
setting `node.data` to `false`.
|
||||
|
||||
This is a powerful setting allowing to simply create smart load
|
||||
balancers that take part in some of different API processing. Lets take
|
||||
an example:
|
||||
|
||||
We can start a whole cluster of data nodes which do not even start an
|
||||
HTTP transport by setting `http.enabled` to `false`. Such nodes will
|
||||
communicate with one another using the
|
||||
<<modules-transport,transport>> module. In front
|
||||
of the cluster we can start one or more "non data" nodes which will
|
||||
start with HTTP enabled. All HTTP communication will be performed
|
||||
through these "non data" nodes.
|
||||
|
||||
The benefit of using that is first the ability to create smart load
|
||||
balancers. These "non data" nodes are still part of the cluster, and
|
||||
they redirect operations exactly to the node that holds the relevant
|
||||
data. The other benefit is the fact that for scatter / gather based
|
||||
operations (such as search), these nodes will take part of the
|
||||
processing since they will start the scatter process, and perform the
|
||||
actual gather processing.
|
||||
|
||||
This relieves the data nodes to do the heavy duty of indexing and
|
||||
searching, without needing to process HTTP requests (parsing), overload
|
||||
the network, or perform the gather processing.
|
245
docs/reference/modules/plugins.asciidoc
Normal file
245
docs/reference/modules/plugins.asciidoc
Normal file
|
@ -0,0 +1,245 @@
|
|||
[[modules-plugins]]
|
||||
== Plugins
|
||||
|
||||
[float]
|
||||
=== Plugins
|
||||
|
||||
Plugins are a way to enhance the basic elasticsearch functionality in a
|
||||
custom manner. They range from adding custom mapping types, custom
|
||||
analyzers (in a more built in fashion), native scripts, custom discovery
|
||||
and more.
|
||||
|
||||
[float]
|
||||
==== Installing plugins
|
||||
|
||||
Installing plugins can either be done manually by placing them under the
|
||||
`plugins` directory, or using the `plugin` script. Several plugins can
|
||||
be found under the https://github.com/elasticsearch[elasticsearch]
|
||||
organization in GitHub, starting with `elasticsearch-`.
|
||||
|
||||
Starting from 0.90.2, installing plugins typically take the form of
|
||||
`plugin --install <org>/<user/component>/<version>`. The plugins will be
|
||||
automatically downloaded in this case from `download.elasticsearch.org`,
|
||||
and in case they don't exist there, from maven (central and sonatype).
|
||||
|
||||
Note that when the plugin is located in maven central or sonatype
|
||||
repository, `<org>` is the artifact `groupId` and `<user/component>` is
|
||||
the `artifactId`.
|
||||
|
||||
For prior version, the older form is
|
||||
`plugin -install <org>/<user/component>/<version>`
|
||||
|
||||
A plugin can also be installed directly by specifying the URL for it,
|
||||
for example:
|
||||
`bin/plugin --url file://path/to/plugin --install plugin-name` or
|
||||
`bin/plugin -url file://path/to/plugin -install plugin-name` for older
|
||||
version.
|
||||
|
||||
Starting from 0.90.2, for more information about plugins, you can run
|
||||
`bin/plugin -h`.
|
||||
|
||||
[float]
|
||||
==== Site Plugins
|
||||
|
||||
Plugins can have "sites" in them, any plugin that exists under the
|
||||
`plugins` directory with a `_site` directory, its content will be
|
||||
statically served when hitting `/_plugin/[plugin_name]/` url. Those can
|
||||
be added even after the process has started.
|
||||
|
||||
Installed plugins that do not contain any java related content, will
|
||||
automatically be detected as site plugins, and their content will be
|
||||
moved under `_site`.
|
||||
|
||||
The ability to install plugins from Github allows to easily install site
|
||||
plugins hosted there by downloading the actual repo, for example,
|
||||
running:
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
# From 0.90.2
|
||||
bin/plugin --install mobz/elasticsearch-head
|
||||
bin/plugin --install lukas-vlcek/bigdesk
|
||||
|
||||
# From a prior version
|
||||
bin/plugin -install mobz/elasticsearch-head
|
||||
bin/plugin -install lukas-vlcek/bigdesk
|
||||
--------------------------------------------------
|
||||
|
||||
Will install both of those site plugins, with `elasticsearch-head`
|
||||
available under `http://localhost:9200/_plugin/head/` and `bigdesk`
|
||||
available under `http://localhost:9200/_plugin/bigdesk/`.
|
||||
|
||||
[float]
|
||||
==== Mandatory Plugins
|
||||
|
||||
If you rely on some plugins, you can define mandatory plugins using the
|
||||
`plugin.mandatory` attribute, for example, here is a sample config:
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
plugin.mandatory: mapper-attachments,lang-groovy
|
||||
--------------------------------------------------
|
||||
|
||||
For safety reasons, if a mandatory plugin is not installed, the node
|
||||
will not start.
|
||||
|
||||
[float]
|
||||
==== Installed Plugins
|
||||
|
||||
A list of the currently loaded plugins can be retrieved using the
|
||||
<<cluster-nodes-info,Node Info API>>.
|
||||
|
||||
[float]
|
||||
=== Known Plugins
|
||||
|
||||
[float]
|
||||
==== Analysis Plugins
|
||||
|
||||
* https://github.com/yakaz/elasticsearch-analysis-combo/[Combo Analysis
|
||||
Plugin] (by Olivier Favre, Yakaz)
|
||||
* https://github.com/elasticsearch/elasticsearch-analysis-smartcn[Smart
|
||||
Chinese Analysis Plugin] (by elasticsearch team)
|
||||
* https://github.com/elasticsearch/elasticsearch-analysis-icu[ICU
|
||||
Analysis plugin] (by elasticsearch team)
|
||||
* https://github.com/elasticsearch/elasticsearch-analysis-stempel[Stempel
|
||||
(Polish) Analysis plugin] (by elasticsearch team)
|
||||
* https://github.com/chytreg/elasticsearch-analysis-morfologik[Morfologik
|
||||
(Polish) Analysis plugin] (by chytreg)
|
||||
* https://github.com/medcl/elasticsearch-analysis-ik[IK Analysis Plugin]
|
||||
(by Medcl)
|
||||
* https://github.com/medcl/elasticsearch-analysis-mmseg[Mmseg Analysis
|
||||
Plugin] (by Medcl)
|
||||
* https://github.com/jprante/elasticsearch-analysis-hunspell[Hunspell
|
||||
Analysis Plugin] (by Jörg Prante)
|
||||
* https://github.com/elasticsearch/elasticsearch-analysis-kuromoji[Japanese
|
||||
(Kuromoji) Analysis plugin] (by elasticsearch team).
|
||||
* https://github.com/suguru/elasticsearch-analysis-japanese[Japanese
|
||||
Analysis plugin] (by suguru).
|
||||
* https://github.com/imotov/elasticsearch-analysis-morphology[Russian
|
||||
and English Morphological Analysis Plugin] (by Igor Motov)
|
||||
* https://github.com/medcl/elasticsearch-analysis-pinyin[Pinyin Analysis
|
||||
Plugin] (by Medcl)
|
||||
* https://github.com/medcl/elasticsearch-analysis-string2int[String2Integer
|
||||
Analysis Plugin] (by Medcl)
|
||||
* https://github.com/barminator/elasticsearch-analysis-annotation[Annotation
|
||||
Analysis Plugin] (by Michal Samek)
|
||||
|
||||
[float]
|
||||
==== River Plugins
|
||||
|
||||
* https://github.com/elasticsearch/elasticsearch-river-couchdb[CouchDB
|
||||
River Plugin] (by elasticsearch team)
|
||||
* https://github.com/elasticsearch/elasticsearch-river-wikipedia[Wikipedia
|
||||
River Plugin] (by elasticsearch team)
|
||||
* https://github.com/elasticsearch/elasticsearch-river-twitter[Twitter
|
||||
River Plugin] (by elasticsearch team)
|
||||
* https://github.com/elasticsearch/elasticsearch-river-rabbitmq[RabbitMQ
|
||||
River Plugin] (by elasticsearch team)
|
||||
* https://github.com/domdorn/elasticsearch-river-activemq/[ActiveMQ
|
||||
River Plugin] (by Dominik Dorn)
|
||||
* https://github.com/albogdano/elasticsearch-river-amazonsqs[Amazon SQS
|
||||
River Plugin] (by Alex Bogdanovski)
|
||||
* https://github.com/xxBedy/elasticsearch-river-csv[CSV River Plugin]
|
||||
(by Martin Bednar)
|
||||
* http://www.pilato.fr/dropbox/[Dropbox River Plugin] (by David Pilato)
|
||||
* http://www.pilato.fr/fsriver/[FileSystem River Plugin] (by David
|
||||
Pilato)
|
||||
* https://github.com/sksamuel/elasticsearch-river-hazelcast[Hazelcast
|
||||
River Plugin] (by Steve Samuel)
|
||||
* https://github.com/jprante/elasticsearch-river-jdbc[JDBC River Plugin]
|
||||
(by Jörg Prante)
|
||||
* https://github.com/qotho/elasticsearch-river-jms[JMS River Plugin] (by
|
||||
Steve Sarandos)
|
||||
* https://github.com/tlrx/elasticsearch-river-ldap[LDAP River Plugin]
|
||||
(by Tanguy Leroux)
|
||||
* https://github.com/richardwilly98/elasticsearch-river-mongodb/[MongoDB
|
||||
River Plugin] (by Richard Louapre)
|
||||
* https://github.com/sksamuel/elasticsearch-river-neo4j[Neo4j River
|
||||
Plugin] (by Steve Samuel)
|
||||
* https://github.com/jprante/elasticsearch-river-oai/[Open Archives
|
||||
Initiative (OAI) River Plugin] (by Jörg Prante)
|
||||
* https://github.com/sksamuel/elasticsearch-river-redis[Redis River
|
||||
Plugin] (by Steve Samuel)
|
||||
* http://dadoonet.github.com/rssriver/[RSS River Plugin] (by David
|
||||
Pilato)
|
||||
* https://github.com/adamlofts/elasticsearch-river-sofa[Sofa River
|
||||
Plugin] (by adamlofts)
|
||||
* https://github.com/javanna/elasticsearch-river-solr/[Solr River
|
||||
Plugin] (by Luca Cavanna)
|
||||
* https://github.com/sunnygleason/elasticsearch-river-st9[St9 River
|
||||
Plugin] (by Sunny Gleason)
|
||||
* https://github.com/endgameinc/elasticsearch-river-kafka[Kafka River
|
||||
Plugin] (by Endgame Inc.)
|
||||
* https://github.com/obazoud/elasticsearch-river-git[Git River Plugin] (by Olivier Bazoud)
|
||||
|
||||
[float]
|
||||
==== Transport Plugins
|
||||
|
||||
* https://github.com/elasticsearch/elasticsearch-transport-wares[Servlet
|
||||
transport] (by elasticsearch team)
|
||||
* https://github.com/elasticsearch/elasticsearch-transport-memcached[Memcached
|
||||
transport plugin] (by elasticsearch team)
|
||||
* https://github.com/elasticsearch/elasticsearch-transport-thrift[Thrift
|
||||
Transport] (by elasticsearch team)
|
||||
* https://github.com/tlrx/transport-zeromq[ZeroMQ transport layer
|
||||
plugin] (by Tanguy Leroux)
|
||||
* https://github.com/sonian/elasticsearch-jetty[Jetty HTTP transport
|
||||
plugin] (by Sonian Inc.)
|
||||
|
||||
[float]
|
||||
==== Scripting Plugins
|
||||
|
||||
* https://github.com/elasticsearch/elasticsearch-lang-python[Python
|
||||
language Plugin] (by elasticsearch team)
|
||||
* https://github.com/elasticsearch/elasticsearch-lang-javascript[JavaScript
|
||||
language Plugin] (by elasticsearch team)
|
||||
* https://github.com/elasticsearch/elasticsearch-lang-groovy[Groovy lang
|
||||
Plugin] (by elasticsearch team)
|
||||
* https://github.com/hiredman/elasticsearch-lang-clojure[Clojure
|
||||
Language Plugin] (by Kevin Downey)
|
||||
|
||||
[float]
|
||||
==== Site Plugins
|
||||
|
||||
* https://github.com/lukas-vlcek/bigdesk[BigDesk Plugin] (by Lukáš Vlček)
|
||||
* https://github.com/mobz/elasticsearch-head[Elasticsearch Head Plugin]
|
||||
(by Ben Birch)
|
||||
* https://github.com/royrusso/elasticsearch-HQ[ElasticSearch HQ] (by Roy
|
||||
Russo)
|
||||
* https://github.com/karmi/elasticsearch-paramedic[Paramedic Plugin] (by
|
||||
Karel Minařík)
|
||||
* https://github.com/polyfractal/elasticsearch-segmentspy[SegmentSpy
|
||||
Plugin] (by Zachary Tong)
|
||||
* https://github.com/polyfractal/elasticsearch-inquisitor[Inquisitor
|
||||
Plugin] (by Zachary Tong)
|
||||
* https://github.com/andrewvc/elastic-hammer[Hammer Plugin] (by Andrew
|
||||
Cholakian)
|
||||
|
||||
[float]
|
||||
==== Misc Plugins
|
||||
|
||||
* https://github.com/elasticsearch/elasticsearch-mapper-attachments[Mapper
|
||||
Attachments Type plugin] (by elasticsearch team)
|
||||
* https://github.com/elasticsearch/elasticsearch-hadoop[Hadoop Plugin]
|
||||
(by elasticsearch team)
|
||||
* https://github.com/elasticsearch/elasticsearch-cloud-aws[AWS Cloud
|
||||
Plugin] (by elasticsearch team)
|
||||
* https://github.com/mattweber/elasticsearch-mocksolrplugin[ElasticSearch
|
||||
Mock Solr Plugin] (by Matt Weber)
|
||||
* https://github.com/spinscale/elasticsearch-suggest-plugin[Suggester
|
||||
Plugin] (by Alexander Reelsen)
|
||||
* https://github.com/medcl/elasticsearch-partialupdate[ElasticSearch
|
||||
PartialUpdate Plugin] (by Medcl)
|
||||
* https://github.com/sonian/elasticsearch-zookeeper[ZooKeeper Discovery
|
||||
Plugin] (by Sonian Inc.)
|
||||
* https://github.com/derryx/elasticsearch-changes-plugin[ElasticSearch
|
||||
Changes Plugin] (by Thomas Peuss)
|
||||
* http://tlrx.github.com/elasticsearch-view-plugin[ElasticSearch View
|
||||
Plugin] (by Tanguy Leroux)
|
||||
* https://github.com/viniciusccarvalho/elasticsearch-newrelic[ElasticSearch
|
||||
New Relic Plugin] (by Vinicius Carvalho)
|
||||
* https://github.com/endgameinc/elasticsearch-term-plugin[Terms
|
||||
Component Plugin] (by Endgame Inc.)
|
||||
* https://github.com/carrot2/elasticsearch-carrot2[carrot2 Plugin]:
|
||||
Results clustering with carrot2 (by Dawid Weiss)
|
||||
|
242
docs/reference/modules/scripting.asciidoc
Normal file
242
docs/reference/modules/scripting.asciidoc
Normal file
|
@ -0,0 +1,242 @@
|
|||
[[modules-scripting]]
|
||||
== Scripting
|
||||
|
||||
The scripting module allows to use scripts in order to evaluate custom
|
||||
expressions. For example, scripts can be used to return "script fields"
|
||||
as part of a search request, or can be used to evaluate a custom score
|
||||
for a query and so on.
|
||||
|
||||
The scripting module uses by default http://mvel.codehaus.org/[mvel] as
|
||||
the scripting language with some extensions. mvel is used since it is
|
||||
extremely fast and very simple to use, and in most cases, simple
|
||||
expressions are needed (for example, mathematical equations).
|
||||
|
||||
Additional `lang` plugins are provided to allow to execute scripts in
|
||||
different languages. Currently supported plugins are `lang-javascript`
|
||||
for JavaScript, `lang-groovy` for Groovy, and `lang-python` for Python.
|
||||
All places where a `script` parameter can be used, a `lang` parameter
|
||||
(on the same level) can be provided to define the language of the
|
||||
script. The `lang` options are `mvel`, `js`, `groovy`, `python`, and
|
||||
`native`.
|
||||
|
||||
[float]
|
||||
=== Default Scripting Language
|
||||
|
||||
The default scripting language (assuming no `lang` parameter is
|
||||
provided) is `mvel`. In order to change it set the `script.default_lang`
|
||||
to the appropriate language.
|
||||
|
||||
[float]
|
||||
=== Preloaded Scripts
|
||||
|
||||
Scripts can always be provided as part of the relevant API, but they can
|
||||
also be preloaded by placing them under `config/scripts` and then
|
||||
referencing them by the script name (instead of providing the full
|
||||
script). This helps reduce the amount of data passed between the client
|
||||
and the nodes.
|
||||
|
||||
The name of the script is derived from the hierarchy of directories it
|
||||
exists under, and the file name without the lang extension. For example,
|
||||
a script placed under `config/scripts/group1/group2/test.py` will be
|
||||
named `group1_group2_test`.
|
||||
|
||||
[float]
|
||||
=== Native (Java) Scripts
|
||||
|
||||
Even though `mvel` is pretty fast, allow to register native Java based
|
||||
scripts for faster execution.
|
||||
|
||||
In order to allow for scripts, the `NativeScriptFactory` needs to be
|
||||
implemented that constructs the script that will be executed. There are
|
||||
two main types, one that extends `AbstractExecutableScript` and one that
|
||||
extends `AbstractSearchScript` (probably the one most users will extend,
|
||||
with additional helper classes in `AbstractLongSearchScript`,
|
||||
`AbstractDoubleSearchScript`, and `AbstractFloatSearchScript`).
|
||||
|
||||
Registering them can either be done by settings, for example:
|
||||
`script.native.my.type` set to `sample.MyNativeScriptFactory` will
|
||||
register a script named `my`. Another option is in a plugin, access
|
||||
`ScriptModule` and call `registerScript` on it.
|
||||
|
||||
Executing the script is done by specifying the `lang` as `native`, and
|
||||
the name of the script as the `script`.
|
||||
|
||||
Note, the scripts need to be in the classpath of elasticsearch. One
|
||||
simple way to do it is to create a directory under plugins (choose a
|
||||
descriptive name), and place the jar / classes files there, they will be
|
||||
automatically loaded.
|
||||
|
||||
[float]
|
||||
=== Score
|
||||
|
||||
In all scripts that can be used in facets, allow to access the current
|
||||
doc score using `doc.score`.
|
||||
|
||||
[float]
|
||||
=== Document Fields
|
||||
|
||||
Most scripting revolve around the use of specific document fields data.
|
||||
The `doc['field_name']` can be used to access specific field data within
|
||||
a document (the document in question is usually derived by the context
|
||||
the script is used). Document fields are very fast to access since they
|
||||
end up being loaded into memory (all the relevant field values/tokens
|
||||
are loaded to memory).
|
||||
|
||||
The following data can be extracted from a field:
|
||||
|
||||
[cols="<,<",options="header",]
|
||||
|=======================================================================
|
||||
|Expression |Description
|
||||
|`doc['field_name'].value` |The native value of the field. For example,
|
||||
if its a short type, it will be short.
|
||||
|
||||
|`doc['field_name'].values` |The native array values of the field. For
|
||||
example, if its a short type, it will be short[]. Remember, a field can
|
||||
have several values within a single doc. Returns an empty array if the
|
||||
field has no values.
|
||||
|
||||
|`doc['field_name'].empty` |A boolean indicating if the field has no
|
||||
values within the doc.
|
||||
|
||||
|`doc['field_name'].multiValued` |A boolean indicating that the field
|
||||
has several values within the corpus.
|
||||
|
||||
|`doc['field_name'].lat` |The latitude of a geo point type.
|
||||
|
||||
|`doc['field_name'].lon` |The longitude of a geo point type.
|
||||
|
||||
|`doc['field_name'].lats` |The latitudes of a geo point type.
|
||||
|
||||
|`doc['field_name'].lons` |The longitudes of a geo point type.
|
||||
|
||||
|`doc['field_name'].distance(lat, lon)` |The `plane` distance (in miles)
|
||||
of this geo point field from the provided lat/lon.
|
||||
|
||||
|`doc['field_name'].arcDistance(lat, lon)` |The `arc` distance (in
|
||||
miles) of this geo point field from the provided lat/lon.
|
||||
|
||||
|`doc['field_name'].distanceInKm(lat, lon)` |The `plane` distance (in
|
||||
km) of this geo point field from the provided lat/lon.
|
||||
|
||||
|`doc['field_name'].arcDistanceInKm(lat, lon)` |The `arc` distance (in
|
||||
km) of this geo point field from the provided lat/lon.
|
||||
|
||||
|`doc['field_name'].geohashDistance(geohash)` |The distance (in miles)
|
||||
of this geo point field from the provided geohash.
|
||||
|
||||
|`doc['field_name'].geohashDistanceInKm(geohash)` |The distance (in km)
|
||||
of this geo point field from the provided geohash.
|
||||
|=======================================================================
|
||||
|
||||
[float]
|
||||
=== Stored Fields
|
||||
|
||||
Stored fields can also be accessed when executed a script. Note, they
|
||||
are much slower to access compared with document fields, but are not
|
||||
loaded into memory. They can be simply accessed using
|
||||
`_fields['my_field_name'].value` or `_fields['my_field_name'].values`.
|
||||
|
||||
[float]
|
||||
=== Source Field
|
||||
|
||||
The source field can also be accessed when executing a script. The
|
||||
source field is loaded per doc, parsed, and then provided to the script
|
||||
for evaluation. The `_source` forms the context under which the source
|
||||
field can be accessed, for example `_source.obj2.obj1.field3`.
|
||||
|
||||
[float]
|
||||
=== mvel Built In Functions
|
||||
|
||||
There are several built in functions that can be used within scripts.
|
||||
They include:
|
||||
|
||||
[cols="<,<",options="header",]
|
||||
|=======================================================================
|
||||
|Function |Description
|
||||
|`time()` |The current time in milliseconds.
|
||||
|
||||
|`sin(a)` |Returns the trigonometric sine of an angle.
|
||||
|
||||
|`cos(a)` |Returns the trigonometric cosine of an angle.
|
||||
|
||||
|`tan(a)` |Returns the trigonometric tangent of an angle.
|
||||
|
||||
|`asin(a)` |Returns the arc sine of a value.
|
||||
|
||||
|`acos(a)` |Returns the arc cosine of a value.
|
||||
|
||||
|`atan(a)` |Returns the arc tangent of a value.
|
||||
|
||||
|`toRadians(angdeg)` |Converts an angle measured in degrees to an
|
||||
approximately equivalent angle measured in radians
|
||||
|
||||
|`toDegrees(angrad)` |Converts an angle measured in radians to an
|
||||
approximately equivalent angle measured in degrees.
|
||||
|
||||
|`exp(a)` |Returns Euler's number _e_ raised to the power of value.
|
||||
|
||||
|`log(a)` |Returns the natural logarithm (base _e_) of a value.
|
||||
|
||||
|`log10(a)` |Returns the base 10 logarithm of a value.
|
||||
|
||||
|`sqrt(a)` |Returns the correctly rounded positive square root of a
|
||||
value.
|
||||
|
||||
|`cbrt(a)` |Returns the cube root of a double value.
|
||||
|
||||
|`IEEEremainder(f1, f2)` |Computes the remainder operation on two
|
||||
arguments as prescribed by the IEEE 754 standard.
|
||||
|
||||
|`ceil(a)` |Returns the smallest (closest to negative infinity) value
|
||||
that is greater than or equal to the argument and is equal to a
|
||||
mathematical integer.
|
||||
|
||||
|`floor(a)` |Returns the largest (closest to positive infinity) value
|
||||
that is less than or equal to the argument and is equal to a
|
||||
mathematical integer.
|
||||
|
||||
|`rint(a)` |Returns the value that is closest in value to the argument
|
||||
and is equal to a mathematical integer.
|
||||
|
||||
|`atan2(y, x)` |Returns the angle _theta_ from the conversion of
|
||||
rectangular coordinates (_x_, _y_) to polar coordinates (r,_theta_).
|
||||
|
||||
|`pow(a, b)` |Returns the value of the first argument raised to the
|
||||
power of the second argument.
|
||||
|
||||
|`round(a)` |Returns the closest _int_ to the argument.
|
||||
|
||||
|`random()` |Returns a random _double_ value.
|
||||
|
||||
|`abs(a)` |Returns the absolute value of a value.
|
||||
|
||||
|`max(a, b)` |Returns the greater of two values.
|
||||
|
||||
|`min(a, b)` |Returns the smaller of two values.
|
||||
|
||||
|`ulp(d)` |Returns the size of an ulp of the argument.
|
||||
|
||||
|`signum(d)` |Returns the signum function of the argument.
|
||||
|
||||
|`sinh(x)` |Returns the hyperbolic sine of a value.
|
||||
|
||||
|`cosh(x)` |Returns the hyperbolic cosine of a value.
|
||||
|
||||
|`tanh(x)` |Returns the hyperbolic tangent of a value.
|
||||
|
||||
|`hypot(x, y)` |Returns sqrt(_x2_ + _y2_) without intermediate overflow
|
||||
or underflow.
|
||||
|=======================================================================
|
||||
|
||||
[float]
|
||||
=== Arithmetic precision in MVEL
|
||||
|
||||
When dividing two numbers using MVEL based scripts, the engine tries to
|
||||
be smart and adheres to the default behaviour of java. This means if you
|
||||
divide two integers (you might have configured the fields as integer in
|
||||
the mapping), the result will also be an integer. This means, if a
|
||||
calculation like `1/num` is happening in your scripts and `num` is an
|
||||
integer with the value of `8`, the result is `0` even though you were
|
||||
expecting it to be `0.125`. You may need to enforce precision by
|
||||
explicitly using a double like `1.0/num` in order to get the expected
|
||||
result.
|
120
docs/reference/modules/threadpool.asciidoc
Normal file
120
docs/reference/modules/threadpool.asciidoc
Normal file
|
@ -0,0 +1,120 @@
|
|||
[[modules-threadpool]]
|
||||
== Thread Pool
|
||||
|
||||
A node holds several thread pools in order to improve how threads are
|
||||
managed and memory consumption within a node. There are several thread
|
||||
pools, but the important ones include:
|
||||
|
||||
[horizontal]
|
||||
`index`::
|
||||
For index/delete operations, defaults to `fixed` type since
|
||||
`0.90.0`, size `# of available processors`. (previously type `cached`)
|
||||
|
||||
`search`::
|
||||
For count/search operations, defaults to `fixed` type since
|
||||
`0.90.0`, size `3x # of available processors`. (previously type
|
||||
`cached`)
|
||||
|
||||
`get`::
|
||||
For get operations, defaults to `fixed` type since `0.90.0`,
|
||||
size `# of available processors`. (previously type `cached`)
|
||||
|
||||
`bulk`::
|
||||
For bulk operations, defaults to `fixed` type since `0.90.0`,
|
||||
size `# of available processors`. (previously type `cached`)
|
||||
|
||||
`warmer`::
|
||||
For segment warm-up operations, defaults to `scaling` since
|
||||
`0.90.0` with a `5m` keep-alive. (previously type `cached`)
|
||||
|
||||
`refresh`::
|
||||
For refresh operations, defaults to `scaling` since
|
||||
`0.90.0` with a `5m` keep-alive. (previously type `cached`)
|
||||
|
||||
Changing a specific thread pool can be done by setting its type and
|
||||
specific type parameters, for example, changing the `index` thread pool
|
||||
to `blocking` type:
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
threadpool:
|
||||
index:
|
||||
type: blocking
|
||||
min: 1
|
||||
size: 30
|
||||
wait_time: 30s
|
||||
--------------------------------------------------
|
||||
|
||||
NOTE: you can update threadpool settings live using
|
||||
<<cluster-update-settings>>.
|
||||
|
||||
|
||||
[float]
|
||||
=== Thread pool types
|
||||
|
||||
The following are the types of thread pools that can be used and their
|
||||
respective parameters:
|
||||
|
||||
[float]
|
||||
==== `cache`
|
||||
|
||||
The `cache` thread pool is an unbounded thread pool that will spawn a
|
||||
thread if there are pending requests. Here is an example of how to set
|
||||
it:
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
threadpool:
|
||||
index:
|
||||
type: cached
|
||||
--------------------------------------------------
|
||||
|
||||
[float]
|
||||
==== `fixed`
|
||||
|
||||
The `fixed` thread pool holds a fixed size of threads to handle the
|
||||
requests with a queue (optionally bounded) for pending requests that
|
||||
have no threads to service them.
|
||||
|
||||
The `size` parameter controls the number of threads, and defaults to the
|
||||
number of cores times 5.
|
||||
|
||||
The `queue_size` allows to control the size of the queue of pending
|
||||
requests that have no threads to execute them. By default, it is set to
|
||||
`-1` which means its unbounded. When a request comes in and the queue is
|
||||
full, the `reject_policy` parameter can control how it will behave. The
|
||||
default, `abort`, will simply fail the request. Setting it to `caller`
|
||||
will cause the request to execute on an IO thread allowing to throttle
|
||||
the execution on the networking layer.
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
threadpool:
|
||||
index:
|
||||
type: fixed
|
||||
size: 30
|
||||
queue_size: 1000
|
||||
reject_policy: caller
|
||||
--------------------------------------------------
|
||||
|
||||
[float]
|
||||
==== `blocking`
|
||||
|
||||
The `blocking` pool allows to configure a `min` (defaults to `1`) and
|
||||
`size` (defaults to the number of cores times 5) parameters for the
|
||||
number of threads.
|
||||
|
||||
It also has a backlog queue with a default `queue_size` of `1000`. Once
|
||||
the queue is full, it will wait for the provided `wait_time` (defaults
|
||||
to `60s`) on the calling IO thread, and fail if it has not been
|
||||
executed.
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
threadpool:
|
||||
index:
|
||||
type: blocking
|
||||
min: 1
|
||||
size: 30
|
||||
wait_time: 30s
|
||||
--------------------------------------------------
|
25
docs/reference/modules/thrift.asciidoc
Normal file
25
docs/reference/modules/thrift.asciidoc
Normal file
|
@ -0,0 +1,25 @@
|
|||
[[modules-thrift]]
|
||||
== Thrift
|
||||
|
||||
The thrift transport module allows to expose the REST interface of
|
||||
elasticsearch using thrift. Thrift should provide better performance
|
||||
over http. Since thrift provides both the wire protocol and the
|
||||
transport, it should make using it simpler (thought its lacking on
|
||||
docs...).
|
||||
|
||||
Using thrift requires installing the `transport-thrift` plugin, located
|
||||
https://github.com/elasticsearch/elasticsearch-transport-thrift[here].
|
||||
|
||||
The thrift
|
||||
https://github.com/elasticsearch/elasticsearch-transport-thrift/blob/master/elasticsearch.thrift[schema]
|
||||
can be used to generate thrift clients.
|
||||
|
||||
[cols="<,<",options="header",]
|
||||
|=======================================================================
|
||||
|Setting |Description
|
||||
|`thrift.port` |The port to bind to. Defaults to 9500-9600
|
||||
|
||||
|`thrift.frame` |Defaults to `-1`, which means no framing. Set to a
|
||||
higher value to specify the frame size (like `15mb`).
|
||||
|=======================================================================
|
||||
|
43
docs/reference/modules/transport.asciidoc
Normal file
43
docs/reference/modules/transport.asciidoc
Normal file
|
@ -0,0 +1,43 @@
|
|||
[[modules-transport]]
|
||||
== Transport
|
||||
|
||||
The transport module is used for internal communication between nodes
|
||||
within the cluster. Each call that goes from one node to the other uses
|
||||
the transport module (for example, when an HTTP GET request is processed
|
||||
by one node, and should actually be processed by another node that holds
|
||||
the data).
|
||||
|
||||
The transport mechanism is completely asynchronous in nature, meaning
|
||||
that there is no blocking thread waiting for a response. The benefit of
|
||||
using asynchronous communication is first solving the
|
||||
http://en.wikipedia.org/wiki/C10k_problem[C10k problem], as well as
|
||||
being the idle solution for scatter (broadcast) / gather operations such
|
||||
as search in ElasticSearch.
|
||||
|
||||
[float]
|
||||
=== TCP Transport
|
||||
|
||||
The TCP transport is an implementation of the transport module using
|
||||
TCP. It allows for the following settings:
|
||||
|
||||
[cols="<,<",options="header",]
|
||||
|=======================================================================
|
||||
|Setting |Description
|
||||
|`transport.tcp.port` |A bind port range. Defaults to `9300-9400`.
|
||||
|
||||
|`transport.tcp.connect_timeout` |The socket connect timeout setting (in
|
||||
time setting format). Defaults to `2s`.
|
||||
|
||||
|`transport.tcp.compress` |Set to `true` to enable compression (LZF)
|
||||
between all nodes. Defaults to `false`.
|
||||
|=======================================================================
|
||||
|
||||
It also shares the uses the common
|
||||
<<modules-network,network settings>>.
|
||||
|
||||
[float]
|
||||
=== Local Transport
|
||||
|
||||
This is a handy transport to use when running integration tests within
|
||||
the JVM. It is automatically enabled when using
|
||||
`NodeBuilder#local(true)`.
|
Loading…
Add table
Add a link
Reference in a new issue