mirror of
https://github.com/elastic/elasticsearch.git
synced 2025-06-28 09:28:55 -04:00
[docs] Migrate docs from AsciiDoc to Markdown (#123507)
* delete asciidoc files
* add migrated files
* fix errors
* Disable docs tests
* Clarify release notes page titles
* Revert "Clarify release notes page titles"
This reverts commit 8be688648d
.
* Comment out edternal URI images
* Clean up query languages landing pages, link to conceptual docs
* Add .md to url
* Fixes inference processor nesting.
---------
Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com>
Co-authored-by: Liam Thompson <leemthompo@gmail.com>
Co-authored-by: Martijn Laarman <Mpdreamz@gmail.com>
Co-authored-by: István Zoltán Szabó <szabosteve@gmail.com>
This commit is contained in:
parent
2113a3c606
commit
b7e3a1e14b
4082 changed files with 141513 additions and 376367 deletions
|
@ -0,0 +1,62 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/_other_command_line_parameters.html
|
||||
---
|
||||
|
||||
# Other command line parameters [_other_command_line_parameters]
|
||||
|
||||
The `plugin` scripts supports a number of other command line parameters:
|
||||
|
||||
|
||||
## Silent/verbose mode [_silentverbose_mode]
|
||||
|
||||
The `--verbose` parameter outputs more debug information, while the `--silent` parameter turns off all output including the progress bar. The script may return the following exit codes:
|
||||
|
||||
`0`
|
||||
: everything was OK
|
||||
|
||||
`64`
|
||||
: unknown command or incorrect option parameter
|
||||
|
||||
`74`
|
||||
: IO error
|
||||
|
||||
`70`
|
||||
: any other error
|
||||
|
||||
|
||||
## Batch mode [_batch_mode]
|
||||
|
||||
Certain plugins require more privileges than those provided by default in core Elasticsearch. These plugins will list the required privileges and ask the user for confirmation before continuing with installation.
|
||||
|
||||
When running the plugin install script from another program (e.g. install automation scripts), the plugin script should detect that it is not being called from the console and skip the confirmation response, automatically granting all requested permissions. If console detection fails, then batch mode can be forced by specifying `-b` or `--batch` as follows:
|
||||
|
||||
```shell
|
||||
sudo bin/elasticsearch-plugin install --batch [pluginname]
|
||||
```
|
||||
|
||||
|
||||
## Custom config directory [_custom_config_directory]
|
||||
|
||||
If your `elasticsearch.yml` config file is in a custom location, you will need to specify the path to the config file when using the `plugin` script. You can do this as follows:
|
||||
|
||||
```sh
|
||||
sudo ES_PATH_CONF=/path/to/conf/dir bin/elasticsearch-plugin install <plugin name>
|
||||
```
|
||||
|
||||
|
||||
## Proxy settings [_proxy_settings]
|
||||
|
||||
To install a plugin via a proxy, you can add the proxy details to the `CLI_JAVA_OPTS` environment variable with the Java settings `http.proxyHost` and `http.proxyPort` (or `https.proxyHost` and `https.proxyPort`):
|
||||
|
||||
```shell
|
||||
sudo CLI_JAVA_OPTS="-Dhttp.proxyHost=host_name -Dhttp.proxyPort=port_number -Dhttps.proxyHost=host_name -Dhttps.proxyPort=https_port_number" bin/elasticsearch-plugin install analysis-icu
|
||||
```
|
||||
|
||||
Or on Windows:
|
||||
|
||||
```shell
|
||||
set CLI_JAVA_OPTS="-Dhttp.proxyHost=host_name -Dhttp.proxyPort=port_number -Dhttps.proxyHost=host_name -Dhttps.proxyPort=https_port_number"
|
||||
bin\elasticsearch-plugin install analysis-icu
|
||||
```
|
||||
|
14
docs/reference/elasticsearch-plugins/_plugins_directory.md
Normal file
14
docs/reference/elasticsearch-plugins/_plugins_directory.md
Normal file
|
@ -0,0 +1,14 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/_plugins_directory.html
|
||||
---
|
||||
|
||||
# Plugins directory [_plugins_directory]
|
||||
|
||||
The default location of the `plugins` directory depends on which package you install:
|
||||
|
||||
* [Directory layout of `.tar.gz` archives](docs-content://deploy-manage/deploy/self-managed/install-elasticsearch-from-archive-on-linux-macos.md#targz-layout)
|
||||
* [Directory layout of Windows `.zip` archives](docs-content://deploy-manage/deploy/self-managed/install-elasticsearch-with-zip-on-windows.md#windows-layout)
|
||||
* [Directory layout of Debian package](docs-content://deploy-manage/deploy/self-managed/install-elasticsearch-with-debian-package.md#deb-layout)
|
||||
* [Directory layout of RPM](docs-content://deploy-manage/deploy/self-managed/install-elasticsearch-with-rpm.md#rpm-layout)
|
||||
|
|
@ -0,0 +1,28 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/_reimplementing_and_extending_the_analyzers.html
|
||||
---
|
||||
|
||||
# Reimplementing and extending the analyzers [_reimplementing_and_extending_the_analyzers]
|
||||
|
||||
The `smartcn` analyzer could be reimplemented as a `custom` analyzer that can then be extended and configured as follows:
|
||||
|
||||
```console
|
||||
PUT smartcn_example
|
||||
{
|
||||
"settings": {
|
||||
"analysis": {
|
||||
"analyzer": {
|
||||
"rebuilt_smartcn": {
|
||||
"tokenizer": "smartcn_tokenizer",
|
||||
"filter": [
|
||||
"porter_stem",
|
||||
"smartcn_stop"
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
|
@ -0,0 +1,29 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/_reimplementing_and_extending_the_analyzers_2.html
|
||||
---
|
||||
|
||||
# Reimplementing and extending the analyzers [_reimplementing_and_extending_the_analyzers_2]
|
||||
|
||||
The `polish` analyzer could be reimplemented as a `custom` analyzer that can then be extended and configured differently as follows:
|
||||
|
||||
```console
|
||||
PUT /stempel_example
|
||||
{
|
||||
"settings": {
|
||||
"analysis": {
|
||||
"analyzer": {
|
||||
"rebuilt_stempel": {
|
||||
"tokenizer": "standard",
|
||||
"filter": [
|
||||
"lowercase",
|
||||
"polish_stop",
|
||||
"polish_stem"
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
|
@ -0,0 +1,17 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/analysis-icu-analyzer.html
|
||||
---
|
||||
|
||||
# ICU analyzer [analysis-icu-analyzer]
|
||||
|
||||
The `icu_analyzer` analyzer performs basic normalization, tokenization and character folding, using the `icu_normalizer` char filter, `icu_tokenizer` and `icu_folding` token filter
|
||||
|
||||
The following parameters are accepted:
|
||||
|
||||
`method`
|
||||
: Normalization method. Accepts `nfkc`, `nfc` or `nfkc_cf` (default)
|
||||
|
||||
`mode`
|
||||
: Normalization mode. Accepts `compose` (default) or `decompose`.
|
||||
|
|
@ -0,0 +1,102 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/analysis-icu-collation-keyword-field.html
|
||||
---
|
||||
|
||||
# ICU collation keyword field [analysis-icu-collation-keyword-field]
|
||||
|
||||
Collations are used for sorting documents in a language-specific word order. The `icu_collation_keyword` field type is available to all indices and will encode the terms directly as bytes in a doc values field and a single indexed token just like a standard [Keyword Field](/reference/elasticsearch/mapping-reference/keyword.md).
|
||||
|
||||
Defaults to using [DUCET collation](https://www.elastic.co/guide/en/elasticsearch/guide/2.x/sorting-collations.html#uca), which is a best-effort attempt at language-neutral sorting.
|
||||
|
||||
Below is an example of how to set up a field for sorting German names in phonebook order:
|
||||
|
||||
```console
|
||||
PUT my-index-000001
|
||||
{
|
||||
"mappings": {
|
||||
"properties": {
|
||||
"name": { <1>
|
||||
"type": "text",
|
||||
"fields": {
|
||||
"sort": { <2>
|
||||
"type": "icu_collation_keyword",
|
||||
"index": false,
|
||||
"language": "de",
|
||||
"country": "DE",
|
||||
"variant": "@collation=phonebook"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
GET /my-index-000001/_search <3>
|
||||
{
|
||||
"query": {
|
||||
"match": {
|
||||
"name": "Fritz"
|
||||
}
|
||||
},
|
||||
"sort": "name.sort"
|
||||
}
|
||||
```
|
||||
|
||||
1. The `name` field uses the `standard` analyzer, and so supports full text queries.
|
||||
2. The `name.sort` field is an `icu_collation_keyword` field that will preserve the name as a single token doc values, and applies the German phonebook order.
|
||||
3. An example query which searches the `name` field and sorts on the `name.sort` field.
|
||||
|
||||
|
||||
## Parameters for ICU collation keyword fields [_parameters_for_icu_collation_keyword_fields]
|
||||
|
||||
The following parameters are accepted by `icu_collation_keyword` fields:
|
||||
|
||||
`doc_values`
|
||||
: Should the field be stored on disk in a column-stride fashion, so that it can later be used for sorting, aggregations, or scripting? Accepts `true` (default) or `false`.
|
||||
|
||||
`index`
|
||||
: Should the field be searchable? Accepts `true` (default) or `false`.
|
||||
|
||||
`null_value`
|
||||
: Accepts a string value which is substituted for any explicit `null` values. Defaults to `null`, which means the field is treated as missing.
|
||||
|
||||
[`ignore_above`](/reference/elasticsearch/mapping-reference/ignore-above.md)
|
||||
: Strings longer than the `ignore_above` setting will be ignored. Checking is performed on the original string before the collation. The `ignore_above` setting can be updated on existing fields using the [PUT mapping API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-put-mapping). By default, there is no limit and all values will be indexed.
|
||||
|
||||
`store`
|
||||
: Whether the field value should be stored and retrievable separately from the [`_source`](/reference/elasticsearch/mapping-reference/mapping-source-field.md) field. Accepts `true` or `false` (default).
|
||||
|
||||
`fields`
|
||||
: Multi-fields allow the same string value to be indexed in multiple ways for different purposes, such as one field for search and a multi-field for sorting and aggregations.
|
||||
|
||||
|
||||
## Collation options [_collation_options]
|
||||
|
||||
`strength`
|
||||
: The strength property determines the minimum level of difference considered significant during comparison. Possible values are : `primary`, `secondary`, `tertiary`, `quaternary` or `identical`. See the [ICU Collation documentation](https://icu-project.org/apiref/icu4j/com/ibm/icu/text/Collator.md) for a more detailed explanation for each value. Defaults to `tertiary` unless otherwise specified in the collation.
|
||||
|
||||
`decomposition`
|
||||
: Possible values: `no` (default, but collation-dependent) or `canonical`. Setting this decomposition property to `canonical` allows the Collator to handle unnormalized text properly, producing the same results as if the text were normalized. If `no` is set, it is the user’s responsibility to ensure that all text is already in the appropriate form before a comparison or before getting a CollationKey. Adjusting decomposition mode allows the user to select between faster and more complete collation behavior. Since a great many of the world’s languages do not require text normalization, most locales set `no` as the default decomposition mode.
|
||||
|
||||
The following options are expert only:
|
||||
|
||||
`alternate`
|
||||
: Possible values: `shifted` or `non-ignorable`. Sets the alternate handling for strength `quaternary` to be either shifted or non-ignorable. Which boils down to ignoring punctuation and whitespace.
|
||||
|
||||
`case_level`
|
||||
: Possible values: `true` or `false` (default). Whether case level sorting is required. When strength is set to `primary` this will ignore accent differences.
|
||||
|
||||
`case_first`
|
||||
: Possible values: `lower` or `upper`. Useful to control which case is sorted first when the case is not ignored for strength `tertiary`. The default depends on the collation.
|
||||
|
||||
`numeric`
|
||||
: Possible values: `true` or `false` (default) . Whether digits are sorted according to their numeric representation. For example the value `egg-9` is sorted before the value `egg-21`.
|
||||
|
||||
`variable_top`
|
||||
: Single character or contraction. Controls what is variable for `alternate`.
|
||||
|
||||
`hiragana_quaternary_mode`
|
||||
: Possible values: `true` or `false`. Distinguishing between Katakana and Hiragana characters in `quaternary` strength.
|
||||
|
||||
|
|
@ -0,0 +1,13 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/analysis-icu-collation.html
|
||||
---
|
||||
|
||||
# ICU collation token filter [analysis-icu-collation]
|
||||
|
||||
::::{warning}
|
||||
This token filter has been deprecated since Lucene 5.0. Please use [ICU Collation Keyword Field](/reference/elasticsearch-plugins/analysis-icu-collation-keyword-field.md).
|
||||
|
||||
::::
|
||||
|
||||
|
62
docs/reference/elasticsearch-plugins/analysis-icu-folding.md
Normal file
62
docs/reference/elasticsearch-plugins/analysis-icu-folding.md
Normal file
|
@ -0,0 +1,62 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/analysis-icu-folding.html
|
||||
---
|
||||
|
||||
# ICU folding token filter [analysis-icu-folding]
|
||||
|
||||
Case folding of Unicode characters based on `UTR#30`, like the [ASCII-folding token filter](/reference/data-analysis/text-analysis/analysis-asciifolding-tokenfilter.md) on steroids. It registers itself as the `icu_folding` token filter and is available to all indices:
|
||||
|
||||
```console
|
||||
PUT icu_sample
|
||||
{
|
||||
"settings": {
|
||||
"index": {
|
||||
"analysis": {
|
||||
"analyzer": {
|
||||
"folded": {
|
||||
"tokenizer": "icu_tokenizer",
|
||||
"filter": [
|
||||
"icu_folding"
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
The ICU folding token filter already does Unicode normalization, so there is no need to use Normalize character or token filter as well.
|
||||
|
||||
Which letters are folded can be controlled by specifying the `unicode_set_filter` parameter, which accepts a [UnicodeSet](https://icu-project.org/apiref/icu4j/com/ibm/icu/text/UnicodeSet.md).
|
||||
|
||||
The following example exempts Swedish characters from folding. It is important to note that both upper and lowercase forms should be specified, and that these filtered character are not lowercased which is why we add the `lowercase` filter as well:
|
||||
|
||||
```console
|
||||
PUT icu_sample
|
||||
{
|
||||
"settings": {
|
||||
"index": {
|
||||
"analysis": {
|
||||
"analyzer": {
|
||||
"swedish_analyzer": {
|
||||
"tokenizer": "icu_tokenizer",
|
||||
"filter": [
|
||||
"swedish_folding",
|
||||
"lowercase"
|
||||
]
|
||||
}
|
||||
},
|
||||
"filter": {
|
||||
"swedish_folding": {
|
||||
"type": "icu_folding",
|
||||
"unicode_set_filter": "[^åäöÅÄÖ]"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
|
@ -0,0 +1,50 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/analysis-icu-normalization-charfilter.html
|
||||
---
|
||||
|
||||
# ICU normalization character filter [analysis-icu-normalization-charfilter]
|
||||
|
||||
Normalizes characters as explained [here](https://unicode-org.github.io/icu/userguide/transforms/normalization/). It registers itself as the `icu_normalizer` character filter, which is available to all indices without any further configuration. The type of normalization can be specified with the `name` parameter, which accepts `nfc`, `nfkc`, and `nfkc_cf` (default). Set the `mode` parameter to `decompose` to convert `nfc` to `nfd` or `nfkc` to `nfkd` respectively:
|
||||
|
||||
Which letters are normalized can be controlled by specifying the `unicode_set_filter` parameter, which accepts a [UnicodeSet](https://icu-project.org/apiref/icu4j/com/ibm/icu/text/UnicodeSet.md).
|
||||
|
||||
Here are two examples, the default usage and a customised character filter:
|
||||
|
||||
```console
|
||||
PUT icu_sample
|
||||
{
|
||||
"settings": {
|
||||
"index": {
|
||||
"analysis": {
|
||||
"analyzer": {
|
||||
"nfkc_cf_normalized": { <1>
|
||||
"tokenizer": "icu_tokenizer",
|
||||
"char_filter": [
|
||||
"icu_normalizer"
|
||||
]
|
||||
},
|
||||
"nfd_normalized": { <2>
|
||||
"tokenizer": "icu_tokenizer",
|
||||
"char_filter": [
|
||||
"nfd_normalizer"
|
||||
]
|
||||
}
|
||||
},
|
||||
"char_filter": {
|
||||
"nfd_normalizer": {
|
||||
"type": "icu_normalizer",
|
||||
"name": "nfc",
|
||||
"mode": "decompose"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
1. Uses the default `nfkc_cf` normalization.
|
||||
2. Uses the customized `nfd_normalizer` token filter, which is set to use `nfc` normalization with decomposition.
|
||||
|
||||
|
|
@ -0,0 +1,51 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/analysis-icu-normalization.html
|
||||
---
|
||||
|
||||
# ICU normalization token filter [analysis-icu-normalization]
|
||||
|
||||
Normalizes characters as explained [here](https://unicode-org.github.io/icu/userguide/transforms/normalization/). It registers itself as the `icu_normalizer` token filter, which is available to all indices without any further configuration. The type of normalization can be specified with the `name` parameter, which accepts `nfc`, `nfkc`, and `nfkc_cf` (default).
|
||||
|
||||
Which letters are normalized can be controlled by specifying the `unicode_set_filter` parameter, which accepts a [UnicodeSet](https://icu-project.org/apiref/icu4j/com/ibm/icu/text/UnicodeSet.md).
|
||||
|
||||
You should probably prefer the [Normalization character filter](/reference/elasticsearch-plugins/analysis-icu-normalization-charfilter.md).
|
||||
|
||||
Here are two examples, the default usage and a customised token filter:
|
||||
|
||||
```console
|
||||
PUT icu_sample
|
||||
{
|
||||
"settings": {
|
||||
"index": {
|
||||
"analysis": {
|
||||
"analyzer": {
|
||||
"nfkc_cf_normalized": { <1>
|
||||
"tokenizer": "icu_tokenizer",
|
||||
"filter": [
|
||||
"icu_normalizer"
|
||||
]
|
||||
},
|
||||
"nfc_normalized": { <2>
|
||||
"tokenizer": "icu_tokenizer",
|
||||
"filter": [
|
||||
"nfc_normalizer"
|
||||
]
|
||||
}
|
||||
},
|
||||
"filter": {
|
||||
"nfc_normalizer": {
|
||||
"type": "icu_normalizer",
|
||||
"name": "nfc"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
1. Uses the default `nfkc_cf` normalization.
|
||||
2. Uses the customized `nfc_normalizer` token filter, which is set to use `nfc` normalization.
|
||||
|
||||
|
|
@ -0,0 +1,92 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/analysis-icu-tokenizer.html
|
||||
---
|
||||
|
||||
# ICU tokenizer [analysis-icu-tokenizer]
|
||||
|
||||
Tokenizes text into words on word boundaries, as defined in [UAX #29: Unicode Text Segmentation](https://www.unicode.org/reports/tr29/). It behaves much like the [`standard` tokenizer](/reference/data-analysis/text-analysis/analysis-standard-tokenizer.md), but adds better support for some Asian languages by using a dictionary-based approach to identify words in Thai, Lao, Chinese, Japanese, and Korean, and using custom rules to break Myanmar and Khmer text into syllables.
|
||||
|
||||
```console
|
||||
PUT icu_sample
|
||||
{
|
||||
"settings": {
|
||||
"index": {
|
||||
"analysis": {
|
||||
"analyzer": {
|
||||
"my_icu_analyzer": {
|
||||
"tokenizer": "icu_tokenizer"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Rules customization [_rules_customization]
|
||||
|
||||
::::{warning}
|
||||
This functionality is marked as experimental in Lucene
|
||||
::::
|
||||
|
||||
|
||||
You can customize the `icu-tokenizer` behavior by specifying per-script rule files, see the [RBBI rules syntax reference](http://userguide.icu-project.org/boundaryanalysis#TOC-RBBI-Rules) for a more detailed explanation.
|
||||
|
||||
To add icu tokenizer rules, set the `rule_files` settings, which should contain a comma-separated list of `code:rulefile` pairs in the following format: [four-letter ISO 15924 script code](https://unicode.org/iso15924/iso15924-codes.md), followed by a colon, then a rule file name. Rule files are placed `ES_HOME/config` directory.
|
||||
|
||||
As a demonstration of how the rule files can be used, save the following user file to `$ES_HOME/config/KeywordTokenizer.rbbi`:
|
||||
|
||||
```text
|
||||
.+ {200};
|
||||
```
|
||||
|
||||
Then create an analyzer to use this rule file as follows:
|
||||
|
||||
```console
|
||||
PUT icu_sample
|
||||
{
|
||||
"settings": {
|
||||
"index": {
|
||||
"analysis": {
|
||||
"tokenizer": {
|
||||
"icu_user_file": {
|
||||
"type": "icu_tokenizer",
|
||||
"rule_files": "Latn:KeywordTokenizer.rbbi"
|
||||
}
|
||||
},
|
||||
"analyzer": {
|
||||
"my_analyzer": {
|
||||
"type": "custom",
|
||||
"tokenizer": "icu_user_file"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
GET icu_sample/_analyze
|
||||
{
|
||||
"analyzer": "my_analyzer",
|
||||
"text": "Elasticsearch. Wow!"
|
||||
}
|
||||
```
|
||||
|
||||
The above `analyze` request returns the following:
|
||||
|
||||
```console-result
|
||||
{
|
||||
"tokens": [
|
||||
{
|
||||
"token": "Elasticsearch. Wow!",
|
||||
"start_offset": 0,
|
||||
"end_offset": 19,
|
||||
"type": "<ALPHANUM>",
|
||||
"position": 0
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
|
|
@ -0,0 +1,65 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/analysis-icu-transform.html
|
||||
---
|
||||
|
||||
# ICU transform token filter [analysis-icu-transform]
|
||||
|
||||
Transforms are used to process Unicode text in many different ways, such as case mapping, normalization, transliteration and bidirectional text handling.
|
||||
|
||||
You can define which transformation you want to apply with the `id` parameter (defaults to `Null`), and specify text direction with the `dir` parameter which accepts `forward` (default) for LTR and `reverse` for RTL. Custom rulesets are not yet supported.
|
||||
|
||||
For example:
|
||||
|
||||
```console
|
||||
PUT icu_sample
|
||||
{
|
||||
"settings": {
|
||||
"index": {
|
||||
"analysis": {
|
||||
"analyzer": {
|
||||
"latin": {
|
||||
"tokenizer": "keyword",
|
||||
"filter": [
|
||||
"myLatinTransform"
|
||||
]
|
||||
}
|
||||
},
|
||||
"filter": {
|
||||
"myLatinTransform": {
|
||||
"type": "icu_transform",
|
||||
"id": "Any-Latin; NFD; [:Nonspacing Mark:] Remove; NFC" <1>
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
GET icu_sample/_analyze
|
||||
{
|
||||
"analyzer": "latin",
|
||||
"text": "你好" <2>
|
||||
}
|
||||
|
||||
GET icu_sample/_analyze
|
||||
{
|
||||
"analyzer": "latin",
|
||||
"text": "здравствуйте" <3>
|
||||
}
|
||||
|
||||
GET icu_sample/_analyze
|
||||
{
|
||||
"analyzer": "latin",
|
||||
"text": "こんにちは" <4>
|
||||
}
|
||||
```
|
||||
|
||||
1. This transforms transliterates characters to Latin, and separates accents from their base characters, removes the accents, and then puts the remaining text into an unaccented form.
|
||||
2. Returns `ni hao`.
|
||||
3. Returns `zdravstvujte`.
|
||||
4. Returns `kon'nichiha`.
|
||||
|
||||
|
||||
For more documentation, Please see the [user guide of ICU Transform](https://unicode-org.github.io/icu/userguide/transforms/).
|
||||
|
56
docs/reference/elasticsearch-plugins/analysis-icu.md
Normal file
56
docs/reference/elasticsearch-plugins/analysis-icu.md
Normal file
|
@ -0,0 +1,56 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/analysis-icu.html
|
||||
---
|
||||
|
||||
# ICU analysis plugin [analysis-icu]
|
||||
|
||||
The ICU Analysis plugin integrates the Lucene ICU module into {{es}}, adding extended Unicode support using the [ICU](https://icu.unicode.org/) libraries, including better analysis of Asian languages, Unicode normalization, Unicode-aware case folding, collation support, and transliteration.
|
||||
|
||||
::::{admonition} ICU analysis and backwards compatibility
|
||||
:class: important
|
||||
|
||||
From time to time, the ICU library receives updates such as adding new characters and emojis, and improving collation (sort) orders. These changes may or may not affect search and sort orders, depending on which characters sets you are using.
|
||||
|
||||
While we restrict ICU upgrades to major versions, you may find that an index created in the previous major version will need to be reindexed in order to return correct (and correctly ordered) results, and to take advantage of new characters.
|
||||
|
||||
::::
|
||||
|
||||
|
||||
|
||||
## Installation [analysis-icu-install]
|
||||
|
||||
::::{warning}
|
||||
Version 9.0.0-beta1 of the Elastic Stack has not yet been released. The plugin might not be available.
|
||||
::::
|
||||
|
||||
|
||||
This plugin can be installed using the plugin manager:
|
||||
|
||||
```sh
|
||||
sudo bin/elasticsearch-plugin install analysis-icu
|
||||
```
|
||||
|
||||
The plugin must be installed on every node in the cluster, and each node must be restarted after installation.
|
||||
|
||||
You can download this plugin for [offline install](/reference/elasticsearch-plugins/plugin-management-custom-url.md) from [https://artifacts.elastic.co/downloads/elasticsearch-plugins/analysis-icu/analysis-icu-9.0.0-beta1.zip](https://artifacts.elastic.co/downloads/elasticsearch-plugins/analysis-icu/analysis-icu-9.0.0-beta1.zip). To verify the `.zip` file, use the [SHA hash](https://artifacts.elastic.co/downloads/elasticsearch-plugins/analysis-icu/analysis-icu-9.0.0-beta1.zip.sha512) or [ASC key](https://artifacts.elastic.co/downloads/elasticsearch-plugins/analysis-icu/analysis-icu-9.0.0-beta1.zip.asc).
|
||||
|
||||
|
||||
## Removal [analysis-icu-remove]
|
||||
|
||||
The plugin can be removed with the following command:
|
||||
|
||||
```sh
|
||||
sudo bin/elasticsearch-plugin remove analysis-icu
|
||||
```
|
||||
|
||||
The node must be stopped before removing the plugin.
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
|
@ -0,0 +1,62 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/analysis-kuromoji-analyzer.html
|
||||
---
|
||||
|
||||
# kuromoji analyzer [analysis-kuromoji-analyzer]
|
||||
|
||||
The `kuromoji` analyzer uses the following analysis chain:
|
||||
|
||||
* `CJKWidthCharFilter` from Lucene
|
||||
* [`kuromoji_tokenizer`](/reference/elasticsearch-plugins/analysis-kuromoji-tokenizer.md)
|
||||
* [`kuromoji_baseform`](/reference/elasticsearch-plugins/analysis-kuromoji-baseform.md) token filter
|
||||
* [`kuromoji_part_of_speech`](/reference/elasticsearch-plugins/analysis-kuromoji-speech.md) token filter
|
||||
* [`ja_stop`](/reference/elasticsearch-plugins/analysis-kuromoji-stop.md) token filter
|
||||
* [`kuromoji_stemmer`](/reference/elasticsearch-plugins/analysis-kuromoji-stemmer.md) token filter
|
||||
* [`lowercase`](/reference/data-analysis/text-analysis/analysis-lowercase-tokenfilter.md) token filter
|
||||
|
||||
It supports the `mode` and `user_dictionary` settings from [`kuromoji_tokenizer`](/reference/elasticsearch-plugins/analysis-kuromoji-tokenizer.md).
|
||||
|
||||
|
||||
## Normalize full-width characters [kuromoji-analyzer-normalize-full-width-characters]
|
||||
|
||||
The `kuromoji_tokenizer` tokenizer uses characters from the MeCab-IPADIC dictionary to split text into tokens. The dictionary includes some full-width characters, such as `o` and `f`. If a text contains full-width characters, the tokenizer can produce unexpected tokens.
|
||||
|
||||
For example, the `kuromoji_tokenizer` tokenizer converts the text `Culture of Japan` to the tokens `[ culture, o, f, japan ]` instead of `[ culture, of, japan ]`.
|
||||
|
||||
To avoid this, add the [`icu_normalizer` character filter](/reference/elasticsearch-plugins/analysis-icu-normalization-charfilter.md) to a custom analyzer based on the `kuromoji` analyzer. The `icu_normalizer` character filter converts full-width characters to their normal equivalents.
|
||||
|
||||
First, duplicate the `kuromoji` analyzer to create the basis for a custom analyzer. Then add the `icu_normalizer` character filter to the custom analyzer. For example:
|
||||
|
||||
```console
|
||||
PUT index-00001
|
||||
{
|
||||
"settings": {
|
||||
"index": {
|
||||
"analysis": {
|
||||
"analyzer": {
|
||||
"kuromoji_normalize": { <1>
|
||||
"char_filter": [
|
||||
"icu_normalizer" <2>
|
||||
],
|
||||
"tokenizer": "kuromoji_tokenizer",
|
||||
"filter": [
|
||||
"kuromoji_baseform",
|
||||
"kuromoji_part_of_speech",
|
||||
"cjk_width",
|
||||
"ja_stop",
|
||||
"kuromoji_stemmer",
|
||||
"lowercase"
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
1. Creates a new custom analyzer, `kuromoji_normalize`, based on the `kuromoji` analyzer.
|
||||
2. Adds the `icu_normalizer` character filter to the analyzer.
|
||||
|
||||
|
|
@ -0,0 +1,49 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/analysis-kuromoji-baseform.html
|
||||
---
|
||||
|
||||
# kuromoji_baseform token filter [analysis-kuromoji-baseform]
|
||||
|
||||
The `kuromoji_baseform` token filter replaces terms with their BaseFormAttribute. This acts as a lemmatizer for verbs and adjectives. Example:
|
||||
|
||||
```console
|
||||
PUT kuromoji_sample
|
||||
{
|
||||
"settings": {
|
||||
"index": {
|
||||
"analysis": {
|
||||
"analyzer": {
|
||||
"my_analyzer": {
|
||||
"tokenizer": "kuromoji_tokenizer",
|
||||
"filter": [
|
||||
"kuromoji_baseform"
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
GET kuromoji_sample/_analyze
|
||||
{
|
||||
"analyzer": "my_analyzer",
|
||||
"text": "飲み"
|
||||
}
|
||||
```
|
||||
|
||||
which responds with:
|
||||
|
||||
```console-result
|
||||
{
|
||||
"tokens" : [ {
|
||||
"token" : "飲む",
|
||||
"start_offset" : 0,
|
||||
"end_offset" : 2,
|
||||
"type" : "word",
|
||||
"position" : 0
|
||||
} ]
|
||||
}
|
||||
```
|
||||
|
|
@ -0,0 +1,15 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/analysis-kuromoji-charfilter.html
|
||||
---
|
||||
|
||||
# kuromoji_iteration_mark character filter [analysis-kuromoji-charfilter]
|
||||
|
||||
The `kuromoji_iteration_mark` normalizes Japanese horizontal iteration marks (*odoriji*) to their expanded form. It accepts the following settings:
|
||||
|
||||
`normalize_kanji`
|
||||
: Indicates whether kanji iteration marks should be normalized. Defaults to `true`.
|
||||
|
||||
`normalize_kana`
|
||||
: Indicates whether kana iteration marks should be normalized. Defaults to `true`
|
||||
|
|
@ -0,0 +1,34 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/analysis-kuromoji-completion.html
|
||||
---
|
||||
|
||||
# kuromoji_completion token filter [analysis-kuromoji-completion]
|
||||
|
||||
The `kuromoji_completion` token filter adds Japanese romanized tokens to the term attributes along with the original tokens (surface forms).
|
||||
|
||||
```console
|
||||
GET _analyze
|
||||
{
|
||||
"analyzer": "kuromoji_completion",
|
||||
"text": "寿司" <1>
|
||||
}
|
||||
```
|
||||
|
||||
1. Returns `寿司`, `susi` (Kunrei-shiki) and `sushi` (Hepburn-shiki).
|
||||
|
||||
|
||||
The `kuromoji_completion` token filter accepts the following settings:
|
||||
|
||||
`mode`
|
||||
: The tokenization mode determines how the tokenizer handles compound and unknown words. It can be set to:
|
||||
|
||||
`index`
|
||||
: Simple romanization. Expected to be used when indexing.
|
||||
|
||||
`query`
|
||||
: Input Method aware romanization. Expected to be used when querying.
|
||||
|
||||
Defaults to `index`.
|
||||
|
||||
|
|
@ -0,0 +1,67 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/analysis-kuromoji-hiragana-uppercase.html
|
||||
---
|
||||
|
||||
# hiragana_uppercase token filter [analysis-kuromoji-hiragana-uppercase]
|
||||
|
||||
The `hiragana_uppercase` token filter normalizes small letters (捨て仮名) in hiragana into standard letters. This filter is useful if you want to search against old style Japanese text such as patents, legal documents, contract policies, etc.
|
||||
|
||||
For example:
|
||||
|
||||
```console
|
||||
PUT kuromoji_sample
|
||||
{
|
||||
"settings": {
|
||||
"index": {
|
||||
"analysis": {
|
||||
"analyzer": {
|
||||
"my_analyzer": {
|
||||
"tokenizer": "kuromoji_tokenizer",
|
||||
"filter": [
|
||||
"hiragana_uppercase"
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
GET kuromoji_sample/_analyze
|
||||
{
|
||||
"analyzer": "my_analyzer",
|
||||
"text": "ちょっとまって"
|
||||
}
|
||||
```
|
||||
|
||||
Which results in:
|
||||
|
||||
```console-result
|
||||
{
|
||||
"tokens": [
|
||||
{
|
||||
"token": "ちよつと",
|
||||
"start_offset": 0,
|
||||
"end_offset": 4,
|
||||
"type": "word",
|
||||
"position": 0
|
||||
},
|
||||
{
|
||||
"token": "まつ",
|
||||
"start_offset": 4,
|
||||
"end_offset": 6,
|
||||
"type": "word",
|
||||
"position": 1
|
||||
},
|
||||
{
|
||||
"token": "て",
|
||||
"start_offset": 6,
|
||||
"end_offset": 7,
|
||||
"type": "word",
|
||||
"position": 2
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
|
@ -0,0 +1,53 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/analysis-kuromoji-katakana-uppercase.html
|
||||
---
|
||||
|
||||
# katakana_uppercase token filter [analysis-kuromoji-katakana-uppercase]
|
||||
|
||||
The `katakana_uppercase` token filter normalizes small letters (捨て仮名) in katakana into standard letters. This filter is useful if you want to search against old style Japanese text such as patents, legal documents, contract policies, etc.
|
||||
|
||||
For example:
|
||||
|
||||
```console
|
||||
PUT kuromoji_sample
|
||||
{
|
||||
"settings": {
|
||||
"index": {
|
||||
"analysis": {
|
||||
"analyzer": {
|
||||
"my_analyzer": {
|
||||
"tokenizer": "kuromoji_tokenizer",
|
||||
"filter": [
|
||||
"katakana_uppercase"
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
GET kuromoji_sample/_analyze
|
||||
{
|
||||
"analyzer": "my_analyzer",
|
||||
"text": "ストップウォッチ"
|
||||
}
|
||||
```
|
||||
|
||||
Which results in:
|
||||
|
||||
```console-result
|
||||
{
|
||||
"tokens": [
|
||||
{
|
||||
"token": "ストツプウオツチ",
|
||||
"start_offset": 0,
|
||||
"end_offset": 8,
|
||||
"type": "word",
|
||||
"position": 0
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
|
@ -0,0 +1,49 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/analysis-kuromoji-number.html
|
||||
---
|
||||
|
||||
# kuromoji_number token filter [analysis-kuromoji-number]
|
||||
|
||||
The `kuromoji_number` token filter normalizes Japanese numbers (kansūji) to regular Arabic decimal numbers in half-width characters. For example:
|
||||
|
||||
```console
|
||||
PUT kuromoji_sample
|
||||
{
|
||||
"settings": {
|
||||
"index": {
|
||||
"analysis": {
|
||||
"analyzer": {
|
||||
"my_analyzer": {
|
||||
"tokenizer": "kuromoji_tokenizer",
|
||||
"filter": [
|
||||
"kuromoji_number"
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
GET kuromoji_sample/_analyze
|
||||
{
|
||||
"analyzer": "my_analyzer",
|
||||
"text": "一〇〇〇"
|
||||
}
|
||||
```
|
||||
|
||||
Which results in:
|
||||
|
||||
```console-result
|
||||
{
|
||||
"tokens" : [ {
|
||||
"token" : "1000",
|
||||
"start_offset" : 0,
|
||||
"end_offset" : 4,
|
||||
"type" : "word",
|
||||
"position" : 0
|
||||
} ]
|
||||
}
|
||||
```
|
||||
|
|
@ -0,0 +1,62 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/analysis-kuromoji-readingform.html
|
||||
---
|
||||
|
||||
# kuromoji_readingform token filter [analysis-kuromoji-readingform]
|
||||
|
||||
The `kuromoji_readingform` token filter replaces the token with its reading form in either katakana or romaji. It accepts the following setting:
|
||||
|
||||
`use_romaji`
|
||||
: Whether romaji reading form should be output instead of katakana. Defaults to `false`.
|
||||
|
||||
When using the pre-defined `kuromoji_readingform` filter, `use_romaji` is set to `true`. The default when defining a custom `kuromoji_readingform`, however, is `false`. The only reason to use the custom form is if you need the katakana reading form:
|
||||
|
||||
```console
|
||||
PUT kuromoji_sample
|
||||
{
|
||||
"settings": {
|
||||
"index": {
|
||||
"analysis": {
|
||||
"analyzer": {
|
||||
"romaji_analyzer": {
|
||||
"tokenizer": "kuromoji_tokenizer",
|
||||
"filter": [ "romaji_readingform" ]
|
||||
},
|
||||
"katakana_analyzer": {
|
||||
"tokenizer": "kuromoji_tokenizer",
|
||||
"filter": [ "katakana_readingform" ]
|
||||
}
|
||||
},
|
||||
"filter": {
|
||||
"romaji_readingform": {
|
||||
"type": "kuromoji_readingform",
|
||||
"use_romaji": true
|
||||
},
|
||||
"katakana_readingform": {
|
||||
"type": "kuromoji_readingform",
|
||||
"use_romaji": false
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
GET kuromoji_sample/_analyze
|
||||
{
|
||||
"analyzer": "katakana_analyzer",
|
||||
"text": "寿司" <1>
|
||||
}
|
||||
|
||||
GET kuromoji_sample/_analyze
|
||||
{
|
||||
"analyzer": "romaji_analyzer",
|
||||
"text": "寿司" <2>
|
||||
}
|
||||
```
|
||||
|
||||
1. Returns `スシ`.
|
||||
2. Returns `sushi`.
|
||||
|
||||
|
|
@ -0,0 +1,69 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/analysis-kuromoji-speech.html
|
||||
---
|
||||
|
||||
# kuromoji_part_of_speech token filter [analysis-kuromoji-speech]
|
||||
|
||||
The `kuromoji_part_of_speech` token filter removes tokens that match a set of part-of-speech tags. It accepts the following setting:
|
||||
|
||||
`stoptags`
|
||||
: An array of part-of-speech tags that should be removed. It defaults to the `stoptags.txt` file embedded in the `lucene-analyzer-kuromoji.jar`.
|
||||
|
||||
For example:
|
||||
|
||||
```console
|
||||
PUT kuromoji_sample
|
||||
{
|
||||
"settings": {
|
||||
"index": {
|
||||
"analysis": {
|
||||
"analyzer": {
|
||||
"my_analyzer": {
|
||||
"tokenizer": "kuromoji_tokenizer",
|
||||
"filter": [
|
||||
"my_posfilter"
|
||||
]
|
||||
}
|
||||
},
|
||||
"filter": {
|
||||
"my_posfilter": {
|
||||
"type": "kuromoji_part_of_speech",
|
||||
"stoptags": [
|
||||
"助詞-格助詞-一般",
|
||||
"助詞-終助詞"
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
GET kuromoji_sample/_analyze
|
||||
{
|
||||
"analyzer": "my_analyzer",
|
||||
"text": "寿司がおいしいね"
|
||||
}
|
||||
```
|
||||
|
||||
Which responds with:
|
||||
|
||||
```console-result
|
||||
{
|
||||
"tokens" : [ {
|
||||
"token" : "寿司",
|
||||
"start_offset" : 0,
|
||||
"end_offset" : 2,
|
||||
"type" : "word",
|
||||
"position" : 0
|
||||
}, {
|
||||
"token" : "おいしい",
|
||||
"start_offset" : 3,
|
||||
"end_offset" : 7,
|
||||
"type" : "word",
|
||||
"position" : 2
|
||||
} ]
|
||||
}
|
||||
```
|
||||
|
|
@ -0,0 +1,56 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/analysis-kuromoji-stemmer.html
|
||||
---
|
||||
|
||||
# kuromoji_stemmer token filter [analysis-kuromoji-stemmer]
|
||||
|
||||
The `kuromoji_stemmer` token filter normalizes common katakana spelling variations ending in a long sound character by removing this character (U+30FC). Only full-width katakana characters are supported.
|
||||
|
||||
This token filter accepts the following setting:
|
||||
|
||||
`minimum_length`
|
||||
: Katakana words shorter than the `minimum length` are not stemmed (default is `4`).
|
||||
|
||||
```console
|
||||
PUT kuromoji_sample
|
||||
{
|
||||
"settings": {
|
||||
"index": {
|
||||
"analysis": {
|
||||
"analyzer": {
|
||||
"my_analyzer": {
|
||||
"tokenizer": "kuromoji_tokenizer",
|
||||
"filter": [
|
||||
"my_katakana_stemmer"
|
||||
]
|
||||
}
|
||||
},
|
||||
"filter": {
|
||||
"my_katakana_stemmer": {
|
||||
"type": "kuromoji_stemmer",
|
||||
"minimum_length": 4
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
GET kuromoji_sample/_analyze
|
||||
{
|
||||
"analyzer": "my_analyzer",
|
||||
"text": "コピー" <1>
|
||||
}
|
||||
|
||||
GET kuromoji_sample/_analyze
|
||||
{
|
||||
"analyzer": "my_analyzer",
|
||||
"text": "サーバー" <2>
|
||||
}
|
||||
```
|
||||
|
||||
1. Returns `コピー`.
|
||||
2. Return `サーバ`.
|
||||
|
||||
|
|
@ -0,0 +1,58 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/analysis-kuromoji-stop.html
|
||||
---
|
||||
|
||||
# ja_stop token filter [analysis-kuromoji-stop]
|
||||
|
||||
The `ja_stop` token filter filters out Japanese stopwords (`_japanese_`), and any other custom stopwords specified by the user. This filter only supports the predefined `_japanese_` stopwords list. If you want to use a different predefined list, then use the [`stop` token filter](/reference/data-analysis/text-analysis/analysis-stop-tokenfilter.md) instead.
|
||||
|
||||
```console
|
||||
PUT kuromoji_sample
|
||||
{
|
||||
"settings": {
|
||||
"index": {
|
||||
"analysis": {
|
||||
"analyzer": {
|
||||
"analyzer_with_ja_stop": {
|
||||
"tokenizer": "kuromoji_tokenizer",
|
||||
"filter": [
|
||||
"ja_stop"
|
||||
]
|
||||
}
|
||||
},
|
||||
"filter": {
|
||||
"ja_stop": {
|
||||
"type": "ja_stop",
|
||||
"stopwords": [
|
||||
"_japanese_",
|
||||
"ストップ"
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
GET kuromoji_sample/_analyze
|
||||
{
|
||||
"analyzer": "analyzer_with_ja_stop",
|
||||
"text": "ストップは消える"
|
||||
}
|
||||
```
|
||||
|
||||
The above request returns:
|
||||
|
||||
```console-result
|
||||
{
|
||||
"tokens" : [ {
|
||||
"token" : "消える",
|
||||
"start_offset" : 5,
|
||||
"end_offset" : 8,
|
||||
"type" : "word",
|
||||
"position" : 2
|
||||
} ]
|
||||
}
|
||||
```
|
||||
|
|
@ -0,0 +1,165 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/analysis-kuromoji-tokenizer.html
|
||||
---
|
||||
|
||||
# kuromoji_tokenizer [analysis-kuromoji-tokenizer]
|
||||
|
||||
The `kuromoji_tokenizer` accepts the following settings:
|
||||
|
||||
`mode`
|
||||
: The tokenization mode determines how the tokenizer handles compound and unknown words. It can be set to:
|
||||
|
||||
`normal`
|
||||
: Normal segmentation, no decomposition for compounds. Example output:
|
||||
|
||||
```
|
||||
関西国際空港
|
||||
アブラカダブラ
|
||||
```
|
||||
|
||||
|
||||
`search`
|
||||
: Segmentation geared towards search. This includes a decompounding process for long nouns, also including the full compound token as a synonym. Example output:
|
||||
|
||||
```
|
||||
関西, 関西国際空港, 国際, 空港
|
||||
アブラカダブラ
|
||||
```
|
||||
|
||||
|
||||
`extended`
|
||||
: Extended mode outputs unigrams for unknown words. Example output:
|
||||
|
||||
```
|
||||
関西, 関西国際空港, 国際, 空港
|
||||
ア, ブ, ラ, カ, ダ, ブ, ラ
|
||||
```
|
||||
|
||||
|
||||
|
||||
`discard_punctuation`
|
||||
: Whether punctuation should be discarded from the output. Defaults to `true`.
|
||||
|
||||
`lenient`
|
||||
: Whether the `user_dictionary` should be deduplicated on the provided `text`. False by default causing duplicates to generate an error.
|
||||
|
||||
`user_dictionary`
|
||||
: The Kuromoji tokenizer uses the MeCab-IPADIC dictionary by default. A `user_dictionary` may be appended to the default dictionary. The dictionary should have the following CSV format:
|
||||
|
||||
```text
|
||||
<text>,<token 1> ... <token n>,<reading 1> ... <reading n>,<part-of-speech tag>
|
||||
```
|
||||
|
||||
|
||||
As a demonstration of how the user dictionary can be used, save the following dictionary to `$ES_HOME/config/userdict_ja.txt`:
|
||||
|
||||
```text
|
||||
東京スカイツリー,東京 スカイツリー,トウキョウ スカイツリー,カスタム名詞
|
||||
```
|
||||
|
||||
You can also inline the rules directly in the tokenizer definition using the `user_dictionary_rules` option:
|
||||
|
||||
```console
|
||||
PUT kuromoji_sample
|
||||
{
|
||||
"settings": {
|
||||
"index": {
|
||||
"analysis": {
|
||||
"tokenizer": {
|
||||
"kuromoji_user_dict": {
|
||||
"type": "kuromoji_tokenizer",
|
||||
"mode": "extended",
|
||||
"user_dictionary_rules": ["東京スカイツリー,東京 スカイツリー,トウキョウ スカイツリー,カスタム名詞"]
|
||||
}
|
||||
},
|
||||
"analyzer": {
|
||||
"my_analyzer": {
|
||||
"type": "custom",
|
||||
"tokenizer": "kuromoji_user_dict"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
`nbest_cost`/`nbest_examples`
|
||||
: Additional expert user parameters `nbest_cost` and `nbest_examples` can be used to include additional tokens that are most likely according to the statistical model. If both parameters are used, the largest number of both is applied.
|
||||
|
||||
`nbest_cost`
|
||||
: The `nbest_cost` parameter specifies an additional Viterbi cost. The KuromojiTokenizer will include all tokens in Viterbi paths that are within the nbest_cost value of the best path.
|
||||
|
||||
`nbest_examples`
|
||||
: The `nbest_examples` can be used to find a `nbest_cost` value based on examples. For example, a value of /箱根山-箱根/成田空港-成田/ indicates that in the texts, 箱根山 (Mt. Hakone) and 成田空港 (Narita Airport) we’d like a cost that gives is us 箱根 (Hakone) and 成田 (Narita).
|
||||
|
||||
|
||||
Then create an analyzer as follows:
|
||||
|
||||
```console
|
||||
PUT kuromoji_sample
|
||||
{
|
||||
"settings": {
|
||||
"index": {
|
||||
"analysis": {
|
||||
"tokenizer": {
|
||||
"kuromoji_user_dict": {
|
||||
"type": "kuromoji_tokenizer",
|
||||
"mode": "extended",
|
||||
"discard_punctuation": "false",
|
||||
"user_dictionary": "userdict_ja.txt",
|
||||
"lenient": "true"
|
||||
}
|
||||
},
|
||||
"analyzer": {
|
||||
"my_analyzer": {
|
||||
"type": "custom",
|
||||
"tokenizer": "kuromoji_user_dict"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
GET kuromoji_sample/_analyze
|
||||
{
|
||||
"analyzer": "my_analyzer",
|
||||
"text": "東京スカイツリー"
|
||||
}
|
||||
```
|
||||
|
||||
The above `analyze` request returns the following:
|
||||
|
||||
```console-result
|
||||
{
|
||||
"tokens" : [ {
|
||||
"token" : "東京",
|
||||
"start_offset" : 0,
|
||||
"end_offset" : 2,
|
||||
"type" : "word",
|
||||
"position" : 0
|
||||
}, {
|
||||
"token" : "スカイツリー",
|
||||
"start_offset" : 2,
|
||||
"end_offset" : 8,
|
||||
"type" : "word",
|
||||
"position" : 1
|
||||
} ]
|
||||
}
|
||||
```
|
||||
|
||||
`discard_compound_token`
|
||||
: Whether original compound tokens should be discarded from the output with `search` mode. Defaults to `false`. Example output with `search` or `extended` mode and this option `true`:
|
||||
|
||||
```
|
||||
関西, 国際, 空港
|
||||
```
|
||||
|
||||
|
||||
::::{note}
|
||||
If a text contains full-width characters, the `kuromoji_tokenizer` tokenizer can produce unexpected tokens. To avoid this, add the [`icu_normalizer` character filter](/reference/elasticsearch-plugins/analysis-icu-normalization-charfilter.md) to your analyzer. See [Normalize full-width characters](/reference/elasticsearch-plugins/analysis-kuromoji-analyzer.md#kuromoji-analyzer-normalize-full-width-characters).
|
||||
::::
|
||||
|
||||
|
50
docs/reference/elasticsearch-plugins/analysis-kuromoji.md
Normal file
50
docs/reference/elasticsearch-plugins/analysis-kuromoji.md
Normal file
|
@ -0,0 +1,50 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/analysis-kuromoji.html
|
||||
---
|
||||
|
||||
# Japanese (kuromoji) analysis plugin [analysis-kuromoji]
|
||||
|
||||
The Japanese (kuromoji) analysis plugin integrates Lucene kuromoji analysis module into {{es}}.
|
||||
|
||||
|
||||
## Installation [analysis-kuromoji-install]
|
||||
|
||||
::::{warning}
|
||||
Version 9.0.0-beta1 of the Elastic Stack has not yet been released. The plugin might not be available.
|
||||
::::
|
||||
|
||||
|
||||
This plugin can be installed using the plugin manager:
|
||||
|
||||
```sh
|
||||
sudo bin/elasticsearch-plugin install analysis-kuromoji
|
||||
```
|
||||
|
||||
The plugin must be installed on every node in the cluster, and each node must be restarted after installation.
|
||||
|
||||
You can download this plugin for [offline install](/reference/elasticsearch-plugins/plugin-management-custom-url.md) from [https://artifacts.elastic.co/downloads/elasticsearch-plugins/analysis-kuromoji/analysis-kuromoji-9.0.0-beta1.zip](https://artifacts.elastic.co/downloads/elasticsearch-plugins/analysis-kuromoji/analysis-kuromoji-9.0.0-beta1.zip). To verify the `.zip` file, use the [SHA hash](https://artifacts.elastic.co/downloads/elasticsearch-plugins/analysis-kuromoji/analysis-kuromoji-9.0.0-beta1.zip.sha512) or [ASC key](https://artifacts.elastic.co/downloads/elasticsearch-plugins/analysis-kuromoji/analysis-kuromoji-9.0.0-beta1.zip.asc).
|
||||
|
||||
|
||||
## Removal [analysis-kuromoji-remove]
|
||||
|
||||
The plugin can be removed with the following command:
|
||||
|
||||
```sh
|
||||
sudo bin/elasticsearch-plugin remove analysis-kuromoji
|
||||
```
|
||||
|
||||
The node must be stopped before removing the plugin.
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
|
@ -0,0 +1,16 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/analysis-nori-analyzer.html
|
||||
---
|
||||
|
||||
# nori analyzer [analysis-nori-analyzer]
|
||||
|
||||
The `nori` analyzer consists of the following tokenizer and token filters:
|
||||
|
||||
* [`nori_tokenizer`](/reference/elasticsearch-plugins/analysis-nori-tokenizer.md)
|
||||
* [`nori_part_of_speech`](/reference/elasticsearch-plugins/analysis-nori-speech.md) token filter
|
||||
* [`nori_readingform`](/reference/elasticsearch-plugins/analysis-nori-readingform.md) token filter
|
||||
* [`lowercase`](/reference/data-analysis/text-analysis/analysis-lowercase-tokenfilter.md) token filter
|
||||
|
||||
It supports the `decompound_mode` and `user_dictionary` settings from [`nori_tokenizer`](/reference/elasticsearch-plugins/analysis-nori-tokenizer.md) and the `stoptags` setting from [`nori_part_of_speech`](/reference/elasticsearch-plugins/analysis-nori-speech.md).
|
||||
|
96
docs/reference/elasticsearch-plugins/analysis-nori-number.md
Normal file
96
docs/reference/elasticsearch-plugins/analysis-nori-number.md
Normal file
|
@ -0,0 +1,96 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/analysis-nori-number.html
|
||||
---
|
||||
|
||||
# nori_number token filter [analysis-nori-number]
|
||||
|
||||
The `nori_number` token filter normalizes Korean numbers to regular Arabic decimal numbers in half-width characters.
|
||||
|
||||
Korean numbers are often written using a combination of Hangul and Arabic numbers with various kinds of punctuation. For example, 3.2천 means 3200. This filter does this kind of normalization and allows a search for 3200 to match 3.2천 in text, but can also be used to make range facets based on the normalized numbers and so on.
|
||||
|
||||
::::{note}
|
||||
Notice that this analyzer uses a token composition scheme and relies on punctuation tokens being found in the token stream. Please make sure your `nori_tokenizer` has `discard_punctuation` set to false. In case punctuation characters, such as U+FF0E(.), is removed from the token stream, this filter would find input tokens 3 and 2천 and give outputs 3 and 2000 instead of 3200, which is likely not the intended result.
|
||||
|
||||
If you want to remove punctuation characters from your index that are not part of normalized numbers, add a `stop` token filter with the punctuation you wish to remove after `nori_number` in your analyzer chain.
|
||||
|
||||
::::
|
||||
|
||||
|
||||
Below are some examples of normalizations this filter supports. The input is untokenized text and the result is the single term attribute emitted for the input.
|
||||
|
||||
* 영영칠 → 7
|
||||
* 일영영영 → 1000
|
||||
* 삼천2백2십삼 → 3223
|
||||
* 일조육백만오천일 → 1000006005001
|
||||
* 3.2천 → 3200
|
||||
* 1.2만345.67 → 12345.67
|
||||
* 4,647.100 → 4647.1
|
||||
* 15,7 → 157 (be aware of this weakness)
|
||||
|
||||
For example:
|
||||
|
||||
```console
|
||||
PUT nori_sample
|
||||
{
|
||||
"settings": {
|
||||
"index": {
|
||||
"analysis": {
|
||||
"analyzer": {
|
||||
"my_analyzer": {
|
||||
"tokenizer": "tokenizer_discard_puncuation_false",
|
||||
"filter": [
|
||||
"part_of_speech_stop_sp", "nori_number"
|
||||
]
|
||||
}
|
||||
},
|
||||
"tokenizer": {
|
||||
"tokenizer_discard_puncuation_false": {
|
||||
"type": "nori_tokenizer",
|
||||
"discard_punctuation": "false"
|
||||
}
|
||||
},
|
||||
"filter": {
|
||||
"part_of_speech_stop_sp": {
|
||||
"type": "nori_part_of_speech",
|
||||
"stoptags": ["SP"]
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
GET nori_sample/_analyze
|
||||
{
|
||||
"analyzer": "my_analyzer",
|
||||
"text": "십만이천오백과 3.2천"
|
||||
}
|
||||
```
|
||||
|
||||
Which results in:
|
||||
|
||||
```console-result
|
||||
{
|
||||
"tokens" : [{
|
||||
"token" : "102500",
|
||||
"start_offset" : 0,
|
||||
"end_offset" : 6,
|
||||
"type" : "word",
|
||||
"position" : 0
|
||||
}, {
|
||||
"token" : "과",
|
||||
"start_offset" : 6,
|
||||
"end_offset" : 7,
|
||||
"type" : "word",
|
||||
"position" : 1
|
||||
}, {
|
||||
"token" : "3200",
|
||||
"start_offset" : 8,
|
||||
"end_offset" : 12,
|
||||
"type" : "word",
|
||||
"position" : 2
|
||||
}]
|
||||
}
|
||||
```
|
||||
|
|
@ -0,0 +1,53 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/analysis-nori-readingform.html
|
||||
---
|
||||
|
||||
# nori_readingform token filter [analysis-nori-readingform]
|
||||
|
||||
The `nori_readingform` token filter rewrites tokens written in Hanja to their Hangul form.
|
||||
|
||||
```console
|
||||
PUT nori_sample
|
||||
{
|
||||
"settings": {
|
||||
"index": {
|
||||
"analysis": {
|
||||
"analyzer": {
|
||||
"my_analyzer": {
|
||||
"tokenizer": "nori_tokenizer",
|
||||
"filter": [ "nori_readingform" ]
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
GET nori_sample/_analyze
|
||||
{
|
||||
"analyzer": "my_analyzer",
|
||||
"text": "鄕歌" <1>
|
||||
}
|
||||
```
|
||||
|
||||
1. A token written in Hanja: Hyangga
|
||||
|
||||
|
||||
Which responds with:
|
||||
|
||||
```console-result
|
||||
{
|
||||
"tokens" : [ {
|
||||
"token" : "향가", <1>
|
||||
"start_offset" : 0,
|
||||
"end_offset" : 2,
|
||||
"type" : "word",
|
||||
"position" : 0
|
||||
}]
|
||||
}
|
||||
```
|
||||
|
||||
1. The Hanja form is replaced by the Hangul translation.
|
||||
|
||||
|
88
docs/reference/elasticsearch-plugins/analysis-nori-speech.md
Normal file
88
docs/reference/elasticsearch-plugins/analysis-nori-speech.md
Normal file
|
@ -0,0 +1,88 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/analysis-nori-speech.html
|
||||
---
|
||||
|
||||
# nori_part_of_speech token filter [analysis-nori-speech]
|
||||
|
||||
The `nori_part_of_speech` token filter removes tokens that match a set of part-of-speech tags. The list of supported tags and their meanings can be found here: [Part of speech tags](https://lucene.apache.org/core/10_1_0/core/../analysis/nori/org/apache/lucene/analysis/ko/POS.Tag.md)
|
||||
|
||||
It accepts the following setting:
|
||||
|
||||
`stoptags`
|
||||
: An array of part-of-speech tags that should be removed.
|
||||
|
||||
and defaults to:
|
||||
|
||||
```js
|
||||
"stoptags": [
|
||||
"E",
|
||||
"IC",
|
||||
"J",
|
||||
"MAG", "MAJ", "MM",
|
||||
"SP", "SSC", "SSO", "SC", "SE",
|
||||
"XPN", "XSA", "XSN", "XSV",
|
||||
"UNA", "NA", "VSV"
|
||||
]
|
||||
```
|
||||
|
||||
For example:
|
||||
|
||||
```console
|
||||
PUT nori_sample
|
||||
{
|
||||
"settings": {
|
||||
"index": {
|
||||
"analysis": {
|
||||
"analyzer": {
|
||||
"my_analyzer": {
|
||||
"tokenizer": "nori_tokenizer",
|
||||
"filter": [
|
||||
"my_posfilter"
|
||||
]
|
||||
}
|
||||
},
|
||||
"filter": {
|
||||
"my_posfilter": {
|
||||
"type": "nori_part_of_speech",
|
||||
"stoptags": [
|
||||
"NR" <1>
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
GET nori_sample/_analyze
|
||||
{
|
||||
"analyzer": "my_analyzer",
|
||||
"text": "여섯 용이" <2>
|
||||
}
|
||||
```
|
||||
|
||||
1. Korean numerals should be removed (`NR`)
|
||||
2. Six dragons
|
||||
|
||||
|
||||
Which responds with:
|
||||
|
||||
```console-result
|
||||
{
|
||||
"tokens" : [ {
|
||||
"token" : "용",
|
||||
"start_offset" : 3,
|
||||
"end_offset" : 4,
|
||||
"type" : "word",
|
||||
"position" : 1
|
||||
}, {
|
||||
"token" : "이",
|
||||
"start_offset" : 4,
|
||||
"end_offset" : 5,
|
||||
"type" : "word",
|
||||
"position" : 2
|
||||
} ]
|
||||
}
|
||||
```
|
||||
|
268
docs/reference/elasticsearch-plugins/analysis-nori-tokenizer.md
Normal file
268
docs/reference/elasticsearch-plugins/analysis-nori-tokenizer.md
Normal file
|
@ -0,0 +1,268 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/analysis-nori-tokenizer.html
|
||||
---
|
||||
|
||||
# nori_tokenizer [analysis-nori-tokenizer]
|
||||
|
||||
The `nori_tokenizer` accepts the following settings:
|
||||
|
||||
`decompound_mode`
|
||||
: The decompound mode determines how the tokenizer handles compound tokens. It can be set to:
|
||||
|
||||
`none`
|
||||
: No decomposition for compounds. Example output:
|
||||
|
||||
```
|
||||
가거도항
|
||||
가곡역
|
||||
```
|
||||
|
||||
|
||||
`discard`
|
||||
: Decomposes compounds and discards the original form (**default**). Example output:
|
||||
|
||||
```
|
||||
가곡역 => 가곡, 역
|
||||
```
|
||||
|
||||
|
||||
`mixed`
|
||||
: Decomposes compounds and keeps the original form. Example output:
|
||||
|
||||
```
|
||||
가곡역 => 가곡역, 가곡, 역
|
||||
```
|
||||
|
||||
|
||||
|
||||
`discard_punctuation`
|
||||
: Whether punctuation should be discarded from the output. Defaults to `true`.
|
||||
|
||||
`lenient`
|
||||
: Whether the `user_dictionary` should be deduplicated on the provided `text`. False by default causing duplicates to generate an error.
|
||||
|
||||
`user_dictionary`
|
||||
: The Nori tokenizer uses the [mecab-ko-dic dictionary](https://bitbucket.org/eunjeon/mecab-ko-dic) by default. A `user_dictionary` with custom nouns (`NNG`) may be appended to the default dictionary. The dictionary should have the following format:
|
||||
|
||||
```txt
|
||||
<token> [<token 1> ... <token n>]
|
||||
```
|
||||
|
||||
The first token is mandatory and represents the custom noun that should be added in the dictionary. For compound nouns the custom segmentation can be provided after the first token (`[<token 1> ... <token n>]`). The segmentation of the custom compound nouns is controlled by the `decompound_mode` setting.
|
||||
|
||||
As a demonstration of how the user dictionary can be used, save the following dictionary to `$ES_HOME/config/userdict_ko.txt`:
|
||||
|
||||
```txt
|
||||
c++ <1>
|
||||
C쁠쁠
|
||||
세종
|
||||
세종시 세종 시 <2>
|
||||
```
|
||||
|
||||
1. A simple noun
|
||||
2. A compound noun (`세종시`) followed by its decomposition: `세종` and `시`.
|
||||
|
||||
|
||||
Then create an analyzer as follows:
|
||||
|
||||
```console
|
||||
PUT nori_sample
|
||||
{
|
||||
"settings": {
|
||||
"index": {
|
||||
"analysis": {
|
||||
"tokenizer": {
|
||||
"nori_user_dict": {
|
||||
"type": "nori_tokenizer",
|
||||
"decompound_mode": "mixed",
|
||||
"discard_punctuation": "false",
|
||||
"user_dictionary": "userdict_ko.txt",
|
||||
"lenient": "true"
|
||||
}
|
||||
},
|
||||
"analyzer": {
|
||||
"my_analyzer": {
|
||||
"type": "custom",
|
||||
"tokenizer": "nori_user_dict"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
GET nori_sample/_analyze
|
||||
{
|
||||
"analyzer": "my_analyzer",
|
||||
"text": "세종시" <1>
|
||||
}
|
||||
```
|
||||
|
||||
1. Sejong city
|
||||
|
||||
|
||||
The above `analyze` request returns the following:
|
||||
|
||||
```console-result
|
||||
{
|
||||
"tokens" : [ {
|
||||
"token" : "세종시",
|
||||
"start_offset" : 0,
|
||||
"end_offset" : 3,
|
||||
"type" : "word",
|
||||
"position" : 0,
|
||||
"positionLength" : 2 <1>
|
||||
}, {
|
||||
"token" : "세종",
|
||||
"start_offset" : 0,
|
||||
"end_offset" : 2,
|
||||
"type" : "word",
|
||||
"position" : 0
|
||||
}, {
|
||||
"token" : "시",
|
||||
"start_offset" : 2,
|
||||
"end_offset" : 3,
|
||||
"type" : "word",
|
||||
"position" : 1
|
||||
}]
|
||||
}
|
||||
```
|
||||
|
||||
1. This is a compound token that spans two positions (`mixed` mode).
|
||||
|
||||
|
||||
|
||||
`user_dictionary_rules`
|
||||
: You can also inline the rules directly in the tokenizer definition using the `user_dictionary_rules` option:
|
||||
|
||||
```console
|
||||
PUT nori_sample
|
||||
{
|
||||
"settings": {
|
||||
"index": {
|
||||
"analysis": {
|
||||
"tokenizer": {
|
||||
"nori_user_dict": {
|
||||
"type": "nori_tokenizer",
|
||||
"decompound_mode": "mixed",
|
||||
"user_dictionary_rules": ["c++", "C쁠쁠", "세종", "세종시 세종 시"]
|
||||
}
|
||||
},
|
||||
"analyzer": {
|
||||
"my_analyzer": {
|
||||
"type": "custom",
|
||||
"tokenizer": "nori_user_dict"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
|
||||
The `nori_tokenizer` sets a number of additional attributes per token that are used by token filters to modify the stream. You can view all these additional attributes with the following request:
|
||||
|
||||
```console
|
||||
GET _analyze
|
||||
{
|
||||
"tokenizer": "nori_tokenizer",
|
||||
"text": "뿌리가 깊은 나무는", <1>
|
||||
"attributes" : ["posType", "leftPOS", "rightPOS", "morphemes", "reading"],
|
||||
"explain": true
|
||||
}
|
||||
```
|
||||
|
||||
1. A tree with deep roots
|
||||
|
||||
|
||||
Which responds with:
|
||||
|
||||
```console-result
|
||||
{
|
||||
"detail": {
|
||||
"custom_analyzer": true,
|
||||
"charfilters": [],
|
||||
"tokenizer": {
|
||||
"name": "nori_tokenizer",
|
||||
"tokens": [
|
||||
{
|
||||
"token": "뿌리",
|
||||
"start_offset": 0,
|
||||
"end_offset": 2,
|
||||
"type": "word",
|
||||
"position": 0,
|
||||
"leftPOS": "NNG(General Noun)",
|
||||
"morphemes": null,
|
||||
"posType": "MORPHEME",
|
||||
"reading": null,
|
||||
"rightPOS": "NNG(General Noun)"
|
||||
},
|
||||
{
|
||||
"token": "가",
|
||||
"start_offset": 2,
|
||||
"end_offset": 3,
|
||||
"type": "word",
|
||||
"position": 1,
|
||||
"leftPOS": "JKS(Subject case marker)",
|
||||
"morphemes": null,
|
||||
"posType": "MORPHEME",
|
||||
"reading": null,
|
||||
"rightPOS": "JKS(Subject case marker)"
|
||||
},
|
||||
{
|
||||
"token": "깊",
|
||||
"start_offset": 4,
|
||||
"end_offset": 5,
|
||||
"type": "word",
|
||||
"position": 2,
|
||||
"leftPOS": "VA(Adjective)",
|
||||
"morphemes": null,
|
||||
"posType": "MORPHEME",
|
||||
"reading": null,
|
||||
"rightPOS": "VA(Adjective)"
|
||||
},
|
||||
{
|
||||
"token": "은",
|
||||
"start_offset": 5,
|
||||
"end_offset": 6,
|
||||
"type": "word",
|
||||
"position": 3,
|
||||
"leftPOS": "ETM(Adnominal form transformative ending)",
|
||||
"morphemes": null,
|
||||
"posType": "MORPHEME",
|
||||
"reading": null,
|
||||
"rightPOS": "ETM(Adnominal form transformative ending)"
|
||||
},
|
||||
{
|
||||
"token": "나무",
|
||||
"start_offset": 7,
|
||||
"end_offset": 9,
|
||||
"type": "word",
|
||||
"position": 4,
|
||||
"leftPOS": "NNG(General Noun)",
|
||||
"morphemes": null,
|
||||
"posType": "MORPHEME",
|
||||
"reading": null,
|
||||
"rightPOS": "NNG(General Noun)"
|
||||
},
|
||||
{
|
||||
"token": "는",
|
||||
"start_offset": 9,
|
||||
"end_offset": 10,
|
||||
"type": "word",
|
||||
"position": 5,
|
||||
"leftPOS": "JX(Auxiliary postpositional particle)",
|
||||
"morphemes": null,
|
||||
"posType": "MORPHEME",
|
||||
"reading": null,
|
||||
"rightPOS": "JX(Auxiliary postpositional particle)"
|
||||
}
|
||||
]
|
||||
},
|
||||
"tokenfilters": []
|
||||
}
|
||||
}
|
||||
```
|
||||
|
43
docs/reference/elasticsearch-plugins/analysis-nori.md
Normal file
43
docs/reference/elasticsearch-plugins/analysis-nori.md
Normal file
|
@ -0,0 +1,43 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/analysis-nori.html
|
||||
---
|
||||
|
||||
# Korean (nori) analysis plugin [analysis-nori]
|
||||
|
||||
The Korean (nori) Analysis plugin integrates Lucene nori analysis module into elasticsearch. It uses the [mecab-ko-dic dictionary](https://bitbucket.org/eunjeon/mecab-ko-dic) to perform morphological analysis of Korean texts.
|
||||
|
||||
|
||||
## Installation [analysis-nori-install]
|
||||
|
||||
::::{warning}
|
||||
Version 9.0.0-beta1 of the Elastic Stack has not yet been released. The plugin might not be available.
|
||||
::::
|
||||
|
||||
|
||||
This plugin can be installed using the plugin manager:
|
||||
|
||||
```sh
|
||||
sudo bin/elasticsearch-plugin install analysis-nori
|
||||
```
|
||||
|
||||
The plugin must be installed on every node in the cluster, and each node must be restarted after installation.
|
||||
|
||||
You can download this plugin for [offline install](/reference/elasticsearch-plugins/plugin-management-custom-url.md) from [https://artifacts.elastic.co/downloads/elasticsearch-plugins/analysis-nori/analysis-nori-9.0.0-beta1.zip](https://artifacts.elastic.co/downloads/elasticsearch-plugins/analysis-nori/analysis-nori-9.0.0-beta1.zip). To verify the `.zip` file, use the [SHA hash](https://artifacts.elastic.co/downloads/elasticsearch-plugins/analysis-nori/analysis-nori-9.0.0-beta1.zip.sha512) or [ASC key](https://artifacts.elastic.co/downloads/elasticsearch-plugins/analysis-nori/analysis-nori-9.0.0-beta1.zip.asc).
|
||||
|
||||
|
||||
## Removal [analysis-nori-remove]
|
||||
|
||||
The plugin can be removed with the following command:
|
||||
|
||||
```sh
|
||||
sudo bin/elasticsearch-plugin remove analysis-nori
|
||||
```
|
||||
|
||||
The node must be stopped before removing the plugin.
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
|
@ -0,0 +1,76 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/analysis-phonetic-token-filter.html
|
||||
---
|
||||
|
||||
# phonetic token filter [analysis-phonetic-token-filter]
|
||||
|
||||
The `phonetic` token filter takes the following settings:
|
||||
|
||||
`encoder`
|
||||
: Which phonetic encoder to use. Accepts `metaphone` (default), `double_metaphone`, `soundex`, `refined_soundex`, `caverphone1`, `caverphone2`, `cologne`, `nysiis`, `koelnerphonetik`, `haasephonetik`, `beider_morse`, `daitch_mokotoff`.
|
||||
|
||||
`replace`
|
||||
: Whether or not the original token should be replaced by the phonetic token. Accepts `true` (default) and `false`. Not supported by `beider_morse` encoding.
|
||||
|
||||
```console
|
||||
PUT phonetic_sample
|
||||
{
|
||||
"settings": {
|
||||
"index": {
|
||||
"analysis": {
|
||||
"analyzer": {
|
||||
"my_analyzer": {
|
||||
"tokenizer": "standard",
|
||||
"filter": [
|
||||
"lowercase",
|
||||
"my_metaphone"
|
||||
]
|
||||
}
|
||||
},
|
||||
"filter": {
|
||||
"my_metaphone": {
|
||||
"type": "phonetic",
|
||||
"encoder": "metaphone",
|
||||
"replace": false
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
GET phonetic_sample/_analyze
|
||||
{
|
||||
"analyzer": "my_analyzer",
|
||||
"text": "Joe Bloggs" <1>
|
||||
}
|
||||
```
|
||||
|
||||
1. Returns: `J`, `joe`, `BLKS`, `bloggs`
|
||||
|
||||
|
||||
It is important to note that `"replace": false` can lead to unexpected behavior since the original and the phonetically analyzed version are both kept at the same token position. Some queries handle these stacked tokens in special ways. For example, the fuzzy `match` query does not apply [fuzziness](/reference/elasticsearch/rest-apis/common-options.md#fuzziness) to stacked synonym tokens. This can lead to issues that are difficult to diagnose and reason about. For this reason, it is often beneficial to use separate fields for analysis with and without phonetic filtering. That way searches can be run against both fields with differing boosts and trade-offs (e.g. only run a fuzzy `match` query on the original text field, but not on the phonetic version).
|
||||
|
||||
|
||||
## Double metaphone settings [_double_metaphone_settings]
|
||||
|
||||
If the `double_metaphone` encoder is used, then this additional setting is supported:
|
||||
|
||||
`max_code_len`
|
||||
: The maximum length of the emitted metaphone token. Defaults to `4`.
|
||||
|
||||
|
||||
## Beider Morse settings [_beider_morse_settings]
|
||||
|
||||
If the `beider_morse` encoder is used, then these additional settings are supported:
|
||||
|
||||
`rule_type`
|
||||
: Whether matching should be `exact` or `approx` (default).
|
||||
|
||||
`name_type`
|
||||
: Whether names are `ashkenazi`, `sephardic`, or `generic` (default).
|
||||
|
||||
`languageset`
|
||||
: An array of languages to check. If not specified, then the language will be guessed. Accepts: `any`, `common`, `cyrillic`, `english`, `french`, `german`, `hebrew`, `hungarian`, `polish`, `romanian`, `russian`, `spanish`.
|
||||
|
39
docs/reference/elasticsearch-plugins/analysis-phonetic.md
Normal file
39
docs/reference/elasticsearch-plugins/analysis-phonetic.md
Normal file
|
@ -0,0 +1,39 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/analysis-phonetic.html
|
||||
---
|
||||
|
||||
# Phonetic analysis plugin [analysis-phonetic]
|
||||
|
||||
The Phonetic Analysis plugin provides token filters which convert tokens to their phonetic representation using Soundex, Metaphone, and a variety of other algorithms.
|
||||
|
||||
|
||||
## Installation [analysis-phonetic-install]
|
||||
|
||||
::::{warning}
|
||||
Version 9.0.0-beta1 of the Elastic Stack has not yet been released. The plugin might not be available.
|
||||
::::
|
||||
|
||||
|
||||
This plugin can be installed using the plugin manager:
|
||||
|
||||
```sh
|
||||
sudo bin/elasticsearch-plugin install analysis-phonetic
|
||||
```
|
||||
|
||||
The plugin must be installed on every node in the cluster, and each node must be restarted after installation.
|
||||
|
||||
You can download this plugin for [offline install](/reference/elasticsearch-plugins/plugin-management-custom-url.md) from [https://artifacts.elastic.co/downloads/elasticsearch-plugins/analysis-phonetic/analysis-phonetic-9.0.0-beta1.zip](https://artifacts.elastic.co/downloads/elasticsearch-plugins/analysis-phonetic/analysis-phonetic-9.0.0-beta1.zip). To verify the `.zip` file, use the [SHA hash](https://artifacts.elastic.co/downloads/elasticsearch-plugins/analysis-phonetic/analysis-phonetic-9.0.0-beta1.zip.sha512) or [ASC key](https://artifacts.elastic.co/downloads/elasticsearch-plugins/analysis-phonetic/analysis-phonetic-9.0.0-beta1.zip.asc).
|
||||
|
||||
|
||||
## Removal [analysis-phonetic-remove]
|
||||
|
||||
The plugin can be removed with the following command:
|
||||
|
||||
```sh
|
||||
sudo bin/elasticsearch-plugin remove analysis-phonetic
|
||||
```
|
||||
|
||||
The node must be stopped before removing the plugin.
|
||||
|
||||
|
52
docs/reference/elasticsearch-plugins/analysis-plugins.md
Normal file
52
docs/reference/elasticsearch-plugins/analysis-plugins.md
Normal file
|
@ -0,0 +1,52 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/analysis.html
|
||||
---
|
||||
|
||||
# Analysis plugins [analysis]
|
||||
|
||||
Analysis plugins extend Elasticsearch by adding new analyzers, tokenizers, token filters, or character filters to Elasticsearch.
|
||||
|
||||
|
||||
## Core analysis plugins [_core_analysis_plugins]
|
||||
|
||||
The core analysis plugins are:
|
||||
|
||||
[ICU](/reference/elasticsearch-plugins/analysis-icu.md)
|
||||
: Adds extended Unicode support using the [ICU](http://site.icu-project.org/) libraries, including better analysis of Asian languages, Unicode normalization, Unicode-aware case folding, collation support, and transliteration.
|
||||
|
||||
[Kuromoji](/reference/elasticsearch-plugins/analysis-kuromoji.md)
|
||||
: Advanced analysis of Japanese using the [Kuromoji analyzer](https://www.atilika.org/).
|
||||
|
||||
[Nori](/reference/elasticsearch-plugins/analysis-nori.md)
|
||||
: Morphological analysis of Korean using the Lucene Nori analyzer.
|
||||
|
||||
[Phonetic](/reference/elasticsearch-plugins/analysis-phonetic.md)
|
||||
: Analyzes tokens into their phonetic equivalent using Soundex, Metaphone, Caverphone, and other codecs.
|
||||
|
||||
[SmartCN](/reference/elasticsearch-plugins/analysis-smartcn.md)
|
||||
: An analyzer for Chinese or mixed Chinese-English text. This analyzer uses probabilistic knowledge to find the optimal word segmentation for Simplified Chinese text. The text is first broken into sentences, then each sentence is segmented into words.
|
||||
|
||||
[Stempel](/reference/elasticsearch-plugins/analysis-stempel.md)
|
||||
: Provides high quality stemming for Polish.
|
||||
|
||||
[Ukrainian](/reference/elasticsearch-plugins/analysis-ukrainian.md)
|
||||
: Provides stemming for Ukrainian.
|
||||
|
||||
|
||||
## Community contributed analysis plugins [_community_contributed_analysis_plugins]
|
||||
|
||||
A number of analysis plugins have been contributed by our community:
|
||||
|
||||
* [IK Analysis Plugin](https://github.com/medcl/elasticsearch-analysis-ik) (by Medcl)
|
||||
* [Pinyin Analysis Plugin](https://github.com/medcl/elasticsearch-analysis-pinyin) (by Medcl)
|
||||
* [Vietnamese Analysis Plugin](https://github.com/duydo/elasticsearch-analysis-vietnamese) (by Duy Do)
|
||||
* [STConvert Analysis Plugin](https://github.com/medcl/elasticsearch-analysis-stconvert) (by Medcl)
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
68
docs/reference/elasticsearch-plugins/analysis-polish-stop.md
Normal file
68
docs/reference/elasticsearch-plugins/analysis-polish-stop.md
Normal file
|
@ -0,0 +1,68 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/analysis-polish-stop.html
|
||||
---
|
||||
|
||||
# polish_stop token filter [analysis-polish-stop]
|
||||
|
||||
The `polish_stop` token filter filters out Polish stopwords (`_polish_`), and any other custom stopwords specified by the user. This filter only supports the predefined `_polish_` stopwords list. If you want to use a different predefined list, then use the [`stop` token filter](/reference/data-analysis/text-analysis/analysis-stop-tokenfilter.md) instead.
|
||||
|
||||
```console
|
||||
PUT /polish_stop_example
|
||||
{
|
||||
"settings": {
|
||||
"index": {
|
||||
"analysis": {
|
||||
"analyzer": {
|
||||
"analyzer_with_stop": {
|
||||
"tokenizer": "standard",
|
||||
"filter": [
|
||||
"lowercase",
|
||||
"polish_stop"
|
||||
]
|
||||
}
|
||||
},
|
||||
"filter": {
|
||||
"polish_stop": {
|
||||
"type": "polish_stop",
|
||||
"stopwords": [
|
||||
"_polish_",
|
||||
"jeść"
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
GET polish_stop_example/_analyze
|
||||
{
|
||||
"analyzer": "analyzer_with_stop",
|
||||
"text": "Gdzie kucharek sześć, tam nie ma co jeść."
|
||||
}
|
||||
```
|
||||
|
||||
The above request returns:
|
||||
|
||||
```console-result
|
||||
{
|
||||
"tokens" : [
|
||||
{
|
||||
"token" : "kucharek",
|
||||
"start_offset" : 6,
|
||||
"end_offset" : 14,
|
||||
"type" : "<ALPHANUM>",
|
||||
"position" : 1
|
||||
},
|
||||
{
|
||||
"token" : "sześć",
|
||||
"start_offset" : 15,
|
||||
"end_offset" : 20,
|
||||
"type" : "<ALPHANUM>",
|
||||
"position" : 2
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
52
docs/reference/elasticsearch-plugins/analysis-smartcn.md
Normal file
52
docs/reference/elasticsearch-plugins/analysis-smartcn.md
Normal file
|
@ -0,0 +1,52 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/analysis-smartcn.html
|
||||
---
|
||||
|
||||
# Smart Chinese analysis plugin [analysis-smartcn]
|
||||
|
||||
The Smart Chinese Analysis plugin integrates Lucene’s Smart Chinese analysis module into elasticsearch.
|
||||
|
||||
It provides an analyzer for Chinese or mixed Chinese-English text. This analyzer uses probabilistic knowledge to find the optimal word segmentation for Simplified Chinese text. The text is first broken into sentences, then each sentence is segmented into words.
|
||||
|
||||
|
||||
## Installation [analysis-smartcn-install]
|
||||
|
||||
::::{warning}
|
||||
Version 9.0.0-beta1 of the Elastic Stack has not yet been released. The plugin might not be available.
|
||||
::::
|
||||
|
||||
|
||||
This plugin can be installed using the plugin manager:
|
||||
|
||||
```sh
|
||||
sudo bin/elasticsearch-plugin install analysis-smartcn
|
||||
```
|
||||
|
||||
The plugin must be installed on every node in the cluster, and each node must be restarted after installation.
|
||||
|
||||
You can download this plugin for [offline install](/reference/elasticsearch-plugins/plugin-management-custom-url.md) from [https://artifacts.elastic.co/downloads/elasticsearch-plugins/analysis-smartcn/analysis-smartcn-9.0.0-beta1.zip](https://artifacts.elastic.co/downloads/elasticsearch-plugins/analysis-smartcn/analysis-smartcn-9.0.0-beta1.zip). To verify the `.zip` file, use the [SHA hash](https://artifacts.elastic.co/downloads/elasticsearch-plugins/analysis-smartcn/analysis-smartcn-9.0.0-beta1.zip.sha512) or [ASC key](https://artifacts.elastic.co/downloads/elasticsearch-plugins/analysis-smartcn/analysis-smartcn-9.0.0-beta1.zip.asc).
|
||||
|
||||
|
||||
## Removal [analysis-smartcn-remove]
|
||||
|
||||
The plugin can be removed with the following command:
|
||||
|
||||
```sh
|
||||
sudo bin/elasticsearch-plugin remove analysis-smartcn
|
||||
```
|
||||
|
||||
The node must be stopped before removing the plugin.
|
||||
|
||||
|
||||
## `smartcn` tokenizer and token filter [analysis-smartcn-tokenizer]
|
||||
|
||||
The plugin provides the `smartcn` analyzer, `smartcn_tokenizer` tokenizer, and `smartcn_stop` token filter which are not configurable.
|
||||
|
||||
::::{note}
|
||||
The `smartcn_word` token filter and `smartcn_sentence` have been deprecated.
|
||||
::::
|
||||
|
||||
|
||||
|
||||
|
377
docs/reference/elasticsearch-plugins/analysis-smartcn_stop.md
Normal file
377
docs/reference/elasticsearch-plugins/analysis-smartcn_stop.md
Normal file
|
@ -0,0 +1,377 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/analysis-smartcn_stop.html
|
||||
---
|
||||
|
||||
# smartcn_stop token filter [analysis-smartcn_stop]
|
||||
|
||||
The `smartcn_stop` token filter filters out stopwords defined by `smartcn` analyzer (`_smartcn_`), and any other custom stopwords specified by the user. This filter only supports the predefined `_smartcn_` stopwords list. If you want to use a different predefined list, then use the [`stop` token filter](/reference/data-analysis/text-analysis/analysis-stop-tokenfilter.md) instead.
|
||||
|
||||
```console
|
||||
PUT smartcn_example
|
||||
{
|
||||
"settings": {
|
||||
"index": {
|
||||
"analysis": {
|
||||
"analyzer": {
|
||||
"smartcn_with_stop": {
|
||||
"tokenizer": "smartcn_tokenizer",
|
||||
"filter": [
|
||||
"porter_stem",
|
||||
"my_smartcn_stop"
|
||||
]
|
||||
}
|
||||
},
|
||||
"filter": {
|
||||
"my_smartcn_stop": {
|
||||
"type": "smartcn_stop",
|
||||
"stopwords": [
|
||||
"_smartcn_",
|
||||
"stack",
|
||||
"的"
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
GET smartcn_example/_analyze
|
||||
{
|
||||
"analyzer": "smartcn_with_stop",
|
||||
"text": "哈喽,我们是 Elastic 我们是 Elastic Stack(Elasticsearch、Kibana、Beats 和 Logstash)的开发公司。从股票行情到 Twitter 消息流,从 Apache 日志到 WordPress 博文,我们可以帮助人们体验搜索的强大力量,帮助他们以截然不同的方式探索和分析数据"
|
||||
}
|
||||
```
|
||||
|
||||
The above request returns:
|
||||
|
||||
```console-result
|
||||
{
|
||||
"tokens": [
|
||||
{
|
||||
"token": "哈",
|
||||
"start_offset": 0,
|
||||
"end_offset": 1,
|
||||
"type": "word",
|
||||
"position": 0
|
||||
},
|
||||
{
|
||||
"token": "喽",
|
||||
"start_offset": 1,
|
||||
"end_offset": 2,
|
||||
"type": "word",
|
||||
"position": 1
|
||||
},
|
||||
{
|
||||
"token": "我们",
|
||||
"start_offset": 3,
|
||||
"end_offset": 5,
|
||||
"type": "word",
|
||||
"position": 3
|
||||
},
|
||||
{
|
||||
"token": "是",
|
||||
"start_offset": 5,
|
||||
"end_offset": 6,
|
||||
"type": "word",
|
||||
"position": 4
|
||||
},
|
||||
{
|
||||
"token": "elast",
|
||||
"start_offset": 7,
|
||||
"end_offset": 14,
|
||||
"type": "word",
|
||||
"position": 5
|
||||
},
|
||||
{
|
||||
"token": "我们",
|
||||
"start_offset": 17,
|
||||
"end_offset": 19,
|
||||
"type": "word",
|
||||
"position": 6
|
||||
},
|
||||
{
|
||||
"token": "是",
|
||||
"start_offset": 19,
|
||||
"end_offset": 20,
|
||||
"type": "word",
|
||||
"position": 7
|
||||
},
|
||||
{
|
||||
"token": "elast",
|
||||
"start_offset": 21,
|
||||
"end_offset": 28,
|
||||
"type": "word",
|
||||
"position": 8
|
||||
},
|
||||
{
|
||||
"token": "elasticsearch",
|
||||
"start_offset": 35,
|
||||
"end_offset": 48,
|
||||
"type": "word",
|
||||
"position": 11
|
||||
},
|
||||
{
|
||||
"token": "kibana",
|
||||
"start_offset": 49,
|
||||
"end_offset": 55,
|
||||
"type": "word",
|
||||
"position": 13
|
||||
},
|
||||
{
|
||||
"token": "beat",
|
||||
"start_offset": 56,
|
||||
"end_offset": 61,
|
||||
"type": "word",
|
||||
"position": 15
|
||||
},
|
||||
{
|
||||
"token": "和",
|
||||
"start_offset": 62,
|
||||
"end_offset": 63,
|
||||
"type": "word",
|
||||
"position": 16
|
||||
},
|
||||
{
|
||||
"token": "logstash",
|
||||
"start_offset": 64,
|
||||
"end_offset": 72,
|
||||
"type": "word",
|
||||
"position": 17
|
||||
},
|
||||
{
|
||||
"token": "开发",
|
||||
"start_offset": 74,
|
||||
"end_offset": 76,
|
||||
"type": "word",
|
||||
"position": 20
|
||||
},
|
||||
{
|
||||
"token": "公司",
|
||||
"start_offset": 76,
|
||||
"end_offset": 78,
|
||||
"type": "word",
|
||||
"position": 21
|
||||
},
|
||||
{
|
||||
"token": "从",
|
||||
"start_offset": 79,
|
||||
"end_offset": 80,
|
||||
"type": "word",
|
||||
"position": 23
|
||||
},
|
||||
{
|
||||
"token": "股票",
|
||||
"start_offset": 80,
|
||||
"end_offset": 82,
|
||||
"type": "word",
|
||||
"position": 24
|
||||
},
|
||||
{
|
||||
"token": "行情",
|
||||
"start_offset": 82,
|
||||
"end_offset": 84,
|
||||
"type": "word",
|
||||
"position": 25
|
||||
},
|
||||
{
|
||||
"token": "到",
|
||||
"start_offset": 84,
|
||||
"end_offset": 85,
|
||||
"type": "word",
|
||||
"position": 26
|
||||
},
|
||||
{
|
||||
"token": "twitter",
|
||||
"start_offset": 86,
|
||||
"end_offset": 93,
|
||||
"type": "word",
|
||||
"position": 27
|
||||
},
|
||||
{
|
||||
"token": "消息",
|
||||
"start_offset": 94,
|
||||
"end_offset": 96,
|
||||
"type": "word",
|
||||
"position": 28
|
||||
},
|
||||
{
|
||||
"token": "流",
|
||||
"start_offset": 96,
|
||||
"end_offset": 97,
|
||||
"type": "word",
|
||||
"position": 29
|
||||
},
|
||||
{
|
||||
"token": "从",
|
||||
"start_offset": 98,
|
||||
"end_offset": 99,
|
||||
"type": "word",
|
||||
"position": 31
|
||||
},
|
||||
{
|
||||
"token": "apach",
|
||||
"start_offset": 100,
|
||||
"end_offset": 106,
|
||||
"type": "word",
|
||||
"position": 32
|
||||
},
|
||||
{
|
||||
"token": "日志",
|
||||
"start_offset": 107,
|
||||
"end_offset": 109,
|
||||
"type": "word",
|
||||
"position": 33
|
||||
},
|
||||
{
|
||||
"token": "到",
|
||||
"start_offset": 109,
|
||||
"end_offset": 110,
|
||||
"type": "word",
|
||||
"position": 34
|
||||
},
|
||||
{
|
||||
"token": "wordpress",
|
||||
"start_offset": 111,
|
||||
"end_offset": 120,
|
||||
"type": "word",
|
||||
"position": 35
|
||||
},
|
||||
{
|
||||
"token": "博",
|
||||
"start_offset": 121,
|
||||
"end_offset": 122,
|
||||
"type": "word",
|
||||
"position": 36
|
||||
},
|
||||
{
|
||||
"token": "文",
|
||||
"start_offset": 122,
|
||||
"end_offset": 123,
|
||||
"type": "word",
|
||||
"position": 37
|
||||
},
|
||||
{
|
||||
"token": "我们",
|
||||
"start_offset": 124,
|
||||
"end_offset": 126,
|
||||
"type": "word",
|
||||
"position": 39
|
||||
},
|
||||
{
|
||||
"token": "可以",
|
||||
"start_offset": 126,
|
||||
"end_offset": 128,
|
||||
"type": "word",
|
||||
"position": 40
|
||||
},
|
||||
{
|
||||
"token": "帮助",
|
||||
"start_offset": 128,
|
||||
"end_offset": 130,
|
||||
"type": "word",
|
||||
"position": 41
|
||||
},
|
||||
{
|
||||
"token": "人们",
|
||||
"start_offset": 130,
|
||||
"end_offset": 132,
|
||||
"type": "word",
|
||||
"position": 42
|
||||
},
|
||||
{
|
||||
"token": "体验",
|
||||
"start_offset": 132,
|
||||
"end_offset": 134,
|
||||
"type": "word",
|
||||
"position": 43
|
||||
},
|
||||
{
|
||||
"token": "搜索",
|
||||
"start_offset": 134,
|
||||
"end_offset": 136,
|
||||
"type": "word",
|
||||
"position": 44
|
||||
},
|
||||
{
|
||||
"token": "强大",
|
||||
"start_offset": 137,
|
||||
"end_offset": 139,
|
||||
"type": "word",
|
||||
"position": 46
|
||||
},
|
||||
{
|
||||
"token": "力量",
|
||||
"start_offset": 139,
|
||||
"end_offset": 141,
|
||||
"type": "word",
|
||||
"position": 47
|
||||
},
|
||||
{
|
||||
"token": "帮助",
|
||||
"start_offset": 142,
|
||||
"end_offset": 144,
|
||||
"type": "word",
|
||||
"position": 49
|
||||
},
|
||||
{
|
||||
"token": "他们",
|
||||
"start_offset": 144,
|
||||
"end_offset": 146,
|
||||
"type": "word",
|
||||
"position": 50
|
||||
},
|
||||
{
|
||||
"token": "以",
|
||||
"start_offset": 146,
|
||||
"end_offset": 147,
|
||||
"type": "word",
|
||||
"position": 51
|
||||
},
|
||||
{
|
||||
"token": "截然不同",
|
||||
"start_offset": 147,
|
||||
"end_offset": 151,
|
||||
"type": "word",
|
||||
"position": 52
|
||||
},
|
||||
{
|
||||
"token": "方式",
|
||||
"start_offset": 152,
|
||||
"end_offset": 154,
|
||||
"type": "word",
|
||||
"position": 54
|
||||
},
|
||||
{
|
||||
"token": "探索",
|
||||
"start_offset": 154,
|
||||
"end_offset": 156,
|
||||
"type": "word",
|
||||
"position": 55
|
||||
},
|
||||
{
|
||||
"token": "和",
|
||||
"start_offset": 156,
|
||||
"end_offset": 157,
|
||||
"type": "word",
|
||||
"position": 56
|
||||
},
|
||||
{
|
||||
"token": "分析",
|
||||
"start_offset": 157,
|
||||
"end_offset": 159,
|
||||
"type": "word",
|
||||
"position": 57
|
||||
},
|
||||
{
|
||||
"token": "数据",
|
||||
"start_offset": 159,
|
||||
"end_offset": 161,
|
||||
"type": "word",
|
||||
"position": 58
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
45
docs/reference/elasticsearch-plugins/analysis-stempel.md
Normal file
45
docs/reference/elasticsearch-plugins/analysis-stempel.md
Normal file
|
@ -0,0 +1,45 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/analysis-stempel.html
|
||||
---
|
||||
|
||||
# Stempel Polish analysis plugin [analysis-stempel]
|
||||
|
||||
The Stempel analysis plugin integrates Lucene’s Stempel analysis module for Polish into elasticsearch.
|
||||
|
||||
|
||||
## Installation [analysis-stempel-install]
|
||||
|
||||
::::{warning}
|
||||
Version 9.0.0-beta1 of the Elastic Stack has not yet been released. The plugin might not be available.
|
||||
::::
|
||||
|
||||
|
||||
This plugin can be installed using the plugin manager:
|
||||
|
||||
```sh
|
||||
sudo bin/elasticsearch-plugin install analysis-stempel
|
||||
```
|
||||
|
||||
The plugin must be installed on every node in the cluster, and each node must be restarted after installation.
|
||||
|
||||
You can download this plugin for [offline install](/reference/elasticsearch-plugins/plugin-management-custom-url.md) from [https://artifacts.elastic.co/downloads/elasticsearch-plugins/analysis-stempel/analysis-stempel-9.0.0-beta1.zip](https://artifacts.elastic.co/downloads/elasticsearch-plugins/analysis-stempel/analysis-stempel-9.0.0-beta1.zip). To verify the `.zip` file, use the [SHA hash](https://artifacts.elastic.co/downloads/elasticsearch-plugins/analysis-stempel/analysis-stempel-9.0.0-beta1.zip.sha512) or [ASC key](https://artifacts.elastic.co/downloads/elasticsearch-plugins/analysis-stempel/analysis-stempel-9.0.0-beta1.zip.asc).
|
||||
|
||||
|
||||
## Removal [analysis-stempel-remove]
|
||||
|
||||
The plugin can be removed with the following command:
|
||||
|
||||
```sh
|
||||
sudo bin/elasticsearch-plugin remove analysis-stempel
|
||||
```
|
||||
|
||||
The node must be stopped before removing the plugin.
|
||||
|
||||
|
||||
## `stempel` tokenizer and token filters [analysis-stempel-tokenizer]
|
||||
|
||||
The plugin provides the `polish` analyzer and the `polish_stem` and `polish_stop` token filters, which are not configurable.
|
||||
|
||||
|
||||
|
45
docs/reference/elasticsearch-plugins/analysis-ukrainian.md
Normal file
45
docs/reference/elasticsearch-plugins/analysis-ukrainian.md
Normal file
|
@ -0,0 +1,45 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/analysis-ukrainian.html
|
||||
---
|
||||
|
||||
# Ukrainian analysis plugin [analysis-ukrainian]
|
||||
|
||||
The Ukrainian analysis plugin integrates Lucene’s UkrainianMorfologikAnalyzer into elasticsearch.
|
||||
|
||||
It provides stemming for Ukrainian using the [Morfologik project](https://github.com/morfologik/morfologik-stemming).
|
||||
|
||||
|
||||
## Installation [analysis-ukrainian-install]
|
||||
|
||||
::::{warning}
|
||||
Version 9.0.0-beta1 of the Elastic Stack has not yet been released. The plugin might not be available.
|
||||
::::
|
||||
|
||||
|
||||
This plugin can be installed using the plugin manager:
|
||||
|
||||
```sh
|
||||
sudo bin/elasticsearch-plugin install analysis-ukrainian
|
||||
```
|
||||
|
||||
The plugin must be installed on every node in the cluster, and each node must be restarted after installation.
|
||||
|
||||
You can download this plugin for [offline install](/reference/elasticsearch-plugins/plugin-management-custom-url.md) from [https://artifacts.elastic.co/downloads/elasticsearch-plugins/analysis-ukrainian/analysis-ukrainian-9.0.0-beta1.zip](https://artifacts.elastic.co/downloads/elasticsearch-plugins/analysis-ukrainian/analysis-ukrainian-9.0.0-beta1.zip). To verify the `.zip` file, use the [SHA hash](https://artifacts.elastic.co/downloads/elasticsearch-plugins/analysis-ukrainian/analysis-ukrainian-9.0.0-beta1.zip.sha512) or [ASC key](https://artifacts.elastic.co/downloads/elasticsearch-plugins/analysis-ukrainian/analysis-ukrainian-9.0.0-beta1.zip.asc).
|
||||
|
||||
|
||||
## Removal [analysis-ukrainian-remove]
|
||||
|
||||
The plugin can be removed with the following command:
|
||||
|
||||
```sh
|
||||
sudo bin/elasticsearch-plugin remove analysis-ukrainian
|
||||
```
|
||||
|
||||
The node must be stopped before removing the plugin.
|
||||
|
||||
|
||||
## `ukrainian` analyzer [analysis-ukrainian-analyzer]
|
||||
|
||||
The plugin provides the `ukrainian` analyzer.
|
||||
|
|
@ -0,0 +1,19 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/api.html
|
||||
---
|
||||
|
||||
# API extension plugins [api]
|
||||
|
||||
API extension plugins add new functionality to Elasticsearch by adding new APIs or features, usually to do with search or mapping.
|
||||
|
||||
|
||||
## Community contributed API extension plugins [_community_contributed_api_extension_plugins]
|
||||
|
||||
A number of plugins have been contributed by our community:
|
||||
|
||||
* [carrot2 Plugin](https://github.com/carrot2/elasticsearch-carrot2): Results clustering with [carrot2](https://github.com/carrot2/carrot2) (by Dawid Weiss)
|
||||
* [Elasticsearch Trigram Accelerated Regular Expression Filter](https://github.com/wikimedia/search-extra): (by Wikimedia Foundation/Nik Everett)
|
||||
* [Elasticsearch Experimental Highlighter](https://github.com/wikimedia/search-highlighter): (by Wikimedia Foundation/Nik Everett)
|
||||
* [Entity Resolution Plugin](https://github.com/zentity-io/zentity) ([zentity](https://zentity.io)): Real-time entity resolution with pure Elasticsearch (by Dave Moore)
|
||||
|
|
@ -0,0 +1,35 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/cloud-aws-best-practices.html
|
||||
---
|
||||
|
||||
# Best Practices in AWS [cloud-aws-best-practices]
|
||||
|
||||
This section contains some other information about designing and managing an {{es}} cluster on your own AWS infrastructure. If you would prefer to avoid these operational details then you may be interested in a hosted {{es}} installation available on AWS-based infrastructure from [https://www.elastic.co/cloud](https://www.elastic.co/cloud).
|
||||
|
||||
## Storage [_storage]
|
||||
|
||||
EC2 instances offer a number of different kinds of storage. Please be aware of the following when selecting the storage for your cluster:
|
||||
|
||||
* [Instance Store](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/InstanceStorage.md) is recommended for {{es}} clusters as it offers excellent performance and is cheaper than EBS-based storage. {{es}} is designed to work well with this kind of ephemeral storage because it replicates each shard across multiple nodes. If a node fails and its Instance Store is lost then {{es}} will rebuild any lost shards from other copies.
|
||||
* [EBS-based storage](https://aws.amazon.com/ebs/) may be acceptable for smaller clusters (1-2 nodes). Be sure to use provisioned IOPS to ensure your cluster has satisfactory performance.
|
||||
* [EFS-based storage](https://aws.amazon.com/efs/) is not recommended or supported as it does not offer satisfactory performance. Historically, shared network filesystems such as EFS have not always offered precisely the behaviour that {{es}} requires of its filesystem, and this has been known to lead to index corruption. Although EFS offers durability, shared storage, and the ability to grow and shrink filesystems dynamically, you can achieve the same benefits using {{es}} directly.
|
||||
|
||||
|
||||
## Choice of AMI [_choice_of_ami]
|
||||
|
||||
Prefer the [Amazon Linux 2 AMIs](https://aws.amazon.com/amazon-linux-2/) as these allow you to benefit from the lightweight nature, support, and EC2-specific performance enhancements that these images offer.
|
||||
|
||||
|
||||
## Networking [_networking]
|
||||
|
||||
* Smaller instance types have limited network performance, in terms of both [bandwidth and number of connections](https://lab.getbase.com/how-we-discovered-limitations-on-the-aws-tcp-stack/). If networking is a bottleneck, avoid [instance types](https://aws.amazon.com/ec2/instance-types/) with networking labelled as `Moderate` or `Low`.
|
||||
* It is a good idea to distribute your nodes across multiple [availability zones](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-regions-availability-zones.md) and use [shard allocation awareness](docs-content://deploy-manage/distributed-architecture/shard-allocation-relocation-recovery/shard-allocation-awareness.md) to ensure that each shard has copies in more than one availability zone.
|
||||
* Do not span a cluster across regions. {{es}} expects that node-to-node connections within a cluster are reasonably reliable and offer high bandwidth and low latency, and these properties do not hold for connections between regions. Although an {{es}} cluster will behave correctly when node-to-node connections are unreliable or slow, it is not optimised for this case and its performance may suffer. If you wish to geographically distribute your data, you should provision multiple clusters and use features such as [cross-cluster search](docs-content://solutions/search/cross-cluster-search.md) and [cross-cluster replication](docs-content://deploy-manage/tools/cross-cluster-replication.md).
|
||||
|
||||
|
||||
## Other recommendations [_other_recommendations]
|
||||
|
||||
* If you have split your nodes into roles, consider [tagging the EC2 instances](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/Using_Tags.md) by role to make it easier to filter and view your EC2 instances in the AWS console.
|
||||
* Consider [enabling termination protection](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/terminating-instances.md#Using_ChangingDisableAPITermination) for all of your data and master-eligible nodes. This will help to prevent accidental termination of these nodes which could temporarily reduce the resilience of the cluster and which could cause a potentially disruptive reallocation of shards.
|
||||
* If running your cluster using one or more [auto-scaling groups](https://docs.aws.amazon.com/autoscaling/ec2/userguide/AutoScalingGroup.md), consider protecting your data and master-eligible nodes [against termination during scale-in](https://docs.aws.amazon.com/autoscaling/ec2/userguide/as-instance-termination.md#instance-protection-instance). This will help to prevent automatic termination of these nodes which could temporarily reduce the resilience of the cluster and which could cause a potentially disruptive reallocation of shards. If these instances are protected against termination during scale-in then you can use shard allocation filtering to gracefully migrate any data off these nodes before terminating them manually. Refer to [](/reference/elasticsearch/index-settings/shard-allocation.md).
|
|
@ -0,0 +1,41 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/cloud-enterprise/current/ece-add-plugins.html
|
||||
---
|
||||
|
||||
# Plugin management (Cloud Enterprise) [ece-add-plugins]
|
||||
|
||||
Plugins extend the core functionality of Elasticsearch. Elastic Cloud Enterprise makes it easy to add plugins to your deployment by providing a number of plugins that work with your version of Elasticsearch. One advantage of these plugins is that you generally don’t have to worry about upgrading plugins when upgrading to a new Elasticsearch version, unless there are breaking changes. The plugins simply are upgraded along with the rest of your deployment.
|
||||
|
||||
Adding plugins to a deployment is as simple as selecting it from the list of available plugins, but different versions of Elasticsearch support different plugins. Plugins are available for different purposes, such as:
|
||||
|
||||
* National language support, phonetic analysis, and extended unicode support
|
||||
* Ingesting attachments in common formats and ingesting information about the geographic location of IP addresses
|
||||
* Adding new field datatypes to Elasticsearch
|
||||
|
||||
Additional plugins might be available. If a plugin is listed for your version of Elasticsearch, it can be used.
|
||||
|
||||
To add plugins when creating a new deployment:
|
||||
|
||||
1. [Log into the Cloud UI](docs-content://deploy-manage/deploy/cloud-enterprise/log-into-cloud-ui.md) and select **Create deployment**.
|
||||
2. Make your initial deployment selections, then select **Customize Deployment**.
|
||||
3. Beneath the Elasticsearch master node, expand the **Manage plugins and settings** caret.
|
||||
4. Select the plugins you want.
|
||||
5. Select **Create deployment**.
|
||||
|
||||
The deployment spins up with the plugins installed.
|
||||
|
||||
To add plugins to an existing deployment:
|
||||
|
||||
1. [Log into the Cloud UI](docs-content://deploy-manage/deploy/cloud-enterprise/log-into-cloud-ui.md).
|
||||
2. On the **Deployments** page, select your deployment.
|
||||
|
||||
Narrow the list by name, ID, or choose from several other filters. To further define the list, use a combination of filters.
|
||||
|
||||
3. From your deployment menu, go to the **Edit** page.
|
||||
4. Beneath the Elasticsearch master node, expand the **Manage plugins and settings** caret.
|
||||
5. Select the plugins that you want.
|
||||
6. Select **Save changes**.
|
||||
|
||||
There is no downtime when adding plugins to highly available deployments. The deployment is updated with new nodes that have the plugins installed.
|
||||
|
|
@ -0,0 +1,30 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/cloud/current/ec-adding-elastic-plugins.html
|
||||
---
|
||||
|
||||
# Add plugins provided with Elasticsearch Service [ec-adding-elastic-plugins]
|
||||
|
||||
You can use a variety of official plugins that are compatible with your version of {{es}}. When you upgrade to a new {{es}} version, these plugins are simply upgraded with the rest of your deployment.
|
||||
|
||||
## Before you begin [ec_before_you_begin_6]
|
||||
|
||||
Some restrictions apply when adding plugins. To learn more, check [Restrictions for {{es}} and {{kib}} plugins](cloud://docs/release-notes/cloud-hosted/known-issues.md#ec-restrictions-plugins).
|
||||
|
||||
Only Gold, Platinum, Enterprise and Private subscriptions, running version 2.4.6 or later, have access to uploading custom plugins. All subscription levels, including Standard, can upload scripts and dictionaries.
|
||||
|
||||
To enable a plugin for a deployment:
|
||||
|
||||
1. Log in to the [Elasticsearch Service Console](https://cloud.elastic.co?page=docs&placement=docs-body).
|
||||
2. Find your deployment on the home page in the Elasticsearch Service card and select **Manage** to access it directly. Or, select **Hosted deployments** to go to the deployments page to view all of your deployments.
|
||||
|
||||
On the deployments page you can narrow your deployments by name, ID, or choose from several other filters. To customize your view, use a combination of filters, or change the format from a grid to a list.
|
||||
|
||||
3. From the **Actions** dropdown, select **Edit deployment**.
|
||||
4. Select **Manage user settings and extensions**.
|
||||
5. Select the **Extensions** tab.
|
||||
6. Select the plugins that you want to enable.
|
||||
7. Select **Back**.
|
||||
8. Select **Save**. The {{es}} cluster is then updated with new nodes that have the plugin installed.
|
||||
|
||||
|
|
@ -0,0 +1,31 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/cloud/current/ec-adding-plugins.html
|
||||
---
|
||||
|
||||
# Add plugins and extensions [ec-adding-plugins]
|
||||
|
||||
Plugins extend the core functionality of {{es}}. There are many suitable plugins, including:
|
||||
|
||||
* Discovery plugins, such as the cloud AWS plugin that allows discovering nodes on EC2 instances.
|
||||
* Analysis plugins, to provide analyzers targeted at languages other than English.
|
||||
* Scripting plugins, to provide additional scripting languages.
|
||||
|
||||
Plugins can come from different sources: the official ones created or at least maintained by Elastic, community-sourced plugins from other users, and plugins that you provide. Some of the official plugins are always provided with our service, and can be [enabled per deployment](/reference/elasticsearch-plugins/cloud/ec-adding-elastic-plugins.md\).
|
||||
|
||||
There are two ways to add plugins to a deployment in Elasticsearch Service:
|
||||
|
||||
* [Enable one of the official plugins already available in Elasticsearch Service](/reference/elasticsearch-plugins/cloud/ec-adding-elastic-plugins.md\).
|
||||
* [Upload a custom plugin and then enable it per deployment](/reference/elasticsearch-plugins/cloud/ec-custom-bundles.md\).
|
||||
|
||||
Custom plugins can include the official {{es}} plugins not provided with Elasticsearch Service, any of the community-sourced plugins, or [plugins that you write yourself](/extend/index.md). Uploading custom plugins is available only to Gold, Platinum, and Enterprise subscriptions. For more information, check [Upload custom plugins and bundles](/reference/elasticsearch-plugins/cloud/ec-custom-bundles.md\).
|
||||
|
||||
To learn more about the official and community-sourced plugins, refer to [{{es}} Plugins and Integrations](/reference/elasticsearch-plugins/index.md).
|
||||
|
||||
For a detailed guide with examples of using the Elasticsearch Service API to create, get information about, update, and delete extensions and plugins, check [Managing plugins and extensions through the API](/reference/elasticsearch-plugins/cloud/ec-plugins-guide.md\).
|
||||
|
||||
Plugins are not supported for {{kib}}. To learn more, check [Restrictions for {{es}} and {{kib}} plugins](cloud://docs/release-notes/cloud-hosted/known-issues.md#ec-restrictions-plugins).
|
||||
|
||||
|
||||
|
||||
|
250
docs/reference/elasticsearch-plugins/cloud/ec-custom-bundles.md
Normal file
250
docs/reference/elasticsearch-plugins/cloud/ec-custom-bundles.md
Normal file
|
@ -0,0 +1,250 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/cloud/current/ec-custom-bundles.html
|
||||
---
|
||||
|
||||
# Upload custom plugins and bundles [ec-custom-bundles]
|
||||
|
||||
There are several cases where you might need your own files to be made available to your {{es}} cluster’s nodes:
|
||||
|
||||
* Your own custom plugins, or third-party plugins that are not amongst the [officially available plugins](/reference/elasticsearch-plugins/plugin-management.md).
|
||||
* Custom dictionaries, such as synonyms, stop words, compound words, and so on.
|
||||
* Cluster configuration files, such as an Identity Provider metadata file used when you [secure your clusters with SAML](docs-content://deploy-manage/users-roles/cluster-or-deployment-auth/saml.md).
|
||||
|
||||
To facilitate this, we make it possible to upload a ZIP file that contains the files you want to make available. Uploaded files are stored using Amazon’s highly-available S3 service. This is necessary so we do not have to rely on the availability of third-party services, such as the official plugin repository, when provisioning nodes.
|
||||
|
||||
Custom plugins and bundles are collectively referred to as extensions.
|
||||
|
||||
## Before you begin [ec_before_you_begin_7]
|
||||
|
||||
The selected plugins/bundles are downloaded and provided when a node starts. Changing a plugin does not change it for nodes already running it. Refer to [Updating Plugins and Bundles](#ec-update-bundles-and-plugins).
|
||||
|
||||
With great power comes great responsibility: your plugins can extend your deployment with new functionality, but also break it. Be careful. We obviously cannot guarantee that your custom code works.
|
||||
|
||||
::::{important}
|
||||
You cannot edit or delete a custom extension after it has been used in a deployment. To remove it from your deployment, you can disable the extension and update your deployment configuration.
|
||||
::::
|
||||
|
||||
|
||||
Uploaded files cannot be bigger than 20MB for most subscription levels, for Platinum and Enterprise the limit is 8GB.
|
||||
|
||||
It is important that plugins and dictionaries that you reference in mappings and configurations are available at all times. For example, if you try to upgrade {{es}} and de-select a dictionary that is referenced in your mapping, the new nodes will be unable to recover the cluster state and function. This is true even if the dictionary is referenced by an empty index you do not actually use.
|
||||
|
||||
|
||||
## Prepare your files for upload [ec-prepare-custom-bundles]
|
||||
|
||||
Plugins are uploaded as ZIP files. You need to choose whether your uploaded file should be treated as a *plugin* or as a *bundle*. Bundles are not installed as plugins. If you need to upload both a custom plugin and custom dictionaries, upload them separately.
|
||||
|
||||
To prepare your files, create one of the following:
|
||||
|
||||
Plugins
|
||||
: A plugin is a ZIP file that contains a plugin descriptor file and binaries.
|
||||
|
||||
The plugin descriptor file is called either `stable-plugin-descriptor.properties` for plugins built against the stable plugin API, or `plugin-descriptor.properties` for plugins built against the classic plugin API. A plugin ZIP file should only contain one plugin descriptor file.
|
||||
|
||||
{{es}} assumes that the uploaded ZIP file contains binaries. If it finds any source code, it fails with an error message, causing provisioning to fail. Make sure you upload binaries, and not source code.
|
||||
|
||||
::::{note}
|
||||
Plugins larger than 5GB should have the plugin descriptor file at the top of the archive. This order can be achieved by specifying at time of creating the ZIP file:
|
||||
|
||||
```sh
|
||||
zip -r name-of-plugin.zip name-of-descriptor-file.properties *
|
||||
```
|
||||
|
||||
::::
|
||||
|
||||
|
||||
Bundles
|
||||
: The entire content of a bundle is made available to the node by extracting to the {{es}} container’s `/app/config` directory. This is useful to make custom dictionaries available. Dictionaries should be placed in a `/dictionaries` folder in the root path of your ZIP file.
|
||||
|
||||
Here are some examples of bundles:
|
||||
|
||||
**Script**
|
||||
|
||||
```text
|
||||
$ tree .
|
||||
.
|
||||
└── scripts
|
||||
└── test.js
|
||||
```
|
||||
|
||||
The script `test.js` can be referred in queries as `"script": "test"`.
|
||||
|
||||
**Dictionary of synonyms**
|
||||
|
||||
```text
|
||||
$ tree .
|
||||
.
|
||||
└── dictionaries
|
||||
└── synonyms.txt
|
||||
```
|
||||
|
||||
The dictionary `synonyms.txt` can be used as `synonyms.txt` or using the full path `/app/config/synonyms.txt` in the `synonyms_path` of the `synonym-filter`.
|
||||
|
||||
To learn more about analyzing with synonyms, check [Synonym token filter](/reference/data-analysis/text-analysis/analysis-synonym-tokenfilter.md) and [Formatting Synonyms](https://www.elastic.co/guide/en/elasticsearch/guide/2.x/synonym-formats.html).
|
||||
|
||||
**GeoIP database bundle**
|
||||
|
||||
```text
|
||||
$ tree .
|
||||
.
|
||||
└── ingest-geoip
|
||||
└── MyGeoLite2-City.mmdb
|
||||
```
|
||||
|
||||
Note that the extension must be `-(City|Country|ASN).mmdb`, and it must be a different name than the original file name `GeoLite2-City.mmdb` which already exists in Elasticsearch Service. To use this bundle, you can refer it in the GeoIP ingest pipeline as `MyGeoLite2-City.mmdb` under `database_file`.
|
||||
|
||||
|
||||
|
||||
## Add your extension [ec-add-your-plugin]
|
||||
|
||||
You must upload your files before you can apply them to your cluster configuration:
|
||||
|
||||
1. Log in to the [Elasticsearch Service Console](https://cloud.elastic.co?page=docs&placement=docs-body).
|
||||
2. Find your deployment on the home page in the Elasticsearch Service card and select **Manage** to access it directly. Or, select **Hosted deployments** to go to the deployments page to view all of your deployments.
|
||||
3. Under **Features**, select **Extensions**.
|
||||
4. Select **Upload extension**.
|
||||
5. Complete the extension fields, including the {{es}} version.
|
||||
|
||||
* Plugins must use full version notation down to the patch level, such as `7.10.1`. You cannot use wildcards. This version notation should match the version in your plugin’s plugin descriptor file. For classic plugins, it should also match the target deployment version.
|
||||
* Bundles should specify major or minor versions with wildcards, such as `7.*` or `*`. Wildcards are recommended to ensure the bundle is compatible across all versions of these releases.
|
||||
|
||||
6. Select the extension **Type**.
|
||||
7. Under **Plugin file**, choose the file to upload.
|
||||
8. Select **Create extension**.
|
||||
|
||||
After creating your extension, you can [enable them for existing {{es}} deployments](#ec-update-bundles) or enable them when creating new deployments.
|
||||
|
||||
::::{note}
|
||||
Creating extensions larger than 200MB should be done through the extensions API.
|
||||
|
||||
Refer to [Managing plugins and extensions through the API](/reference/elasticsearch-plugins/cloud/ec-plugins-guide.md) for more details.
|
||||
|
||||
::::
|
||||
|
||||
|
||||
|
||||
## Update your deployment configuration [ec-update-bundles]
|
||||
|
||||
After uploading your files, you can select to enable them when creating a new {{es}} deployment. For existing deployments, you must update your deployment configuration to use the new files:
|
||||
|
||||
1. Log in to the [Elasticsearch Service Console](https://cloud.elastic.co?page=docs&placement=docs-body).
|
||||
2. Find your deployment on the home page in the Elasticsearch Service card and select **Manage** to access it directly. Or, select **Hosted deployments** to go to the deployments page to view all of your deployments.
|
||||
|
||||
On the deployments page you can narrow your deployments by name, ID, or choose from several other filters. To customize your view, use a combination of filters, or change the format from a grid to a list.
|
||||
|
||||
3. From the **Actions** dropdown, select **Edit deployment**.
|
||||
4. Select **Manage user settings and extensions**.
|
||||
5. Select the **Extensions** tab.
|
||||
6. Select the custom extension.
|
||||
7. Select **Back**.
|
||||
8. Select **Save**. The {{es}} cluster is then updated with new nodes that have the plugin installed.
|
||||
|
||||
|
||||
## Update your extension [ec-update-bundles-and-plugins]
|
||||
|
||||
While you can update the ZIP file for any plugin or bundle, these are downloaded and made available only when a node is started.
|
||||
|
||||
You should be careful when updating an extension. If you update an existing extension with a new file, and if the file is broken for some reason, all the nodes could be in trouble, as a restart or move node could make even HA clusters non-available.
|
||||
|
||||
If the extension is not in use by any deployments, then you are free to update the files or extension details as much as you like. However, if the extension is in use, and if you need to update it with a new file, it is recommended to [create a new extension](#ec-add-your-plugin) rather than updating the existing one that is in use.
|
||||
|
||||
By following this method, only the one node would be down even if the extension file is faulty. This would ensure that HA clusters remain available.
|
||||
|
||||
This method also supports having a test/staging deployment to test out the extension changes before applying them on a production deployment.
|
||||
|
||||
You may delete the old extension after updating the deployment successfully.
|
||||
|
||||
To update an extension with a new file version,
|
||||
|
||||
1. Prepare a new plugin or bundle.
|
||||
2. On the **Extensions** page, [upload a new extension](#ec-add-your-plugin).
|
||||
3. Make your new files available by uploading them.
|
||||
4. Find your deployment on the home page in the Elasticsearch Service card and select **Manage** to access it directly. Or, select **Hosted deployments** to go to the deployments page to view all of your deployments.
|
||||
|
||||
On the deployments page you can narrow your deployments by name, ID, or choose from several other filters. To customize your view, use a combination of filters, or change the format from a grid to a list.
|
||||
|
||||
5. From the **Actions** dropdown, select **Edit deployment**.
|
||||
6. Select **Manage user settings and extensions**.
|
||||
7. Select the **Extensions** tab.
|
||||
8. Select the new extension and de-select the old one.
|
||||
9. Select **Back**.
|
||||
10. Select **Save**.
|
||||
|
||||
|
||||
## How to use the extensions API [ec-extension-api-usage-guide]
|
||||
|
||||
::::{note}
|
||||
For a full set of examples, check [Managing plugins and extensions through the API](/reference/elasticsearch-plugins/cloud/ec-plugins-guide.md).
|
||||
::::
|
||||
|
||||
|
||||
If you don’t already have one, create an [API key](docs-content://deploy-manage/api-keys/elastic-cloud-api-keys.md)
|
||||
|
||||
There are ways that you can use the extensions API to upload a file.
|
||||
|
||||
### Method 1: Use HTTP `POST` to create metadata and then upload the file using HTTP `PUT` [ec_method_1_use_http_post_to_create_metadata_and_then_upload_the_file_using_http_put]
|
||||
|
||||
Step 1: Create metadata
|
||||
|
||||
```text
|
||||
curl -XPOST \
|
||||
-H "Authorization: ApiKey $EC_API_KEY" \
|
||||
-H 'content-type:application/json' \
|
||||
https://api.elastic-cloud.com/api/v1/deployments/extensions \
|
||||
-d'{
|
||||
"name" : "synonyms-v1",
|
||||
"description" : "The best synonyms ever",
|
||||
"extension_type" : "bundle",
|
||||
"version" : "7.*"
|
||||
}'
|
||||
```
|
||||
|
||||
Step 2: Upload the file
|
||||
|
||||
```text
|
||||
curl -XPUT \
|
||||
-H "Authorization: ApiKey $EC_API_KEY" \
|
||||
"https://api.elastic-cloud.com/api/v1/deployments/extensions/$extension_id" \
|
||||
-T /tmp/synonyms.zip
|
||||
```
|
||||
|
||||
If you are using a client that does not have native `application/zip` handling like `curl`, be sure to use the equivalent of the following with `content-type: multipart/form-data`:
|
||||
|
||||
```text
|
||||
curl -XPUT \
|
||||
-H 'Expect:' \
|
||||
-H 'content-type: multipart/form-data' \
|
||||
-H "Authorization: ApiKey $EC_API_KEY" \
|
||||
"https://api.elastic-cloud.com/api/v1/deployments/extensions/$extension_id" -F "file=@/tmp/synonyms.zip"
|
||||
```
|
||||
|
||||
For example, using the Python `requests` module, the `PUT` request would be as follows:
|
||||
|
||||
```text
|
||||
import requests
|
||||
files = {'file': open('/tmp/synonyms.zip','rb')}
|
||||
r = requests.put('https://api.elastic-cloud.com/api/v1/deployments/extensions/{}'.format(extension_id), files=files, headers= {'Authorization': 'ApiKey {}'.format(EC_API_KEY)})
|
||||
```
|
||||
|
||||
|
||||
### Method 2: Single step. Use a `download_url` so that the API server downloads the object at the specified URL [ec_method_2_single_step_use_a_download_url_so_that_the_api_server_downloads_the_object_at_the_specified_url]
|
||||
|
||||
```text
|
||||
curl -XPOST \
|
||||
-H "Authorization: ApiKey $EC_API_KEY" \
|
||||
-H 'content-type:application/json' \
|
||||
https://api.elastic-cloud.com/api/v1/deployments/extensions \
|
||||
-d'{
|
||||
"name" : "anylysis_icu",
|
||||
"description" : "Helpful description",
|
||||
"extension_type" : "plugin",
|
||||
"version" : "7.13.2",
|
||||
"download_url": "https://artifacts.elastic.co/downloads/elasticsearch-plugins/analysis-icu/analysis-icu-7.13.2.zip"
|
||||
}'
|
||||
```
|
||||
|
||||
Please refer to the [Extensions API reference](https://www.elastic.co/docs/api/doc/cloud/group/endpoint-extensions) for the complete set of HTTP methods and payloads.
|
||||
|
||||
|
||||
|
498
docs/reference/elasticsearch-plugins/cloud/ec-plugins-guide.md
Normal file
498
docs/reference/elasticsearch-plugins/cloud/ec-plugins-guide.md
Normal file
|
@ -0,0 +1,498 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/cloud/current/ec-plugins-guide.html
|
||||
---
|
||||
|
||||
# Managing plugins and extensions through the API [ec-plugins-guide]
|
||||
|
||||
This guide provides a full list of tasks for managing [plugins and extensions](ec-adding-plugins.md) in Elasticsearch Service, using the API.
|
||||
|
||||
* [Create an extension](ec-plugins-guide.md#ec-extension-guide-create)
|
||||
* [Add an extension to a deployment plan](ec-plugins-guide.md#ec-extension-guide-add-plan)
|
||||
* [Get an extension](ec-plugins-guide.md#ec-extension-guide-get-extension)
|
||||
* [Update the name of an existing extension](ec-plugins-guide.md#ec-extension-guide-update-name)
|
||||
* [Update the type of an existing extension](ec-plugins-guide.md#ec-extension-guide-update-type)
|
||||
* [Update the version of an existing bundle](ec-plugins-guide.md#ec-extension-guide-update-version-bundle)
|
||||
* [Update the version of an existing plugin](ec-plugins-guide.md#ec-extension-guide-update-version-plugin)
|
||||
* [Update the file associated to an existing extension](ec-plugins-guide.md#ec-extension-guide-update-file)
|
||||
* [Upgrade Elasticsearch](ec-plugins-guide.md#ec-extension-guide-upgrade-elasticsearch)
|
||||
* [Delete an extension](ec-plugins-guide.md#ec-extension-guide-delete)
|
||||
|
||||
|
||||
## Create an extension [ec-extension-guide-create]
|
||||
|
||||
There are two methods to create an extension. You can:
|
||||
|
||||
1. Stream the file from a publicly-accessible download URL.
|
||||
2. Upload the file from a local file path.
|
||||
|
||||
::::{note}
|
||||
For plugins larger than 200MB the download URL option **must** be used. Plugins larger than 8GB cannot be uploaded with either method.
|
||||
::::
|
||||
|
||||
|
||||
These two examples are for the `plugin` extension type. For bundles, change `extension_type` to `bundle`.
|
||||
|
||||
For plugins, `version` must match (exactly) the `elasticsearch.version` field defined in the plugin’s `plugin-descriptor.properties` file. Check [Help for plugin authors](/extend/index.md) for details. For plugins larger than 5GB, the `plugin-descriptor.properties` file needs to be at the top of the archive. This ensures that the our verification process is able to detect that it is an Elasticsearch plugin; otherwise the plugin will be rejected by the API. This order can be achieved by specifying at time of creating the ZIP file: `zip -r name-of-plugin.zip plugin-descriptor.properties *`.
|
||||
|
||||
For bundles, we recommend setting `version` using wildcard notation that matches the major version of the Elasticsearch deployment. For example, if Elasticsearch is on version 8.4.3, simply set `8.*` as the version. The value `8.*` means that the bundle is compatible with all 8.x versions of Elasticsearch.
|
||||
|
||||
$$$ec-extension-guide-create-option1$$$
|
||||
**Option 1: Stream the file from a publicly-accessible download URL**
|
||||
|
||||
```sh
|
||||
curl -X POST \
|
||||
https://api.elastic-cloud.com/api/v1/deployments/extensions \
|
||||
-H "Authorization: ApiKey $CLOUD_API_KEY" \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d '{
|
||||
"download_url" : "https://my_site/custom-plugin-8.4.3.zip",
|
||||
"extension_type" : "plugin",
|
||||
"name" : "custom-plugin",
|
||||
"version" : "8.4.3"
|
||||
}'
|
||||
```
|
||||
|
||||
The single POST request creates an extension with the metadata, validates, and streams the file from the `download_url` specified. The accepted protocols for `download_url` are `http` and `https`.
|
||||
|
||||
::::{note}
|
||||
The `download_url` must be directly and publicly accessible. There is currently no support for redirection or authentication unless it contains security credentials/tokens expected by your HTTP service as part of the URL. Otherwise, use the following Option 2 to upload the file from a local path.
|
||||
::::
|
||||
|
||||
|
||||
::::{note}
|
||||
When the file is larger than 5GB, the request may timeout after 2-5 minutes, but streaming will continue on the server. Check the Extensions page in the Cloud UI after 5-10 minutes to make sure that the plugin has been created. A successfully created plugin will contain correct name, type, version, size, and last modified information.
|
||||
::::
|
||||
|
||||
|
||||
$$$ec-extension-guide-create-option2$$$
|
||||
**Option 2: Upload the file from a local file path**
|
||||
|
||||
This option requires a two step process. First, create the metadata for the extension:
|
||||
|
||||
```sh
|
||||
curl -X POST \
|
||||
https://api.elastic-cloud.com/api/v1/deployments/extensions \
|
||||
-H "Authorization: ApiKey $CLOUD_API_KEY" \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d '{
|
||||
"extension_type": "plugin",
|
||||
"name": "custom-plugin",
|
||||
"version" : "8.4.3"
|
||||
}'
|
||||
```
|
||||
|
||||
```sh
|
||||
{
|
||||
"url": "repo://4226448541",
|
||||
"version": "8.4.3",
|
||||
"extension_type": "plugin",
|
||||
"id": "4226448541",
|
||||
"name": "custom-plugin"
|
||||
}
|
||||
```
|
||||
|
||||
The response returns a `url` you can reference later in the plan (the numeric value in the `url` is the `EXTENSION_ID`). Use this `EXTENSION_ID` in the following PUT call:
|
||||
|
||||
```sh
|
||||
curl -v -X PUT "https://api.elastic-cloud.com/api/v1/deployments/extensions/EXTENSION_ID" \
|
||||
-H 'Content-type:application/zip' \
|
||||
-H "Authorization: ApiKey $CLOUD_API_KEY" \
|
||||
-H 'Expect:' \
|
||||
-T "/path_to/custom-plugin-8.4.3.zip"
|
||||
```
|
||||
|
||||
::::{note}
|
||||
When using curl, always use the `-T` option. DO NOT use `-F` (we have seen inconsistency in curl behavior across systems; using `-F` can result in partially uploaded or truncated files).
|
||||
::::
|
||||
|
||||
|
||||
The above PUT request uploads the file from the local path specified. This request is synchronous. An HTTP 200 response indicates that the file has been successfully uploaded and is ready for use.
|
||||
|
||||
```sh
|
||||
{
|
||||
"url": "repo://2286113333",
|
||||
"version": "8.4.3",
|
||||
"extension_type": "plugin",
|
||||
"id": "2286113333",
|
||||
"name": "custom-plugin"
|
||||
}
|
||||
```
|
||||
|
||||
|
||||
## Add an extension to a deployment plan [ec-extension-guide-add-plan]
|
||||
|
||||
Once the extension is created and uploaded, you can add the extension using its `EXTENSION_ID` in an [update deployment API call](https://www.elastic.co/docs/api/doc/cloud/operation/operation-update-deployment).
|
||||
|
||||
The following are examples of a GCP plan. Your specific deployment plan will be different. The important parts related to extensions are in the `user_plugins` object.
|
||||
|
||||
```sh
|
||||
{
|
||||
"name": "Extensions",
|
||||
"prune_orphans": false,
|
||||
"resources": {
|
||||
"elasticsearch": [
|
||||
{
|
||||
"region": "gcp-us-central1",
|
||||
"ref_id": "main-elasticsearch",
|
||||
"plan": {
|
||||
"cluster_topology": [
|
||||
|
||||
...
|
||||
|
||||
],
|
||||
"elasticsearch": {
|
||||
"version": "8.4.3",
|
||||
"enabled_built_in_plugins": [ ],
|
||||
"user_bundles": [
|
||||
{
|
||||
"name": "custom-plugin",
|
||||
"url": "repo://2286113333",
|
||||
"elasticsearch_version": "8.4.3"
|
||||
}
|
||||
]
|
||||
},
|
||||
"deployment_template": {
|
||||
"id": "gcp-storage-optimized-v3"
|
||||
}
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
You can use the [cat plugins API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-cat-plugins) to confirm that the plugin has been deployed successfully to Elasticsearch.
|
||||
|
||||
The previous examples are for plugins. For bundles, use the `user_bundles` construct instead.
|
||||
|
||||
```sh
|
||||
"user_bundles": [
|
||||
{
|
||||
"elasticsearch_version": "8.*",
|
||||
"name": "custom-bundle",
|
||||
"url": "repo://5886113212"
|
||||
}
|
||||
]
|
||||
```
|
||||
|
||||
|
||||
## Get an extension [ec-extension-guide-get-extension]
|
||||
|
||||
You can use the GET call to retrieve information about an extension.
|
||||
|
||||
To list all extensions for the account:
|
||||
|
||||
```sh
|
||||
curl -X GET \
|
||||
https://api.elastic-cloud.com/api/v1/deployments/extensions \
|
||||
-H 'Content-Type: application/json' \
|
||||
-H "Authorization: ApiKey $CLOUD_API_KEY" \
|
||||
```
|
||||
|
||||
To get a specific extension:
|
||||
|
||||
```sh
|
||||
curl -X GET \
|
||||
https://api.elastic-cloud.com/api/v1/deployments/extensions/EXTENSION_ID \
|
||||
-H 'Content-Type: application/json' \
|
||||
-H "Authorization: ApiKey $CLOUD_API_KEY" \
|
||||
```
|
||||
|
||||
The previous GET calls support an optional `include_deployments` parameter. When set to `true`, the call also returns the deployments that currently have the extension in-use:
|
||||
|
||||
```sh
|
||||
curl -X GET \
|
||||
https://api.elastic-cloud.com/api/v1/deployments/extensions/EXTENSION_ID?include_deployments=true \
|
||||
-H 'Content-Type: application/json' \
|
||||
-H "Authorization: ApiKey $CLOUD_API_KEY" \
|
||||
```
|
||||
|
||||
For example, the previous call returns:
|
||||
|
||||
```sh
|
||||
{
|
||||
"name": "custom-plugin",
|
||||
"url": "repo://2286113333",
|
||||
"extension_type": "plugin",
|
||||
"deployments": [
|
||||
"f91f3a9360a74e9d8c068cd2698c92ea"
|
||||
],
|
||||
"version": "8.4.3",
|
||||
"id": "2286113333"
|
||||
}
|
||||
```
|
||||
|
||||
|
||||
## Update the name of an existing extension [ec-extension-guide-update-name]
|
||||
|
||||
To update the name of an existing extension, simply update the name field without uploading a new file. You do not have to specify the `download_url` when only making metadata changes to an extension.
|
||||
|
||||
Example using the [Option 1](ec-plugins-guide.md#ec-extension-guide-create-option1) create an extension method:
|
||||
|
||||
```sh
|
||||
curl -X POST \
|
||||
https://api.elastic-cloud.com/api/v1/deployments/extensions/EXTENSION_ID \
|
||||
-H "Authorization: ApiKey $CLOUD_API_KEY" \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d '{
|
||||
"extension_type" : "plugin",
|
||||
"name": "custom-plugin-07012020",
|
||||
"version" : "8.4.3"
|
||||
}'
|
||||
```
|
||||
|
||||
Example using the [Option 2](ec-plugins-guide.md#ec-extension-guide-create-option2) create an extension method:
|
||||
|
||||
```sh
|
||||
curl -X POST \
|
||||
https://api.elastic-cloud.com/api/v1/deployments/extensions/EXTENSION_ID \
|
||||
-H "Authorization: ApiKey $CLOUD_API_KEY" \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d '{
|
||||
"extension_type" : "plugin",
|
||||
"name": "custom-plugin-07012020",
|
||||
"version" : "8.4.3"
|
||||
}'
|
||||
```
|
||||
|
||||
Updating the name of an existing extension does not change its `EXTENSION_ID`.
|
||||
|
||||
|
||||
## Update the type of an existing extension [ec-extension-guide-update-type]
|
||||
|
||||
Updating `extension_type` has no effect. You cannot change the extension’s type (`plugin` versus `bundle`) after the initial creation of a plugin.
|
||||
|
||||
|
||||
## Update the version of an existing bundle [ec-extension-guide-update-version-bundle]
|
||||
|
||||
For bundles, we recommend setting `version` using wildcard notation that matches the major version of the Elasticsearch deployment. For example, if Elasticsearch is on version 8.4.3, simply set `8.*` as the version. The value `8.*` means that the bundle is compatible with all 7.x versions of Elasticsearch.
|
||||
|
||||
For example, if the bundle was previously uploaded with the version `8.4.2`, simply update the version field. You no longer have to specify the `download_url` when only making metadata changes to a bundle.
|
||||
|
||||
Example using the [Option 1](ec-plugins-guide.md#ec-extension-guide-create-option1) create an extension method:
|
||||
|
||||
```sh
|
||||
curl -X POST \
|
||||
https://api.elastic-cloud.com/api/v1/deployments/extensions/EXTENSION_ID \
|
||||
-H "Authorization: ApiKey $CLOUD_API_KEY" \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d '{
|
||||
"extension_type" : "bundle",
|
||||
"name": "custom-bundle",
|
||||
"version" : "8.*"
|
||||
}'
|
||||
```
|
||||
|
||||
Example using the [Option 2](ec-plugins-guide.md#ec-extension-guide-create-option2) create an extension method:
|
||||
|
||||
```sh
|
||||
curl -X POST \
|
||||
https://api.elastic-cloud.com/api/v1/deployments/extensions/EXTENSION_ID \
|
||||
-H "Authorization: ApiKey $CLOUD_API_KEY" \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d '{
|
||||
"extension_type" : "bundle",
|
||||
"name": "custom-bundle",
|
||||
"version" : "8.*"
|
||||
}'
|
||||
```
|
||||
|
||||
Updating the name of an existing extension does not change its `EXTENSION_ID`.
|
||||
|
||||
|
||||
## Update the version of an existing plugin [ec-extension-guide-update-version-plugin]
|
||||
|
||||
For plugins, `version` must match (exactly) the `elasticsearch.version` field defined in the plugin’s `plugin-descriptor.properties` file. Check [Help for plugin authors](/extend/index.md) for details. If you change the version, the associated plugin file *must* also be updated accordingly.
|
||||
|
||||
|
||||
## Update the file associated to an existing extension [ec-extension-guide-update-file]
|
||||
|
||||
You may want to update an uploaded file for an existing extension without performing an Elasticsearch upgrade. If you are updating the extension to prepare for an Elasticsearch upgrade, check the [Upgrade Elasticsearch](ec-plugins-guide.md#ec-extension-guide-upgrade-elasticsearch) scenario later on this page.
|
||||
|
||||
This example is for the `plugin` extension type. For bundles, change `extension_type` to `bundle`.
|
||||
|
||||
If you used [Option 1](ec-plugins-guide.md#ec-extension-guide-create-option1) to create the extension, simply re-run the POST request with the `download_url` pointing to the location of your updated extension file.
|
||||
|
||||
```sh
|
||||
curl -X POST \
|
||||
https://api.elastic-cloud.com/api/v1/deployments/extensions/EXTENSION_ID \
|
||||
-H "Authorization: ApiKey $CLOUD_API_KEY" \
|
||||
-H 'Content-Type: application/json' \
|
||||
-d '{
|
||||
"download_url" : "https://my_site/custom-plugin-8.4.3-10212022.zip",
|
||||
"extension_type" : "plugin",
|
||||
"name": "custom-plugin-10212022",
|
||||
"version" : "8.4.3"
|
||||
}'
|
||||
```
|
||||
|
||||
If you used [Option 2](ec-plugins-guide.md#ec-extension-guide-create-option2) to create the extension, simply re-run the PUT request with the `file` parameter pointing to the location of your updated extension file.
|
||||
|
||||
```sh
|
||||
curl -v -X PUT "https://api.elastic-cloud.com/api/v1/deployments/extensions/EXTENSION_ID" \
|
||||
-H 'Content-type:application/zip' \
|
||||
-H "Authorization: ApiKey $CLOUD_API_KEY" \
|
||||
-H 'Expect:' \
|
||||
-T "/path_to/custom-plugin-8.4.3-10212022.zip"
|
||||
```
|
||||
|
||||
::::{important}
|
||||
If you are not making any other plan changes and simply updating an extension file, you need to issue a no-op plan so that Elasticsearch will make use of this new file. A *no-op* (no operation) plan triggers a rolling restart on the deployment, applying the same (unchanged) plan as the current plan.
|
||||
::::
|
||||
|
||||
|
||||
Updating the file of an existing extension or bundle does not change its `EXTENSION_ID`.
|
||||
|
||||
|
||||
## Upgrade Elasticsearch [ec-extension-guide-upgrade-elasticsearch]
|
||||
|
||||
When you upgrade Elasticsearch in a deployment, you must ensure that:
|
||||
|
||||
* Bundles are on versions that are compatible with the Elasticsearch version that you are upgrading to.
|
||||
* Plugins match (exactly) the Elasticsearch upgrade version.
|
||||
|
||||
**To prepare existing bundle and update the plan:**
|
||||
|
||||
1. **Update the bundle version to be compatible with the Elasticsearch upgrade version.**
|
||||
|
||||
Bundles using wildcard notation for versions (for example, `7.*`, `8.*`) in their extension metadata are compatible with all minor versions of the same Elasticsearch major version. In other words, if you are performing a patch (for example, from `8.4.2` to `8.4.3`) or a minor (for example `8.3.0` to `8.4.3`) version upgrade of Elasticsearch and you are already using `8.*` as the `version` for the extension, you are ready for the Elasticsearch upgrade and can proceed to Step 2.
|
||||
|
||||
However, if you are using a specific `version` for bundles, or upgrading to a major version, you must update the metadata of the extension to specify the matching Elasticsearch `version` that you are upgrading to, or use the wildcard syntax described in the previous paragraph. For example, if you are upgrading from version 7.x to 8.x, set `version` to `8.*` before the upgrade. Refer to [Update the version of an existing bundle](ec-plugins-guide.md#ec-extension-guide-update-version-bundle).
|
||||
|
||||
2. **Update the bundle reference as part of an upgrade plan.**
|
||||
|
||||
Submit a plan change that performs the following operations in a *single* [update deployment API](https://www.elastic.co/docs/api/doc/cloud/operation/operation-update-deployment) call:
|
||||
|
||||
* Upgrade the version of Elasticsearch to the upgrade version (for example, `8.4.3`).
|
||||
* Update reference to the existing bundle to be compatible with Elasticsearch upgrade version (for example, `8.*`).
|
||||
|
||||
This triggers a rolling upgrade plan change to the later Elasticsearch version and updates the reference to the bundle at the same time.
|
||||
|
||||
The following example shows the upgrade of an Elasticsearch deployment and its bundle. You can also upgrade other deployment resources within the same plan change.
|
||||
|
||||
Update `resources.elasticsearch.plan.elasticsearch.version` and `resources.elasticsearch.plan.cluster_topology.elasticsearch.user_bundles.elasticsearch_version` accordingly.
|
||||
|
||||
```sh
|
||||
{
|
||||
"name": "Extensions",
|
||||
"prune_orphans": false,
|
||||
"resources": {
|
||||
"elasticsearch": [
|
||||
{
|
||||
"region": "gcp-us-central1",
|
||||
"ref_id": "main-elasticsearch",
|
||||
"plan": {
|
||||
"cluster_topology": [
|
||||
...
|
||||
],
|
||||
"elasticsearch": {
|
||||
"version": "8.4.3",
|
||||
"enabled_built_in_plugins": [],
|
||||
"user_bundles": [
|
||||
{
|
||||
"elasticsearch_version": "7.*",
|
||||
"name": "custom-bundle",
|
||||
"url": "repo://5886113212"
|
||||
}
|
||||
]
|
||||
|
||||
},
|
||||
"deployment_template": {
|
||||
"id": "gcp-storage-optimized-v3"
|
||||
}
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
|
||||
**To create a new plugin and update the plan:**
|
||||
|
||||
Unlike bundles, plugins *must* match the Elasticsearch version down to the patch level (for example, `8.4.3`). When upgrading Elasticsearch to a new patch, minor, or major version, update the version in the extension metadata and update the extension file. The following example updates an existing plugin and upgrades the Elasticsearch deployment from version 8.3.0 to 8.4.3.
|
||||
|
||||
1. **Create a new plugin that matches the Elasticsearch upgrade version.**
|
||||
|
||||
Follow the steps in [Get an extension](ec-plugins-guide.md#ec-extension-guide-get-extension) to create a new extension with a `version` metadata field and the plugin’s `elasticsearch.version` field in `plugin-descriptor.properties` that matches the Elasticsearch upgrade version (for example, `8.4.3`).
|
||||
|
||||
2. **Remove the old plugin and add the new plugin to the upgrade plan.**
|
||||
|
||||
Submit a plan change that performs the following operations in a *single* [update deployment API](https://www.elastic.co/docs/api/doc/cloud/operation/operation-update-deployment) call:
|
||||
|
||||
* Upgrade the version of Elasticsearch to the upgrade version (for example, `8.4.3`).
|
||||
* Remove reference to the the plugin on the older version (for example, `8.3.0`) from the plan.
|
||||
* Add reference to the new plugin on the upgrade version (for example, `8.4.3`) to the plan.
|
||||
|
||||
This triggers a rolling upgrade plan change to the later Elasticsearch version, removes reference to the older plugin, and deploys your updated plugin at the same time.
|
||||
|
||||
The following example shows the upgrade of an Elasticsearch deployment and its plugin. You can also upgrade other deployment resources within the same plan change.
|
||||
|
||||
Update deployment plans, update `resources.elasticsearch.plan.elasticsearch.version` and `resources.elasticsearch.plan.cluster_topology.elasticsearch.user_plugins.elasticsearch_version` accordingly.
|
||||
|
||||
```sh
|
||||
{
|
||||
"name": "Extensions",
|
||||
"prune_orphans": false,
|
||||
"resources": {
|
||||
"elasticsearch": [
|
||||
{
|
||||
"region": "gcp-us-central1",
|
||||
"ref_id": "main-elasticsearch",
|
||||
"plan": {
|
||||
"cluster_topology": [
|
||||
...
|
||||
],
|
||||
"elasticsearch": {
|
||||
"version": "8.4.3",
|
||||
"enabled_built_in_plugins": [],
|
||||
"user_plugins": [
|
||||
{
|
||||
"elasticsearch_version": "8.4.3",
|
||||
"name": "custom-plugin",
|
||||
"url": "repo://4226448541"
|
||||
}
|
||||
]
|
||||
|
||||
},
|
||||
"deployment_template": {
|
||||
"id": "gcp-storage-optimized-v3"
|
||||
}
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
You can use the [cat plugins API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-cat-plugins) to confirm that the plugin has been upgraded successfully to Elasticsearch.
|
||||
|
||||
|
||||
|
||||
## Delete an extension [ec-extension-guide-delete]
|
||||
|
||||
You can delete an extension simply by calling a DELETE against the EXTENSION_ID of interest:
|
||||
|
||||
```sh
|
||||
curl -X DELETE \
|
||||
https://api.elastic-cloud.com/api/v1/deployments/extensions/EXTENSION_ID \
|
||||
-H "Authorization: ApiKey $CLOUD_API_KEY" \
|
||||
-H 'Content-Type: application/json'
|
||||
```
|
||||
|
||||
Only extensions not currently referenced in a deployment plan can be deleted. If you attempt to delete an extension that is in use, you will receive an HTTP 400 Bad Request error like the following, indicating the deployments that are currently using the extension.
|
||||
|
||||
```sh
|
||||
{
|
||||
"errors": [
|
||||
{
|
||||
"message": "Cannot delete extension [EXTENSION_ID]. It is used by deployments [DEPLOYMENT_NAME].",
|
||||
"code": "extensions.extension_in_use"
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
To remove an extension reference from a deployment plan, simply update the deployment with the extension reference deleted from the `user_plugins` or `user_bundles` arrays. Check [Add an extension to a deployment plan](ec-plugins-guide.md#ec-extension-guide-add-plan) for where these are specified in the plan.
|
||||
|
|
@ -0,0 +1,220 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/discovery-azure-classic-long.html
|
||||
---
|
||||
|
||||
# Setup process for Azure Discovery [discovery-azure-classic-long]
|
||||
|
||||
We will expose here one strategy which is to hide our Elasticsearch cluster from outside.
|
||||
|
||||
With this strategy, only VMs behind the same virtual port can talk to each other. That means that with this mode, you can use Elasticsearch unicast discovery to build a cluster, using the Azure API to retrieve information about your nodes.
|
||||
|
||||
## Prerequisites [discovery-azure-classic-long-prerequisites]
|
||||
|
||||
Before starting, you need to have:
|
||||
|
||||
* A [Windows Azure account](https://azure.microsoft.com/en-us/)
|
||||
* OpenSSL that isn’t from MacPorts, specifically `OpenSSL 1.0.1f 6 Jan 2014` doesn’t seem to create a valid keypair for ssh. FWIW, `OpenSSL 1.0.1c 10 May 2012` on Ubuntu 14.04 LTS is known to work.
|
||||
* SSH keys and certificate
|
||||
|
||||
You should follow [this guide](http://azure.microsoft.com/en-us/documentation/articles/linux-use-ssh-key/) to learn how to create or use existing SSH keys. If you have already done it, you can skip the following.
|
||||
|
||||
Here is a description on how to generate SSH keys using `openssl`:
|
||||
|
||||
```sh
|
||||
# You may want to use another dir than /tmp
|
||||
cd /tmp
|
||||
openssl req -x509 -nodes -days 365 -newkey rsa:2048 -keyout azure-private.key -out azure-certificate.pem
|
||||
chmod 600 azure-private.key azure-certificate.pem
|
||||
openssl x509 -outform der -in azure-certificate.pem -out azure-certificate.cer
|
||||
```
|
||||
|
||||
Generate a keystore which will be used by the plugin to authenticate with a certificate all Azure API calls.
|
||||
|
||||
```sh
|
||||
# Generate a keystore (azurekeystore.pkcs12)
|
||||
# Transform private key to PEM format
|
||||
openssl pkcs8 -topk8 -nocrypt -in azure-private.key -inform PEM -out azure-pk.pem -outform PEM
|
||||
# Transform certificate to PEM format
|
||||
openssl x509 -inform der -in azure-certificate.cer -out azure-cert.pem
|
||||
cat azure-cert.pem azure-pk.pem > azure.pem.txt
|
||||
# You MUST enter a password!
|
||||
openssl pkcs12 -export -in azure.pem.txt -out azurekeystore.pkcs12 -name azure -noiter -nomaciter
|
||||
```
|
||||
|
||||
Upload the `azure-certificate.cer` file both in the Elasticsearch Cloud Service (under `Manage Certificates`), and under `Settings -> Manage Certificates`.
|
||||
|
||||
::::{important}
|
||||
When prompted for a password, you need to enter a non empty one.
|
||||
::::
|
||||
|
||||
|
||||
See this [guide](http://www.windowsazure.com/en-us/manage/linux/how-to-guides/ssh-into-linux/) for more details about how to create keys for Azure.
|
||||
|
||||
Once done, you need to upload your certificate in Azure:
|
||||
|
||||
* Go to the [management console](https://account.windowsazure.com/).
|
||||
* Sign in using your account.
|
||||
* Click on `Portal`.
|
||||
* Go to Settings (bottom of the left list)
|
||||
* On the bottom bar, click on `Upload` and upload your `azure-certificate.cer` file.
|
||||
|
||||
You may want to use [Windows Azure Command-Line Tool](http://www.windowsazure.com/en-us/develop/nodejs/how-to-guides/command-line-tools/):
|
||||
|
||||
* Install [NodeJS](https://github.com/joyent/node/wiki/Installing-Node.js-via-package-manager), for example using homebrew on MacOS X:
|
||||
|
||||
```sh
|
||||
brew install node
|
||||
```
|
||||
|
||||
* Install Azure tools
|
||||
|
||||
```sh
|
||||
sudo npm install azure-cli -g
|
||||
```
|
||||
|
||||
* Download and import your azure settings:
|
||||
|
||||
```sh
|
||||
# This will open a browser and will download a .publishsettings file
|
||||
azure account download
|
||||
|
||||
# Import this file (we have downloaded it to /tmp)
|
||||
# Note, it will create needed files in ~/.azure. You can remove azure.publishsettings when done.
|
||||
azure account import /tmp/azure.publishsettings
|
||||
```
|
||||
|
||||
|
||||
|
||||
## Creating your first instance [discovery-azure-classic-long-instance]
|
||||
|
||||
You need to have a storage account available. Check [Azure Blob Storage documentation](http://www.windowsazure.com/en-us/develop/net/how-to-guides/blob-storage/#create-account) for more information.
|
||||
|
||||
You will need to choose the operating system you want to run on. To get a list of official available images, run:
|
||||
|
||||
```sh
|
||||
azure vm image list
|
||||
```
|
||||
|
||||
Let’s say we are going to deploy an Ubuntu image on an extra small instance in West Europe:
|
||||
|
||||
Azure cluster name
|
||||
: `azure-elasticsearch-cluster`
|
||||
|
||||
Image
|
||||
: `b39f27a8b8c64d52b05eac6a62ebad85__Ubuntu-13_10-amd64-server-20130808-alpha3-en-us-30GB`
|
||||
|
||||
VM Name
|
||||
: `myesnode1`
|
||||
|
||||
VM Size
|
||||
: `extrasmall`
|
||||
|
||||
Location
|
||||
: `West Europe`
|
||||
|
||||
Login
|
||||
: `elasticsearch`
|
||||
|
||||
Password
|
||||
: `password1234!!`
|
||||
|
||||
Using command line:
|
||||
|
||||
```sh
|
||||
azure vm create azure-elasticsearch-cluster \
|
||||
b39f27a8b8c64d52b05eac6a62ebad85__Ubuntu-13_10-amd64-server-20130808-alpha3-en-us-30GB \
|
||||
--vm-name myesnode1 \
|
||||
--location "West Europe" \
|
||||
--vm-size extrasmall \
|
||||
--ssh 22 \
|
||||
--ssh-cert /tmp/azure-certificate.pem \
|
||||
elasticsearch password1234\!\!
|
||||
```
|
||||
|
||||
You should see something like:
|
||||
|
||||
```text
|
||||
info: Executing command vm create
|
||||
+ Looking up image
|
||||
+ Looking up cloud service
|
||||
+ Creating cloud service
|
||||
+ Retrieving storage accounts
|
||||
+ Configuring certificate
|
||||
+ Creating VM
|
||||
info: vm create command OK
|
||||
```
|
||||
|
||||
Now, your first instance is started.
|
||||
|
||||
::::{admonition} Working with SSH
|
||||
:class: tip
|
||||
|
||||
You need to give the private key and username each time you log on your instance:
|
||||
|
||||
```sh
|
||||
ssh -i ~/.ssh/azure-private.key elasticsearch@myescluster.cloudapp.net
|
||||
```
|
||||
|
||||
But you can also define it once in `~/.ssh/config` file:
|
||||
|
||||
```text
|
||||
Host *.cloudapp.net
|
||||
User elasticsearch
|
||||
StrictHostKeyChecking no
|
||||
UserKnownHostsFile=/dev/null
|
||||
IdentityFile ~/.ssh/azure-private.key
|
||||
```
|
||||
|
||||
::::
|
||||
|
||||
|
||||
Next, you need to install Elasticsearch on your new instance. First, copy your keystore to the instance, then connect to the instance using SSH:
|
||||
|
||||
```sh
|
||||
scp /tmp/azurekeystore.pkcs12 azure-elasticsearch-cluster.cloudapp.net:/home/elasticsearch
|
||||
ssh azure-elasticsearch-cluster.cloudapp.net
|
||||
```
|
||||
|
||||
Once connected, [install {{es}}](docs-content://deploy-manage/deploy/self-managed/installing-elasticsearch.md).
|
||||
|
||||
|
||||
## Install Elasticsearch cloud azure plugin [discovery-azure-classic-long-plugin]
|
||||
|
||||
```sh
|
||||
# Install the plugin
|
||||
sudo /usr/share/elasticsearch/bin/elasticsearch-plugin install discovery-azure-classic
|
||||
|
||||
# Configure it
|
||||
sudo vi /etc/elasticsearch/elasticsearch.yml
|
||||
```
|
||||
|
||||
And add the following lines:
|
||||
|
||||
```yaml
|
||||
# If you don't remember your account id, you may get it with `azure account list`
|
||||
cloud:
|
||||
azure:
|
||||
management:
|
||||
subscription.id: your_azure_subscription_id
|
||||
cloud.service.name: your_azure_cloud_service_name
|
||||
keystore:
|
||||
path: /home/elasticsearch/azurekeystore.pkcs12
|
||||
password: your_password_for_keystore
|
||||
|
||||
discovery:
|
||||
type: azure
|
||||
|
||||
# Recommended (warning: non durable disk)
|
||||
# path.data: /mnt/resource/elasticsearch/data
|
||||
```
|
||||
|
||||
Start Elasticsearch:
|
||||
|
||||
```sh
|
||||
sudo systemctl start elasticsearch
|
||||
```
|
||||
|
||||
If anything goes wrong, check your logs in `/var/log/elasticsearch`.
|
||||
|
||||
|
|
@ -0,0 +1,63 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/discovery-azure-classic-scale.html
|
||||
---
|
||||
|
||||
# Scaling out! [discovery-azure-classic-scale]
|
||||
|
||||
You need first to create an image of your previous machine. Disconnect from your machine and run locally the following commands:
|
||||
|
||||
```sh
|
||||
# Shutdown the instance
|
||||
azure vm shutdown myesnode1
|
||||
|
||||
# Create an image from this instance (it could take some minutes)
|
||||
azure vm capture myesnode1 esnode-image --delete
|
||||
|
||||
# Note that the previous instance has been deleted (mandatory)
|
||||
# So you need to create it again and BTW create other instances.
|
||||
|
||||
azure vm create azure-elasticsearch-cluster \
|
||||
esnode-image \
|
||||
--vm-name myesnode1 \
|
||||
--location "West Europe" \
|
||||
--vm-size extrasmall \
|
||||
--ssh 22 \
|
||||
--ssh-cert /tmp/azure-certificate.pem \
|
||||
elasticsearch password1234\!\!
|
||||
```
|
||||
|
||||
::::{tip}
|
||||
It could happen that azure changes the endpoint public IP address. DNS propagation could take some minutes before you can connect again using name. You can get from azure the IP address if needed, using:
|
||||
|
||||
```sh
|
||||
# Look at Network `Endpoints 0 Vip`
|
||||
azure vm show myesnode1
|
||||
```
|
||||
|
||||
::::
|
||||
|
||||
|
||||
Let’s start more instances!
|
||||
|
||||
```sh
|
||||
for x in $(seq 2 10)
|
||||
do
|
||||
echo "Launching azure instance #$x..."
|
||||
azure vm create azure-elasticsearch-cluster \
|
||||
esnode-image \
|
||||
--vm-name myesnode$x \
|
||||
--vm-size extrasmall \
|
||||
--ssh $((21 + $x)) \
|
||||
--ssh-cert /tmp/azure-certificate.pem \
|
||||
--connect \
|
||||
elasticsearch password1234\!\!
|
||||
done
|
||||
```
|
||||
|
||||
If you want to remove your running instances:
|
||||
|
||||
```sh
|
||||
azure vm delete myesnode1
|
||||
```
|
||||
|
|
@ -0,0 +1,97 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/discovery-azure-classic-usage.html
|
||||
---
|
||||
|
||||
# Azure Virtual Machine discovery [discovery-azure-classic-usage]
|
||||
|
||||
Azure VM discovery allows to use the Azure APIs to perform automatic discovery. Here is a simple sample configuration:
|
||||
|
||||
```yaml
|
||||
cloud:
|
||||
azure:
|
||||
management:
|
||||
subscription.id: XXX-XXX-XXX-XXX
|
||||
cloud.service.name: es-demo-app
|
||||
keystore:
|
||||
path: /path/to/azurekeystore.pkcs12
|
||||
password: WHATEVER
|
||||
type: pkcs12
|
||||
|
||||
discovery:
|
||||
seed_providers: azure
|
||||
```
|
||||
|
||||
::::{admonition} Binding the network host
|
||||
:class: important
|
||||
|
||||
The keystore file must be placed in a directory accessible by Elasticsearch like the `config` directory.
|
||||
|
||||
It’s important to define `network.host` as by default it’s bound to `localhost`.
|
||||
|
||||
You can use [core network host settings](/reference/elasticsearch/configuration-reference/networking-settings.md). For example `_en0_`.
|
||||
|
||||
::::
|
||||
|
||||
|
||||
## How to start (short story) [discovery-azure-classic-short]
|
||||
|
||||
* Create Azure instances
|
||||
* Install Elasticsearch
|
||||
* Install Azure plugin
|
||||
* Modify `elasticsearch.yml` file
|
||||
* Start Elasticsearch
|
||||
|
||||
|
||||
## Azure credential API settings [discovery-azure-classic-settings]
|
||||
|
||||
The following are a list of settings that can further control the credential API:
|
||||
|
||||
`cloud.azure.management.keystore.path`
|
||||
: /path/to/keystore
|
||||
|
||||
`cloud.azure.management.keystore.type`
|
||||
: `pkcs12`, `jceks` or `jks`. Defaults to `pkcs12`.
|
||||
|
||||
`cloud.azure.management.keystore.password`
|
||||
: your_password for the keystore
|
||||
|
||||
`cloud.azure.management.subscription.id`
|
||||
: your_azure_subscription_id
|
||||
|
||||
`cloud.azure.management.cloud.service.name`
|
||||
: your_azure_cloud_service_name. This is the cloud service name/DNS but without the `cloudapp.net` part. So if the DNS name is `abc.cloudapp.net` then the `cloud.service.name` to use is just `abc`.
|
||||
|
||||
|
||||
## Advanced settings [discovery-azure-classic-settings-advanced]
|
||||
|
||||
The following are a list of settings that can further control the discovery:
|
||||
|
||||
`discovery.azure.host.type`
|
||||
: Either `public_ip` or `private_ip` (default). Azure discovery will use the one you set to ping other nodes.
|
||||
|
||||
`discovery.azure.endpoint.name`
|
||||
: When using `public_ip` this setting is used to identify the endpoint name used to forward requests to Elasticsearch (aka transport port name). Defaults to `elasticsearch`. In Azure management console, you could define an endpoint `elasticsearch` forwarding for example requests on public IP on port 8100 to the virtual machine on port 9300.
|
||||
|
||||
`discovery.azure.deployment.name`
|
||||
: Deployment name if any. Defaults to the value set with `cloud.azure.management.cloud.service.name`.
|
||||
|
||||
`discovery.azure.deployment.slot`
|
||||
: Either `staging` or `production` (default).
|
||||
|
||||
For example:
|
||||
|
||||
```yaml
|
||||
discovery:
|
||||
type: azure
|
||||
azure:
|
||||
host:
|
||||
type: private_ip
|
||||
endpoint:
|
||||
name: elasticsearch
|
||||
deployment:
|
||||
name: your_azure_cloud_service_name
|
||||
slot: production
|
||||
```
|
||||
|
||||
|
|
@ -0,0 +1,48 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/discovery-azure-classic.html
|
||||
---
|
||||
|
||||
# Azure Classic discovery plugin [discovery-azure-classic]
|
||||
|
||||
The Azure Classic Discovery plugin uses the Azure Classic API to identify the addresses of seed hosts.
|
||||
|
||||
::::{admonition} Deprecated in 5.0.0.
|
||||
:class: warning
|
||||
|
||||
This plugin will be removed in the future
|
||||
::::
|
||||
|
||||
|
||||
|
||||
## Installation [discovery-azure-classic-install]
|
||||
|
||||
::::{warning}
|
||||
Version 9.0.0-beta1 of the Elastic Stack has not yet been released. The plugin might not be available.
|
||||
::::
|
||||
|
||||
|
||||
This plugin can be installed using the plugin manager:
|
||||
|
||||
```sh
|
||||
sudo bin/elasticsearch-plugin install discovery-azure-classic
|
||||
```
|
||||
|
||||
The plugin must be installed on every node in the cluster, and each node must be restarted after installation.
|
||||
|
||||
You can download this plugin for [offline install](/reference/elasticsearch-plugins/plugin-management-custom-url.md) from [https://artifacts.elastic.co/downloads/elasticsearch-plugins/discovery-azure-classic/discovery-azure-classic-9.0.0-beta1.zip](https://artifacts.elastic.co/downloads/elasticsearch-plugins/discovery-azure-classic/discovery-azure-classic-9.0.0-beta1.zip). To verify the `.zip` file, use the [SHA hash](https://artifacts.elastic.co/downloads/elasticsearch-plugins/discovery-azure-classic/discovery-azure-classic-9.0.0-beta1.zip.sha512) or [ASC key](https://artifacts.elastic.co/downloads/elasticsearch-plugins/discovery-azure-classic/discovery-azure-classic-9.0.0-beta1.zip.asc).
|
||||
|
||||
|
||||
## Removal [discovery-azure-classic-remove]
|
||||
|
||||
The plugin can be removed with the following command:
|
||||
|
||||
```sh
|
||||
sudo bin/elasticsearch-plugin remove discovery-azure-classic
|
||||
```
|
||||
|
||||
The node must be stopped before removing the plugin.
|
||||
|
||||
|
||||
|
||||
|
158
docs/reference/elasticsearch-plugins/discovery-ec2-usage.md
Normal file
158
docs/reference/elasticsearch-plugins/discovery-ec2-usage.md
Normal file
|
@ -0,0 +1,158 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/discovery-ec2-usage.html
|
||||
---
|
||||
|
||||
# Using the EC2 discovery plugin [discovery-ec2-usage]
|
||||
|
||||
The `discovery-ec2` plugin allows {{es}} to find the master-eligible nodes in a cluster running on AWS EC2 by querying the [AWS API](https://github.com/aws/aws-sdk-java) for the addresses of the EC2 instances running these nodes.
|
||||
|
||||
It is normally a good idea to restrict the discovery process just to the master-eligible nodes in the cluster. This plugin allows you to identify these nodes by certain criteria including their tags, their membership of security groups, and their placement within availability zones. The discovery process will work correctly even if it finds master-ineligible nodes, but master elections will be more efficient if this can be avoided.
|
||||
|
||||
The interaction with the AWS API can be authenticated using the [instance role](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/iam-roles-for-amazon-ec2.md), or else custom credentials can be supplied.
|
||||
|
||||
## Enabling EC2 discovery [_enabling_ec2_discovery]
|
||||
|
||||
To enable EC2 discovery, configure {{es}} to use the `ec2` seed hosts provider:
|
||||
|
||||
```yaml
|
||||
discovery.seed_providers: ec2
|
||||
```
|
||||
|
||||
|
||||
## Configuring EC2 discovery [_configuring_ec2_discovery]
|
||||
|
||||
EC2 discovery supports a number of settings. Some settings are sensitive and must be stored in the [{{es}} keystore](docs-content://deploy-manage/security/secure-settings.md). For example, to authenticate using a particular access key and secret key, add these keys to the keystore by running the following commands:
|
||||
|
||||
```sh
|
||||
bin/elasticsearch-keystore add discovery.ec2.access_key
|
||||
bin/elasticsearch-keystore add discovery.ec2.secret_key
|
||||
```
|
||||
|
||||
The available settings for the EC2 discovery plugin are as follows.
|
||||
|
||||
`discovery.ec2.access_key` ({{ref}}/secure-settings.html[Secure], [reloadable](docs-content://deploy-manage/security/secure-settings.md#reloadable-secure-settings))
|
||||
: An EC2 access key. If set, you must also set `discovery.ec2.secret_key`. If unset, `discovery-ec2` will instead use the instance role. This setting is sensitive and must be stored in the {{es}} keystore.
|
||||
|
||||
`discovery.ec2.secret_key` ({{ref}}/secure-settings.html[Secure], [reloadable](docs-content://deploy-manage/security/secure-settings.md#reloadable-secure-settings))
|
||||
: An EC2 secret key. If set, you must also set `discovery.ec2.access_key`. This setting is sensitive and must be stored in the {{es}} keystore.
|
||||
|
||||
`discovery.ec2.session_token` ({{ref}}/secure-settings.html[Secure], [reloadable](docs-content://deploy-manage/security/secure-settings.md#reloadable-secure-settings))
|
||||
: An EC2 session token. If set, you must also set `discovery.ec2.access_key` and `discovery.ec2.secret_key`. This setting is sensitive and must be stored in the {{es}} keystore.
|
||||
|
||||
`discovery.ec2.endpoint`
|
||||
: The EC2 service endpoint to which to connect. See [https://docs.aws.amazon.com/general/latest/gr/rande.html#ec2_region](https://docs.aws.amazon.com/general/latest/gr/rande.md#ec2_region) to find the appropriate endpoint for the region. This setting defaults to `ec2.us-east-1.amazonaws.com` which is appropriate for clusters running in the `us-east-1` region.
|
||||
|
||||
`discovery.ec2.protocol`
|
||||
: The protocol to use to connect to the EC2 service endpoint, which may be either `http` or `https`. Defaults to `https`.
|
||||
|
||||
`discovery.ec2.proxy.host`
|
||||
: The address or host name of an HTTP proxy through which to connect to EC2. If not set, no proxy is used.
|
||||
|
||||
`discovery.ec2.proxy.port`
|
||||
: When the address of an HTTP proxy is given in `discovery.ec2.proxy.host`, this setting determines the port to use to connect to the proxy. Defaults to `80`.
|
||||
|
||||
`discovery.ec2.proxy.scheme`
|
||||
: The scheme to use when connecting to the EC2 service endpoint through proxy specified in `discovery.ec2.proxy.host`. Valid values are `http` or `https`. Defaults to `http`.
|
||||
|
||||
`discovery.ec2.proxy.username` ({{ref}}/secure-settings.html[Secure], [reloadable](docs-content://deploy-manage/security/secure-settings.md#reloadable-secure-settings))
|
||||
: When the address of an HTTP proxy is given in `discovery.ec2.proxy.host`, this setting determines the username to use to connect to the proxy. When not set, no username is used. This setting is sensitive and must be stored in the {{es}} keystore.
|
||||
|
||||
`discovery.ec2.proxy.password` ({{ref}}/secure-settings.html[Secure], [reloadable](docs-content://deploy-manage/security/secure-settings.md#reloadable-secure-settings))
|
||||
: When the address of an HTTP proxy is given in `discovery.ec2.proxy.host`, this setting determines the password to use to connect to the proxy. When not set, no password is used. This setting is sensitive and must be stored in the {{es}} keystore.
|
||||
|
||||
`discovery.ec2.read_timeout`
|
||||
: The socket timeout for connections to EC2, [including the units](/reference/elasticsearch/rest-apis/api-conventions.md#time-units). For example, a value of `60s` specifies a 60-second timeout. Defaults to 50 seconds.
|
||||
|
||||
`discovery.ec2.groups`
|
||||
: A list of the names or IDs of the security groups to use for discovery. The `discovery.ec2.any_group` setting determines the behaviour of this setting. Defaults to an empty list, meaning that security group membership is ignored by EC2 discovery.
|
||||
|
||||
`discovery.ec2.any_group`
|
||||
: Defaults to `true`, meaning that instances belonging to *any* of the security groups specified in `discovery.ec2.groups` will be used for discovery. If set to `false`, only instances that belong to *all* of the security groups specified in `discovery.ec2.groups` will be used for discovery.
|
||||
|
||||
`discovery.ec2.host_type`
|
||||
: Each EC2 instance has a number of different addresses that might be suitable for discovery. This setting allows you to select which of these addresses is used by the discovery process. It can be set to one of `private_ip`, `public_ip`, `private_dns`, `public_dns` or `tag:TAGNAME` where `TAGNAME` refers to a name of a tag. This setting defaults to `private_ip`.
|
||||
|
||||
If you set `discovery.ec2.host_type` to a value of the form `tag:TAGNAME` then the value of the tag `TAGNAME` attached to each instance will be used as that instance’s address for discovery. Instances which do not have this tag set will be ignored by the discovery process.
|
||||
|
||||
For example if you tag some EC2 instances with a tag named `elasticsearch-host-name` and set `host_type: tag:elasticsearch-host-name` then the `discovery-ec2` plugin will read each instance’s host name from the value of the `elasticsearch-host-name` tag. [Read more about EC2 Tags](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/Using_Tags.md).
|
||||
|
||||
|
||||
`discovery.ec2.availability_zones`
|
||||
: A list of the names of the availability zones to use for discovery. The name of an availability zone is the [region code followed by a letter](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-regions-availability-zones.md), such as `us-east-1a`. Only instances placed in one of the given availability zones will be used for discovery.
|
||||
|
||||
$$$discovery-ec2-filtering$$$
|
||||
|
||||
`discovery.ec2.tag.TAGNAME`
|
||||
: A list of the values of a tag called `TAGNAME` to use for discovery. If set, only instances that are tagged with one of the given values will be used for discovery. For instance, the following settings will only use nodes with a `role` tag set to `master` and an `environment` tag set to either `dev` or `staging`.
|
||||
|
||||
```yaml
|
||||
discovery.ec2.tag.role: master
|
||||
discovery.ec2.tag.environment: dev,staging
|
||||
```
|
||||
|
||||
::::{note}
|
||||
The names of tags used for discovery may only contain ASCII letters, numbers, hyphens and underscores. In particular you cannot use tags whose name includes a colon.
|
||||
::::
|
||||
|
||||
|
||||
|
||||
`discovery.ec2.node_cache_time`
|
||||
: Sets the length of time for which the collection of discovered instances is cached. {{es}} waits at least this long between requests for discovery information from the EC2 API. AWS may reject discovery requests if they are made too often, and this would cause discovery to fail. Defaults to `10s`.
|
||||
|
||||
All **secure** settings of this plugin are [reloadable](docs-content://deploy-manage/security/secure-settings.md#reloadable-secure-settings), allowing you to update the secure settings for this plugin without needing to restart each node.
|
||||
|
||||
|
||||
## Recommended EC2 permissions [discovery-ec2-permissions]
|
||||
|
||||
The `discovery-ec2` plugin works by making a `DescribeInstances` call to the AWS EC2 API. You must configure your AWS account to allow this, which is normally done using an IAM policy. You can create a custom policy via the IAM Management Console. It should look similar to this.
|
||||
|
||||
```js
|
||||
{
|
||||
"Statement": [
|
||||
{
|
||||
"Action": [
|
||||
"ec2:DescribeInstances"
|
||||
],
|
||||
"Effect": "Allow",
|
||||
"Resource": [
|
||||
"*"
|
||||
]
|
||||
}
|
||||
],
|
||||
"Version": "2012-10-17"
|
||||
}
|
||||
```
|
||||
|
||||
|
||||
## Automatic node attributes [discovery-ec2-attributes]
|
||||
|
||||
The `discovery-ec2` plugin can automatically set the `aws_availability_zone` node attribute to the availability zone of each node. This node attribute allows you to ensure that each shard has copies allocated redundantly across multiple availability zones by using the [Allocation Awareness](docs-content://deploy-manage/distributed-architecture/shard-allocation-relocation-recovery/shard-allocation-awareness.md) feature.
|
||||
|
||||
In order to enable the automatic definition of the `aws_availability_zone` attribute, set `cloud.node.auto_attributes` to `true`. For example:
|
||||
|
||||
```yaml
|
||||
cloud.node.auto_attributes: true
|
||||
cluster.routing.allocation.awareness.attributes: aws_availability_zone
|
||||
```
|
||||
|
||||
The `aws_availability_zone` attribute can be automatically set like this when using any discovery type. It is not necessary to set `discovery.seed_providers: ec2`. However this feature does require that the `discovery-ec2` plugin is installed.
|
||||
|
||||
|
||||
## Binding to the correct address [discovery-ec2-network-host]
|
||||
|
||||
It is important to define `network.host` correctly when deploying a cluster on EC2. By default each {{es}} node only binds to `localhost`, which will prevent it from being discovered by nodes running on any other instances.
|
||||
|
||||
You can use the [core network host settings](/reference/elasticsearch/configuration-reference/networking-settings.md) to bind each node to the desired address, or you can set `network.host` to one of the following EC2-specific settings provided by the `discovery-ec2` plugin:
|
||||
|
||||
| EC2 Host Value | Description |
|
||||
| --- | --- |
|
||||
| `_ec2:privateIpv4_` | The private IP address (ipv4) of the machine. |
|
||||
| `_ec2:privateDns_` | The private host of the machine. |
|
||||
| `_ec2:publicIpv4_` | The public IP address (ipv4) of the machine. |
|
||||
| `_ec2:publicDns_` | The public host of the machine. |
|
||||
| `_ec2:privateIp_` | Equivalent to `_ec2:privateIpv4_`. |
|
||||
| `_ec2:publicIp_` | Equivalent to `_ec2:publicIpv4_`. |
|
||||
| `_ec2_` | Equivalent to `_ec2:privateIpv4_`. |
|
||||
|
||||
These values are acceptable when using any discovery type. They do not require you to set `discovery.seed_providers: ec2`. However they do require that the `discovery-ec2` plugin is installed.
|
42
docs/reference/elasticsearch-plugins/discovery-ec2.md
Normal file
42
docs/reference/elasticsearch-plugins/discovery-ec2.md
Normal file
|
@ -0,0 +1,42 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/discovery-ec2.html
|
||||
---
|
||||
|
||||
# EC2 Discovery plugin [discovery-ec2]
|
||||
|
||||
The EC2 discovery plugin provides a list of seed addresses to the [discovery process](docs-content://deploy-manage/distributed-architecture/discovery-cluster-formation/discovery-hosts-providers.md) by querying the [AWS API](https://github.com/aws/aws-sdk-java) for a list of EC2 instances matching certain criteria determined by the [plugin settings](/reference/elasticsearch-plugins/discovery-ec2-usage.md).
|
||||
|
||||
**If you are looking for a hosted solution of {{es}} on AWS, please visit [https://www.elastic.co/cloud.**](https://www.elastic.co/cloud.**)
|
||||
|
||||
|
||||
## Installation [discovery-ec2-install]
|
||||
|
||||
::::{warning}
|
||||
Version 9.0.0-beta1 of the Elastic Stack has not yet been released. The plugin might not be available.
|
||||
::::
|
||||
|
||||
|
||||
This plugin can be installed using the plugin manager:
|
||||
|
||||
```sh
|
||||
sudo bin/elasticsearch-plugin install discovery-ec2
|
||||
```
|
||||
|
||||
The plugin must be installed on every node in the cluster, and each node must be restarted after installation.
|
||||
|
||||
You can download this plugin for [offline install](/reference/elasticsearch-plugins/plugin-management-custom-url.md) from [https://artifacts.elastic.co/downloads/elasticsearch-plugins/discovery-ec2/discovery-ec2-9.0.0-beta1.zip](https://artifacts.elastic.co/downloads/elasticsearch-plugins/discovery-ec2/discovery-ec2-9.0.0-beta1.zip). To verify the `.zip` file, use the [SHA hash](https://artifacts.elastic.co/downloads/elasticsearch-plugins/discovery-ec2/discovery-ec2-9.0.0-beta1.zip.sha512) or [ASC key](https://artifacts.elastic.co/downloads/elasticsearch-plugins/discovery-ec2/discovery-ec2-9.0.0-beta1.zip.asc).
|
||||
|
||||
|
||||
## Removal [discovery-ec2-remove]
|
||||
|
||||
The plugin can be removed with the following command:
|
||||
|
||||
```sh
|
||||
sudo bin/elasticsearch-plugin remove discovery-ec2
|
||||
```
|
||||
|
||||
The node must be stopped before removing the plugin.
|
||||
|
||||
|
||||
|
|
@ -0,0 +1,35 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/discovery-gce-network-host.html
|
||||
---
|
||||
|
||||
# GCE Network Host [discovery-gce-network-host]
|
||||
|
||||
When the `discovery-gce` plugin is installed, the following are also allowed as valid network host settings:
|
||||
|
||||
| GCE Host Value | Description |
|
||||
| --- | --- |
|
||||
| `_gce:privateIp:X_` | The private IP address of the machine for a given network interface. |
|
||||
| `_gce:hostname_` | The hostname of the machine. |
|
||||
| `_gce_` | Same as `_gce:privateIp:0_` (recommended). |
|
||||
|
||||
Examples:
|
||||
|
||||
```yaml
|
||||
# get the IP address from network interface 1
|
||||
network.host: _gce:privateIp:1_
|
||||
# Using GCE internal hostname
|
||||
network.host: _gce:hostname_
|
||||
# shortcut for _gce:privateIp:0_ (recommended)
|
||||
network.host: _gce_
|
||||
```
|
||||
|
||||
## How to start (short story) [discovery-gce-usage-short]
|
||||
|
||||
* Create Google Compute Engine instance (with compute rw permissions)
|
||||
* Install Elasticsearch
|
||||
* Install Google Compute Engine Cloud plugin
|
||||
* Modify `elasticsearch.yml` file
|
||||
* Start Elasticsearch
|
||||
|
||||
|
|
@ -0,0 +1,62 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/discovery-gce-usage-cloning.html
|
||||
---
|
||||
|
||||
# Cloning your existing machine [discovery-gce-usage-cloning]
|
||||
|
||||
In order to build a cluster on many nodes, you can clone your configured instance to new nodes. You won’t have to reinstall everything!
|
||||
|
||||
First create an image of your running instance and upload it to Google Cloud Storage:
|
||||
|
||||
```sh
|
||||
# Create an image of your current instance
|
||||
sudo /usr/bin/gcimagebundle -d /dev/sda -o /tmp/
|
||||
|
||||
# An image has been created in `/tmp` directory:
|
||||
ls /tmp
|
||||
e4686d7f5bf904a924ae0cfeb58d0827c6d5b966.image.tar.gz
|
||||
|
||||
# Upload your image to Google Cloud Storage:
|
||||
# Create a bucket to hold your image, let's say `esimage`:
|
||||
gsutil mb gs://esimage
|
||||
|
||||
# Copy your image to this bucket:
|
||||
gsutil cp /tmp/e4686d7f5bf904a924ae0cfeb58d0827c6d5b966.image.tar.gz gs://esimage
|
||||
|
||||
# Then add your image to images collection:
|
||||
gcloud compute images create elasticsearch-2-0-0 --source-uri gs://esimage/e4686d7f5bf904a924ae0cfeb58d0827c6d5b966.image.tar.gz
|
||||
|
||||
# If the previous command did not work for you, logout from your instance
|
||||
# and launch the same command from your local machine.
|
||||
```
|
||||
|
||||
## Start new instances [discovery-gce-usage-start-new-instances]
|
||||
|
||||
As you have now an image, you can create as many instances as you need:
|
||||
|
||||
```sh
|
||||
# Just change node name (here myesnode2)
|
||||
gcloud compute instances create myesnode2 --image elasticsearch-2-0-0 --zone europe-west1-a
|
||||
|
||||
# If you want to provide all details directly, you can use:
|
||||
gcloud compute instances create myesnode2 --image=elasticsearch-2-0-0 \
|
||||
--zone europe-west1-a --machine-type f1-micro --scopes=compute-rw
|
||||
```
|
||||
|
||||
|
||||
## Remove an instance (aka shut it down) [discovery-gce-usage-remove-instance]
|
||||
|
||||
You can use [Google Cloud Console](https://cloud.google.com/console) or CLI to manage your instances:
|
||||
|
||||
```sh
|
||||
# Stopping and removing instances
|
||||
gcloud compute instances delete myesnode1 myesnode2 \
|
||||
--zone=europe-west1-a
|
||||
|
||||
# Consider removing disk as well if you don't need them anymore
|
||||
gcloud compute disks delete boot-myesnode1 boot-myesnode2 \
|
||||
--zone=europe-west1-a
|
||||
```
|
||||
|
||||
|
128
docs/reference/elasticsearch-plugins/discovery-gce-usage-long.md
Normal file
128
docs/reference/elasticsearch-plugins/discovery-gce-usage-long.md
Normal file
|
@ -0,0 +1,128 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/discovery-gce-usage-long.html
|
||||
---
|
||||
|
||||
# Setting up GCE Discovery [discovery-gce-usage-long]
|
||||
|
||||
## Prerequisites [discovery-gce-usage-long-prerequisites]
|
||||
|
||||
Before starting, you need:
|
||||
|
||||
* Your project ID, e.g. `es-cloud`. Get it from [Google API Console](https://code.google.com/apis/console/).
|
||||
* To install [Google Cloud SDK](https://developers.google.com/cloud/sdk/)
|
||||
|
||||
If you did not set it yet, you can define your default project you will work on:
|
||||
|
||||
```sh
|
||||
gcloud config set project es-cloud
|
||||
```
|
||||
|
||||
|
||||
## Login to Google Cloud [discovery-gce-usage-long-login]
|
||||
|
||||
If you haven’t already, login to Google Cloud
|
||||
|
||||
```sh
|
||||
gcloud auth login
|
||||
```
|
||||
|
||||
This will open your browser. You will be asked to sign-in to a Google account and authorize access to the Google Cloud SDK.
|
||||
|
||||
|
||||
## Creating your first instance [discovery-gce-usage-long-first-instance]
|
||||
|
||||
```sh
|
||||
gcloud compute instances create myesnode1 \
|
||||
--zone <your-zone> \
|
||||
--scopes compute-rw
|
||||
```
|
||||
|
||||
When done, a report like this one should appears:
|
||||
|
||||
```text
|
||||
Created [https://www.googleapis.com/compute/v1/projects/es-cloud-1070/zones/us-central1-f/instances/myesnode1].
|
||||
NAME ZONE MACHINE_TYPE PREEMPTIBLE INTERNAL_IP EXTERNAL_IP STATUS
|
||||
myesnode1 us-central1-f n1-standard-1 10.240.133.54 104.197.94.25 RUNNING
|
||||
```
|
||||
|
||||
You can now connect to your instance:
|
||||
|
||||
```sh
|
||||
# Connect using google cloud SDK
|
||||
gcloud compute ssh myesnode1 --zone europe-west1-a
|
||||
|
||||
# Or using SSH with external IP address
|
||||
ssh -i ~/.ssh/google_compute_engine 192.158.29.199
|
||||
```
|
||||
|
||||
::::{admonition} Service Account Permissions
|
||||
:class: important
|
||||
|
||||
It’s important when creating an instance that the correct permissions are set. At a minimum, you must ensure you have:
|
||||
|
||||
```text
|
||||
scopes=compute-rw
|
||||
```
|
||||
|
||||
Failing to set this will result in unauthorized messages when starting Elasticsearch. See [Machine Permissions](/reference/elasticsearch-plugins/discovery-gce-usage-tips.md#discovery-gce-usage-tips-permissions).
|
||||
|
||||
::::
|
||||
|
||||
|
||||
Once connected, [install {{es}}](docs-content://deploy-manage/deploy/self-managed/installing-elasticsearch.md).
|
||||
|
||||
|
||||
## Install Elasticsearch discovery gce plugin [discovery-gce-usage-long-install-plugin]
|
||||
|
||||
Install the plugin:
|
||||
|
||||
```sh
|
||||
# Use Plugin Manager to install it
|
||||
sudo bin/elasticsearch-plugin install discovery-gce
|
||||
```
|
||||
|
||||
Open the `elasticsearch.yml` file:
|
||||
|
||||
```sh
|
||||
sudo vi /etc/elasticsearch/elasticsearch.yml
|
||||
```
|
||||
|
||||
And add the following lines:
|
||||
|
||||
```yaml
|
||||
cloud:
|
||||
gce:
|
||||
project_id: es-cloud
|
||||
zone: europe-west1-a
|
||||
discovery:
|
||||
seed_providers: gce
|
||||
```
|
||||
|
||||
Start Elasticsearch:
|
||||
|
||||
```sh
|
||||
sudo systemctl start elasticsearch
|
||||
```
|
||||
|
||||
If anything goes wrong, you should check logs:
|
||||
|
||||
```sh
|
||||
tail -f /var/log/elasticsearch/elasticsearch.log
|
||||
```
|
||||
|
||||
If needed, you can change log level to `trace` by opening `log4j2.properties`:
|
||||
|
||||
```sh
|
||||
sudo vi /etc/elasticsearch/log4j2.properties
|
||||
```
|
||||
|
||||
and adding the following line:
|
||||
|
||||
```yaml
|
||||
# discovery
|
||||
logger.discovery_gce.name = discovery.gce
|
||||
logger.discovery_gce.level = trace
|
||||
```
|
||||
|
||||
|
|
@ -0,0 +1,35 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/discovery-gce-usage-port.html
|
||||
---
|
||||
|
||||
# Changing default transport port [discovery-gce-usage-port]
|
||||
|
||||
By default, Elasticsearch GCE plugin assumes that you run Elasticsearch on 9300 default port. But you can specify the port value Elasticsearch is meant to use using google compute engine metadata `es_port`:
|
||||
|
||||
## When creating instance [discovery-gce-usage-port-create]
|
||||
|
||||
Add `--metadata es_port=9301` option:
|
||||
|
||||
```sh
|
||||
# when creating first instance
|
||||
gcloud compute instances create myesnode1 \
|
||||
--scopes=compute-rw,storage-full \
|
||||
--metadata es_port=9301
|
||||
|
||||
# when creating an instance from an image
|
||||
gcloud compute instances create myesnode2 --image=elasticsearch-1-0-0-RC1 \
|
||||
--zone europe-west1-a --machine-type f1-micro --scopes=compute-rw \
|
||||
--metadata es_port=9301
|
||||
```
|
||||
|
||||
|
||||
## On a running instance [discovery-gce-usage-port-run]
|
||||
|
||||
```sh
|
||||
gcloud compute instances add-metadata myesnode1 \
|
||||
--zone europe-west1-a \
|
||||
--metadata es_port=9301
|
||||
```
|
||||
|
||||
|
|
@ -0,0 +1,32 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/discovery-gce-usage-tags.html
|
||||
---
|
||||
|
||||
# Filtering by tags [discovery-gce-usage-tags]
|
||||
|
||||
The GCE discovery can also filter machines to include in the cluster based on tags using `discovery.gce.tags` settings. For example, setting `discovery.gce.tags` to `dev` will only filter instances having a tag set to `dev`. Several tags set will require all of those tags to be set for the instance to be included.
|
||||
|
||||
One practical use for tag filtering is when a GCE cluster contains many nodes that are not master-eligible {{es}} nodes. In this case, tagging the GCE instances that *are* running the master-eligible {{es}} nodes, and then filtering by that tag, will help discovery to run more efficiently.
|
||||
|
||||
Add your tag when building the new instance:
|
||||
|
||||
```sh
|
||||
gcloud compute instances create myesnode1 --project=es-cloud \
|
||||
--scopes=compute-rw \
|
||||
--tags=elasticsearch,dev
|
||||
```
|
||||
|
||||
Then, define it in `elasticsearch.yml`:
|
||||
|
||||
```yaml
|
||||
cloud:
|
||||
gce:
|
||||
project_id: es-cloud
|
||||
zone: europe-west1-a
|
||||
discovery:
|
||||
seed_providers: gce
|
||||
gce:
|
||||
tags: elasticsearch, dev
|
||||
```
|
||||
|
|
@ -0,0 +1,26 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/discovery-gce-usage-testing.html
|
||||
---
|
||||
|
||||
# Testing GCE [discovery-gce-usage-testing]
|
||||
|
||||
Integrations tests in this plugin require working GCE configuration and therefore disabled by default. To enable tests prepare a config file elasticsearch.yml with the following content:
|
||||
|
||||
```yaml
|
||||
cloud:
|
||||
gce:
|
||||
project_id: es-cloud
|
||||
zone: europe-west1-a
|
||||
discovery:
|
||||
seed_providers: gce
|
||||
```
|
||||
|
||||
Replace `project_id` and `zone` with your settings.
|
||||
|
||||
To run test:
|
||||
|
||||
```sh
|
||||
mvn -Dtests.gce=true -Dtests.config=/path/to/config/file/elasticsearch.yml clean test
|
||||
```
|
||||
|
|
@ -0,0 +1,52 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/discovery-gce-usage-tips.html
|
||||
---
|
||||
|
||||
# GCE Tips [discovery-gce-usage-tips]
|
||||
|
||||
## Store project id locally [discovery-gce-usage-tips-projectid]
|
||||
|
||||
If you don’t want to repeat the project id each time, you can save it in the local gcloud config
|
||||
|
||||
```sh
|
||||
gcloud config set project es-cloud
|
||||
```
|
||||
|
||||
|
||||
## Machine Permissions [discovery-gce-usage-tips-permissions]
|
||||
|
||||
If you have created a machine without the correct permissions, you will see `403 unauthorized` error messages. To change machine permission on an existing instance, first stop the instance then Edit. Scroll down to `Access Scopes` to change permission. The other way to alter these permissions is to delete the instance (NOT THE DISK). Then create another with the correct permissions.
|
||||
|
||||
Creating machines with gcloud
|
||||
: Ensure the following flags are set:
|
||||
|
||||
```text
|
||||
--scopes=compute-rw
|
||||
```
|
||||
|
||||
|
||||
Creating with console (web)
|
||||
: When creating an instance using the web console, scroll down to **Identity and API access**.
|
||||
|
||||
Select a service account with the correct permissions or choose **Compute Engine default service account** and select **Allow default access** for **Access scopes**.
|
||||
|
||||
|
||||
Creating with knife google
|
||||
: Set the service account scopes when creating the machine:
|
||||
|
||||
```sh
|
||||
knife google server create www1 \
|
||||
-m n1-standard-1 \
|
||||
-I debian-8 \
|
||||
-Z us-central1-a \
|
||||
-i ~/.ssh/id_rsa \
|
||||
-x jdoe \
|
||||
--gce-service-account-scopes https://www.googleapis.com/auth/compute
|
||||
```
|
||||
|
||||
Or, you may use the alias:
|
||||
|
||||
```sh
|
||||
--gce-service-account-scopes compute-rw
|
||||
```
|
|
@ -0,0 +1,20 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/discovery-gce-usage-zones.html
|
||||
---
|
||||
|
||||
# Using GCE zones [discovery-gce-usage-zones]
|
||||
|
||||
`cloud.gce.zone` helps to retrieve instances running in a given zone. It should be one of the [GCE supported zones](https://developers.google.com/compute/docs/zones#available).
|
||||
|
||||
The GCE discovery can support multi zones although you need to be aware of network latency between zones. To enable discovery across more than one zone, just enter add your zone list to `cloud.gce.zone` setting:
|
||||
|
||||
```yaml
|
||||
cloud:
|
||||
gce:
|
||||
project_id: <your-google-project-id>
|
||||
zone: ["<your-zone1>", "<your-zone2>"]
|
||||
discovery:
|
||||
seed_providers: gce
|
||||
```
|
||||
|
51
docs/reference/elasticsearch-plugins/discovery-gce-usage.md
Normal file
51
docs/reference/elasticsearch-plugins/discovery-gce-usage.md
Normal file
|
@ -0,0 +1,51 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/discovery-gce-usage.html
|
||||
---
|
||||
|
||||
# GCE Virtual Machine discovery [discovery-gce-usage]
|
||||
|
||||
Google Compute Engine VM discovery allows to use the google APIs to perform automatic discovery of seed hosts. Here is a simple sample configuration:
|
||||
|
||||
```yaml
|
||||
cloud:
|
||||
gce:
|
||||
project_id: <your-google-project-id>
|
||||
zone: <your-zone>
|
||||
discovery:
|
||||
seed_providers: gce
|
||||
```
|
||||
|
||||
The following gce settings (prefixed with `cloud.gce`) are supported:
|
||||
|
||||
`project_id`
|
||||
: Your Google project id. By default the project id will be derived from the instance metadata.
|
||||
|
||||
```
|
||||
Note: Deriving the project id from system properties or environment variables
|
||||
(`GOOGLE_CLOUD_PROJECT` or `GCLOUD_PROJECT`) is not supported.
|
||||
```
|
||||
|
||||
|
||||
`zone`
|
||||
: helps to retrieve instances running in a given zone. It should be one of the [GCE supported zones](https://developers.google.com/compute/docs/zones#available). By default the zone will be derived from the instance metadata. See also [Using GCE zones](/reference/elasticsearch-plugins/discovery-gce-usage-zones.md).
|
||||
|
||||
`retry`
|
||||
: If set to `true`, client will use [ExponentialBackOff](https://developers.google.com/api-client-library/java/google-http-java-client/backoff) policy to retry the failed http request. Defaults to `true`.
|
||||
|
||||
`max_wait`
|
||||
: The maximum elapsed time after the client instantiating retry. If the time elapsed goes past the `max_wait`, client stops to retry. A negative value means that it will wait indefinitely. Defaults to `0s` (retry indefinitely).
|
||||
|
||||
`refresh_interval`
|
||||
: How long the list of hosts is cached to prevent further requests to the GCE API. `0s` disables caching. A negative value will cause infinite caching. Defaults to `0s`.
|
||||
|
||||
::::{admonition} Binding the network host
|
||||
:class: important
|
||||
|
||||
It’s important to define `network.host` as by default it’s bound to `localhost`.
|
||||
|
||||
You can use [core network host settings](/reference/elasticsearch/configuration-reference/networking-settings.md) or [gce specific host settings](/reference/elasticsearch-plugins/discovery-gce-network-host.md):
|
||||
|
||||
::::
|
||||
|
||||
|
47
docs/reference/elasticsearch-plugins/discovery-gce.md
Normal file
47
docs/reference/elasticsearch-plugins/discovery-gce.md
Normal file
|
@ -0,0 +1,47 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/discovery-gce.html
|
||||
---
|
||||
|
||||
# GCE Discovery plugin [discovery-gce]
|
||||
|
||||
The Google Compute Engine Discovery plugin uses the GCE API to identify the addresses of seed hosts.
|
||||
|
||||
|
||||
## Installation [discovery-gce-install]
|
||||
|
||||
::::{warning}
|
||||
Version 9.0.0-beta1 of the Elastic Stack has not yet been released. The plugin might not be available.
|
||||
::::
|
||||
|
||||
|
||||
This plugin can be installed using the plugin manager:
|
||||
|
||||
```sh
|
||||
sudo bin/elasticsearch-plugin install discovery-gce
|
||||
```
|
||||
|
||||
The plugin must be installed on every node in the cluster, and each node must be restarted after installation.
|
||||
|
||||
You can download this plugin for [offline install](/reference/elasticsearch-plugins/plugin-management-custom-url.md) from [https://artifacts.elastic.co/downloads/elasticsearch-plugins/discovery-gce/discovery-gce-9.0.0-beta1.zip](https://artifacts.elastic.co/downloads/elasticsearch-plugins/discovery-gce/discovery-gce-9.0.0-beta1.zip). To verify the `.zip` file, use the [SHA hash](https://artifacts.elastic.co/downloads/elasticsearch-plugins/discovery-gce/discovery-gce-9.0.0-beta1.zip.sha512) or [ASC key](https://artifacts.elastic.co/downloads/elasticsearch-plugins/discovery-gce/discovery-gce-9.0.0-beta1.zip.asc).
|
||||
|
||||
|
||||
## Removal [discovery-gce-remove]
|
||||
|
||||
The plugin can be removed with the following command:
|
||||
|
||||
```sh
|
||||
sudo bin/elasticsearch-plugin remove discovery-gce
|
||||
```
|
||||
|
||||
The node must be stopped before removing the plugin.
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
26
docs/reference/elasticsearch-plugins/discovery-plugins.md
Normal file
26
docs/reference/elasticsearch-plugins/discovery-plugins.md
Normal file
|
@ -0,0 +1,26 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/discovery.html
|
||||
---
|
||||
|
||||
# Discovery plugins [discovery]
|
||||
|
||||
Discovery plugins extend Elasticsearch by adding new seed hosts providers that can be used to extend the [cluster formation module](docs-content://deploy-manage/distributed-architecture/discovery-cluster-formation.md).
|
||||
|
||||
|
||||
## Core discovery plugins [_core_discovery_plugins]
|
||||
|
||||
The core discovery plugins are:
|
||||
|
||||
[EC2 discovery](/reference/elasticsearch-plugins/discovery-ec2.md)
|
||||
: The EC2 discovery plugin uses the [AWS API](https://github.com/aws/aws-sdk-java) to identify the addresses of seed hosts.
|
||||
|
||||
[Azure Classic discovery](/reference/elasticsearch-plugins/discovery-azure-classic.md)
|
||||
: The Azure Classic discovery plugin uses the Azure Classic API to identify the addresses of seed hosts.
|
||||
|
||||
[GCE discovery](/reference/elasticsearch-plugins/discovery-gce.md)
|
||||
: The Google Compute Engine discovery plugin uses the GCE API to identify the addresses of seed hosts.
|
||||
|
||||
|
||||
|
||||
|
27
docs/reference/elasticsearch-plugins/index.md
Normal file
27
docs/reference/elasticsearch-plugins/index.md
Normal file
|
@ -0,0 +1,27 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/index.html
|
||||
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/intro.html
|
||||
---
|
||||
|
||||
# Elasticsearch plugins [intro]
|
||||
|
||||
Plugins are a way to enhance the core Elasticsearch functionality in a custom manner. They range from adding custom mapping types, custom analyzers, native scripts, custom discovery and more.
|
||||
|
||||
Plugins contain JAR files, but may also contain scripts and config files, and must be installed on every node in the cluster. After installation, each node must be restarted before the plugin becomes visible.
|
||||
|
||||
::::{note}
|
||||
A full cluster restart is required for installing plugins that have custom cluster state metadata. It is still possible to upgrade such plugins with a rolling restart.
|
||||
::::
|
||||
|
||||
|
||||
This documentation distinguishes two categories of plugins:
|
||||
|
||||
Core Plugins
|
||||
: This category identifies plugins that are part of Elasticsearch project. Delivered at the same time as Elasticsearch, their version number always matches the version number of Elasticsearch itself. These plugins are maintained by the Elastic team with the appreciated help of amazing community members (for open source plugins). Issues and bug reports can be reported on the [Github project page](https://github.com/elastic/elasticsearch).
|
||||
|
||||
Community contributed
|
||||
: This category identifies plugins that are external to the Elasticsearch project. They are provided by individual developers or private companies and have their own licenses as well as their own versioning system. Issues and bug reports can usually be reported on the community plugin’s web site.
|
||||
|
||||
For advice on writing your own plugin, refer to [*Creating an {{es}} plugin*](/extend/index.md).
|
||||
|
26
docs/reference/elasticsearch-plugins/installation.md
Normal file
26
docs/reference/elasticsearch-plugins/installation.md
Normal file
|
@ -0,0 +1,26 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/installation.html
|
||||
---
|
||||
|
||||
# Installing plugins [installation]
|
||||
|
||||
The documentation for each plugin usually includes specific installation instructions for that plugin, but below we document the various available options:
|
||||
|
||||
|
||||
## Core Elasticsearch plugins [_core_elasticsearch_plugins]
|
||||
|
||||
Core Elasticsearch plugins can be installed as follows:
|
||||
|
||||
```shell
|
||||
sudo bin/elasticsearch-plugin install [plugin_name]
|
||||
```
|
||||
|
||||
For instance, to install the core [ICU plugin](/reference/elasticsearch-plugins/analysis-icu.md), just run the following command:
|
||||
|
||||
```shell
|
||||
sudo bin/elasticsearch-plugin install analysis-icu
|
||||
```
|
||||
|
||||
This command will install the version of the plugin that matches your Elasticsearch version and also show a progress bar while downloading.
|
||||
|
|
@ -0,0 +1,23 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/installing-multiple-plugins.html
|
||||
---
|
||||
|
||||
# Installing multiple plugins [installing-multiple-plugins]
|
||||
|
||||
Multiple plugins can be installed in one invocation as follows:
|
||||
|
||||
```shell
|
||||
sudo bin/elasticsearch-plugin install [plugin_id] [plugin_id] ... [plugin_id]
|
||||
```
|
||||
|
||||
Each `plugin_id` can be any valid form for installing a single plugin (e.g., the name of a core plugin, or a custom URL).
|
||||
|
||||
For instance, to install the core [ICU plugin](/reference/elasticsearch-plugins/analysis-icu.md), run the following command:
|
||||
|
||||
```shell
|
||||
sudo bin/elasticsearch-plugin install analysis-icu
|
||||
```
|
||||
|
||||
This command will install the versions of the plugins that matches your Elasticsearch version. The installation will be treated as a transaction, so that all the plugins will be installed, or none of the plugins will be installed if any installation fails.
|
||||
|
97
docs/reference/elasticsearch-plugins/integrations.md
Normal file
97
docs/reference/elasticsearch-plugins/integrations.md
Normal file
|
@ -0,0 +1,97 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/integrations.html
|
||||
---
|
||||
|
||||
# Integrations [integrations]
|
||||
|
||||
Integrations are not plugins, but are external tools or modules that make it easier to work with Elasticsearch.
|
||||
|
||||
|
||||
## CMS integrations [cms-integrations]
|
||||
|
||||
|
||||
### Supported by the community: [_supported_by_the_community]
|
||||
|
||||
* [ElasticPress](https://wordpress.org/plugins/elasticpress/): Elasticsearch WordPress Plugin
|
||||
* [Tiki Wiki CMS Groupware](https://doc.tiki.org/Elasticsearch): Tiki has native support for Elasticsearch. This provides faster & better search (facets, etc), along with some Natural Language Processing features (ex.: More like this)
|
||||
* [XWiki Next Generation Wiki](https://extensions.xwiki.org/xwiki/bin/view/Extension/Elastic+Search+Macro/): XWiki has an Elasticsearch and Kibana macro allowing to run Elasticsearch queries and display the results in XWiki pages using XWiki’s scripting language as well as include Kibana Widgets in XWiki pages
|
||||
|
||||
|
||||
### Supported by Elastic: [_supported_by_elastic]
|
||||
|
||||
* [Logstash output to Elasticsearch](logstash://docs/reference/plugins-outputs-elasticsearch.md): The Logstash `elasticsearch` output plugin.
|
||||
* [Elasticsearch input to Logstash](logstash://docs/reference/plugins-inputs-elasticsearch.md) The Logstash `elasticsearch` input plugin.
|
||||
* [Elasticsearch event filtering in Logstash](logstash://docs/reference/plugins-filters-elasticsearch.md) The Logstash `elasticsearch` filter plugin.
|
||||
* [Elasticsearch bulk codec](logstash://docs/reference/plugins-codecs-es_bulk.md) The Logstash `es_bulk` plugin decodes the Elasticsearch bulk format into individual events.
|
||||
|
||||
|
||||
### Supported by the community: [_supported_by_the_community_2]
|
||||
|
||||
* [Ingest processor template](https://github.com/spinscale/cookiecutter-elasticsearch-ingest-processor): A template for creating new ingest processors.
|
||||
* [Kafka Standalone Consumer (Indexer)](https://github.com/BigDataDevs/kafka-elasticsearch-consumer): Kafka Standalone Consumer [Indexer] will read messages from Kafka in batches, processes(as implemented) and bulk-indexes them into Elasticsearch. Flexible and scalable. More documentation in above GitHub repo’s Wiki.
|
||||
* [Scrutineer](https://github.com/Aconex/scrutineer): A high performance consistency checker to compare what you’ve indexed with your source of truth content (e.g. DB)
|
||||
* [FS Crawler](https://github.com/dadoonet/fscrawler): The File System (FS) crawler allows to index documents (PDF, Open Office…) from your local file system and over SSH. (by David Pilato)
|
||||
* [Elasticsearch Evolution](https://github.com/senacor/elasticsearch-evolution): A library to migrate elasticsearch mappings.
|
||||
* [PGSync](https://pgsync.com): A tool for syncing data from Postgres to Elasticsearch.
|
||||
|
||||
|
||||
## Deployment [deployment]
|
||||
|
||||
|
||||
### Supported by the community: [_supported_by_the_community_3]
|
||||
|
||||
* [Ansible](https://github.com/elastic/ansible-elasticsearch): Ansible playbook for Elasticsearch.
|
||||
* [Puppet](https://github.com/elastic/puppet-elasticsearch): Elasticsearch puppet module.
|
||||
* [Chef](https://github.com/elastic/cookbook-elasticsearch): Chef cookbook for Elasticsearch
|
||||
|
||||
|
||||
## Framework integrations [framework-integrations]
|
||||
|
||||
|
||||
### Supported by the community: [_supported_by_the_community_4]
|
||||
|
||||
* [Apache Camel Integration](https://camel.apache.org/components/2.x/elasticsearch-component.md): An Apache camel component to integrate Elasticsearch
|
||||
* [Catmandu](https://metacpan.org/pod/Catmandu::Store::ElasticSearch): An Elasticsearch backend for the Catmandu framework.
|
||||
* [FOSElasticaBundle](https://github.com/FriendsOfSymfony/FOSElasticaBundle): Symfony2 Bundle wrapping Elastica.
|
||||
* [Grails](https://plugins.grails.org/plugin/puneetbehl/elasticsearch): Elasticsearch Grails plugin.
|
||||
* [Hibernate Search](https://hibernate.org/search/) Integration with Hibernate ORM, from the Hibernate team. Automatic synchronization of write operations, yet exposes full Elasticsearch capabilities for queries. Can return either Elasticsearch native or re-map queries back into managed entities loaded within transactions from the reference database.
|
||||
* [Spring Data Elasticsearch](https://github.com/spring-projects/spring-data-elasticsearch): Spring Data implementation for Elasticsearch
|
||||
* [Spring Elasticsearch](https://github.com/dadoonet/spring-elasticsearch): Spring Factory for Elasticsearch
|
||||
* [Zeebe](https://zeebe.io): An Elasticsearch exporter acts as a bridge between Zeebe and Elasticsearch
|
||||
* [Apache Pulsar](https://pulsar.apache.org/docs/en/io-elasticsearch): The Elasticsearch Sink Connector is used to pull messages from Pulsar topics and persist the messages to an index.
|
||||
* [Micronaut Elasticsearch Integration](https://micronaut-projects.github.io/micronaut-elasticsearch/latest/guide/index.md): Integration of Micronaut with Elasticsearch
|
||||
* [Apache StreamPipes](https://streampipes.apache.org): StreamPipes is a framework that enables users to work with IoT data sources.
|
||||
* [Apache MetaModel](https://metamodel.apache.org/): Providing a common interface for discovery, exploration of metadata and querying of different types of data sources.
|
||||
* [Micrometer](https://micrometer.io): Vendor-neutral application metrics facade. Think SLF4j, but for metrics.
|
||||
|
||||
|
||||
## Hadoop integrations [hadoop-integrations]
|
||||
|
||||
|
||||
### Supported by Elastic: [_supported_by_elastic_2]
|
||||
|
||||
* [es-hadoop](elasticsearch-hadoop://docs/reference/preface.md): Elasticsearch real-time search and analytics natively integrated with Hadoop. Supports Map/Reduce, Cascading, Apache Hive, Apache Pig, Apache Spark and Apache Storm.
|
||||
|
||||
|
||||
### Supported by the community: [_supported_by_the_community_5]
|
||||
|
||||
* [Garmadon](https://github.com/criteo/garmadon): Garmadon is a solution for Hadoop Cluster realtime introspection.
|
||||
|
||||
|
||||
## Health and Performance Monitoring [monitoring-integrations]
|
||||
|
||||
|
||||
### Supported by the community: [_supported_by_the_community_6]
|
||||
|
||||
* [SPM for Elasticsearch](https://sematext.com/spm/index.md): Performance monitoring with live charts showing cluster and node stats, integrated alerts, email reports, etc.
|
||||
* [Zabbix monitoring template](https://www.zabbix.com/integrations/elasticsearch): Monitor the performance and status of your {{es}} nodes and cluster with Zabbix and receive events information.
|
||||
|
||||
|
||||
## Other integrations [other-integrations]
|
||||
|
||||
|
||||
### Supported by the community: [_supported_by_the_community_7]
|
||||
|
||||
* [Wireshark](https://www.wireshark.org/): Protocol dissection for HTTP and the transport protocol
|
||||
* [ItemsAPI](https://www.itemsapi.com/): Search backend for mobile and web
|
|
@ -0,0 +1,50 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/listing-removing-updating.html
|
||||
---
|
||||
|
||||
# Listing, removing and updating installed plugins [listing-removing-updating]
|
||||
|
||||
|
||||
## Listing plugins [_listing_plugins]
|
||||
|
||||
A list of the currently loaded plugins can be retrieved with the `list` option:
|
||||
|
||||
```shell
|
||||
sudo bin/elasticsearch-plugin list
|
||||
```
|
||||
|
||||
Alternatively, use the [node-info API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-nodes-info) to find out which plugins are installed on each node in the cluster
|
||||
|
||||
|
||||
## Removing plugins [_removing_plugins]
|
||||
|
||||
Plugins can be removed manually, by deleting the appropriate directory under `plugins/`, or using the public script:
|
||||
|
||||
```shell
|
||||
sudo bin/elasticsearch-plugin remove [pluginname]
|
||||
```
|
||||
|
||||
After a Java plugin has been removed, you will need to restart the node to complete the removal process.
|
||||
|
||||
By default, plugin configuration files (if any) are preserved on disk; this is so that configuration is not lost while upgrading a plugin. If you wish to purge the configuration files while removing a plugin, use `-p` or `--purge`. This can option can be used after a plugin is removed to remove any lingering configuration files.
|
||||
|
||||
|
||||
## Removing multiple plugins [removing-multiple-plugins]
|
||||
|
||||
Multiple plugins can be removed in one invocation as follows:
|
||||
|
||||
```shell
|
||||
sudo bin/elasticsearch-plugin remove [pluginname] [pluginname] ... [pluginname]
|
||||
```
|
||||
|
||||
|
||||
## Updating plugins [_updating_plugins]
|
||||
|
||||
Except for text analysis plugins that are created using the [stable plugin API](/extend/creating-stable-plugins.md), plugins are built for a specific version of {{es}}, and must be reinstalled each time {{es}} is updated.
|
||||
|
||||
```shell
|
||||
sudo bin/elasticsearch-plugin remove [pluginname]
|
||||
sudo bin/elasticsearch-plugin install [pluginname]
|
||||
```
|
||||
|
|
@ -0,0 +1,38 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/manage-plugins-using-configuration-file.html
|
||||
---
|
||||
|
||||
# Manage plugins using a configuration file [manage-plugins-using-configuration-file]
|
||||
|
||||
::::{admonition} Docker only
|
||||
:class: important
|
||||
|
||||
This feature is only available for [official {{es}} Docker images](https://www.docker.elastic.co/). Other {{es}} distributions will not start with a plugin configuration file.
|
||||
|
||||
::::
|
||||
|
||||
|
||||
If you run {{es}} using Docker, you can manage plugins using a declarative configuration file. When {{es}} starts up, it will compare the plugins in the file with those that are currently installed, and add or remove plugins as required. {{es}} will also upgrade official plugins when you upgrade {{es}} itself.
|
||||
|
||||
The file is called `elasticsearch-plugins.yml`, and must be placed in the Elasticsearch configuration directory, alongside `elasticsearch.yml`. Here is an example:
|
||||
|
||||
```yaml
|
||||
plugins:
|
||||
- id: analysis-icu
|
||||
- id: repository-azure
|
||||
- id: custom-mapper
|
||||
location: https://example.com/archive/custom-mapper-1.0.0.zip
|
||||
```
|
||||
|
||||
This example installs the official `analysis-icu` and `repository-azure` plugins, and one unofficial plugin. Every plugin must provide an `id`. Unofficial plugins must also provide a `location`. This is typically a URL, but Maven coordinates are also supported. The downloaded plugin’s name must match the ID in the configuration file.
|
||||
|
||||
While {{es}} will respect the [standard Java proxy system properties](https://docs.oracle.com/javase/8/docs/technotes/guides/net/proxies.md) when downloading plugins, you can also configure an HTTP proxy to use explicitly in the configuration file. For example:
|
||||
|
||||
```yaml
|
||||
plugins:
|
||||
- id: custom-mapper
|
||||
location: https://example.com/archive/custom-mapper-1.0.0.zip
|
||||
proxy: proxy.example.com:8443
|
||||
```
|
||||
|
15
docs/reference/elasticsearch-plugins/mandatory-plugins.md
Normal file
15
docs/reference/elasticsearch-plugins/mandatory-plugins.md
Normal file
|
@ -0,0 +1,15 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/mandatory-plugins.html
|
||||
---
|
||||
|
||||
# Mandatory plugins [mandatory-plugins]
|
||||
|
||||
If you rely on some plugins, you can define mandatory plugins by adding `plugin.mandatory` setting to the `config/elasticsearch.yml` file, for example:
|
||||
|
||||
```yaml
|
||||
plugin.mandatory: analysis-icu,lang-js
|
||||
```
|
||||
|
||||
For safety reasons, a node will not start if it is missing a mandatory plugin.
|
||||
|
|
@ -0,0 +1,48 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/mapper-annotated-text-highlighter.html
|
||||
---
|
||||
|
||||
# Using the annotated highlighter [mapper-annotated-text-highlighter]
|
||||
|
||||
The `annotated-text` plugin includes a custom highlighter designed to mark up search hits in a way which is respectful of the original markup:
|
||||
|
||||
```console
|
||||
# Example documents
|
||||
PUT my-index-000001/_doc/1
|
||||
{
|
||||
"my_field": "The cat sat on the [mat](sku3578)"
|
||||
}
|
||||
|
||||
GET my-index-000001/_search
|
||||
{
|
||||
"query": {
|
||||
"query_string": {
|
||||
"query": "cats"
|
||||
}
|
||||
},
|
||||
"highlight": {
|
||||
"fields": {
|
||||
"my_field": {
|
||||
"type": "annotated", <1>
|
||||
"require_field_match": false
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
1. The `annotated` highlighter type is designed for use with annotated_text fields
|
||||
|
||||
|
||||
The annotated highlighter is based on the `unified` highlighter and supports the same settings but does not use the `pre_tags` or `post_tags` parameters. Rather than using html-like markup such as `<em>cat</em>` the annotated highlighter uses the same markdown-like syntax used for annotations and injects a key=value annotation where `_hit_term` is the key and the matched search term is the value e.g.
|
||||
|
||||
```
|
||||
The [cat](_hit_term=cat) sat on the [mat](sku3578)
|
||||
```
|
||||
The annotated highlighter tries to be respectful of any existing markup in the original text:
|
||||
|
||||
* If the search term matches exactly the location of an existing annotation then the `_hit_term` key is merged into the url-like syntax used in the `(...)` part of the existing annotation.
|
||||
* However, if the search term overlaps the span of an existing annotation it would break the markup formatting so the original annotation is removed in favour of a new annotation with just the search hit information in the results.
|
||||
* Any non-overlapping annotations in the original text are preserved in highlighter selections
|
||||
|
|
@ -0,0 +1,12 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/mapper-annotated-text-limitations.html
|
||||
---
|
||||
|
||||
# Limitations [mapper-annotated-text-limitations]
|
||||
|
||||
The annotated_text field type supports the same mapping settings as the `text` field type but with the following exceptions:
|
||||
|
||||
* No support for `fielddata` or `fielddata_frequency_filter`
|
||||
* No support for `index_prefixes` or `index_phrases` indexing
|
||||
|
|
@ -0,0 +1,96 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/mapper-annotated-text-tips.html
|
||||
---
|
||||
|
||||
# Data modelling tips [mapper-annotated-text-tips]
|
||||
|
||||
## Use structured and unstructured fields [_use_structured_and_unstructured_fields]
|
||||
|
||||
Annotations are normally a way of weaving structured information into unstructured text for higher-precision search.
|
||||
|
||||
`Entity resolution` is a form of document enrichment undertaken by specialist software or people where references to entities in a document are disambiguated by attaching a canonical ID. The ID is used to resolve any number of aliases or distinguish between people with the same name. The hyperlinks connecting Wikipedia’s articles are a good example of resolved entity IDs woven into text.
|
||||
|
||||
These IDs can be embedded as annotations in an annotated_text field but it often makes sense to include them in dedicated structured fields to support discovery via aggregations:
|
||||
|
||||
```console
|
||||
PUT my-index-000001
|
||||
{
|
||||
"mappings": {
|
||||
"properties": {
|
||||
"my_unstructured_text_field": {
|
||||
"type": "annotated_text"
|
||||
},
|
||||
"my_structured_people_field": {
|
||||
"type": "text",
|
||||
"fields": {
|
||||
"keyword" : {
|
||||
"type": "keyword"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Applications would then typically provide content and discover it as follows:
|
||||
|
||||
```console
|
||||
# Example documents
|
||||
PUT my-index-000001/_doc/1
|
||||
{
|
||||
"my_unstructured_text_field": "[Shay](%40kimchy) created elasticsearch",
|
||||
"my_twitter_handles": ["@kimchy"] <1>
|
||||
}
|
||||
|
||||
GET my-index-000001/_search
|
||||
{
|
||||
"query": {
|
||||
"query_string": {
|
||||
"query": "elasticsearch OR logstash OR kibana",<2>
|
||||
"default_field": "my_unstructured_text_field"
|
||||
}
|
||||
},
|
||||
"aggregations": {
|
||||
"top_people" :{
|
||||
"significant_terms" : { <3>
|
||||
"field" : "my_twitter_handles.keyword"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
1. Note the `my_twitter_handles` contains a list of the annotation values also used in the unstructured text. (Note the annotated_text syntax requires escaping). By repeating the annotation values in a structured field this application has ensured that the tokens discovered in the structured field can be used for search and highlighting in the unstructured field.
|
||||
2. In this example we search for documents that talk about components of the elastic stack
|
||||
3. We use the `my_twitter_handles` field here to discover people who are significantly associated with the elastic stack.
|
||||
|
||||
|
||||
|
||||
## Avoiding over-matching annotations [_avoiding_over_matching_annotations]
|
||||
|
||||
By design, the regular text tokens and the annotation tokens co-exist in the same indexed field but in rare cases this can lead to some over-matching.
|
||||
|
||||
The value of an annotation often denotes a *named entity* (a person, place or company). The tokens for these named entities are inserted untokenized, and differ from typical text tokens because they are normally:
|
||||
|
||||
* Mixed case e.g. `Madonna`
|
||||
* Multiple words e.g. `Jeff Beck`
|
||||
* Can have punctuation or numbers e.g. `Apple Inc.` or `@kimchy`
|
||||
|
||||
This means, for the most part, a search for a named entity in the annotated text field will not have any false positives e.g. when selecting `Apple Inc.` from an aggregation result you can drill down to highlight uses in the text without "over matching" on any text tokens like the word `apple` in this context:
|
||||
|
||||
```
|
||||
the apple was very juicy
|
||||
```
|
||||
However, a problem arises if your named entity happens to be a single term and lower-case e.g. the company `elastic`. In this case, a search on the annotated text field for the token `elastic` may match a text document such as this:
|
||||
|
||||
```
|
||||
they fired an elastic band
|
||||
```
|
||||
To avoid such false matches users should consider prefixing annotation values to ensure they don’t name clash with text tokens e.g.
|
||||
|
||||
```
|
||||
[elastic](Company_elastic) released version 7.0 of the elastic stack today
|
||||
```
|
||||
|
|
@ -0,0 +1,223 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/mapper-annotated-text-usage.html
|
||||
---
|
||||
|
||||
# Using the annotated-text field [mapper-annotated-text-usage]
|
||||
|
||||
The `annotated-text` tokenizes text content as per the more common [`text`](/reference/elasticsearch/mapping-reference/text.md) field (see "limitations" below) but also injects any marked-up annotation tokens directly into the search index:
|
||||
|
||||
```console
|
||||
PUT my-index-000001
|
||||
{
|
||||
"mappings": {
|
||||
"properties": {
|
||||
"my_field": {
|
||||
"type": "annotated_text"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Such a mapping would allow marked-up text eg wikipedia articles to be indexed as both text and structured tokens. The annotations use a markdown-like syntax using URL encoding of one or more values separated by the `&` symbol.
|
||||
|
||||
We can use the "_analyze" api to test how an example annotation would be stored as tokens in the search index:
|
||||
|
||||
```js
|
||||
GET my-index-000001/_analyze
|
||||
{
|
||||
"field": "my_field",
|
||||
"text":"Investors in [Apple](Apple+Inc.) rejoiced."
|
||||
}
|
||||
```
|
||||
|
||||
Response:
|
||||
|
||||
```js
|
||||
{
|
||||
"tokens": [
|
||||
{
|
||||
"token": "investors",
|
||||
"start_offset": 0,
|
||||
"end_offset": 9,
|
||||
"type": "<ALPHANUM>",
|
||||
"position": 0
|
||||
},
|
||||
{
|
||||
"token": "in",
|
||||
"start_offset": 10,
|
||||
"end_offset": 12,
|
||||
"type": "<ALPHANUM>",
|
||||
"position": 1
|
||||
},
|
||||
{
|
||||
"token": "Apple Inc.", <1>
|
||||
"start_offset": 13,
|
||||
"end_offset": 18,
|
||||
"type": "annotation",
|
||||
"position": 2
|
||||
},
|
||||
{
|
||||
"token": "apple",
|
||||
"start_offset": 13,
|
||||
"end_offset": 18,
|
||||
"type": "<ALPHANUM>",
|
||||
"position": 2
|
||||
},
|
||||
{
|
||||
"token": "rejoiced",
|
||||
"start_offset": 19,
|
||||
"end_offset": 27,
|
||||
"type": "<ALPHANUM>",
|
||||
"position": 3
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
1. Note the whole annotation token `Apple Inc.` is placed, unchanged as a single token in the token stream and at the same position (position 2) as the text token (`apple`) it annotates.
|
||||
|
||||
|
||||
We can now perform searches for annotations using regular `term` queries that don’t tokenize the provided search values. Annotations are a more precise way of matching as can be seen in this example where a search for `Beck` will not match `Jeff Beck` :
|
||||
|
||||
```console
|
||||
# Example documents
|
||||
PUT my-index-000001/_doc/1
|
||||
{
|
||||
"my_field": "[Beck](Beck) announced a new tour"<1>
|
||||
}
|
||||
|
||||
PUT my-index-000001/_doc/2
|
||||
{
|
||||
"my_field": "[Jeff Beck](Jeff+Beck&Guitarist) plays a strat"<2>
|
||||
}
|
||||
|
||||
# Example search
|
||||
GET my-index-000001/_search
|
||||
{
|
||||
"query": {
|
||||
"term": {
|
||||
"my_field": "Beck" <3>
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
1. As well as tokenising the plain text into single words e.g. `beck`, here we inject the single token value `Beck` at the same position as `beck` in the token stream.
|
||||
2. Note annotations can inject multiple tokens at the same position - here we inject both the very specific value `Jeff Beck` and the broader term `Guitarist`. This enables broader positional queries e.g. finding mentions of a `Guitarist` near to `strat`.
|
||||
3. A benefit of searching with these carefully defined annotation tokens is that a query for `Beck` will not match document 2 that contains the tokens `jeff`, `beck` and `Jeff Beck`
|
||||
|
||||
|
||||
::::{warning}
|
||||
Any use of `=` signs in annotation values eg `[Prince](person=Prince)` will cause the document to be rejected with a parse failure. In future we hope to have a use for the equals signs so will actively reject documents that contain this today.
|
||||
::::
|
||||
|
||||
|
||||
## Synthetic `_source` [annotated-text-synthetic-source]
|
||||
|
||||
::::{important}
|
||||
Synthetic `_source` is Generally Available only for TSDB indices (indices that have `index.mode` set to `time_series`). For other indices synthetic `_source` is in technical preview. Features in technical preview may be changed or removed in a future release. Elastic will work to fix any issues, but features in technical preview are not subject to the support SLA of official GA features.
|
||||
::::
|
||||
|
||||
|
||||
If using a sub-`keyword` field then the values are sorted in the same way as a `keyword` field’s values are sorted. By default, that means sorted with duplicates removed. So:
|
||||
|
||||
$$$synthetic-source-text-example-default$$$
|
||||
|
||||
```console
|
||||
PUT idx
|
||||
{
|
||||
"settings": {
|
||||
"index": {
|
||||
"mapping": {
|
||||
"source": {
|
||||
"mode": "synthetic"
|
||||
}
|
||||
}
|
||||
}
|
||||
},
|
||||
"mappings": {
|
||||
"properties": {
|
||||
"text": {
|
||||
"type": "annotated_text",
|
||||
"fields": {
|
||||
"raw": {
|
||||
"type": "keyword"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
PUT idx/_doc/1
|
||||
{
|
||||
"text": [
|
||||
"the quick brown fox",
|
||||
"the quick brown fox",
|
||||
"jumped over the lazy dog"
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
Will become:
|
||||
|
||||
```console-result
|
||||
{
|
||||
"text": [
|
||||
"jumped over the lazy dog",
|
||||
"the quick brown fox"
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
::::{note}
|
||||
Reordering text fields can have an effect on [phrase](/reference/query-languages/query-dsl-match-query-phrase.md) and [span](/reference/query-languages/span-queries.md) queries. See the discussion about [`position_increment_gap`](/reference/elasticsearch/mapping-reference/position-increment-gap.md) for more detail. You can avoid this by making sure the `slop` parameter on the phrase queries is lower than the `position_increment_gap`. This is the default.
|
||||
::::
|
||||
|
||||
|
||||
If the `annotated_text` field sets `store` to true then order and duplicates are preserved.
|
||||
|
||||
$$$synthetic-source-text-example-stored$$$
|
||||
|
||||
```console
|
||||
PUT idx
|
||||
{
|
||||
"settings": {
|
||||
"index": {
|
||||
"mapping": {
|
||||
"source": {
|
||||
"mode": "synthetic"
|
||||
}
|
||||
}
|
||||
}
|
||||
},
|
||||
"mappings": {
|
||||
"properties": {
|
||||
"text": { "type": "annotated_text", "store": true }
|
||||
}
|
||||
}
|
||||
}
|
||||
PUT idx/_doc/1
|
||||
{
|
||||
"text": [
|
||||
"the quick brown fox",
|
||||
"the quick brown fox",
|
||||
"jumped over the lazy dog"
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
Will become:
|
||||
|
||||
```console-result
|
||||
{
|
||||
"text": [
|
||||
"the quick brown fox",
|
||||
"the quick brown fox",
|
||||
"jumped over the lazy dog"
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
|
|
@ -0,0 +1,49 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/mapper-annotated-text.html
|
||||
---
|
||||
|
||||
# Mapper annotated text plugin [mapper-annotated-text]
|
||||
|
||||
::::{warning}
|
||||
This functionality is in technical preview and may be changed or removed in a future release. Elastic will work to fix any issues, but features in technical preview are not subject to the support SLA of official GA features.
|
||||
::::
|
||||
|
||||
|
||||
The mapper-annotated-text plugin provides the ability to index text that is a combination of free-text and special markup that is typically used to identify items of interest such as people or organisations (see NER or Named Entity Recognition tools).
|
||||
|
||||
The elasticsearch markup allows one or more additional tokens to be injected, unchanged, into the token stream at the same position as the underlying text it annotates.
|
||||
|
||||
|
||||
## Installation [mapper-annotated-text-install]
|
||||
|
||||
::::{warning}
|
||||
Version 9.0.0-beta1 of the Elastic Stack has not yet been released. The plugin might not be available.
|
||||
::::
|
||||
|
||||
|
||||
This plugin can be installed using the plugin manager:
|
||||
|
||||
```sh
|
||||
sudo bin/elasticsearch-plugin install mapper-annotated-text
|
||||
```
|
||||
|
||||
The plugin must be installed on every node in the cluster, and each node must be restarted after installation.
|
||||
|
||||
You can download this plugin for [offline install](/reference/elasticsearch-plugins/plugin-management-custom-url.md) from [https://artifacts.elastic.co/downloads/elasticsearch-plugins/mapper-annotated-text/mapper-annotated-text-9.0.0-beta1.zip](https://artifacts.elastic.co/downloads/elasticsearch-plugins/mapper-annotated-text/mapper-annotated-text-9.0.0-beta1.zip). To verify the `.zip` file, use the [SHA hash](https://artifacts.elastic.co/downloads/elasticsearch-plugins/mapper-annotated-text/mapper-annotated-text-9.0.0-beta1.zip.sha512) or [ASC key](https://artifacts.elastic.co/downloads/elasticsearch-plugins/mapper-annotated-text/mapper-annotated-text-9.0.0-beta1.zip.asc).
|
||||
|
||||
|
||||
## Removal [mapper-annotated-text-remove]
|
||||
|
||||
The plugin can be removed with the following command:
|
||||
|
||||
```sh
|
||||
sudo bin/elasticsearch-plugin remove mapper-annotated-text
|
||||
```
|
||||
|
||||
The node must be stopped before removing the plugin.
|
||||
|
||||
|
||||
|
||||
|
||||
|
58
docs/reference/elasticsearch-plugins/mapper-murmur3-usage.md
Normal file
58
docs/reference/elasticsearch-plugins/mapper-murmur3-usage.md
Normal file
|
@ -0,0 +1,58 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/mapper-murmur3-usage.html
|
||||
---
|
||||
|
||||
# Using the murmur3 field [mapper-murmur3-usage]
|
||||
|
||||
The `murmur3` is typically used within a multi-field, so that both the original value and its hash are stored in the index:
|
||||
|
||||
```console
|
||||
PUT my-index-000001
|
||||
{
|
||||
"mappings": {
|
||||
"properties": {
|
||||
"my_field": {
|
||||
"type": "keyword",
|
||||
"fields": {
|
||||
"hash": {
|
||||
"type": "murmur3"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Such a mapping would allow to refer to `my_field.hash` in order to get hashes of the values of the `my_field` field. This is only useful in order to run `cardinality` aggregations:
|
||||
|
||||
```console
|
||||
# Example documents
|
||||
PUT my-index-000001/_doc/1
|
||||
{
|
||||
"my_field": "This is a document"
|
||||
}
|
||||
|
||||
PUT my-index-000001/_doc/2
|
||||
{
|
||||
"my_field": "This is another document"
|
||||
}
|
||||
|
||||
GET my-index-000001/_search
|
||||
{
|
||||
"aggs": {
|
||||
"my_field_cardinality": {
|
||||
"cardinality": {
|
||||
"field": "my_field.hash" <1>
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
1. Counting unique values on the `my_field.hash` field
|
||||
|
||||
|
||||
Running a `cardinality` aggregation on the `my_field` field directly would yield the same result, however using `my_field.hash` instead might result in a speed-up if the field has a high-cardinality. On the other hand, it is discouraged to use the `murmur3` field on numeric fields and string fields that are not almost unique as the use of a `murmur3` field is unlikely to bring significant speed-ups, while increasing the amount of disk space required to store the index.
|
||||
|
39
docs/reference/elasticsearch-plugins/mapper-murmur3.md
Normal file
39
docs/reference/elasticsearch-plugins/mapper-murmur3.md
Normal file
|
@ -0,0 +1,39 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/mapper-murmur3.html
|
||||
---
|
||||
|
||||
# Mapper murmur3 plugin [mapper-murmur3]
|
||||
|
||||
The mapper-murmur3 plugin provides the ability to compute hash of field values at index-time and store them in the index. This can sometimes be helpful when running cardinality aggregations on high-cardinality and large string fields.
|
||||
|
||||
|
||||
## Installation [mapper-murmur3-install]
|
||||
|
||||
::::{warning}
|
||||
Version 9.0.0-beta1 of the Elastic Stack has not yet been released. The plugin might not be available.
|
||||
::::
|
||||
|
||||
|
||||
This plugin can be installed using the plugin manager:
|
||||
|
||||
```sh
|
||||
sudo bin/elasticsearch-plugin install mapper-murmur3
|
||||
```
|
||||
|
||||
The plugin must be installed on every node in the cluster, and each node must be restarted after installation.
|
||||
|
||||
You can download this plugin for [offline install](/reference/elasticsearch-plugins/plugin-management-custom-url.md) from [https://artifacts.elastic.co/downloads/elasticsearch-plugins/mapper-murmur3/mapper-murmur3-9.0.0-beta1.zip](https://artifacts.elastic.co/downloads/elasticsearch-plugins/mapper-murmur3/mapper-murmur3-9.0.0-beta1.zip). To verify the `.zip` file, use the [SHA hash](https://artifacts.elastic.co/downloads/elasticsearch-plugins/mapper-murmur3/mapper-murmur3-9.0.0-beta1.zip.sha512) or [ASC key](https://artifacts.elastic.co/downloads/elasticsearch-plugins/mapper-murmur3/mapper-murmur3-9.0.0-beta1.zip.asc).
|
||||
|
||||
|
||||
## Removal [mapper-murmur3-remove]
|
||||
|
||||
The plugin can be removed with the following command:
|
||||
|
||||
```sh
|
||||
sudo bin/elasticsearch-plugin remove mapper-murmur3
|
||||
```
|
||||
|
||||
The node must be stopped before removing the plugin.
|
||||
|
||||
|
26
docs/reference/elasticsearch-plugins/mapper-plugins.md
Normal file
26
docs/reference/elasticsearch-plugins/mapper-plugins.md
Normal file
|
@ -0,0 +1,26 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/mapper.html
|
||||
---
|
||||
|
||||
# Mapper plugins [mapper]
|
||||
|
||||
Mapper plugins allow new field data types to be added to Elasticsearch.
|
||||
|
||||
|
||||
## Core mapper plugins [_core_mapper_plugins]
|
||||
|
||||
The core mapper plugins are:
|
||||
|
||||
[Mapper size plugin](/reference/elasticsearch-plugins/mapper-size.md)
|
||||
: The mapper-size plugin provides the `_size` metadata field which, when enabled, indexes the size in bytes of the original [`_source`](/reference/elasticsearch/mapping-reference/mapping-source-field.md) field.
|
||||
|
||||
[Mapper murmur3 plugin](/reference/elasticsearch-plugins/mapper-murmur3.md)
|
||||
: The mapper-murmur3 plugin allows hashes to be computed at index-time and stored in the index for later use with the `cardinality` aggregation.
|
||||
|
||||
[Mapper annotated text plugin](/reference/elasticsearch-plugins/mapper-annotated-text.md)
|
||||
: The annotated text plugin provides the ability to index text that is a combination of free-text and special markup that is typically used to identify items of interest such as people or organisations (see NER or Named Entity Recognition tools).
|
||||
|
||||
|
||||
|
||||
|
82
docs/reference/elasticsearch-plugins/mapper-size-usage.md
Normal file
82
docs/reference/elasticsearch-plugins/mapper-size-usage.md
Normal file
|
@ -0,0 +1,82 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/mapper-size-usage.html
|
||||
---
|
||||
|
||||
# Using the _size field [mapper-size-usage]
|
||||
|
||||
In order to enable the `_size` field, set the mapping as follows:
|
||||
|
||||
```console
|
||||
PUT my-index-000001
|
||||
{
|
||||
"mappings": {
|
||||
"_size": {
|
||||
"enabled": true
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
The value of the `_size` field is accessible in queries, aggregations, scripts, and when sorting. It can be retrieved using the [fields parameter](/reference/elasticsearch/rest-apis/retrieve-selected-fields.md#search-fields-param):
|
||||
|
||||
```console
|
||||
# Example documents
|
||||
PUT my-index-000001/_doc/1
|
||||
{
|
||||
"text": "This is a document"
|
||||
}
|
||||
|
||||
PUT my-index-000001/_doc/2
|
||||
{
|
||||
"text": "This is another document"
|
||||
}
|
||||
|
||||
GET my-index-000001/_search
|
||||
{
|
||||
"query": {
|
||||
"range": {
|
||||
"_size": { <1>
|
||||
"gt": 10
|
||||
}
|
||||
}
|
||||
},
|
||||
"aggs": {
|
||||
"sizes": {
|
||||
"terms": {
|
||||
"field": "_size", <2>
|
||||
"size": 10
|
||||
}
|
||||
}
|
||||
},
|
||||
"sort": [
|
||||
{
|
||||
"_size": { <3>
|
||||
"order": "desc"
|
||||
}
|
||||
}
|
||||
],
|
||||
"fields": ["_size"], <4>
|
||||
"script_fields": {
|
||||
"size": {
|
||||
"script": "doc['_size']" <5>
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
1. Querying on the `_size` field
|
||||
2. Aggregating on the `_size` field
|
||||
3. Sorting on the `_size` field
|
||||
4. Use the `fields` parameter to return the `_size` in the search response.
|
||||
5. Uses a [script field](/reference/elasticsearch/rest-apis/retrieve-selected-fields.md#script-fields) to return the `_size` field in the search response.
|
||||
|
||||
|
||||
::::{admonition} Using `_size` in {kib}
|
||||
:class: note
|
||||
|
||||
To use the `_size` field in {{kib}}, update the `metaFields` setting and add `_size` to the list of meta fields. `metaFields` can be configured in {{kib}} from the Advanced Settings page in Management.
|
||||
|
||||
::::
|
||||
|
||||
|
39
docs/reference/elasticsearch-plugins/mapper-size.md
Normal file
39
docs/reference/elasticsearch-plugins/mapper-size.md
Normal file
|
@ -0,0 +1,39 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/mapper-size.html
|
||||
---
|
||||
|
||||
# Mapper size plugin [mapper-size]
|
||||
|
||||
The mapper-size plugin provides the `_size` metadata field which, when enabled, indexes the size in bytes of the original [`_source`](/reference/elasticsearch/mapping-reference/mapping-source-field.md) field.
|
||||
|
||||
|
||||
## Installation [mapper-size-install]
|
||||
|
||||
::::{warning}
|
||||
Version 9.0.0-beta1 of the Elastic Stack has not yet been released. The plugin might not be available.
|
||||
::::
|
||||
|
||||
|
||||
This plugin can be installed using the plugin manager:
|
||||
|
||||
```sh
|
||||
sudo bin/elasticsearch-plugin install mapper-size
|
||||
```
|
||||
|
||||
The plugin must be installed on every node in the cluster, and each node must be restarted after installation.
|
||||
|
||||
You can download this plugin for [offline install](/reference/elasticsearch-plugins/plugin-management-custom-url.md) from [https://artifacts.elastic.co/downloads/elasticsearch-plugins/mapper-size/mapper-size-9.0.0-beta1.zip](https://artifacts.elastic.co/downloads/elasticsearch-plugins/mapper-size/mapper-size-9.0.0-beta1.zip). To verify the `.zip` file, use the [SHA hash](https://artifacts.elastic.co/downloads/elasticsearch-plugins/mapper-size/mapper-size-9.0.0-beta1.zip.sha512) or [ASC key](https://artifacts.elastic.co/downloads/elasticsearch-plugins/mapper-size/mapper-size-9.0.0-beta1.zip.asc).
|
||||
|
||||
|
||||
## Removal [mapper-size-remove]
|
||||
|
||||
The plugin can be removed with the following command:
|
||||
|
||||
```sh
|
||||
sudo bin/elasticsearch-plugin remove mapper-size
|
||||
```
|
||||
|
||||
The node must be stopped before removing the plugin.
|
||||
|
||||
|
|
@ -0,0 +1,55 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/plugin-management-custom-url.html
|
||||
---
|
||||
|
||||
# Custom URL or file system [plugin-management-custom-url]
|
||||
|
||||
A plugin can also be downloaded directly from a custom location by specifying the URL:
|
||||
|
||||
```shell
|
||||
sudo bin/elasticsearch-plugin install [url] <1>
|
||||
```
|
||||
|
||||
1. must be a valid URL, the plugin name is determined from its descriptor.
|
||||
|
||||
|
||||
Unix
|
||||
: To install a plugin from your local file system at `/path/to/plugin.zip`, you could run:
|
||||
|
||||
```shell
|
||||
sudo bin/elasticsearch-plugin install file:///path/to/plugin.zip
|
||||
```
|
||||
|
||||
|
||||
Windows
|
||||
: To install a plugin from your local file system at `C:\path\to\plugin.zip`, you could run:
|
||||
|
||||
```shell
|
||||
bin\elasticsearch-plugin install file:///C:/path/to/plugin.zip
|
||||
```
|
||||
|
||||
::::{note}
|
||||
Any path that contains spaces must be wrapped in quotes!
|
||||
::::
|
||||
|
||||
|
||||
::::{note}
|
||||
If you are installing a plugin from the filesystem the plugin distribution must not be contained in the `plugins` directory for the node that you are installing the plugin to or installation will fail.
|
||||
::::
|
||||
|
||||
|
||||
HTTP
|
||||
: To install a plugin from an HTTP URL:
|
||||
|
||||
```shell
|
||||
sudo bin/elasticsearch-plugin install https://some.domain/path/to/plugin.zip
|
||||
```
|
||||
|
||||
The plugin script will refuse to talk to an HTTPS URL with an untrusted certificate. To use a self-signed HTTPS cert, you will need to add the CA cert to a local Java truststore and pass the location to the script as follows:
|
||||
|
||||
```shell
|
||||
sudo CLI_JAVA_OPTS="-Djavax.net.ssl.trustStore=/path/to/trustStore.jks" bin/elasticsearch-plugin install https://host/plugin.zip
|
||||
```
|
||||
|
||||
|
11
docs/reference/elasticsearch-plugins/plugin-management.md
Normal file
11
docs/reference/elasticsearch-plugins/plugin-management.md
Normal file
|
@ -0,0 +1,11 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/plugin-management.html
|
||||
- https://www.elastic.co/guide/en/cloud/current/ec-adding-plugins.html
|
||||
- https://www.elastic.co/guide/en/cloud-enterprise/current/ece-add-plugins.html
|
||||
---
|
||||
|
||||
# Plugin management
|
||||
|
||||
% The inventory is not clear about which of the mapped pages should be source material
|
||||
% for this page vs. added as separate pages.
|
|
@ -0,0 +1,61 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/repository-hdfs-config.html
|
||||
---
|
||||
|
||||
# Configuration properties [repository-hdfs-config]
|
||||
|
||||
Once installed, define the configuration for the `hdfs` repository through the [REST API](docs-content://deploy-manage/tools/snapshot-and-restore.md):
|
||||
|
||||
```console
|
||||
PUT _snapshot/my_hdfs_repository
|
||||
{
|
||||
"type": "hdfs",
|
||||
"settings": {
|
||||
"uri": "hdfs://namenode:8020/",
|
||||
"path": "elasticsearch/repositories/my_hdfs_repository",
|
||||
"conf.dfs.client.read.shortcircuit": "true"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
The following settings are supported:
|
||||
|
||||
`uri`
|
||||
: The uri address for hdfs. ex: "hdfs://<host>:<port>/". (Required)
|
||||
|
||||
`path`
|
||||
: The file path within the filesystem where data is stored/loaded. ex: "path/to/file". (Required)
|
||||
|
||||
`load_defaults`
|
||||
: Whether to load the default Hadoop configuration or not. (Enabled by default)
|
||||
|
||||
`conf.<key>`
|
||||
: Inlined configuration parameter to be added to Hadoop configuration. (Optional) Only client oriented properties from the hadoop [core](https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/core-default.xml) and [hdfs](https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml) configuration files will be recognized by the plugin.
|
||||
|
||||
`compress`
|
||||
: Whether to compress the metadata or not. (Enabled by default)
|
||||
|
||||
`max_restore_bytes_per_sec`
|
||||
: Throttles per node restore rate. Defaults to unlimited. Note that restores are also throttled through [recovery settings](/reference/elasticsearch/configuration-reference/index-recovery-settings.md).
|
||||
|
||||
`max_snapshot_bytes_per_sec`
|
||||
: Throttles per node snapshot rate. Defaults to `40mb` per second. Note that if the [recovery settings for managed services](/reference/elasticsearch/configuration-reference/index-recovery-settings.md) are set, then it defaults to unlimited, and the rate is additionally throttled through [recovery settings](/reference/elasticsearch/configuration-reference/index-recovery-settings.md).
|
||||
|
||||
`readonly`
|
||||
: Makes repository read-only. Defaults to `false`.
|
||||
|
||||
`chunk_size`
|
||||
: Override the chunk size. (Disabled by default)
|
||||
|
||||
`security.principal`
|
||||
: Kerberos principal to use when connecting to a secured HDFS cluster. If you are using a service principal for your elasticsearch node, you may use the `_HOST` pattern in the principal name and the plugin will replace the pattern with the hostname of the node at runtime (see [Creating the Secure Repository](/reference/elasticsearch-plugins/repository-hdfs-security.md#repository-hdfs-security-runtime)).
|
||||
|
||||
`replication_factor`
|
||||
: The replication factor for all new HDFS files created by this repository. Must be greater or equal to `dfs.replication.min` and less or equal to `dfs.replication.max` HDFS option. Defaults to using HDFS cluster setting.
|
||||
|
||||
|
||||
## A note on HDFS availability [repository-hdfs-availability]
|
||||
|
||||
When you initialize a repository, its settings are persisted in the cluster state. When a node comes online, it will attempt to initialize all repositories for which it has settings. If your cluster has an HDFS repository configured, then all nodes in the cluster must be able to reach HDFS when starting. If not, then the node will fail to initialize the repository at start up and the repository will be unusable. If this happens, you will need to remove and re-add the repository or restart the offending node.
|
||||
|
|
@ -0,0 +1,82 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/repository-hdfs-security.html
|
||||
---
|
||||
|
||||
# Hadoop security [repository-hdfs-security]
|
||||
|
||||
The HDFS repository plugin integrates seamlessly with Hadoop’s authentication model. The following authentication methods are supported by the plugin:
|
||||
|
||||
`simple`
|
||||
: Also means "no security" and is enabled by default. Uses information from underlying operating system account running Elasticsearch to inform Hadoop of the name of the current user. Hadoop makes no attempts to verify this information.
|
||||
|
||||
`kerberos`
|
||||
: Authenticates to Hadoop through the usage of a Kerberos principal and keytab. Interfacing with HDFS clusters secured with Kerberos requires a few additional steps to enable (See [Principals and keytabs](#repository-hdfs-security-keytabs) and [Creating the secure repository](#repository-hdfs-security-runtime) for more info)
|
||||
|
||||
|
||||
## Principals and keytabs [repository-hdfs-security-keytabs]
|
||||
|
||||
Before attempting to connect to a secured HDFS cluster, provision the Kerberos principals and keytabs that the Elasticsearch nodes will use for authenticating to Kerberos. For maximum security and to avoid tripping up the Kerberos replay protection, you should create a service principal per node, following the pattern of `elasticsearch/hostname@REALM`.
|
||||
|
||||
::::{warning}
|
||||
In some cases, if the same principal is authenticating from multiple clients at once, services may reject authentication for those principals under the assumption that they could be replay attacks. If you are running the plugin in production with multiple nodes you should be using a unique service principal for each node.
|
||||
::::
|
||||
|
||||
|
||||
On each Elasticsearch node, place the appropriate keytab file in the node’s configuration location under the `repository-hdfs` directory using the name `krb5.keytab`:
|
||||
|
||||
```bash
|
||||
$> cd elasticsearch/config
|
||||
$> ls
|
||||
elasticsearch.yml jvm.options log4j2.properties repository-hdfs/ scripts/
|
||||
$> cd repository-hdfs
|
||||
$> ls
|
||||
krb5.keytab
|
||||
```
|
||||
|
||||
::::{note}
|
||||
Make sure you have the correct keytabs! If you are using a service principal per node (like `elasticsearch/hostname@REALM`) then each node will need its own unique keytab file for the principal assigned to that host!
|
||||
::::
|
||||
|
||||
|
||||
|
||||
## Creating the secure repository [repository-hdfs-security-runtime]
|
||||
|
||||
Once your keytab files are in place and your cluster is started, creating a secured HDFS repository is simple. Just add the name of the principal that you will be authenticating as in the repository settings under the `security.principal` option:
|
||||
|
||||
```console
|
||||
PUT _snapshot/my_hdfs_repository
|
||||
{
|
||||
"type": "hdfs",
|
||||
"settings": {
|
||||
"uri": "hdfs://namenode:8020/",
|
||||
"path": "/user/elasticsearch/repositories/my_hdfs_repository",
|
||||
"security.principal": "elasticsearch@REALM"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
If you are using different service principals for each node, you can use the `_HOST` pattern in your principal name. Elasticsearch will automatically replace the pattern with the hostname of the node at runtime:
|
||||
|
||||
```console
|
||||
PUT _snapshot/my_hdfs_repository
|
||||
{
|
||||
"type": "hdfs",
|
||||
"settings": {
|
||||
"uri": "hdfs://namenode:8020/",
|
||||
"path": "/user/elasticsearch/repositories/my_hdfs_repository",
|
||||
"security.principal": "elasticsearch/_HOST@REALM"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
|
||||
## Authorization [repository-hdfs-security-authorization]
|
||||
|
||||
Once Elasticsearch is connected and authenticated to HDFS, HDFS will infer a username to use for authorizing file access for the client. By default, it picks this username from the primary part of the kerberos principal used to authenticate to the service. For example, in the case of a principal like `elasticsearch@REALM` or `elasticsearch/hostname@REALM` then the username that HDFS extracts for file access checks will be `elasticsearch`.
|
||||
|
||||
::::{note}
|
||||
The repository plugin makes no assumptions of what Elasticsearch’s principal name is. The main fragment of the Kerberos principal is not required to be `elasticsearch`. If you have a principal or service name that works better for you or your organization then feel free to use it instead!
|
||||
::::
|
||||
|
||||
|
|
@ -0,0 +1,14 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/repository-hdfs-usage.html
|
||||
---
|
||||
|
||||
# Getting started with HDFS [repository-hdfs-usage]
|
||||
|
||||
The HDFS snapshot/restore plugin is built against the latest Apache Hadoop 2.x (currently 2.7.1). If the distro you are using is not protocol compatible with Apache Hadoop, consider replacing the Hadoop libraries inside the plugin folder with your own (you might have to adjust the security permissions required).
|
||||
|
||||
Even if Hadoop is already installed on the Elasticsearch nodes, for security reasons, the required libraries need to be placed under the plugin folder. Note that in most cases, if the distro is compatible, one simply needs to configure the repository with the appropriate Hadoop configuration files (see below).
|
||||
|
||||
Windows Users
|
||||
: Using Apache Hadoop on Windows is problematic and thus it is not recommended. For those *really* wanting to use it, make sure you place the elusive `winutils.exe` under the plugin folder and point `HADOOP_HOME` variable to it; this should minimize the amount of permissions Hadoop requires (though one would still have to add some more).
|
||||
|
41
docs/reference/elasticsearch-plugins/repository-hdfs.md
Normal file
41
docs/reference/elasticsearch-plugins/repository-hdfs.md
Normal file
|
@ -0,0 +1,41 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/repository-hdfs.html
|
||||
---
|
||||
|
||||
# Hadoop HDFS repository plugin [repository-hdfs]
|
||||
|
||||
The HDFS repository plugin adds support for using HDFS File System as a repository for [Snapshot/Restore](docs-content://deploy-manage/tools/snapshot-and-restore.md).
|
||||
|
||||
|
||||
## Installation [repository-hdfs-install]
|
||||
|
||||
::::{warning}
|
||||
Version 9.0.0-beta1 of the Elastic Stack has not yet been released. The plugin might not be available.
|
||||
::::
|
||||
|
||||
|
||||
This plugin can be installed using the plugin manager:
|
||||
|
||||
```sh
|
||||
sudo bin/elasticsearch-plugin install repository-hdfs
|
||||
```
|
||||
|
||||
The plugin must be installed on every node in the cluster, and each node must be restarted after installation.
|
||||
|
||||
You can download this plugin for [offline install](/reference/elasticsearch-plugins/plugin-management-custom-url.md) from [https://artifacts.elastic.co/downloads/elasticsearch-plugins/repository-hdfs/repository-hdfs-9.0.0-beta1.zip](https://artifacts.elastic.co/downloads/elasticsearch-plugins/repository-hdfs/repository-hdfs-9.0.0-beta1.zip). To verify the `.zip` file, use the [SHA hash](https://artifacts.elastic.co/downloads/elasticsearch-plugins/repository-hdfs/repository-hdfs-9.0.0-beta1.zip.sha512) or [ASC key](https://artifacts.elastic.co/downloads/elasticsearch-plugins/repository-hdfs/repository-hdfs-9.0.0-beta1.zip.asc).
|
||||
|
||||
|
||||
## Removal [repository-hdfs-remove]
|
||||
|
||||
The plugin can be removed with the following command:
|
||||
|
||||
```sh
|
||||
sudo bin/elasticsearch-plugin remove repository-hdfs
|
||||
```
|
||||
|
||||
The node must be stopped before removing the plugin.
|
||||
|
||||
|
||||
|
||||
|
|
@ -0,0 +1,30 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/repository.html
|
||||
---
|
||||
|
||||
# Snapshot/restore repository plugins [repository]
|
||||
|
||||
Repository plugins extend the [Snapshot/Restore](docs-content://deploy-manage/tools/snapshot-and-restore.md) functionality in Elasticsearch by adding repositories backed by the cloud or by distributed file systems:
|
||||
|
||||
|
||||
### Official repository plugins [_official_repository_plugins]
|
||||
|
||||
::::{note}
|
||||
Support for S3, GCS and Azure repositories is now bundled in {{es}} by default.
|
||||
::::
|
||||
|
||||
|
||||
The official repository plugins are:
|
||||
|
||||
[HDFS Repository](/reference/elasticsearch-plugins/repository-hdfs.md)
|
||||
: The Hadoop HDFS Repository plugin adds support for using HDFS as a repository.
|
||||
|
||||
|
||||
## Community contributed repository plugins [_community_contributed_repository_plugins]
|
||||
|
||||
The following plugin has been contributed by our community:
|
||||
|
||||
* [Openstack Swift](https://github.com/BigDataBoutique/elasticsearch-repository-swift) (by Wikimedia Foundation and BigData Boutique)
|
||||
|
||||
|
18
docs/reference/elasticsearch-plugins/store-plugins.md
Normal file
18
docs/reference/elasticsearch-plugins/store-plugins.md
Normal file
|
@ -0,0 +1,18 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/store.html
|
||||
---
|
||||
|
||||
# Store plugins [store]
|
||||
|
||||
Store plugins offer alternatives to default Lucene stores.
|
||||
|
||||
|
||||
## Core store plugins [_core_store_plugins]
|
||||
|
||||
The core store plugins are:
|
||||
|
||||
[Store SMB](/reference/elasticsearch-plugins/store-smb.md)
|
||||
: The Store SMB plugin works around for a bug in Windows SMB and Java on windows.
|
||||
|
||||
|
41
docs/reference/elasticsearch-plugins/store-smb-usage.md
Normal file
41
docs/reference/elasticsearch-plugins/store-smb-usage.md
Normal file
|
@ -0,0 +1,41 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/store-smb-usage.html
|
||||
---
|
||||
|
||||
# Working around a bug in Windows SMB and Java on windows [store-smb-usage]
|
||||
|
||||
When using a shared file system based on the SMB protocol (like Azure File Service) to store indices, the way Lucene opens index segment files is with a write only flag. This is the *correct* way to open the files, as they will only be used for writes and allows different FS implementations to optimize for it. Sadly, in windows with SMB, this disables the cache manager, causing writes to be slow. This has been described in [LUCENE-6176](https://issues.apache.org/jira/browse/LUCENE-6176), but it affects each and every Java program out there!. This need and must be fixed outside of ES and/or Lucene, either in windows or OpenJDK. For now, we are providing an experimental support to open the files with read flag, but this should be considered experimental and the correct way to fix it is in OpenJDK or Windows.
|
||||
|
||||
The Store SMB plugin provides two storage types optimized for SMB:
|
||||
|
||||
`smb_mmap_fs`
|
||||
: a SMB specific implementation of the default [mmap fs](/reference/elasticsearch/index-settings/store.md#mmapfs)
|
||||
|
||||
`smb_simple_fs`
|
||||
: deprecated::[7.15,"smb_simple_fs is deprecated and will be removed in 8.0. Use smb_nio_fs or other file systems instead."]
|
||||
|
||||
`smb_nio_fs`
|
||||
: a SMB specific implementation of the default [nio fs](/reference/elasticsearch/index-settings/store.md#niofs)
|
||||
|
||||
To use one of these specific storage types, you need to install the Store SMB plugin and restart the node. Then configure Elasticsearch to set the storage type you want.
|
||||
|
||||
This can be configured for all indices by adding this to the `elasticsearch.yml` file:
|
||||
|
||||
```yaml
|
||||
index.store.type: smb_nio_fs
|
||||
```
|
||||
|
||||
Note that settings will be applied for newly created indices.
|
||||
|
||||
It can also be set on a per-index basis at index creation time:
|
||||
|
||||
```console
|
||||
PUT my-index-000001
|
||||
{
|
||||
"settings": {
|
||||
"index.store.type": "smb_mmap_fs"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
39
docs/reference/elasticsearch-plugins/store-smb.md
Normal file
39
docs/reference/elasticsearch-plugins/store-smb.md
Normal file
|
@ -0,0 +1,39 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/elasticsearch/plugins/current/store-smb.html
|
||||
---
|
||||
|
||||
# Store SMB plugin [store-smb]
|
||||
|
||||
The Store SMB plugin works around for a bug in Windows SMB and Java on windows.
|
||||
|
||||
|
||||
## Installation [store-smb-install]
|
||||
|
||||
::::{warning}
|
||||
Version 9.0.0-beta1 of the Elastic Stack has not yet been released. The plugin might not be available.
|
||||
::::
|
||||
|
||||
|
||||
This plugin can be installed using the plugin manager:
|
||||
|
||||
```sh
|
||||
sudo bin/elasticsearch-plugin install store-smb
|
||||
```
|
||||
|
||||
The plugin must be installed on every node in the cluster, and each node must be restarted after installation.
|
||||
|
||||
You can download this plugin for [offline install](/reference/elasticsearch-plugins/plugin-management-custom-url.md) from [https://artifacts.elastic.co/downloads/elasticsearch-plugins/store-smb/store-smb-9.0.0-beta1.zip](https://artifacts.elastic.co/downloads/elasticsearch-plugins/store-smb/store-smb-9.0.0-beta1.zip). To verify the `.zip` file, use the [SHA hash](https://artifacts.elastic.co/downloads/elasticsearch-plugins/store-smb/store-smb-9.0.0-beta1.zip.sha512) or [ASC key](https://artifacts.elastic.co/downloads/elasticsearch-plugins/store-smb/store-smb-9.0.0-beta1.zip.asc).
|
||||
|
||||
|
||||
## Removal [store-smb-remove]
|
||||
|
||||
The plugin can be removed with the following command:
|
||||
|
||||
```sh
|
||||
sudo bin/elasticsearch-plugin remove store-smb
|
||||
```
|
||||
|
||||
The node must be stopped before removing the plugin.
|
||||
|
||||
|
113
docs/reference/elasticsearch-plugins/toc.yml
Normal file
113
docs/reference/elasticsearch-plugins/toc.yml
Normal file
|
@ -0,0 +1,113 @@
|
|||
toc:
|
||||
- file: index.md
|
||||
- file: plugin-management.md
|
||||
children:
|
||||
- file: installation.md
|
||||
- file: plugin-management-custom-url.md
|
||||
- file: installing-multiple-plugins.md
|
||||
- file: mandatory-plugins.md
|
||||
- file: listing-removing-updating.md
|
||||
- file: _other_command_line_parameters.md
|
||||
- file: _plugins_directory.md
|
||||
- file: manage-plugins-using-configuration-file.md
|
||||
- file: cloud/ec-adding-plugins.md
|
||||
children:
|
||||
- file: cloud/ec-adding-elastic-plugins.md
|
||||
- file: cloud/ec-custom-bundles.md
|
||||
- file: cloud/ec-plugins-guide.md
|
||||
- file: cloud-enterprise/ece-add-plugins.md
|
||||
- file: api-extension-plugins.md
|
||||
- file: analysis-plugins.md
|
||||
children:
|
||||
- file: analysis-icu.md
|
||||
children:
|
||||
- file: analysis-icu-analyzer.md
|
||||
- file: analysis-icu-normalization-charfilter.md
|
||||
- file: analysis-icu-tokenizer.md
|
||||
- file: analysis-icu-normalization.md
|
||||
- file: analysis-icu-folding.md
|
||||
- file: analysis-icu-collation.md
|
||||
- file: analysis-icu-collation-keyword-field.md
|
||||
- file: analysis-icu-transform.md
|
||||
- file: analysis-kuromoji.md
|
||||
children:
|
||||
- file: analysis-kuromoji-analyzer.md
|
||||
- file: analysis-kuromoji-charfilter.md
|
||||
- file: analysis-kuromoji-tokenizer.md
|
||||
- file: analysis-kuromoji-baseform.md
|
||||
- file: analysis-kuromoji-speech.md
|
||||
- file: analysis-kuromoji-readingform.md
|
||||
- file: analysis-kuromoji-stemmer.md
|
||||
- file: analysis-kuromoji-stop.md
|
||||
- file: analysis-kuromoji-number.md
|
||||
- file: analysis-kuromoji-hiragana-uppercase.md
|
||||
- file: analysis-kuromoji-katakana-uppercase.md
|
||||
- file: analysis-kuromoji-completion.md
|
||||
- file: analysis-nori.md
|
||||
children:
|
||||
- file: analysis-nori-analyzer.md
|
||||
- file: analysis-nori-tokenizer.md
|
||||
- file: analysis-nori-speech.md
|
||||
- file: analysis-nori-readingform.md
|
||||
- file: analysis-nori-number.md
|
||||
- file: analysis-phonetic.md
|
||||
children:
|
||||
- file: analysis-phonetic-token-filter.md
|
||||
- file: analysis-smartcn.md
|
||||
children:
|
||||
- file: _reimplementing_and_extending_the_analyzers.md
|
||||
- file: analysis-smartcn_stop.md
|
||||
- file: analysis-stempel.md
|
||||
children:
|
||||
- file: _reimplementing_and_extending_the_analyzers_2.md
|
||||
- file: analysis-polish-stop.md
|
||||
- file: analysis-ukrainian.md
|
||||
- file: discovery-plugins.md
|
||||
children:
|
||||
- file: discovery-ec2.md
|
||||
children:
|
||||
- file: discovery-ec2-usage.md
|
||||
- file: cloud-aws-best-practices.md
|
||||
- file: discovery-azure-classic.md
|
||||
children:
|
||||
- file: discovery-azure-classic-usage.md
|
||||
- file: discovery-azure-classic-long.md
|
||||
- file: discovery-azure-classic-scale.md
|
||||
- file: discovery-gce.md
|
||||
children:
|
||||
- file: discovery-gce-usage.md
|
||||
- file: discovery-gce-network-host.md
|
||||
- file: discovery-gce-usage-long.md
|
||||
- file: discovery-gce-usage-cloning.md
|
||||
- file: discovery-gce-usage-zones.md
|
||||
- file: discovery-gce-usage-tags.md
|
||||
- file: discovery-gce-usage-port.md
|
||||
- file: discovery-gce-usage-tips.md
|
||||
- file: discovery-gce-usage-testing.md
|
||||
- file: mapper-plugins.md
|
||||
children:
|
||||
- file: mapper-size.md
|
||||
children:
|
||||
- file: mapper-size-usage.md
|
||||
- file: mapper-murmur3.md
|
||||
children:
|
||||
- file: mapper-murmur3-usage.md
|
||||
- file: mapper-annotated-text.md
|
||||
children:
|
||||
- file: mapper-annotated-text-usage.md
|
||||
- file: mapper-annotated-text-tips.md
|
||||
- file: mapper-annotated-text-highlighter.md
|
||||
- file: mapper-annotated-text-limitations.md
|
||||
- file: snapshotrestore-repository-plugins.md
|
||||
children:
|
||||
- file: repository-hdfs.md
|
||||
children:
|
||||
- file: repository-hdfs-usage.md
|
||||
- file: repository-hdfs-config.md
|
||||
- file: repository-hdfs-security.md
|
||||
- file: store-plugins.md
|
||||
children:
|
||||
- file: store-smb.md
|
||||
children:
|
||||
- file: store-smb-usage.md
|
||||
- file: integrations.md
|
Loading…
Add table
Add a link
Reference in a new issue