mirror of
https://github.com/elastic/elasticsearch.git
synced 2025-06-28 17:34:17 -04:00
* [DOCS] Documentation for the stable plugin API * Removed references to rivers * Add link to Cloud docs for managing plugins * Add caveat about needing to update plugins * Remove reference to site plugins * Wording and clarifications * Fix test * Add link to text analysis docs * Text analysis API dependencies * Remove reference to REST endpoints and fix list * Move plugin descriptor file to its own page * Typos * Review feedback * Delete unused properties file * Changed into * Changed 'elasticsearchVersion' into 'pluginApiVersion' * Swap 'The analysis plugin API' and 'Plugin file structure' sections * Update docs/plugins/authors.asciidoc Co-authored-by: Arianna Laudazzi <46651782+alaudazzi@users.noreply.github.com> * Update docs/plugins/development/creating-non-text-analysis-plugins.asciidoc Co-authored-by: Arianna Laudazzi <46651782+alaudazzi@users.noreply.github.com> * Update docs/plugins/development/creating-non-text-analysis-plugins.asciidoc Co-authored-by: Arianna Laudazzi <46651782+alaudazzi@users.noreply.github.com> * Update docs/plugins/development/creating-text-analysis-plugins.asciidoc Co-authored-by: Arianna Laudazzi <46651782+alaudazzi@users.noreply.github.com> * Update docs/plugins/development/creating-text-analysis-plugins.asciidoc Co-authored-by: Arianna Laudazzi <46651782+alaudazzi@users.noreply.github.com> * Update docs/plugins/development/creating-non-text-analysis-plugins.asciidoc Co-authored-by: Arianna Laudazzi <46651782+alaudazzi@users.noreply.github.com> * Update docs/plugins/development/creating-text-analysis-plugins.asciidoc Co-authored-by: Arianna Laudazzi <46651782+alaudazzi@users.noreply.github.com> * Update docs/plugins/development/creating-text-analysis-plugins.asciidoc Co-authored-by: Arianna Laudazzi <46651782+alaudazzi@users.noreply.github.com> * Update docs/plugins/development/example-text-analysis-plugin.asciidoc Co-authored-by: Arianna Laudazzi <46651782+alaudazzi@users.noreply.github.com> * Update docs/plugins/development/plugin-descriptor-file.asciidoc Co-authored-by: Arianna Laudazzi <46651782+alaudazzi@users.noreply.github.com> * Update docs/plugins/plugin-script.asciidoc Co-authored-by: Arianna Laudazzi <46651782+alaudazzi@users.noreply.github.com> * Update docs/plugins/development/creating-non-text-analysis-plugins.asciidoc Co-authored-by: Arianna Laudazzi <46651782+alaudazzi@users.noreply.github.com> * Update docs/plugins/development/creating-non-text-analysis-plugins.asciidoc Co-authored-by: Arianna Laudazzi <46651782+alaudazzi@users.noreply.github.com> * Rewording * Add modulename and extended.plugins descriptions for descriptor file * Add link to existing plugins in Github * Review feedback * Use 'stable' and 'classic' plugin naming * Fix capitalization * Review feedback --------- Co-authored-by: Arianna Laudazzi <46651782+alaudazzi@users.noreply.github.com> Co-authored-by: William Brafford <william.brafford@elastic.co>
73 lines
1.9 KiB
Text
73 lines
1.9 KiB
Text
[[mapper-murmur3]]
|
|
=== Mapper murmur3 plugin
|
|
|
|
The mapper-murmur3 plugin provides the ability to compute hash of field values
|
|
at index-time and store them in the index. This can sometimes be helpful when
|
|
running cardinality aggregations on high-cardinality and large string fields.
|
|
|
|
:plugin_name: mapper-murmur3
|
|
include::install_remove.asciidoc[]
|
|
|
|
[[mapper-murmur3-usage]]
|
|
==== Using the `murmur3` field
|
|
|
|
The `murmur3` is typically used within a multi-field, so that both the original
|
|
value and its hash are stored in the index:
|
|
|
|
[source,console]
|
|
--------------------------
|
|
PUT my-index-000001
|
|
{
|
|
"mappings": {
|
|
"properties": {
|
|
"my_field": {
|
|
"type": "keyword",
|
|
"fields": {
|
|
"hash": {
|
|
"type": "murmur3"
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
--------------------------
|
|
|
|
Such a mapping would allow to refer to `my_field.hash` in order to get hashes
|
|
of the values of the `my_field` field. This is only useful in order to run
|
|
`cardinality` aggregations:
|
|
|
|
[source,console]
|
|
--------------------------
|
|
# Example documents
|
|
PUT my-index-000001/_doc/1
|
|
{
|
|
"my_field": "This is a document"
|
|
}
|
|
|
|
PUT my-index-000001/_doc/2
|
|
{
|
|
"my_field": "This is another document"
|
|
}
|
|
|
|
GET my-index-000001/_search
|
|
{
|
|
"aggs": {
|
|
"my_field_cardinality": {
|
|
"cardinality": {
|
|
"field": "my_field.hash" <1>
|
|
}
|
|
}
|
|
}
|
|
}
|
|
--------------------------
|
|
|
|
<1> Counting unique values on the `my_field.hash` field
|
|
|
|
Running a `cardinality` aggregation on the `my_field` field directly would
|
|
yield the same result, however using `my_field.hash` instead might result in
|
|
a speed-up if the field has a high-cardinality. On the other hand, it is
|
|
discouraged to use the `murmur3` field on numeric fields and string fields
|
|
that are not almost unique as the use of a `murmur3` field is unlikely to
|
|
bring significant speed-ups, while increasing the amount of disk space required
|
|
to store the index.
|