elasticsearch/docs/plugins/mapper-murmur3.asciidoc

[[mapper-murmur3]]
=== Mapper murmur3 plugin

The mapper-murmur3 plugin provides the ability to compute hash of field values
at index-time and store them in the index. This can sometimes be helpful when
running cardinality aggregations on high-cardinality and large string fields.

:plugin_name: mapper-murmur3
include::install_remove.asciidoc[]

[[mapper-murmur3-usage]]
==== Using the `murmur3` field

The `murmur3` is typically used within a multi-field, so that both the original
value and its hash are stored in the index:

[source,console]
--------------------------
PUT my-index-000001
{
  "mappings": {
    "properties": {
      "my_field": {
        "type": "keyword",
        "fields": {
          "hash": {
            "type": "murmur3"
          }
        }
      }
    }
  }
}
--------------------------

Such a mapping would allow to refer to `my_field.hash` in order to get hashes
of the values of the `my_field` field. This is only useful in order to run
`cardinality` aggregations:

[source,console]
--------------------------
# Example documents
PUT my-index-000001/_doc/1
{
  "my_field": "This is a document"
}

PUT my-index-000001/_doc/2
{
  "my_field": "This is another document"
}

GET my-index-000001/_search
{
  "aggs": {
    "my_field_cardinality": {
      "cardinality": {
        "field": "my_field.hash" <1>
      }
    }
  }
}
--------------------------

<1> Counting unique values on the `my_field.hash` field

Running a `cardinality` aggregation on the `my_field` field directly would
yield the same result, however using `my_field.hash` instead might result in
a speed-up if the field has a high-cardinality. On the other hand, it is
discouraged to use the `murmur3` field on numeric fields and string fields
that are not almost unique as the use of a `murmur3` field is unlikely to
bring significant speed-ups, while increasing the amount of disk space required
to store the index.