elasticsearch/docs/reference/mapping/params/subobjects.asciidoc
Kostas Krikellas 8539876663
[8.x] Apply auto-flattening to subobjects: auto (#113584)
* Apply auto-flattening to `subobjects: auto` (#112092)

* Introduce mode `subobjects=auto` for objects

* Update docs/changelog/110524.yaml

* compilation error

* tests and fixes

* refactor

* spotless

* more tests

* fix nested objects

* fix test

* update fetch test

* add QA coverage

* update tests

* update tests

* update tests

* Apply auto-flattening to `subobjects: auto`

* Update docs/changelog/112092.yaml

* sync

* dont flatten subobjects auto

* refine test

* fix path for nested flattened objects and dynamic

* document `subobjects: auto`

* Apply suggestions from code review

Co-authored-by: Felix Barnsteiner <felixbarny@users.noreply.github.com>

* comment updates

* restore indentation in comment

* update comment

* update comment

* update comment

* update comment

* rename isFlattenable

* add test for dynamic template

* fix copy_to and noop dynamic updates

* tests

* update comment

* fix tests

* update cluster feature in yaml test

* address comments

---------

Co-authored-by: Felix Barnsteiner <felixbarny@users.noreply.github.com>
(cherry picked from commit fffe8844e9)

# Conflicts:
#	modules/dot-prefix-validation/build.gradle
#	rest-api-spec/build.gradle

* Update build.gradle
2024-09-26 20:17:11 +10:00

371 lines
11 KiB
Text

[[subobjects]]
=== `subobjects`
When indexing a document or updating mappings, Elasticsearch accepts fields that contain dots in their names,
which get expanded to their corresponding object structure. For instance, the field `metrics.time.max`
is mapped as a `max` leaf field with a parent `time` object, belonging to its parent `metrics` object.
The described default behaviour is reasonable for most scenarios, but causes problems in certain situations
where for instance a field `metrics.time` holds a value too, which is common when indexing metrics data.
A document holding a value for both `metrics.time.max` and `metrics.time` gets rejected given that `time`
would need to be a leaf field to hold a value as well as an object to hold the `max` sub-field.
The `subobjects: false` setting, which can be applied only to the top-level mapping definition and
to <<object,`object`>> fields, disables the ability for an object to hold further subobjects and makes it possible
to store documents where field names contain dots and share common prefixes. From the example above, if the object
container `metrics` has `subobjects` set to `false`, it can hold values for both `time` and `time.max` directly
without the need for any intermediate object, as dots in field names are preserved.
[source,console]
--------------------------------------------------
PUT my-index-000001
{
"mappings": {
"properties": {
"metrics": {
"type": "object",
"subobjects": false, <1>
"properties": {
"time": { "type": "long" },
"time.min": { "type": "long" },
"time.max": { "type": "long" }
}
}
}
}
}
PUT my-index-000001/_doc/metric_1
{
"metrics.time" : 100, <2>
"metrics.time.min" : 10,
"metrics.time.max" : 900
}
PUT my-index-000001/_doc/metric_2
{
"metrics" : {
"time" : 100, <3>
"time.min" : 10,
"time.max" : 900
}
}
GET my-index-000001/_mapping
--------------------------------------------------
[source,console-result]
--------------------------------------------------
{
"my-index-000001" : {
"mappings" : {
"properties" : {
"metrics" : {
"subobjects" : false,
"properties" : {
"time" : {
"type" : "long"
},
"time.min" : { <4>
"type" : "long"
},
"time.max" : {
"type" : "long"
}
}
}
}
}
}
}
--------------------------------------------------
<1> The `metrics` field cannot hold other objects.
<2> Sample document holding flat paths
<3> Sample document holding an object (configured to not hold subobjects) and its leaf sub-fields
<4> The resulting mapping where dots in field names were preserved
The entire mapping may be configured to not support subobjects as well, in which case the document can
only ever hold leaf sub-fields:
[source,console]
--------------------------------------------------
PUT my-index-000001
{
"mappings": {
"subobjects": false <1>
}
}
PUT my-index-000001/_doc/metric_1
{
"time" : "100ms", <2>
"time.min" : "10ms",
"time.max" : "900ms"
}
--------------------------------------------------
<1> The entire mapping is configured to not support objects.
<2> The document does not support objects
Setting `subobjects: false` disallows the definition of <<object,`object`>> and <<nested,`nested`>> sub-fields, which
can be too restrictive in cases where it's desirable to have <<nested,`nested`>> objects or sub-objects with specific
behavior (e.g. with `enabled:false`). In this case, it's possible to set `subobjects: auto`, which
<<auto-flattening, auto-flattens>> whenever possible and falls back to creating an object mapper otherwise (instead of
rejecting the mapping as `subobjects: false` does). For instance:
[source,console]
--------------------------------------------------
PUT my-index-000002
{
"mappings": {
"properties": {
"metrics": {
"type": "object",
"subobjects": "auto", <1>
"properties": {
"inner": {
"type": "object",
"enabled": false
},
"nested": {
"type": "nested"
}
}
}
}
}
}
PUT my-index-000002/_doc/metric_1
{
"metrics.time" : 100, <2>
"metrics.time.min" : 10,
"metrics.time.max" : 900
}
PUT my-index-000002/_doc/metric_2
{
"metrics" : { <3>
"time" : 100,
"time.min" : 10,
"time.max" : 900,
"inner": {
"foo": "bar",
"path.to.some.field": "baz"
},
"nested": [
{ "id": 10 },
{ "id": 1 }
]
}
}
GET my-index-000002/_mapping
--------------------------------------------------
[source,console-result]
--------------------------------------------------
{
"my-index-000002" : {
"mappings" : {
"properties" : {
"metrics" : {
"subobjects" : auto,
"properties" : {
"inner": { <4>
"type": "object",
"enabled": false
},
"nested": {
"type": "nested",
"properties" : {
"id" : {
"type" : "long"
}
}
},
"time" : {
"type" : "long"
},
"time.min" : {
"type" : "long"
},
"time.max" : {
"type" : "long"
}
}
}
}
}
}
}
--------------------------------------------------
<1> The `metrics` field can only hold statically defined objects, namely `inner` and `nested`.
<2> Sample document holding flat paths
<3> Sample document holding an object (configured with sub-objects) and its leaf sub-fields
<4> The resulting mapping where dots in field names (`time.min`, `time_max`), as well as the
statically-defined sub-objects `inner` and `nested`, were preserved
The `subobjects` setting for existing fields and the top-level mapping definition cannot be updated.
[[auto-flattening]]
==== Auto-flattening object mappings
It is generally recommended to define the properties of an object that is configured with `subobjects: false` or
`subobjects: auto` with dotted field names (as shown in the first example). However, it is also possible to define
these properties as sub-objects in the mappings. In that case, the mapping will be automatically flattened before
it is stored. This makes it easier to re-use existing mappings without having to re-write them.
Note that auto-flattening does not apply if any of the following <<mapping-params, mapping parameters>> are set
on object mappings that are defined under an object configured with `subobjects: false` or `subobjects: auto`:
* The <<enabled, `enabled`>> mapping parameter is `false`.
* The <<dynamic, `dynamic`>> mapping parameter contradicts the implicit or explicit value of the parent.
For example, when `dynamic` is set to `false` in the root of the mapping, object mappers that set `dynamic` to `true`
can't be auto-flattened.
* The <<subobjects, `subobjects`>> mapping parameter is set to `auto` or `true` explicitly.
If such a sub-object is detected, the behavior depends on the `subobjects` value:
* `subobjects: false` is not compatible, so a mapping error is returned during mapping construction.
* `subobjects: auto` reverts to adding the object to the mapping, bypassing auto-flattening for it. Still, any
intermediate objects will be auto-flattened if applicable (i.e. the object name gets directly attached under the parent
object with `subobjects: auto`). Auto-flattening can be applied within sub-objects, if they are configured with
`subobjects: auto` too.
Auto-flattening example with `subobjects: false`:
[source,console]
--------------------------------------------------
PUT my-index-000003
{
"mappings": {
"properties": {
"metrics": {
"subobjects": false,
"properties": {
"time": {
"type": "object", <1>
"properties": {
"min": { "type": "long" }, <2>
"max": { "type": "long" }
}
}
}
}
}
}
}
GET my-index-000003/_mapping
--------------------------------------------------
[source,console-result]
--------------------------------------------------
{
"my-index-000003" : {
"mappings" : {
"properties" : {
"metrics" : {
"subobjects" : false,
"properties" : {
"time.min" : { <3>
"type" : "long"
},
"time.max" : {
"type" : "long"
}
}
}
}
}
}
}
--------------------------------------------------
<1> The metrics object can contain further object mappings that will be auto-flattened.
Object mappings at this level must not set certain mapping parameters as explained above.
<2> This field will be auto-flattened to `time.min` before the mapping is stored.
<3> The auto-flattened `time.min` field can be inspected by looking at the index mapping.
Auto-flattening example with `subobjects: auto`:
[source,console]
--------------------------------------------------
PUT my-index-000004
{
"mappings": {
"properties": {
"metrics": {
"subobjects": "auto",
"properties": {
"time": {
"type": "object", <1>
"properties": {
"min": { "type": "long" } <2>
}
},
"to": {
"type": "object",
"properties": {
"inner_metrics": { <3>
"type": "object",
"subobjects": "auto",
"properties": {
"time": {
"type": "object",
"properties": {
"max": { "type": "long" } <4>
}
}
}
}
}
}
}
}
}
}
}
GET my-index-000004/_mapping
--------------------------------------------------
[source,console-result]
--------------------------------------------------
{
"my-index-000004" : {
"mappings" : {
"properties" : {
"metrics" : {
"subobjects" : "auto",
"properties" : {
"time.min" : { <5>
"type" : "long"
},
"to.inner_metrics" : { <6>
"subobjects" : "auto",
"properties" : {
"time.max" : { <7>
"type" : "long"
}
}
}
}
}
}
}
}
}
--------------------------------------------------
<1> The metrics object can contain further object mappings that may be auto-flattened, depending on their mapping
parameters as explained above.
<2> This field will be auto-flattened to `time.min` before the mapping is stored.
<3> This object has param `subobjects: auto` so it can't be auto-flattened. Its parent does qualify for auto-flattening,
so it becomes `to.inner_metrics` before the mapping is stored.
<4> This field will be auto-flattened to `time.max` before the mapping is stored.
<5> The auto-flattened `time.min` field can be inspected by looking at the index mapping.
<6> The inner object `to.inner_metrics` can be inspected by looking at the index mapping.
<7> The auto-flattened `time.max` field can be inspected by looking at the index mapping.