mirror of
https://github.com/elastic/elasticsearch.git
synced 2025-04-25 23:57:20 -04:00
* Apply auto-flattening to `subobjects: auto` (#112092)
* Introduce mode `subobjects=auto` for objects
* Update docs/changelog/110524.yaml
* compilation error
* tests and fixes
* refactor
* spotless
* more tests
* fix nested objects
* fix test
* update fetch test
* add QA coverage
* update tests
* update tests
* update tests
* Apply auto-flattening to `subobjects: auto`
* Update docs/changelog/112092.yaml
* sync
* dont flatten subobjects auto
* refine test
* fix path for nested flattened objects and dynamic
* document `subobjects: auto`
* Apply suggestions from code review
Co-authored-by: Felix Barnsteiner <felixbarny@users.noreply.github.com>
* comment updates
* restore indentation in comment
* update comment
* update comment
* update comment
* update comment
* rename isFlattenable
* add test for dynamic template
* fix copy_to and noop dynamic updates
* tests
* update comment
* fix tests
* update cluster feature in yaml test
* address comments
---------
Co-authored-by: Felix Barnsteiner <felixbarny@users.noreply.github.com>
(cherry picked from commit fffe8844e9
)
# Conflicts:
# modules/dot-prefix-validation/build.gradle
# rest-api-spec/build.gradle
* Update build.gradle
371 lines
11 KiB
Text
371 lines
11 KiB
Text
[[subobjects]]
|
|
=== `subobjects`
|
|
|
|
When indexing a document or updating mappings, Elasticsearch accepts fields that contain dots in their names,
|
|
which get expanded to their corresponding object structure. For instance, the field `metrics.time.max`
|
|
is mapped as a `max` leaf field with a parent `time` object, belonging to its parent `metrics` object.
|
|
|
|
The described default behaviour is reasonable for most scenarios, but causes problems in certain situations
|
|
where for instance a field `metrics.time` holds a value too, which is common when indexing metrics data.
|
|
A document holding a value for both `metrics.time.max` and `metrics.time` gets rejected given that `time`
|
|
would need to be a leaf field to hold a value as well as an object to hold the `max` sub-field.
|
|
|
|
The `subobjects: false` setting, which can be applied only to the top-level mapping definition and
|
|
to <<object,`object`>> fields, disables the ability for an object to hold further subobjects and makes it possible
|
|
to store documents where field names contain dots and share common prefixes. From the example above, if the object
|
|
container `metrics` has `subobjects` set to `false`, it can hold values for both `time` and `time.max` directly
|
|
without the need for any intermediate object, as dots in field names are preserved.
|
|
|
|
[source,console]
|
|
--------------------------------------------------
|
|
PUT my-index-000001
|
|
{
|
|
"mappings": {
|
|
"properties": {
|
|
"metrics": {
|
|
"type": "object",
|
|
"subobjects": false, <1>
|
|
"properties": {
|
|
"time": { "type": "long" },
|
|
"time.min": { "type": "long" },
|
|
"time.max": { "type": "long" }
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
|
|
PUT my-index-000001/_doc/metric_1
|
|
{
|
|
"metrics.time" : 100, <2>
|
|
"metrics.time.min" : 10,
|
|
"metrics.time.max" : 900
|
|
}
|
|
|
|
PUT my-index-000001/_doc/metric_2
|
|
{
|
|
"metrics" : {
|
|
"time" : 100, <3>
|
|
"time.min" : 10,
|
|
"time.max" : 900
|
|
}
|
|
}
|
|
|
|
GET my-index-000001/_mapping
|
|
--------------------------------------------------
|
|
|
|
[source,console-result]
|
|
--------------------------------------------------
|
|
{
|
|
"my-index-000001" : {
|
|
"mappings" : {
|
|
"properties" : {
|
|
"metrics" : {
|
|
"subobjects" : false,
|
|
"properties" : {
|
|
"time" : {
|
|
"type" : "long"
|
|
},
|
|
"time.min" : { <4>
|
|
"type" : "long"
|
|
},
|
|
"time.max" : {
|
|
"type" : "long"
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
|
|
<1> The `metrics` field cannot hold other objects.
|
|
<2> Sample document holding flat paths
|
|
<3> Sample document holding an object (configured to not hold subobjects) and its leaf sub-fields
|
|
<4> The resulting mapping where dots in field names were preserved
|
|
|
|
The entire mapping may be configured to not support subobjects as well, in which case the document can
|
|
only ever hold leaf sub-fields:
|
|
|
|
[source,console]
|
|
--------------------------------------------------
|
|
PUT my-index-000001
|
|
{
|
|
"mappings": {
|
|
"subobjects": false <1>
|
|
}
|
|
}
|
|
|
|
PUT my-index-000001/_doc/metric_1
|
|
{
|
|
"time" : "100ms", <2>
|
|
"time.min" : "10ms",
|
|
"time.max" : "900ms"
|
|
}
|
|
|
|
--------------------------------------------------
|
|
|
|
<1> The entire mapping is configured to not support objects.
|
|
<2> The document does not support objects
|
|
|
|
Setting `subobjects: false` disallows the definition of <<object,`object`>> and <<nested,`nested`>> sub-fields, which
|
|
can be too restrictive in cases where it's desirable to have <<nested,`nested`>> objects or sub-objects with specific
|
|
behavior (e.g. with `enabled:false`). In this case, it's possible to set `subobjects: auto`, which
|
|
<<auto-flattening, auto-flattens>> whenever possible and falls back to creating an object mapper otherwise (instead of
|
|
rejecting the mapping as `subobjects: false` does). For instance:
|
|
|
|
[source,console]
|
|
--------------------------------------------------
|
|
PUT my-index-000002
|
|
{
|
|
"mappings": {
|
|
"properties": {
|
|
"metrics": {
|
|
"type": "object",
|
|
"subobjects": "auto", <1>
|
|
"properties": {
|
|
"inner": {
|
|
"type": "object",
|
|
"enabled": false
|
|
},
|
|
"nested": {
|
|
"type": "nested"
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
|
|
PUT my-index-000002/_doc/metric_1
|
|
{
|
|
"metrics.time" : 100, <2>
|
|
"metrics.time.min" : 10,
|
|
"metrics.time.max" : 900
|
|
}
|
|
|
|
PUT my-index-000002/_doc/metric_2
|
|
{
|
|
"metrics" : { <3>
|
|
"time" : 100,
|
|
"time.min" : 10,
|
|
"time.max" : 900,
|
|
"inner": {
|
|
"foo": "bar",
|
|
"path.to.some.field": "baz"
|
|
},
|
|
"nested": [
|
|
{ "id": 10 },
|
|
{ "id": 1 }
|
|
]
|
|
}
|
|
}
|
|
|
|
GET my-index-000002/_mapping
|
|
--------------------------------------------------
|
|
|
|
[source,console-result]
|
|
--------------------------------------------------
|
|
{
|
|
"my-index-000002" : {
|
|
"mappings" : {
|
|
"properties" : {
|
|
"metrics" : {
|
|
"subobjects" : auto,
|
|
"properties" : {
|
|
"inner": { <4>
|
|
"type": "object",
|
|
"enabled": false
|
|
},
|
|
"nested": {
|
|
"type": "nested",
|
|
"properties" : {
|
|
"id" : {
|
|
"type" : "long"
|
|
}
|
|
}
|
|
},
|
|
"time" : {
|
|
"type" : "long"
|
|
},
|
|
"time.min" : {
|
|
"type" : "long"
|
|
},
|
|
"time.max" : {
|
|
"type" : "long"
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
|
|
<1> The `metrics` field can only hold statically defined objects, namely `inner` and `nested`.
|
|
<2> Sample document holding flat paths
|
|
<3> Sample document holding an object (configured with sub-objects) and its leaf sub-fields
|
|
<4> The resulting mapping where dots in field names (`time.min`, `time_max`), as well as the
|
|
statically-defined sub-objects `inner` and `nested`, were preserved
|
|
|
|
The `subobjects` setting for existing fields and the top-level mapping definition cannot be updated.
|
|
|
|
[[auto-flattening]]
|
|
==== Auto-flattening object mappings
|
|
|
|
It is generally recommended to define the properties of an object that is configured with `subobjects: false` or
|
|
`subobjects: auto` with dotted field names (as shown in the first example). However, it is also possible to define
|
|
these properties as sub-objects in the mappings. In that case, the mapping will be automatically flattened before
|
|
it is stored. This makes it easier to re-use existing mappings without having to re-write them.
|
|
|
|
Note that auto-flattening does not apply if any of the following <<mapping-params, mapping parameters>> are set
|
|
on object mappings that are defined under an object configured with `subobjects: false` or `subobjects: auto`:
|
|
|
|
* The <<enabled, `enabled`>> mapping parameter is `false`.
|
|
* The <<dynamic, `dynamic`>> mapping parameter contradicts the implicit or explicit value of the parent.
|
|
For example, when `dynamic` is set to `false` in the root of the mapping, object mappers that set `dynamic` to `true`
|
|
can't be auto-flattened.
|
|
* The <<subobjects, `subobjects`>> mapping parameter is set to `auto` or `true` explicitly.
|
|
|
|
If such a sub-object is detected, the behavior depends on the `subobjects` value:
|
|
|
|
* `subobjects: false` is not compatible, so a mapping error is returned during mapping construction.
|
|
* `subobjects: auto` reverts to adding the object to the mapping, bypassing auto-flattening for it. Still, any
|
|
intermediate objects will be auto-flattened if applicable (i.e. the object name gets directly attached under the parent
|
|
object with `subobjects: auto`). Auto-flattening can be applied within sub-objects, if they are configured with
|
|
`subobjects: auto` too.
|
|
|
|
Auto-flattening example with `subobjects: false`:
|
|
|
|
[source,console]
|
|
--------------------------------------------------
|
|
PUT my-index-000003
|
|
{
|
|
"mappings": {
|
|
"properties": {
|
|
"metrics": {
|
|
"subobjects": false,
|
|
"properties": {
|
|
"time": {
|
|
"type": "object", <1>
|
|
"properties": {
|
|
"min": { "type": "long" }, <2>
|
|
"max": { "type": "long" }
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
GET my-index-000003/_mapping
|
|
--------------------------------------------------
|
|
|
|
[source,console-result]
|
|
--------------------------------------------------
|
|
{
|
|
"my-index-000003" : {
|
|
"mappings" : {
|
|
"properties" : {
|
|
"metrics" : {
|
|
"subobjects" : false,
|
|
"properties" : {
|
|
"time.min" : { <3>
|
|
"type" : "long"
|
|
},
|
|
"time.max" : {
|
|
"type" : "long"
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
|
|
<1> The metrics object can contain further object mappings that will be auto-flattened.
|
|
Object mappings at this level must not set certain mapping parameters as explained above.
|
|
<2> This field will be auto-flattened to `time.min` before the mapping is stored.
|
|
<3> The auto-flattened `time.min` field can be inspected by looking at the index mapping.
|
|
|
|
Auto-flattening example with `subobjects: auto`:
|
|
|
|
[source,console]
|
|
--------------------------------------------------
|
|
PUT my-index-000004
|
|
{
|
|
"mappings": {
|
|
"properties": {
|
|
"metrics": {
|
|
"subobjects": "auto",
|
|
"properties": {
|
|
"time": {
|
|
"type": "object", <1>
|
|
"properties": {
|
|
"min": { "type": "long" } <2>
|
|
}
|
|
},
|
|
"to": {
|
|
"type": "object",
|
|
"properties": {
|
|
"inner_metrics": { <3>
|
|
"type": "object",
|
|
"subobjects": "auto",
|
|
"properties": {
|
|
"time": {
|
|
"type": "object",
|
|
"properties": {
|
|
"max": { "type": "long" } <4>
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
GET my-index-000004/_mapping
|
|
--------------------------------------------------
|
|
|
|
[source,console-result]
|
|
--------------------------------------------------
|
|
{
|
|
"my-index-000004" : {
|
|
"mappings" : {
|
|
"properties" : {
|
|
"metrics" : {
|
|
"subobjects" : "auto",
|
|
"properties" : {
|
|
"time.min" : { <5>
|
|
"type" : "long"
|
|
},
|
|
"to.inner_metrics" : { <6>
|
|
"subobjects" : "auto",
|
|
"properties" : {
|
|
"time.max" : { <7>
|
|
"type" : "long"
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
|
|
<1> The metrics object can contain further object mappings that may be auto-flattened, depending on their mapping
|
|
parameters as explained above.
|
|
<2> This field will be auto-flattened to `time.min` before the mapping is stored.
|
|
<3> This object has param `subobjects: auto` so it can't be auto-flattened. Its parent does qualify for auto-flattening,
|
|
so it becomes `to.inner_metrics` before the mapping is stored.
|
|
<4> This field will be auto-flattened to `time.max` before the mapping is stored.
|
|
<5> The auto-flattened `time.min` field can be inspected by looking at the index mapping.
|
|
<6> The inner object `to.inner_metrics` can be inspected by looking at the index mapping.
|
|
<7> The auto-flattened `time.max` field can be inspected by looking at the index mapping.
|