[[synthetic-source]] ==== Synthetic `_source` preview:[] Though very handy to have around, the source field takes up a significant amount of space on disk. Instead of storing source documents on disk exactly as you send them, Elasticsearch can reconstruct source content on the fly upon retrieval. Enable this by setting `mode: synthetic` in `_source`: [source,console,id=enable-synthetic-source-example] ---- PUT idx { "mappings": { "_source": { "mode": "synthetic" } } } ---- // TESTSETUP While this on the fly reconstruction is *generally* slower than saving the source documents verbatim and loading them at query time, it saves a lot of storage space. There are a couple of restrictions to be aware of: * When you retrieve synthetic `_source` content it undergoes minor <> compared to the original JSON. * Synthetic `_source` can be used with indices that contain only these field types: ** <> ** <> ** <> ** <> ** <> ** <> ** <> ** <> ** <> ** <> ** <> ** <> ** <> ** <> ** <> ** <> (with a `keyword` sub-field) ** <> Runtime fields cannot, at this stage, use synthetic `_source`. [[synthetic-source-modifications]] ===== Synthetic source modifications When synthetic `_source` is enabled, retrieved documents undergo some modifications compared to the original JSON. [[synthetic-source-modifications-leaf-arrays]] ====== Arrays moved to leaf fields Synthetic `_source` arrays are moved to leaves. For example: [source,console,id=synthetic-source-leaf-arrays-example] ---- PUT idx/_doc/1 { "foo": [ { "bar": 1 }, { "bar": 2 } ] } ---- // TEST[s/$/\nGET idx\/_doc\/1?filter_path=_source\n/] Will become: [source,console-result] ---- { "foo": { "bar": [1, 2] } } ---- // TEST[s/^/{"_source":/ s/\n$/}/] [[synthetic-source-modifications-field-names]] ====== Fields named as they are mapped Synthetic source names fields as they are named in the mapping. When used with <>, fields with dots (`.`) in their names are, by default, interpreted as multiple objects, while dots in field names are preserved within objects that have <> disabled. For example: [source,console,id=synthetic-source-objecty-example] ---- PUT idx/_doc/1 { "foo.bar.baz": 1 } ---- // TEST[s/$/\nGET idx\/_doc\/1?filter_path=_source\n/] Will become: [source,console-result] ---- { "foo": { "bar": { "baz": 1 } } } ---- // TEST[s/^/{"_source":/ s/\n$/}/] [[synthetic-source-modifications-alphabetical]] ====== Alphabetical sorting Synthetic `_source` fields are sorted alphabetically. The https://www.rfc-editor.org/rfc/rfc7159.html[JSON RFC] defines objects as "an unordered collection of zero or more name/value pairs" so applications shouldn't care but without synthetic `_source` the original ordering is preserved and some applications may, counter to the spec, do something with that ordering.