[role="xpack"] [[find-field-structure]] = Find field structure API Finds the structure of a field in an Elasticsearch index. [discrete] [[find-field-structure-request]] == {api-request-title} `GET _text_structure/find_field_structure` [discrete] [[find-field-structure-prereqs]] == {api-prereq-title} * If the {es} {security-features} are enabled, you must have `monitor_text_structure` or `monitor` cluster privileges to use this API. See <>. [discrete] [[find-field-structure-desc]] == {api-description-title} This API provides a starting point for extracting further information from log messages already ingested into {es}. For example, if you have ingested data into a very simple index that has just `@timestamp` and `message` fields, you can use this API to see what common structure exists in the `message` field. The response from the API contains: * Sample messages. * Statistics that reveal the most common values for all fields detected within the text and basic numeric statistics for numeric fields. * Information about the structure of the text, which is useful when you write ingest configurations to index it or similarly formatted text. * Appropriate mappings for an {es} index, which you could use to ingest the text. All this information can be calculated by the structure finder with no guidance. However, you can optionally override some of the decisions about the text structure by specifying one or more query parameters. Details of the output can be seen in the <>. If the structure finder produces unexpected results, specify the `explain` query parameter and an `explanation` will appear in the response. It helps determine why the returned structure was chosen. [discrete] [[find-field-structure-query-parms]] == {api-query-parms-title} `index`:: (Required, string) The name of the index containing the field. `field`:: (Required, string) The name of the field that's analyzed. include::{es-ref-dir}/text-structure/apis/find-structure-shared.asciidoc[tag=param-column-names] include::{es-ref-dir}/text-structure/apis/find-structure-shared.asciidoc[tag=param-delimiter] `documents_to_sample`:: (Optional, unsigned integer) The number of documents to include in the structural analysis. The minimum is 2; the default is 1000. include::{es-ref-dir}/text-structure/apis/find-structure-shared.asciidoc[tag=param-explain] include::{es-ref-dir}/text-structure/apis/find-structure-shared.asciidoc[tag=param-format] include::{es-ref-dir}/text-structure/apis/find-structure-shared.asciidoc[tag=param-grok-pattern] include::{es-ref-dir}/text-structure/apis/find-structure-shared.asciidoc[tag=param-ecs-compatibility] include::{es-ref-dir}/text-structure/apis/find-structure-shared.asciidoc[tag=param-quote] include::{es-ref-dir}/text-structure/apis/find-structure-shared.asciidoc[tag=param-should-trim-fields] include::{es-ref-dir}/text-structure/apis/find-structure-shared.asciidoc[tag=param-timeout] include::{es-ref-dir}/text-structure/apis/find-structure-shared.asciidoc[tag=param-timestamp-field] include::{es-ref-dir}/text-structure/apis/find-structure-shared.asciidoc[tag=param-timestamp-format] [discrete] [[find-field-structure-examples]] == {api-examples-title} [discrete] [[find-field-structure-example]] === Analyzing Elasticsearch log files Suppose you have a list of {es} log messages in an index. You can analyze them with the `find_field_structure` endpoint as follows: [source,console] ---- POST _bulk?refresh=true {"index":{"_index":"test-logs"}} {"message":"[2024-03-05T10:52:36,256][INFO ][o.a.l.u.VectorUtilPanamaProvider] [laptop] Java vector incubator API enabled; uses preferredBitSize=128"} {"index":{"_index":"test-logs"}} {"message":"[2024-03-05T10:52:41,038][INFO ][o.e.p.PluginsService ] [laptop] loaded module [repository-url]"} {"index":{"_index":"test-logs"}} {"message":"[2024-03-05T10:52:41,042][INFO ][o.e.p.PluginsService ] [laptop] loaded module [rest-root]"} {"index":{"_index":"test-logs"}} {"message":"[2024-03-05T10:52:41,043][INFO ][o.e.p.PluginsService ] [laptop] loaded module [x-pack-core]"} {"index":{"_index":"test-logs"}} {"message":"[2024-03-05T10:52:41,043][INFO ][o.e.p.PluginsService ] [laptop] loaded module [x-pack-redact]"} {"index":{"_index":"test-logs"}} {"message":"[2024-03-05T10:52:41,043][INFO ][o.e.p.PluginsService ] [laptop] loaded module [ingest-user-agent]"} {"index":{"_index":"test-logs"}} {"message":"[2024-03-05T10:52:41,044][INFO ][o.e.p.PluginsService ] [laptop] loaded module [x-pack-monitoring]"} {"index":{"_index":"test-logs"}} {"message":"[2024-03-05T10:52:41,044][INFO ][o.e.p.PluginsService ] [laptop] loaded module [repository-s3]"} {"index":{"_index":"test-logs"}} {"message":"[2024-03-05T10:52:41,044][INFO ][o.e.p.PluginsService ] [laptop] loaded module [x-pack-analytics]"} {"index":{"_index":"test-logs"}} {"message":"[2024-03-05T10:52:41,044][INFO ][o.e.p.PluginsService ] [laptop] loaded module [x-pack-ent-search]"} {"index":{"_index":"test-logs"}} {"message":"[2024-03-05T10:52:41,044][INFO ][o.e.p.PluginsService ] [laptop] loaded module [x-pack-autoscaling]"} {"index":{"_index":"test-logs"}} {"message":"[2024-03-05T10:52:41,044][INFO ][o.e.p.PluginsService ] [laptop] loaded module [lang-painless]]"} {"index":{"_index":"test-logs"}} {"message":"[2024-03-05T10:52:41,059][INFO ][o.e.p.PluginsService ] [laptop] loaded module [lang-expression]"} {"index":{"_index":"test-logs"}} {"message":"[2024-03-05T10:52:41,059][INFO ][o.e.p.PluginsService ] [laptop] loaded module [x-pack-eql]"} {"index":{"_index":"test-logs"}} {"message":"[2024-03-05T10:52:43,291][INFO ][o.e.e.NodeEnvironment ] [laptop] heap size [16gb], compressed ordinary object pointers [true]"} {"index":{"_index":"test-logs"}} {"message":"[2024-03-05T10:52:46,098][INFO ][o.e.x.s.Security ] [laptop] Security is enabled"} {"index":{"_index":"test-logs"}} {"message":"[2024-03-05T10:52:47,227][INFO ][o.e.x.p.ProfilingPlugin ] [laptop] Profiling is enabled"} {"index":{"_index":"test-logs"}} {"message":"[2024-03-05T10:52:47,259][INFO ][o.e.x.p.ProfilingPlugin ] [laptop] profiling index templates will not be installed or reinstalled"} {"index":{"_index":"test-logs"}} {"message":"[2024-03-05T10:52:47,755][INFO ][o.e.i.r.RecoverySettings ] [laptop] using rate limit [40mb] with [default=40mb, read=0b, write=0b, max=0b]"} {"index":{"_index":"test-logs"}} {"message":"[2024-03-05T10:52:47,787][INFO ][o.e.d.DiscoveryModule ] [laptop] using discovery type [multi-node] and seed hosts providers [settings]"} {"index":{"_index":"test-logs"}} {"message":"[2024-03-05T10:52:49,188][INFO ][o.e.n.Node ] [laptop] initialized"} {"index":{"_index":"test-logs"}} {"message":"[2024-03-05T10:52:49,199][INFO ][o.e.n.Node ] [laptop] starting ..."} GET _text_structure/find_field_structure?index=test-logs&field=message ---- // TEST If the request does not encounter errors, you receive the following result: [source,console-result] ---- { "num_lines_analyzed" : 22, "num_messages_analyzed" : 22, "sample_start" : "[2024-03-05T10:52:36,256][INFO ][o.a.l.u.VectorUtilPanamaProvider] [laptop] Java vector incubator API enabled; uses preferredBitSize=128\n[2024-03-05T10:52:41,038][INFO ][o.e.p.PluginsService ] [laptop] loaded module [repository-url]\n", <3> "charset" : "UTF-8", "format" : "semi_structured_text", "multiline_start_pattern" : "^\\[\\b\\d{4}-\\d{2}-\\d{2}[T ]\\d{2}:\\d{2}", "grok_pattern" : "\\[%{TIMESTAMP_ISO8601:timestamp}\\]\\[%{LOGLEVEL:loglevel} \\]\\[.*", "ecs_compatibility" : "disabled", "timestamp_field" : "timestamp", "joda_timestamp_formats" : [ "ISO8601" ], "java_timestamp_formats" : [ "ISO8601" ], "need_client_timezone" : true, "mappings" : { "properties" : { "@timestamp" : { "type" : "date" }, "loglevel" : { "type" : "keyword" }, "message" : { "type" : "text" } } }, "ingest_pipeline" : { "description" : "Ingest pipeline created by text structure finder", "processors" : [ { "grok" : { "field" : "message", "patterns" : [ "\\[%{TIMESTAMP_ISO8601:timestamp}\\]\\[%{LOGLEVEL:loglevel} \\]\\[.*" ], "ecs_compatibility" : "disabled" } }, { "date" : { "field" : "timestamp", "timezone" : "{{ event.timezone }}", "formats" : [ "ISO8601" ] } }, { "remove" : { "field" : "timestamp" } } ] }, "field_stats" : { "loglevel" : { "count" : 22, "cardinality" : 1, "top_hits" : [ { "value" : "INFO", "count" : 22 } ] }, "message" : { "count" : 22, "cardinality" : 22, "top_hits" : [ { "value" : "[2024-03-05T10:52:36,256][INFO ][o.a.l.u.VectorUtilPanamaProvider] [laptop] Java vector incubator API enabled; uses preferredBitSize=128", "count" : 1 }, { "value" : "[2024-03-05T10:52:41,038][INFO ][o.e.p.PluginsService ] [laptop] loaded module [repository-url]", "count" : 1 }, { "value" : "[2024-03-05T10:52:41,042][INFO ][o.e.p.PluginsService ] [laptop] loaded module [rest-root]", "count" : 1 }, { "value" : "[2024-03-05T10:52:41,043][INFO ][o.e.p.PluginsService ] [laptop] loaded module [ingest-user-agent]", "count" : 1 }, { "value" : "[2024-03-05T10:52:41,043][INFO ][o.e.p.PluginsService ] [laptop] loaded module [x-pack-core]", "count" : 1 }, { "value" : "[2024-03-05T10:52:41,043][INFO ][o.e.p.PluginsService ] [laptop] loaded module [x-pack-redact]", "count" : 1 }, { "value" : "[2024-03-05T10:52:41,044][INFO ][o.e.p.PluginsService ] [laptop] loaded module [lang-painless]]", "count" : 1 }, { "value" : "[2024-03-05T10:52:41,044][INFO ][o.e.p.PluginsService ] [laptop] loaded module [repository-s3]", "count" : 1 }, { "value" : "[2024-03-05T10:52:41,044][INFO ][o.e.p.PluginsService ] [laptop] loaded module [x-pack-analytics]", "count" : 1 }, { "value" : "[2024-03-05T10:52:41,044][INFO ][o.e.p.PluginsService ] [laptop] loaded module [x-pack-autoscaling]", "count" : 1 } ] }, "timestamp" : { "count" : 22, "cardinality" : 14, "earliest" : "2024-03-05T10:52:36,256", "latest" : "2024-03-05T10:52:49,199", "top_hits" : [ { "value" : "2024-03-05T10:52:41,044", "count" : 6 }, { "value" : "2024-03-05T10:52:41,043", "count" : 3 }, { "value" : "2024-03-05T10:52:41,059", "count" : 2 }, { "value" : "2024-03-05T10:52:36,256", "count" : 1 }, { "value" : "2024-03-05T10:52:41,038", "count" : 1 }, { "value" : "2024-03-05T10:52:41,042", "count" : 1 }, { "value" : "2024-03-05T10:52:43,291", "count" : 1 }, { "value" : "2024-03-05T10:52:46,098", "count" : 1 }, { "value" : "2024-03-05T10:52:47,227", "count" : 1 }, { "value" : "2024-03-05T10:52:47,259", "count" : 1 } ] } } } ---- // TESTRESPONSE[s/"sample_start" : ".*",/"sample_start" : "$body.sample_start",/] // The substitution is because the text is pre-processed by the test harness, // so the fields may get reordered in the JSON the endpoint sees For a detailed description of the response format, or for additional examples on ingesting delimited text (such as CSV) or newline-delimited JSON, refer to the <>.