elasticsearch/docs/reference/ingest/processors/script.asciidoc
Stuart Tettemer d42211c431
Ingest: IngestDocument requires non-null version (#87665)
Changes the type of the version parameter in `IngestDocument` from
`Long` to `long` and moves it to the third argument, so all required
values occur before nullable arguments.

The `IngestService` expects a non-null version for a document and will
throw an `NullPointerException` if one is not provided.

Related: #87309
2022-06-15 07:50:45 -05:00

161 lines
3.9 KiB
Text

[[script-processor]]
=== Script processor
++++
<titleabbrev>Script</titleabbrev>
++++
Runs an inline or stored <<modules-scripting,script>> on incoming documents. The
script runs in the {painless}/painless-ingest-processor-context.html[`ingest`]
context.
The script processor uses the <<scripts-and-search-speed,script cache>> to avoid
recompiling the script for each incoming document. To improve performance,
ensure the script cache is properly sized before using a script processor in
production.
[[script-options]]
.Script options
[options="header"]
|======
| Name | Required | Default | Description
| `lang` | no | "painless" | <<scripting-available-languages,Script language>>.
| `id` | no | - | ID of a <<create-stored-script-api,stored script>>.
If no `source` is specified, this parameter is required.
| `source` | no | - | Inline script.
If no `id` is specified, this parameter is required.
| `params` | no | - | Object containing parameters for the script.
include::common-options.asciidoc[]
|======
[discrete]
[[script-processor-access-source-fields]]
==== Access source fields
The script processor parses each incoming document's JSON source fields into a
set of maps, lists, and primitives. To access these fields with a Painless
script, use the
{painless}/painless-operators-reference.html#map-access-operator[map access
operator]: `ctx['my-field']`. You can also use the shorthand `ctx.<my-field>`
syntax.
NOTE: The script processor does not support the `ctx['_source']['my-field']` or
`ctx._source.<my-field>` syntaxes.
The following processor uses a Painless script to extract the `tags` field from
the `env` source field.
[source,console]
----
POST _ingest/pipeline/_simulate
{
"pipeline": {
"processors": [
{
"script": {
"description": "Extract 'tags' from 'env' field",
"lang": "painless",
"source": """
String[] envSplit = ctx['env'].splitOnToken(params['delimiter']);
ArrayList tags = new ArrayList();
tags.add(envSplit[params['position']].trim());
ctx['tags'] = tags;
""",
"params": {
"delimiter": "-",
"position": 1
}
}
}
]
},
"docs": [
{
"_source": {
"env": "es01-prod"
}
}
]
}
----
The processor produces:
[source,console-result]
----
{
"docs": [
{
"doc": {
...
"_source": {
"env": "es01-prod",
"tags": [
"prod"
]
}
}
}
]
}
----
// TESTRESPONSE[s/\.\.\./"_index":"_index","_id":"_id","_version":"-3","_ingest":{"timestamp":$body.docs.0.doc._ingest.timestamp},/]
[discrete]
[[script-processor-access-metadata-fields]]
==== Access metadata fields
You can also use a script processor to access metadata fields. The following
processor uses a Painless script to set an incoming document's `_index`.
[source,console]
----
POST _ingest/pipeline/_simulate
{
"pipeline": {
"processors": [
{
"script": {
"description": "Set index based on `lang` field and `dataset` param",
"lang": "painless",
"source": """
ctx['_index'] = ctx['lang'] + '-' + params['dataset'];
""",
"params": {
"dataset": "catalog"
}
}
}
]
},
"docs": [
{
"_index": "generic-index",
"_source": {
"lang": "fr"
}
}
]
}
----
The processor changes the document's `_index` to `fr-catalog` from
`generic-index`.
[source,console-result]
----
{
"docs": [
{
"doc": {
...
"_index": "fr-catalog",
"_source": {
"lang": "fr"
}
}
}
]
}
----
// TESTRESPONSE[s/\.\.\./"_id":"_id","_version":"-3","_ingest":{"timestamp":$body.docs.0.doc._ingest.timestamp},/]