[DOCS] Add performance warning for scripts (#59890)

This commit is contained in:
James Rodewig 2020-07-20 14:04:35 -04:00 committed by GitHub
parent 9b155d9201
commit 8a57800f1b
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
6 changed files with 175 additions and 156 deletions

View file

@ -252,6 +252,166 @@ changed by setting `script.max_size_in_bytes` setting to increase that soft
limit, but if scripts are really large then a
<<modules-scripting-engine,native script engine>> should be considered.
[[scripts-and-search-speed]]
=== Scripts and search speed
Scripts can't make use of {es}'s index structures or related optimizations. This
can sometimes result in slower search speeds.
If you often use scripts to transform indexed data, you can speed up search by
making these changes during ingest instead. However, that often means slower
index speeds.
.*Example*
[%collapsible]
=====
An index, `my_test_scores`, contains two `long` fields:
* `math_score`
* `verbal_score`
When running searches, users often use a script to sort results by the sum of
these two field's values.
[source,console]
----
GET /my_test_scores/_search
{
"query": {
"term": {
"grad_year": "2099"
}
},
"sort": [
{
"_script": {
"type": "number",
"script": {
"source": "doc['math_score'].value + doc['verbal_score'].value"
},
"order": "desc"
}
}
]
}
----
// TEST[s/^/PUT my_test_scores\n/]
To speed up search, you can perform this calculation during ingest and index the
sum to a field instead.
First, <<indices-put-mapping,add a new field>>, `total_score`, to the index. The
`total_score` field will contain sum of the `math_score` and `verbal_score`
field values.
[source,console]
----
PUT /my_test_scores/_mapping
{
"properties": {
"total_score": {
"type": "long"
}
}
}
----
// TEST[continued]
Next, use an <<ingest,ingest pipeline>> containing the
<<script-processor,`script`>> processor to calculate the sum of `math_score` and
`verbal_score` and index it in the `total_score` field.
[source,console]
----
PUT _ingest/pipeline/my_test_scores_pipeline
{
"description": "Calculates the total test score",
"processors": [
{
"script": {
"source": "ctx.total_score = (ctx.math_score + ctx.verbal_score)"
}
}
]
}
----
// TEST[continued]
To update existing data, use this pipeline to <<docs-reindex,reindex>> any
documents from `my_test_scores` to a new index, `my_test_scores_2`.
[source,console]
----
POST /_reindex
{
"source": {
"index": "my_test_scores"
},
"dest": {
"index": "my_test_scores_2",
"pipeline": "my_test_scores_pipeline"
}
}
----
// TEST[continued]
Continue using the pipeline to index any new documents to `my_test_scores_2`.
[source,console]
----
POST /my_test_scores_2/_doc/?pipeline=my_test_scores_pipeline
{
"student": "kimchy",
"grad_year": "2099",
"math_score": 800,
"verbal_score": 800
}
----
// TEST[continued]
These changes may slow indexing but allow for faster searches. Users can now
sort searches made on `my_test_scores_2` using the `total_score` field instead
of using a script.
[source,console]
----
GET /my_test_scores_2/_search
{
"query": {
"term": {
"grad_year": "2099"
}
},
"sort": [
{
"total_score": {
"order": "desc"
}
}
]
}
----
// TEST[continued]
////
[source,console]
----
DELETE /_ingest/pipeline/my_test_scores_pipeline
----
// TEST[continued]
[source,console-result]
----
{
"acknowledged": true
}
----
////
=====
We recommend testing and benchmarking any indexing changes before deploying them
in production.
[float]
[[modules-scripting-errors]]
=== Script errors