elasticsearch/docs/reference/docs/update.asciidoc
Clinton Gormley 64581d66c9 Tidied up the update docs
Closes #9459
2015-06-19 17:29:11 +02:00

258 lines
7.6 KiB
Text

[[docs-update]]
== Update API
The update API allows to update a document based on a script provided.
The operation gets the document (collocated with the shard) from the
index, runs the script (with optional script language and parameters),
and index back the result (also allows to delete, or ignore the
operation). It uses versioning to make sure no updates have happened
during the "get" and "reindex".
Note, this operation still means full reindex of the document, it just
removes some network roundtrips and reduces chances of version conflicts
between the get and the index. The `_source` field need to be enabled
for this feature to work.
For example, lets index a simple doc:
[source,js]
--------------------------------------------------
curl -XPUT localhost:9200/test/type1/1 -d '{
"counter" : 1,
"tags" : ["red"]
}'
--------------------------------------------------
[float]
=== Scripted updates
Now, we can execute a script that would increment the counter:
[source,js]
--------------------------------------------------
curl -XPOST 'localhost:9200/test/type1/1/_update' -d '{
"script" : {
"inline": "ctx._source.counter += count",
"params" : {
"count" : 4
}
}
}'
--------------------------------------------------
We can add a tag to the list of tags (note, if the tag exists, it
will still add it, since its a list):
[source,js]
--------------------------------------------------
curl -XPOST 'localhost:9200/test/type1/1/_update' -d '{
"script" : {
"inline": "ctx._source.tags += tag",
"params" : {
"tag" : "blue"
}
}
}'
--------------------------------------------------
In addition to `_source`, the following variables are available through
the `ctx` map: `_index`, `_type`, `_id`, `_version`, `_routing`,
`_parent`, `_timestamp`, `_ttl`.
We can also add a new field to the document:
[source,js]
--------------------------------------------------
curl -XPOST 'localhost:9200/test/type1/1/_update' -d '{
"script" : "ctx._source.name_of_new_field = \"value_of_new_field\""
}'
--------------------------------------------------
Or remove a field from the document:
[source,js]
--------------------------------------------------
curl -XPOST 'localhost:9200/test/type1/1/_update' -d '{
"script" : "ctx._source.remove(\"name_of_field\")"
}'
--------------------------------------------------
And, we can even change the operation that is executed. This example deletes
the doc if the `tags` field contain `blue`, otherwise it does nothing
(`noop`):
[source,js]
--------------------------------------------------
curl -XPOST 'localhost:9200/test/type1/1/_update' -d '{
"script" : {
"inline": "ctx._source.tags.contains(tag) ? ctx.op = \"delete\" : ctx.op = \"none\"",
"params" : {
"tag" : "blue"
}
}
}'
--------------------------------------------------
[float]
=== Updates with a partial document
The update API also support passing a partial document,
which will be merged into the existing document (simple recursive merge,
inner merging of objects, replacing core "keys/values" and arrays). For
example:
[source,js]
--------------------------------------------------
curl -XPOST 'localhost:9200/test/type1/1/_update' -d '{
"doc" : {
"name" : "new_name"
}
}'
--------------------------------------------------
If both `doc` and `script` is specified, then `doc` is ignored. Best is
to put your field pairs of the partial document in the script itself.
[float]
=== `detect_noop`
By default if `doc` is specified then the document is always updated even
if the merging process doesn't cause any changes. Specifying `detect_noop`
as `true` will cause Elasticsearch to check if there are changes and, if
there aren't, turn the update request into a noop. For example:
[source,js]
--------------------------------------------------
curl -XPOST 'localhost:9200/test/type1/1/_update' -d '{
"doc" : {
"name" : "new_name"
},
"detect_noop": true
}'
--------------------------------------------------
If `name` was `new_name` before the request was sent then the entire update
request is ignored.
[[upserts]]
[float]
=== Upserts
If the document does not already exist, the contents of the `upsert` element
will be inserted as a new document. If the document does exist, then the
`script` will be executed instead:
[source,js]
--------------------------------------------------
curl -XPOST 'localhost:9200/test/type1/1/_update' -d '{
"script" : {
"inline": "ctx._source.counter += count",
"params" : {
"count" : 4
}
},
"upsert" : {
"counter" : 1
}
}'
--------------------------------------------------
[float]
==== `scripted_upsert`
If you would like your script to run regardless of whether the document exists
or not -- i.e. the script handles initializing the document instead of the
`upsert` element -- then set `scripted_upsert` to `true`:
[source,js]
--------------------------------------------------
curl -XPOST 'localhost:9200/sessions/session/dh3sgudg8gsrgl/_update' -d '{
"scripted_upsert":true,
"script" : {
"id": "my_web_session_summariser",
"params" : {
"pageViewEvent" : {
"url":"foo.com/bar",
"response":404,
"time":"2014-01-01 12:32"
}
}
},
"upsert" : {}
}'
--------------------------------------------------
[float]
==== `doc_as_upsert`
Instead of sending a partial `doc` plus an `upsert` doc, setting
`doc_as_upsert` to `true` will use the contents of `doc` as the `upsert`
value:
[source,js]
--------------------------------------------------
curl -XPOST 'localhost:9200/test/type1/1/_update' -d '{
"doc" : {
"name" : "new_name"
},
"doc_as_upsert" : true
}'
--------------------------------------------------
[float]
=== Parameters
The update operation supports the following query-string parameters:
[horizontal]
`retry_on_conflict`::
In between the get and indexing phases of the update, it is possible that
another process might have already updated the same document. By default, the
update will fail with a version conflict exception. The `retry_on_conflict`
parameter controls how many times to retry the update before finally throwing
an exception.
`routing`::
Routing is used to route the update request to the right shard and sets the
routing for the upsert request if the document being updated doesn't exist.
Can't be used to update the routing of an existing document.
`parent`::
Parent is used to route the update request to the right shard and sets the
parent for the upsert request if the document being updated doesn't exist.
Can't be used to update the `parent` of an existing document.
`timeout`::
Timeout waiting for a shard to become available.
`consistency`::
The write consistency of the index/delete operation.
`refresh`::
Refresh the relevant primary and replica shards (not the whole index)
immediately after the operation occurs, so that the updated document appears
in search results immediately.
`fields`::
Return the relevant fields from the updated document. Specify `_source` to
return the full updated source.
`version` & `version_type`::
The Update API uses the Elasticsearch's versioning support internally to make
sure the document doesn't change during the update. You can use the `version`
parameter to specify that the document should only be updated if it's version
matches the one specified. By setting version type to `force` you can force
the new version of the document after update (use with care! with `force`
there is no guarantee the document didn't change).Version types `external` &
`external_gte` are not supported.