[DOCS] Adds section about how to use ingest timestamp to sync a transform (#87650)

Co-authored-by: Lisa Cawley <lcawley@elastic.co>
This commit is contained in:
István Zoltán Szabó 2022-06-15 15:00:44 +02:00 committed by GitHub
parent d42211c431
commit d48e1a2488
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
2 changed files with 52 additions and 3 deletions

View file

@ -271,7 +271,7 @@ include::{es-repo-dir}/rest-api/common-parms.asciidoc[tag=sync-time-delay]
include::{es-repo-dir}/rest-api/common-parms.asciidoc[tag=sync-time-field] include::{es-repo-dir}/rest-api/common-parms.asciidoc[tag=sync-time-field]
+ +
-- --
TIP: In general, its a good idea to use a field that contains the TIP: It is strongly recommended to use a field that contains the
<<access-ingest-metadata,ingest timestamp>>. If you use a different field, <<access-ingest-metadata,ingest timestamp>>. If you use a different field,
you might need to set the `delay` such that it accounts for data transmission you might need to set the `delay` such that it accounts for data transmission
delays. delays.

View file

@ -40,8 +40,8 @@ The {transform} applies changes related to either new or changed entities or
time buckets to the destination index. The set of changes can be paginated. The time buckets to the destination index. The set of changes can be paginated. The
{transform} performs a composite aggregation similarly to the batch {transform} {transform} performs a composite aggregation similarly to the batch {transform}
operation, however it also injects query filters based on the previous step to operation, however it also injects query filters based on the previous step to
reduce the amount of work. After all changes have been applied, the checkpoint is reduce the amount of work. After all changes have been applied, the checkpoint
complete. is complete.
-- --
This checkpoint process involves both search and indexing activity on the This checkpoint process involves both search and indexing activity on the
@ -55,6 +55,55 @@ TIP: If the cluster experiences unsuitable performance degradation due to the
{transform}, stop the {transform} and refer to <<transform-performance>>. {transform}, stop the {transform} and refer to <<transform-performance>>.
[discrete]
[[sync-field-ingest-timestamp]]
== Using the ingest timestamp for syncing the {transform}
In most cases, it is strongly recommended to use the ingest timestamp of the
source indices for syncing the {transform}. This is the most optimal way for
{transforms} to be able to identify new changes. If your data source follows the
{ecs-ref}/ecs-reference.html[ECS standard], you might already have an
{ecs-ref}/ecs-event.html#field-event-ingested[`event.ingested`] field. In this
case, use `event.ingested` as the `sync`.`time`.`field` property of your
{transform}.
If you don't have a `event.ingested` field or it isn't populated, you can set it
by using an ingest pipeline. Create an ingest pipeline either using the
<<put-pipeline-api, ingest pipeline API>> (like the example below) or via {kib}
under **Stack Management > Ingest Pipelines**. Use a
<<set-processor,`set` processor>> to set the field and associate it with the
value of the ingest timestamp.
[source,console]
----------------------------------
PUT _ingest/pipeline/set_ingest_time
{
"description": "Set ingest timestamp.",
"processors": [
{
"set": {
"field": "event.ingested",
"value": "{{{_ingest.timestamp}}}"
}
}
]
}
----------------------------------
After you created the ingest pipeline, apply it to the source indices of your
{transform}. The pipeline adds the field `event.ingested` to every document with
the value of the ingest timestamp. Configure the `sync`.`time`.`field` property
of your {transform} to use the field by using the
<<put-transform,create {transform} API>> for new {transforms} or the
<<update-transform, update {transform} API>> for existing {transforms}. The
`event.ingested` field is used for syncing the {transform}.
Refer to <<add-pipeline-to-indexing-request>> and <<ingest>> to learn more about
how to use an ingest pipeline.
[discrete] [discrete]
[[ml-transform-checkpoint-heuristics]] [[ml-transform-checkpoint-heuristics]]
== Change detection heuristics == Change detection heuristics