mirror of
https://github.com/elastic/elasticsearch.git
synced 2025-04-25 07:37:19 -04:00
[DOCS] Adds section about how to use ingest timestamp to sync a transform (#87650)
Co-authored-by: Lisa Cawley <lcawley@elastic.co>
This commit is contained in:
parent
d42211c431
commit
d48e1a2488
2 changed files with 52 additions and 3 deletions
|
@ -271,7 +271,7 @@ include::{es-repo-dir}/rest-api/common-parms.asciidoc[tag=sync-time-delay]
|
||||||
include::{es-repo-dir}/rest-api/common-parms.asciidoc[tag=sync-time-field]
|
include::{es-repo-dir}/rest-api/common-parms.asciidoc[tag=sync-time-field]
|
||||||
+
|
+
|
||||||
--
|
--
|
||||||
TIP: In general, it’s a good idea to use a field that contains the
|
TIP: It is strongly recommended to use a field that contains the
|
||||||
<<access-ingest-metadata,ingest timestamp>>. If you use a different field,
|
<<access-ingest-metadata,ingest timestamp>>. If you use a different field,
|
||||||
you might need to set the `delay` such that it accounts for data transmission
|
you might need to set the `delay` such that it accounts for data transmission
|
||||||
delays.
|
delays.
|
||||||
|
|
|
@ -40,8 +40,8 @@ The {transform} applies changes related to either new or changed entities or
|
||||||
time buckets to the destination index. The set of changes can be paginated. The
|
time buckets to the destination index. The set of changes can be paginated. The
|
||||||
{transform} performs a composite aggregation similarly to the batch {transform}
|
{transform} performs a composite aggregation similarly to the batch {transform}
|
||||||
operation, however it also injects query filters based on the previous step to
|
operation, however it also injects query filters based on the previous step to
|
||||||
reduce the amount of work. After all changes have been applied, the checkpoint is
|
reduce the amount of work. After all changes have been applied, the checkpoint
|
||||||
complete.
|
is complete.
|
||||||
--
|
--
|
||||||
|
|
||||||
This checkpoint process involves both search and indexing activity on the
|
This checkpoint process involves both search and indexing activity on the
|
||||||
|
@ -55,6 +55,55 @@ TIP: If the cluster experiences unsuitable performance degradation due to the
|
||||||
{transform}, stop the {transform} and refer to <<transform-performance>>.
|
{transform}, stop the {transform} and refer to <<transform-performance>>.
|
||||||
|
|
||||||
|
|
||||||
|
[discrete]
|
||||||
|
[[sync-field-ingest-timestamp]]
|
||||||
|
== Using the ingest timestamp for syncing the {transform}
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
In most cases, it is strongly recommended to use the ingest timestamp of the
|
||||||
|
source indices for syncing the {transform}. This is the most optimal way for
|
||||||
|
{transforms} to be able to identify new changes. If your data source follows the
|
||||||
|
{ecs-ref}/ecs-reference.html[ECS standard], you might already have an
|
||||||
|
{ecs-ref}/ecs-event.html#field-event-ingested[`event.ingested`] field. In this
|
||||||
|
case, use `event.ingested` as the `sync`.`time`.`field` property of your
|
||||||
|
{transform}.
|
||||||
|
|
||||||
|
If you don't have a `event.ingested` field or it isn't populated, you can set it
|
||||||
|
by using an ingest pipeline. Create an ingest pipeline either using the
|
||||||
|
<<put-pipeline-api, ingest pipeline API>> (like the example below) or via {kib}
|
||||||
|
under **Stack Management > Ingest Pipelines**. Use a
|
||||||
|
<<set-processor,`set` processor>> to set the field and associate it with the
|
||||||
|
value of the ingest timestamp.
|
||||||
|
|
||||||
|
[source,console]
|
||||||
|
----------------------------------
|
||||||
|
PUT _ingest/pipeline/set_ingest_time
|
||||||
|
{
|
||||||
|
"description": "Set ingest timestamp.",
|
||||||
|
"processors": [
|
||||||
|
{
|
||||||
|
"set": {
|
||||||
|
"field": "event.ingested",
|
||||||
|
"value": "{{{_ingest.timestamp}}}"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
----------------------------------
|
||||||
|
|
||||||
|
After you created the ingest pipeline, apply it to the source indices of your
|
||||||
|
{transform}. The pipeline adds the field `event.ingested` to every document with
|
||||||
|
the value of the ingest timestamp. Configure the `sync`.`time`.`field` property
|
||||||
|
of your {transform} to use the field by using the
|
||||||
|
<<put-transform,create {transform} API>> for new {transforms} or the
|
||||||
|
<<update-transform, update {transform} API>> for existing {transforms}. The
|
||||||
|
`event.ingested` field is used for syncing the {transform}.
|
||||||
|
|
||||||
|
Refer to <<add-pipeline-to-indexing-request>> and <<ingest>> to learn more about
|
||||||
|
how to use an ingest pipeline.
|
||||||
|
|
||||||
|
|
||||||
[discrete]
|
[discrete]
|
||||||
[[ml-transform-checkpoint-heuristics]]
|
[[ml-transform-checkpoint-heuristics]]
|
||||||
== Change detection heuristics
|
== Change detection heuristics
|
||||||
|
|
Loading…
Add table
Add a link
Reference in a new issue