mirror of
https://github.com/elastic/elasticsearch.git
synced 2025-04-27 00:27:25 -04:00
n larger clusters with complicated datafeed requirements, being able to preview only a specific window of time is important. Previously, datafeed previews would always start at 0 (or from the beginning of the data). This causes issues if the index pattern contains indices on slower hardware, but when the datafeed is actually started, the "start" time is set to more recent data (and thus on faster hardware). Additionally, when _preview is unbounded (as before), it attempts to only preview indices that are NOT frozen or cold. This is done through a query against the _tier field. Meaning, it only effects newer indices that actually have that field set.
230 lines
6.7 KiB
Text
230 lines
6.7 KiB
Text
[role="xpack"]
|
|
[[ml-preview-datafeed]]
|
|
= Preview {dfeeds} API
|
|
|
|
[subs="attributes"]
|
|
++++
|
|
<titleabbrev>Preview {dfeeds}</titleabbrev>
|
|
++++
|
|
|
|
Previews a {dfeed}.
|
|
|
|
[[ml-preview-datafeed-request]]
|
|
== {api-request-title}
|
|
|
|
`GET _ml/datafeeds/<datafeed_id>/_preview` +
|
|
|
|
`POST _ml/datafeeds/<datafeed_id>/_preview` +
|
|
|
|
`GET _ml/datafeeds/_preview` +
|
|
|
|
`POST _ml/datafeeds/_preview`
|
|
|
|
[[ml-preview-datafeed-prereqs]]
|
|
== {api-prereq-title}
|
|
|
|
Requires the following privileges:
|
|
|
|
* cluster: `manage_ml` (the `machine_learning_admin` built-in role grants this
|
|
privilege)
|
|
* source index configured in the {dfeed}: `read`.
|
|
|
|
[[ml-preview-datafeed-desc]]
|
|
== {api-description-title}
|
|
|
|
The preview {dfeeds} API returns the first "page" of search results from a
|
|
{dfeed}. You can preview an existing {dfeed} or provide configuration details
|
|
for the {dfeed} and {anomaly-job} in the API. The preview shows the structure of
|
|
the data that will be passed to the anomaly detection engine.
|
|
|
|
IMPORTANT: When {es} {security-features} are enabled, the {dfeed} query is
|
|
previewed using the credentials of the user calling the preview {dfeed} API.
|
|
When the {dfeed} is started it runs the query using the roles of the last user
|
|
to create or update it. If the two sets of roles differ then the preview may
|
|
not accurately reflect what the {dfeed} will return when started. To avoid
|
|
such problems, the same user that creates or updates the {dfeed} should preview
|
|
it to ensure it is returning the expected data. Alternatively, use
|
|
<<http-clients-secondary-authorization,secondary authorization headers>> to
|
|
supply the credentials.
|
|
|
|
[[ml-preview-datafeed-path-parms]]
|
|
== {api-path-parms-title}
|
|
|
|
`<datafeed_id>`::
|
|
(Optional, string)
|
|
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=datafeed-id]
|
|
+
|
|
NOTE: If you provide the `<datafeed_id>` as a path parameter, you cannot
|
|
provide {dfeed} or {anomaly-job} configuration details in the request body.
|
|
|
|
[[ml-preview-datafeed-query-parms]]
|
|
== {api-query-parms-title}
|
|
|
|
`end`::
|
|
(Optional, string) The time that the {dfeed} preview should end. The preview may not go to the end of the provided value
|
|
as only the first page of results are returned. The time can be specified by using one of the following formats:
|
|
+
|
|
--
|
|
* ISO 8601 format with milliseconds, for example `2017-01-22T06:00:00.000Z`
|
|
* ISO 8601 format without milliseconds, for example `2017-01-22T06:00:00+00:00`
|
|
* Milliseconds since the epoch, for example `1485061200000`
|
|
|
|
Date-time arguments using either of the ISO 8601 formats must have a time zone
|
|
designator, where `Z` is accepted as an abbreviation for UTC time.
|
|
|
|
NOTE: When a URL is expected (for example, in browsers), the `+` used in time
|
|
zone designators must be encoded as `%2B`.
|
|
|
|
This value is exclusive.
|
|
--
|
|
|
|
`start`::
|
|
(Optional, string) The time that the {dfeed} preview should begin, which can be
|
|
specified by using the same formats as the `end` parameter. This value is
|
|
inclusive.
|
|
|
|
NOTE: If you don't provide either the `start` or `end` parameter, the {dfeed} preview will search over the entire
|
|
time of data but exclude data within `cold` or `frozen` <<data-tiers, data tiers>>.
|
|
|
|
[[ml-preview-datafeed-request-body]]
|
|
== {api-request-body-title}
|
|
|
|
`datafeed_config`::
|
|
(Optional, object) The {dfeed} definition to preview. For valid definitions, see
|
|
the <<ml-put-datafeed-request-body,create {dfeeds} API>>.
|
|
|
|
`job_config`::
|
|
(Optional, object) The configuration details for the {anomaly-job} that is
|
|
associated with the {dfeed}. If the `datafeed_config` object does not include a
|
|
`job_id` that references an existing {anomaly-job}, you must supply this
|
|
`job_config` object. If you include both a `job_id` and a `job_config`, the
|
|
latter information is used. You cannot specify a `job_config` object unless you also supply a `datafeed_config` object. For valid definitions, see the
|
|
<<ml-put-job-request-body,create {anomaly-jobs} API>>.
|
|
|
|
[[ml-preview-datafeed-example]]
|
|
== {api-examples-title}
|
|
|
|
This is an example of providing the ID of an existing {dfeed}:
|
|
|
|
[source,console]
|
|
--------------------------------------------------
|
|
GET _ml/datafeeds/datafeed-high_sum_total_sales/_preview
|
|
--------------------------------------------------
|
|
// TEST[skip:set up Kibana sample data]
|
|
|
|
The data that is returned for this example is as follows:
|
|
|
|
[source,console-result]
|
|
----
|
|
[
|
|
{
|
|
"order_date" : 1574294659000,
|
|
"category.keyword" : "Men's Clothing",
|
|
"customer_full_name.keyword" : "Sultan Al Benson",
|
|
"taxful_total_price" : 35.96875
|
|
},
|
|
{
|
|
"order_date" : 1574294918000,
|
|
"category.keyword" : [
|
|
"Women's Accessories",
|
|
"Women's Clothing"
|
|
],
|
|
"customer_full_name.keyword" : "Pia Webb",
|
|
"taxful_total_price" : 83.0
|
|
},
|
|
{
|
|
"order_date" : 1574295782000,
|
|
"category.keyword" : [
|
|
"Women's Accessories",
|
|
"Women's Shoes"
|
|
],
|
|
"customer_full_name.keyword" : "Brigitte Graham",
|
|
"taxful_total_price" : 72.0
|
|
}
|
|
]
|
|
----
|
|
|
|
The following example provides {dfeed} and {anomaly-job} configuration
|
|
details in the API:
|
|
|
|
[source,console]
|
|
--------------------------------------------------
|
|
POST _ml/datafeeds/_preview
|
|
{
|
|
"datafeed_config": {
|
|
"indices" : [
|
|
"kibana_sample_data_ecommerce"
|
|
],
|
|
"query" : {
|
|
"bool" : {
|
|
"filter" : [
|
|
{
|
|
"term" : {
|
|
"_index" : "kibana_sample_data_ecommerce"
|
|
}
|
|
}
|
|
]
|
|
}
|
|
},
|
|
"scroll_size" : 1000
|
|
},
|
|
"job_config": {
|
|
"description" : "Find customers spending an unusually high amount in an hour",
|
|
"analysis_config" : {
|
|
"bucket_span" : "1h",
|
|
"detectors" : [
|
|
{
|
|
"detector_description" : "High total sales",
|
|
"function" : "high_sum",
|
|
"field_name" : "taxful_total_price",
|
|
"over_field_name" : "customer_full_name.keyword"
|
|
}
|
|
],
|
|
"influencers" : [
|
|
"customer_full_name.keyword",
|
|
"category.keyword"
|
|
]
|
|
},
|
|
"analysis_limits" : {
|
|
"model_memory_limit" : "10mb"
|
|
},
|
|
"data_description" : {
|
|
"time_field" : "order_date",
|
|
"time_format" : "epoch_ms"
|
|
}
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
// TEST[skip:set up Kibana sample data]
|
|
|
|
The data that is returned for this example is as follows:
|
|
|
|
[source,console-result]
|
|
----
|
|
[
|
|
{
|
|
"order_date" : 1574294659000,
|
|
"category.keyword" : "Men's Clothing",
|
|
"customer_full_name.keyword" : "Sultan Al Benson",
|
|
"taxful_total_price" : 35.96875
|
|
},
|
|
{
|
|
"order_date" : 1574294918000,
|
|
"category.keyword" : [
|
|
"Women's Accessories",
|
|
"Women's Clothing"
|
|
],
|
|
"customer_full_name.keyword" : "Pia Webb",
|
|
"taxful_total_price" : 83.0
|
|
},
|
|
{
|
|
"order_date" : 1574295782000,
|
|
"category.keyword" : [
|
|
"Women's Accessories",
|
|
"Women's Shoes"
|
|
],
|
|
"customer_full_name.keyword" : "Brigitte Graham",
|
|
"taxful_total_price" : 72.0
|
|
}
|
|
]
|
|
----
|