elasticsearch/docs/reference/ml/anomaly-detection/apis/preview-datafeed.asciidoc
Benjamin Trent b796632582
[ML] Allow datafeed and job configs for datafeed preview API (#70836)
Previously, a datafeed and job must already exist for the `_preview` API to work.

With this change, users can get an accurate preview of the data that will be sent to the anomaly detection job
without creating either of them. 

closes https://github.com/elastic/elasticsearch/issues/70264
2021-03-26 12:52:23 -04:00

200 lines
5.6 KiB
Text

[role="xpack"]
[testenv="platinum"]
[[ml-preview-datafeed]]
= Preview {dfeeds} API
[subs="attributes"]
++++
<titleabbrev>Preview {dfeeds}</titleabbrev>
++++
Previews a {dfeed}.
[[ml-preview-datafeed-request]]
== {api-request-title}
`GET _ml/datafeeds/<datafeed_id>/_preview` +
`POST _ml/datafeeds/<datafeed_id>/_preview` +
`GET _ml/datafeeds/_preview` +
`POST _ml/datafeeds/_preview`
[[ml-preview-datafeed-prereqs]]
== {api-prereq-title}
* If {es} {security-features} are enabled, you must have `monitor_ml`, `monitor`,
`manage_ml`, or `manage` cluster privileges to use this API. See
<<security-privileges>> and {ml-docs-setup-privileges}.
[[ml-preview-datafeed-desc]]
== {api-description-title}
The preview {dfeeds} API returns the first "page" of search results from a
{dfeed}. You can preview an existing {dfeed} or provide configuration details
for the {dfeed} and {anomaly-job} in the API. The preview shows the structure of
the data that will be passed to the anomaly detection engine.
IMPORTANT: When {es} {security-features} are enabled, the {dfeed} query is
previewed using the credentials of the user calling the preview {dfeed} API.
When the {dfeed} is started it runs the query using the roles of the last user
to create or update it. If the two sets of roles differ then the preview may
not accurately reflect what the {dfeed} will return when started. To avoid
such problems, the same user that creates or updates the {dfeed} should preview
it to ensure it is returning the expected data. Alternatively, use
<<http-clients-secondary-authorization,secondary authorization headers>> to
supply the credentials.
[[ml-preview-datafeed-path-parms]]
== {api-path-parms-title}
`<datafeed_id>`::
(Optional, string)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=datafeed-id]
+
NOTE: If you provide the `<datafeed_id>` as a path parameter, you cannot
provide {dfeed} or {anomaly-job} configuration details in the request body.
[[ml-preview-datafeed-request-body]]
== {api-request-body-title}
`datafeed_config`::
(Optional, object) The {dfeed} definition to preview. For valid definitions, see
the <<ml-put-datafeed-request-body,create {dfeeds} API>>.
`job_config`::
(Optional, object) The configuration details for the {anomaly-job} that is
associated with the {dfeed}. If the `datafeed_config` object does not include a
`job_id` that references an existing {anomaly-job}, you must supply this
`job_config` object. If you include both a `job_id` and a `job_config`, the
latter information is used. You cannot specify a `job_config` object unless you also supply a `datafeed_config` object. For valid definitions, see the
<<ml-put-job-request-body,create {anomaly-jobs} API>>.
[[ml-preview-datafeed-example]]
== {api-examples-title}
This is an example of providing the ID of an existing {dfeed}:
[source,console]
--------------------------------------------------
GET _ml/datafeeds/datafeed-high_sum_total_sales/_preview
--------------------------------------------------
// TEST[skip:set up Kibana sample data]
The data that is returned for this example is as follows:
[source,console-result]
----
[
{
"order_date" : 1574294659000,
"category.keyword" : "Men's Clothing",
"customer_full_name.keyword" : "Sultan Al Benson",
"taxful_total_price" : 35.96875
},
{
"order_date" : 1574294918000,
"category.keyword" : [
"Women's Accessories",
"Women's Clothing"
],
"customer_full_name.keyword" : "Pia Webb",
"taxful_total_price" : 83.0
},
{
"order_date" : 1574295782000,
"category.keyword" : [
"Women's Accessories",
"Women's Shoes"
],
"customer_full_name.keyword" : "Brigitte Graham",
"taxful_total_price" : 72.0
}
]
----
The following example provides {dfeed} and {anomaly-job} configuration
details in the API:
[source,console]
--------------------------------------------------
POST _ml/datafeeds/_preview
{
"datafeed_config": {
"indices" : [
"kibana_sample_data_ecommerce"
],
"query" : {
"bool" : {
"filter" : [
{
"term" : {
"_index" : "kibana_sample_data_ecommerce"
}
}
]
}
},
"scroll_size" : 1000
},
"job_config": {
"description" : "Find customers spending an unusually high amount in an hour",
"analysis_config" : {
"bucket_span" : "1h",
"detectors" : [
{
"detector_description" : "High total sales",
"function" : "high_sum",
"field_name" : "taxful_total_price",
"over_field_name" : "customer_full_name.keyword"
}
],
"influencers" : [
"customer_full_name.keyword",
"category.keyword"
]
},
"analysis_limits" : {
"model_memory_limit" : "10mb"
},
"data_description" : {
"time_field" : "order_date",
"time_format" : "epoch_ms"
}
}
}
--------------------------------------------------
// TEST[skip:set up Kibana sample data]
The data that is returned for this example is as follows:
[source,console-result]
----
[
{
"order_date" : 1574294659000,
"category.keyword" : "Men's Clothing",
"customer_full_name.keyword" : "Sultan Al Benson",
"taxful_total_price" : 35.96875
},
{
"order_date" : 1574294918000,
"category.keyword" : [
"Women's Accessories",
"Women's Clothing"
],
"customer_full_name.keyword" : "Pia Webb",
"taxful_total_price" : 83.0
},
{
"order_date" : 1574295782000,
"category.keyword" : [
"Women's Accessories",
"Women's Shoes"
],
"customer_full_name.keyword" : "Brigitte Graham",
"taxful_total_price" : 72.0
}
]
----