* reorg files for docs-assembler and create toc.yml files * fix build error, add redirects * only toc * move images
5.6 KiB
navigation_title | mapped_pages | |
---|---|---|
{{infer-cap}} bucket |
|
{{infer-cap}} bucket aggregation [search-aggregations-pipeline-inference-bucket-aggregation]
A parent pipeline aggregation which loads a pre-trained model and performs {{infer}} on the collated result fields from the parent bucket aggregation.
To use the {{infer}} bucket aggregation, you need to have the same security privileges that are required for using the get trained models API.
Syntax [inference-bucket-agg-syntax]
A inference
aggregation looks like this in isolation:
{
"inference": {
"model_id": "a_model_for_inference", <1>
"inference_config": { <2>
"regression_config": {
"num_top_feature_importance_values": 2
}
},
"buckets_path": {
"avg_cost": "avg_agg", <3>
"max_cost": "max_agg"
}
}
}
- The unique identifier or alias for the trained model.
- The optional inference config which overrides the model’s default settings
- Map the value of
avg_agg
to the model’s input fieldavg_cost
$$$inference-bucket-params$
Parameter Name | Description | Required | Default Value |
---|---|---|---|
model_id |
The ID or alias for the trained model. | Required | - |
inference_config |
Contains the inference type and its options. There are two types: regression and classification |
Optional | - |
buckets_path |
Defines the paths to the input aggregations and maps the aggregation names to the field names expected by the model.See buckets_path Syntax for more details |
Required | - |
Configuration options for {{infer}} models [_configuration_options_for_infer_models]
The inference_config
setting is optional and usually isn’t required as the pre-trained models come equipped with sensible defaults. In the context of aggregations some options can be overridden for each of the two types of model.
Configuration options for {{regression}} models [inference-agg-regression-opt]
num_top_feature_importance_values
- (Optional, integer) Specifies the maximum number of {{feat-imp}} values per document. By default, it is zero and no {{feat-imp}} calculation occurs.
Configuration options for {{classification}} models [inference-agg-classification-opt]
num_top_classes
- (Optional, integer) Specifies the number of top class predictions to return. Defaults to 0.
num_top_feature_importance_values
- (Optional, integer) Specifies the maximum number of {{feat-imp}} values per document. Defaults to 0 which means no {{feat-imp}} calculation occurs.
prediction_field_type
- (Optional, string) Specifies the type of the predicted field to write. Valid values are:
string
,number
,boolean
. Whenboolean
is provided1.0
is transformed totrue
and0.0
tofalse
.
Example [inference-bucket-agg-example]
The following snippet aggregates a web log by client_ip
and extracts a number of features via metric and bucket sub-aggregations as input to the {{infer}} aggregation configured with a model trained to identify suspicious client IPs:
GET kibana_sample_data_logs/_search
{
"size": 0,
"aggs": {
"client_ip": { <1>
"composite": {
"sources": [
{
"client_ip": {
"terms": {
"field": "clientip"
}
}
}
]
},
"aggs": { <2>
"url_dc": {
"cardinality": {
"field": "url.keyword"
}
},
"bytes_sum": {
"sum": {
"field": "bytes"
}
},
"geo_src_dc": {
"cardinality": {
"field": "geo.src"
}
},
"geo_dest_dc": {
"cardinality": {
"field": "geo.dest"
}
},
"responses_total": {
"value_count": {
"field": "timestamp"
}
},
"success": {
"filter": {
"term": {
"response": "200"
}
}
},
"error404": {
"filter": {
"term": {
"response": "404"
}
}
},
"error503": {
"filter": {
"term": {
"response": "503"
}
}
},
"malicious_client_ip": { <3>
"inference": {
"model_id": "malicious_clients_model",
"buckets_path": {
"response_count": "responses_total",
"url_dc": "url_dc",
"bytes_sum": "bytes_sum",
"geo_src_dc": "geo_src_dc",
"geo_dest_dc": "geo_dest_dc",
"success": "success._count",
"error404": "error404._count",
"error503": "error503._count"
}
}
}
}
}
}
}
- A composite bucket aggregation that aggregates the data by
client_ip
. - A series of metrics and bucket sub-aggregations.
- {{infer-cap}} bucket aggregation that specifies the trained model and maps the aggregation names to the model’s input fields.