elasticsearch/docs/reference/ingest/processors/inference.asciidoc
Benjamin Trent 4e1ff31c3c
[ML] add new inference_config field to trained model config (#54421)
A new field called `inference_config` is now added to the trained model config object. This new field allows for default inference settings from analytics or some external model builder. 

The inference processor can still override whatever is set as the default in the trained model config.
2020-04-02 10:34:17 -04:00

175 lines
5.2 KiB
Text

[role="xpack"]
[testenv="basic"]
[[inference-processor]]
=== {infer-cap} Processor
Uses a pre-trained {dfanalytics} model to infer against the data that is being
ingested in the pipeline.
[[inference-options]]
.{infer-cap} Options
[options="header"]
|======
| Name | Required | Default | Description
| `model_id` | yes | - | (String) The ID of the model to load and infer against.
| `target_field` | no | `ml.inference.<processor_tag>` | (String) Field added to incoming documents to contain results objects.
| `field_map` | yes | - | (Object) Maps the document field names to the known field names of the model. This mapping takes precedence over any default mappings provided in the model configuration.
| `inference_config` | yes | - | (Object) Contains the inference type and its options. There are two types: <<inference-processor-regression-opt,`regression`>> and <<inference-processor-classification-opt,`classification`>>.
include::common-options.asciidoc[]
|======
[source,js]
--------------------------------------------------
{
"inference": {
"model_id": "flight_delay_regression-1571767128603",
"target_field": "FlightDelayMin_prediction_infer",
"field_map": {},
"inference_config": { "regression": {} }
}
}
--------------------------------------------------
// NOTCONSOLE
[discrete]
[[inference-processor-regression-opt]]
==== {regression-cap} configuration options
Regression configuration for inference.
`results_field`::
(Optional, string)
include::{docdir}/ml/ml-shared.asciidoc[tag=inference-config-regression-results-field]
`num_top_feature_importance_values`::
(Optional, integer)
include::{docdir}/ml/ml-shared.asciidoc[tag=inference-config-regression-num-top-feature-importance-values]
[discrete]
[[inference-processor-classification-opt]]
==== {classification-cap} configuration options
Classification configuration for inference.
`num_top_classes`::
(Optional, integer)
include::{docdir}/ml/ml-shared.asciidoc[tag=inference-config-classification-num-top-classes]
`num_top_feature_importance_values`::
(Optional, integer)
include::{docdir}/ml/ml-shared.asciidoc[tag=inference-config-classification-num-top-feature-importance-values]
`results_field`::
(Optional, string)
include::{docdir}/ml/ml-shared.asciidoc[tag=inference-config-classification-results-field]
`top_classes_results_field`::
(Optional, string)
include::{docdir}/ml/ml-shared.asciidoc[tag=inference-config-classification-top-classes-results-field]
[discrete]
[[inference-processor-config-example]]
==== `inference_config` examples
[source,js]
--------------------------------------------------
{
"inference_config": {
"regression": {
"results_field": "my_regression"
}
}
}
--------------------------------------------------
// NOTCONSOLE
This configuration specifies a `regression` inference and the results are
written to the `my_regression` field contained in the `target_field` results
object.
[source,js]
--------------------------------------------------
{
"inference_config": {
"classification": {
"num_top_classes": 2,
"results_field": "prediction",
"top_classes_results_field": "probabilities"
}
}
}
--------------------------------------------------
// NOTCONSOLE
This configuration specifies a `classification` inference. The number of
categories for which the predicted probabilities are reported is 2
(`num_top_classes`). The result is written to the `prediction` field and the top
classes to the `probabilities` field. Both fields are contained in the
`target_field` results object.
[discrete]
[[inference-processor-feature-importance]]
==== {feat-imp-cap} object mapping
Update your index mapping of the {feat-imp} result field as you can see below to
get the full benefit of aggregating and searching for
{ml-docs}/dfa-classification.html#dfa-classification-feature-importance[{feat-imp}].
[source,js]
--------------------------------------------------
"ml.inference.feature_importance": {
"type": "nested",
"dynamic": true,
"properties": {
"feature_name": {
"type": "keyword"
},
"importance": {
"type": "double"
}
}
}
--------------------------------------------------
// NOTCONSOLE
The mapping field name for {feat-imp} is compounded as follows:
`<ml.inference.target_field>`.`<inference.tag>`.`feature_importance`
If `inference.tag` is not provided in the processor definition, it is not part
of the field path. The `<ml.inference.target_field>` defaults to `ml.inference`.
For example, you provide a tag `foo` in the definition as you can see below:
[source,js]
--------------------------------------------------
{
"tag": "foo",
...
}
--------------------------------------------------
// NOTCONSOLE
The {feat-imp} value is written to the `ml.inference.foo.feature_importance`
field.
You can also specify a target field as follows:
[source,js]
--------------------------------------------------
{
"tag": "foo",
"target_field": "my_field"
}
--------------------------------------------------
// NOTCONSOLE
In this case, {feat-imp} is exposed in the
`my_field.foo.feature_importance` field.