[role="xpack"]
[[put-trained-models]]
= Create trained models API
[subs="attributes"]
++++
<titleabbrev>Create trained models</titleabbrev>
++++

Creates a trained model.

WARNING: Models created in version 7.8.0 are not backwards compatible
         with older node versions. If in a mixed cluster environment,
         all nodes must be at least 7.8.0 to use a model stored by
         a 7.8.0 node.


[[ml-put-trained-models-request]]
== {api-request-title}

`PUT _ml/trained_models/<model_id>`


[[ml-put-trained-models-prereq]]
== {api-prereq-title}

Requires the `manage_ml` cluster privilege. This privilege is included in the
`machine_learning_admin` built-in role.


[[ml-put-trained-models-desc]]
== {api-description-title}

The create trained model API enables you to supply a trained model that is not
created by {dfanalytics}.

[[ml-put-trained-models-path-params]]
== {api-path-parms-title}

`<model_id>`::
(Required, string)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=model-id]

[[ml-put-trained-models-query-params]]
== {api-query-parms-title}

`defer_definition_decompression`::
(Optional, boolean)
If set to `true` and a `compressed_definition` is provided, the request defers
definition decompression and skips relevant validations.
This deferral is useful for systems or users that know a good byte size estimate for their
model and know that their model is valid and likely won't fail during inference.


[role="child_attributes"]
[[ml-put-trained-models-request-body]]
== {api-request-body-title}

`compressed_definition`::
(Required, string)
The compressed (GZipped and Base64 encoded) {infer} definition of the model.
If `compressed_definition` is specified, then `definition` cannot be specified.

//Begin definition
`definition`::
(Required, object)
The {infer} definition for the model. If `definition` is specified, then
`compressed_definition` cannot be specified.
+
.Properties of `definition`
[%collapsible%open]
====
//Begin preprocessors
`preprocessors`::
(Optional, object)
Collection of preprocessors. See <<ml-put-trained-models-preprocessor-example>>.
+
.Properties of `preprocessors`
[%collapsible%open]
=====
//Begin frequency encoding
`frequency_encoding`::
(Required, object)
Defines a frequency encoding for a field.
+
.Properties of `frequency_encoding`
[%collapsible%open]
======
`feature_name`::
(Required, string)
The name of the resulting feature.

`field`::
(Required, string)
The field name to encode.

`frequency_map`::
(Required, object map of string:double)
Object that maps the field value to the frequency encoded value.

`custom`::
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=custom-preprocessor]

======
//End frequency encoding

//Begin one hot encoding
`one_hot_encoding`::
(Required, object)
Defines a one hot encoding map for a field.
+
.Properties of `one_hot_encoding`
[%collapsible%open]
======
`field`::
(Required, string)
The field name to encode.

`hot_map`::
(Required, object map of strings)
String map of "field_value: one_hot_column_name".

`custom`::
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=custom-preprocessor]

======
//End one hot encoding

//Begin target mean encoding
`target_mean_encoding`::
(Required, object)
Defines a target mean encoding for a field.
+
.Properties of `target_mean_encoding`
[%collapsible%open]
======
`default_value`:::
(Required, double)
The feature value if the field value is not in the `target_map`.

`feature_name`:::
(Required, string)
The name of the resulting feature.

`field`:::
(Required, string)
The field name to encode.

`target_map`:::
(Required, object map of string:double)
Object that maps the field value to the target mean value.

`custom`::
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=custom-preprocessor]

======
//End target mean encoding
=====
//End preprocessors

//Begin trained model
`trained_model`::
(Required, object)
The definition of the trained model.
+
.Properties of `trained_model`
[%collapsible%open]
=====
//Begin tree
`tree`::
(Required, object)
The definition for a binary decision tree.
+
.Properties of `tree`
[%collapsible%open]
======
`classification_labels`:::
(Optional, string) An array of classification labels (used for
`classification`).

`feature_names`:::
(Required, string)
Features expected by the tree, in their expected order.

`target_type`:::
(Required, string)
String indicating the model target type; `regression` or `classification`.

`tree_structure`:::
(Required, object)
An array of `tree_node` objects. The nodes must be in ordinal order by their
`tree_node.node_index` value.
======
//End tree

//Begin tree node
`tree_node`::
(Required, object)
The definition of a node in a tree.
+
--
There are two major types of nodes: leaf nodes and not-leaf nodes.

* Leaf nodes only need `node_index` and `leaf_value` defined.
* All other nodes need `split_feature`, `left_child`, `right_child`,
  `threshold`, `decision_type`, and `default_left` defined.
--
+
.Properties of `tree_node`
[%collapsible%open]
======
`decision_type`::
(Optional, string)
Indicates the positive value (in other words, when to choose the left node)
decision type. Supported `lt`, `lte`, `gt`, `gte`. Defaults to `lte`.

`default_left`::
(Optional, Boolean)
Indicates whether to default to the left when the feature is missing. Defaults
to `true`.

`leaf_value`::
(Optional, double)
The leaf value of the of the node, if the value is a leaf (in other words, no
children).

`left_child`::
(Optional, integer)
The index of the left child.

`node_index`::
(Integer)
The index of the current node.

`right_child`::
(Optional, integer)
The index of the right child.

`split_feature`::
(Optional, integer)
The index of the feature value in the feature array.

`split_gain`::
(Optional, double) The information gain from the split.

`threshold`::
(Optional, double)
The decision threshold with which to compare the feature value.
======
//End tree node

//Begin ensemble
`ensemble`::
(Optional, object)
The definition for an ensemble model. See <<ml-put-trained-models-model-example>>.
+
.Properties of `ensemble`
[%collapsible%open]
======
//Begin aggregate output
`aggregate_output`::
(Required, object)
An aggregated output object that defines how to aggregate the outputs of the
`trained_models`. Supported objects are `weighted_mode`, `weighted_sum`, and
`logistic_regression`. See <<ml-put-trained-models-aggregated-output-example>>.
+
.Properties of `aggregate_output`
[%collapsible%open]
=======
//Begin logistic regression
`logistic_regression`::
(Optional, object)
This `aggregated_output` type works with binary classification (classification
for values [0, 1]). It multiplies the outputs (in the case of the `ensemble`
model, the inference model values) by the supplied `weights`. The resulting
vector is summed and passed to a
{wikipedia}/Sigmoid_function[`sigmoid` function]. The result
of the `sigmoid` function is considered the probability of class 1 (`P_1`),
consequently, the probability of class 0 is `1 - P_1`. The class with the
highest probability (either 0 or 1) is then returned. For more information about
logistic regression, see
{wikipedia}/Logistic_regression[this wiki article].
+
.Properties of `logistic_regression`
[%collapsible%open]
========
`weights`:::
(Required, double)
The weights to multiply by the input values (the inference values of the trained
models).
========
//End logistic regression

//Begin weighted sum
`weighted_sum`::
(Optional, object)
This `aggregated_output` type works with regression. The weighted sum of the
input values.
+
.Properties of `weighted_sum`
[%collapsible%open]
========
`weights`:::
(Required, double)
The weights to multiply by the input values (the inference values of the trained
models).
========
//End weighted sum

//Begin weighted mode
`weighted_mode`::
(Optional, object)
This `aggregated_output` type works with regression or classification. It takes
a weighted vote of the input values. The most common input value (taking the
weights into account) is returned.
+
.Properties of `weighted_mode`
[%collapsible%open]
========
`weights`:::
(Required, double)
The weights to multiply by the input values (the inference values of the trained
models).
========
//End weighted mode

//Begin exponent
`exponent`::
(Optional, object)
This `aggregated_output` type works with regression. It takes a weighted sum of
the input values and passes the result to an exponent function
(`e^x` where `x` is the sum of the weighted values).
+
.Properties of `exponent`
[%collapsible%open]
========
`weights`:::
(Required, double)
The weights to multiply by the input values (the inference values of the trained
models).
========
//End exponent
=======
//End aggregate output

`classification_labels`::
(Optional, string)
An array of classification labels.

`feature_names`::
(Optional, string)
Features expected by the ensemble, in their expected order.

`target_type`::
(Required, string)
String indicating the model target type; `regression` or `classification.`

`trained_models`::
(Required, object)
An array of `trained_model` objects. Supported trained models are `tree` and
`ensemble`.
======
//End ensemble

=====
//End trained model

====
//End definition

`description`::
(Optional, string)
A human-readable description of the {infer} trained model.

`estimated_heap_memory_usage_bytes`::
(Optional, integer) deprecated:[7.16.0,Replaced by `model_size_bytes`]

`estimated_operations`::
(Optional, integer)
The estimated number of operations to use the trained model during inference.
This property is supported only if `defer_definition_decompression` is `true` or
the model definition is not supplied.

//Begin inference_config
`inference_config`::
(Required, object)
The default configuration for inference. This can be: `regression`,
`classification`, `fill_mask`, `ner`, `question_answering`,
`text_classification`, `text_embedding` or `zero_shot_classification`.
If `regression` or `classification`, it must match the `target_type` of the
underlying `definition.trained_model`. If `fill_mask`, `ner`,
`question_answering`, `text_classification`, or `text_embedding`; the
`model_type` must be `pytorch`.
+
.Properties of `inference_config`
[%collapsible%open]
====
`classification`:::
(Optional, object)
Classification configuration for inference.
+
.Properties of classification inference
[%collapsible%open]
=====
`num_top_classes`::::
(Optional, integer)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-classification-num-top-classes]

`num_top_feature_importance_values`::::
(Optional, integer)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-classification-num-top-feature-importance-values]

`prediction_field_type`::::
(Optional, string)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-classification-prediction-field-type]

`results_field`::::
(Optional, string)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-results-field]

`top_classes_results_field`::::
(Optional, string)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-classification-top-classes-results-field]
=====

`fill_mask`:::
(Optional, object)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-fill-mask]
+
.Properties of fill_mask inference
[%collapsible%open]
=====
`num_top_classes`::::
(Optional, integer)
Number of top predicted tokens to return for replacing the mask token. Defaults to `0`.

`results_field`::::
(Optional, string)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-results-field]

`tokenization`::::
(Optional, object)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization]
+
.Properties of tokenization
[%collapsible%open]
======
`bert`::::
(Optional, object)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-bert]
+
.Properties of bert
[%collapsible%open]
=======
`do_lower_case`::::
(Optional, boolean)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-do-lower-case]

`max_sequence_length`::::
(Optional, integer)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-max-sequence-length]

`truncate`::::
(Optional, string)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-truncate]

`with_special_tokens`::::
(Optional, boolean)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-bert-with-special-tokens]
=======
`roberta`::::
(Optional, object)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-roberta]
+
.Properties of roberta
[%collapsible%open]
=======
`add_prefix_space`::::
(Optional, boolean)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-roberta-add-prefix-space]

`max_sequence_length`::::
(Optional, integer)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-max-sequence-length]

`truncate`::::
(Optional, string)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-truncate]

`with_special_tokens`::::
(Optional, boolean)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-roberta-with-special-tokens]
=======
`mpnet`::::
(Optional, object)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-mpnet]
+
.Properties of mpnet
[%collapsible%open]
=======
`do_lower_case`::::
(Optional, boolean)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-do-lower-case]

`max_sequence_length`::::
(Optional, integer)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-max-sequence-length]

`truncate`::::
(Optional, string)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-truncate]

`with_special_tokens`::::
(Optional, boolean)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-mpnet-with-special-tokens]
=======
======
=====

`ner`:::
(Optional, object)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-ner]
+
.Properties of ner inference
[%collapsible%open]
=====
`classification_labels`::::
(Optional, string)
An array of classification labels. NER only supports Inside-Outside-Beginning
labels (IOB) and only persons, organizations, locations, and miscellaneous.
Example: ["O", "B-PER", "I-PER", "B-ORG", "I-ORG", "B-LOC", "I-LOC", "B-MISC",
"I-MISC"]

`results_field`::::
(Optional, string)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-results-field]

`tokenization`::::
(Optional, object)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization]
+
.Properties of tokenization
[%collapsible%open]
======
`bert`::::
(Optional, object)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-bert]
+
.Properties of bert
[%collapsible%open]
=======
`do_lower_case`::::
(Optional, boolean)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-do-lower-case]

`max_sequence_length`::::
(Optional, integer)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-max-sequence-length]

`truncate`::::
(Optional, string)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-truncate]

`with_special_tokens`::::
(Optional, boolean)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-bert-with-special-tokens]
=======
`roberta`::::
(Optional, object)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-roberta]
+
.Properties of roberta
[%collapsible%open]
=======
`add_prefix_space`::::
(Optional, boolean)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-roberta-add-prefix-space]

`max_sequence_length`::::
(Optional, integer)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-max-sequence-length]

`truncate`::::
(Optional, string)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-truncate]

`with_special_tokens`::::
(Optional, boolean)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-roberta-with-special-tokens]
=======
`mpnet`::::
(Optional, object)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-mpnet]
+
.Properties of mpnet
[%collapsible%open]
=======
`do_lower_case`::::
(Optional, boolean)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-do-lower-case]

`max_sequence_length`::::
(Optional, integer)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-max-sequence-length]

`truncate`::::
(Optional, string)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-truncate]

`with_special_tokens`::::
(Optional, boolean)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-mpnet-with-special-tokens]
=======
======
=====

`pass_through`:::
(Optional, object)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-pass-through]
+
.Properties of pass_through inference
[%collapsible%open]
=====
`results_field`::::
(Optional, string)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-results-field]

`tokenization`::::
(Optional, object)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization]
+
.Properties of tokenization
[%collapsible%open]
======
`bert`::::
(Optional, object)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-bert]
+
.Properties of bert
[%collapsible%open]
=======
`do_lower_case`::::
(Optional, boolean)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-do-lower-case]

`max_sequence_length`::::
(Optional, integer)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-max-sequence-length]

`truncate`::::
(Optional, string)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-truncate]

`with_special_tokens`::::
(Optional, boolean)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-bert-with-special-tokens]
=======
`roberta`::::
(Optional, object)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-roberta]
+
.Properties of roberta
[%collapsible%open]
=======
`add_prefix_space`::::
(Optional, boolean)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-roberta-add-prefix-space]

`max_sequence_length`::::
(Optional, integer)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-max-sequence-length]

`truncate`::::
(Optional, string)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-truncate]

`with_special_tokens`::::
(Optional, boolean)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-roberta-with-special-tokens]
=======
`mpnet`::::
(Optional, object)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-mpnet]
+
.Properties of mpnet
[%collapsible%open]
=======
`do_lower_case`::::
(Optional, boolean)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-do-lower-case]

`max_sequence_length`::::
(Optional, integer)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-max-sequence-length]

`truncate`::::
(Optional, string)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-truncate]

`with_special_tokens`::::
(Optional, boolean)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-mpnet-with-special-tokens]
=======
======
=====

`question_answering`:::
(Optional, object)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-question-answering]
+
.Properties of question_answering inference
[%collapsible%open]
=====
`max_answer_length`::::
(Optional, integer)
The maximum amount of words in the answer. Defaults to `15`.

`results_field`::::
(Optional, string)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-results-field]

`tokenization`::::
(Optional, object)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization]
+
Recommended to set `max_sentence_length` to `386` with `128` of `span` and set
`truncate` to `none`.
+
.Properties of tokenization
[%collapsible%open]
======
`bert`::::
(Optional, object)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-bert]
+
.Properties of bert
[%collapsible%open]
=======
`do_lower_case`::::
(Optional, boolean)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-do-lower-case]

`max_sequence_length`::::
(Optional, integer)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-max-sequence-length]

`span`::::
(Optional, integer)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-span]

`truncate`::::
(Optional, string)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-truncate]

`with_special_tokens`::::
(Optional, boolean)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-bert-with-special-tokens]
=======
`roberta`::::
(Optional, object)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-roberta]
+
.Properties of roberta
[%collapsible%open]
=======
`add_prefix_space`::::
(Optional, boolean)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-roberta-add-prefix-space]

`max_sequence_length`::::
(Optional, integer)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-max-sequence-length]

`span`::::
(Optional, integer)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-span]

`truncate`::::
(Optional, string)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-truncate]

`with_special_tokens`::::
(Optional, boolean)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-roberta-with-special-tokens]
=======
`mpnet`::::
(Optional, object)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-mpnet]
+
.Properties of mpnet
[%collapsible%open]
=======
`do_lower_case`::::
(Optional, boolean)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-do-lower-case]

`max_sequence_length`::::
(Optional, integer)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-max-sequence-length]

`span`::::
(Optional, integer)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-span]

`truncate`::::
(Optional, string)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-truncate]

`with_special_tokens`::::
(Optional, boolean)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-mpnet-with-special-tokens]
=======
======
=====

`regression`:::
(Optional, object)
Regression configuration for inference.
+
.Properties of regression inference
[%collapsible%open]
=====
`num_top_feature_importance_values`::::
(Optional, integer)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-regression-num-top-feature-importance-values]

`results_field`::::
(Optional, string)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-results-field]
=====

`text_classification`:::
(Optional, object)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-text-classification]
+
.Properties of text_classification inference
[%collapsible%open]
=====
`classification_labels`::::
(Optional, string) An array of classification labels.

`num_top_classes`::::
(Optional, integer)
Specifies the number of top class predictions to return. Defaults to all classes (-1).

`results_field`::::
(Optional, string)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-results-field]

`tokenization`::::
(Optional, object)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization]
+
.Properties of tokenization
[%collapsible%open]
======
`bert`::::
(Optional, object)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-bert]
+
.Properties of bert
[%collapsible%open]
=======
`do_lower_case`::::
(Optional, boolean)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-do-lower-case]

`max_sequence_length`::::
(Optional, integer)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-max-sequence-length]

`span`::::
(Optional, integer)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-span]

`truncate`::::
(Optional, string)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-truncate]

`with_special_tokens`::::
(Optional, boolean)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-bert-with-special-tokens]
=======
`roberta`::::
(Optional, object)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-roberta]
+
.Properties of roberta
[%collapsible%open]
=======
`add_prefix_space`::::
(Optional, boolean)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-roberta-add-prefix-space]

`max_sequence_length`::::
(Optional, integer)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-max-sequence-length]

`span`::::
(Optional, integer)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-span]

`truncate`::::
(Optional, string)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-truncate]

`with_special_tokens`::::
(Optional, boolean)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-roberta-with-special-tokens]
=======
`mpnet`::::
(Optional, object)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-mpnet]
+
.Properties of mpnet
[%collapsible%open]
=======
`do_lower_case`::::
(Optional, boolean)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-do-lower-case]

`max_sequence_length`::::
(Optional, integer)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-max-sequence-length]

`truncate`::::
(Optional, string)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-truncate]

`with_special_tokens`::::
(Optional, boolean)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-mpnet-with-special-tokens]
=======
======
=====
`text_embedding`:::
(Object, optional)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-text-embedding]
+
.Properties of text_embedding inference
[%collapsible%open]
=====
`results_field`::::
(Optional, string)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-results-field]

`tokenization`::::
(Optional, object)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization]
+
.Properties of tokenization
[%collapsible%open]
======
`bert`::::
(Optional, object)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-bert]
+
.Properties of bert
[%collapsible%open]
=======
`do_lower_case`::::
(Optional, boolean)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-do-lower-case]

`max_sequence_length`::::
(Optional, integer)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-max-sequence-length]

`truncate`::::
(Optional, string)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-truncate]

`with_special_tokens`::::
(Optional, boolean)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-bert-with-special-tokens]
=======
`roberta`::::
(Optional, object)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-roberta]
+
.Properties of roberta
[%collapsible%open]
=======
`add_prefix_space`::::
(Optional, boolean)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-roberta-add-prefix-space]

`max_sequence_length`::::
(Optional, integer)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-max-sequence-length]

`truncate`::::
(Optional, string)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-truncate]

`with_special_tokens`::::
(Optional, boolean)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-roberta-with-special-tokens]
=======
`mpnet`::::
(Optional, object)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-mpnet]
+
.Properties of mpnet
[%collapsible%open]
=======
`do_lower_case`::::
(Optional, boolean)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-do-lower-case]

`max_sequence_length`::::
(Optional, integer)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-max-sequence-length]

`truncate`::::
(Optional, string)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-truncate]

`with_special_tokens`::::
(Optional, boolean)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-mpnet-with-special-tokens]
=======
======
=====
`text_similarity`::::
(Object, optional)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-text-similarity]
+
.Properties of text_similarity inference
[%collapsible%open]
=====
`span_score_combination_function`::::
(Optional, string)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-text-similarity-span-score-func]

`tokenization`::::
(Optional, object)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization]
+
.Properties of tokenization
[%collapsible%open]
======
`bert`::::
(Optional, object)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-bert]
+
.Properties of bert
[%collapsible%open]
=======
`do_lower_case`::::
(Optional, boolean)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-do-lower-case]

`max_sequence_length`::::
(Optional, integer)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-max-sequence-length]

`span`::::
(Optional, integer)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-span]

`truncate`::::
(Optional, string)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-truncate]

`with_special_tokens`::::
(Optional, boolean)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-bert-with-special-tokens]
=======
`roberta`::::
(Optional, object)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-roberta]
+
.Properties of roberta
[%collapsible%open]
=======
`add_prefix_space`::::
(Optional, boolean)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-roberta-add-prefix-space]

`max_sequence_length`::::
(Optional, integer)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-max-sequence-length]

`span`::::
(Optional, integer)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-span]

`truncate`::::
(Optional, string)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-truncate]

`with_special_tokens`::::
(Optional, boolean)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-roberta-with-special-tokens]
=======
`mpnet`::::
(Optional, object)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-mpnet]
+
.Properties of mpnet
[%collapsible%open]
=======
`do_lower_case`::::
(Optional, boolean)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-do-lower-case]

`max_sequence_length`::::
(Optional, integer)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-max-sequence-length]

`span`::::
(Optional, integer)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-span]

`truncate`::::
(Optional, string)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-truncate]

`with_special_tokens`::::
(Optional, boolean)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-mpnet-with-special-tokens]
=======
======
=====
`zero_shot_classification`:::
(Object, optional)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-zero-shot-classification]
+
.Properties of zero_shot_classification inference
[%collapsible%open]
=====
`classification_labels`::::
(Required, array)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-zero-shot-classification-classification-labels]

`hypothesis_template`::::
(Optional, string)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-zero-shot-classification-hypothesis-template]

`labels`::::
(Optional, array)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-zero-shot-classification-labels]

`multi_label`::::
(Optional, boolean)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-zero-shot-classification-multi-label]

`results_field`::::
(Optional, string)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-results-field]

`tokenization`::::
(Optional, object)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization]
+
.Properties of tokenization
[%collapsible%open]
======
`bert`::::
(Optional, object)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-bert]
+
.Properties of bert
[%collapsible%open]
=======
`do_lower_case`::::
(Optional, boolean)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-do-lower-case]

`max_sequence_length`::::
(Optional, integer)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-max-sequence-length]

`truncate`::::
(Optional, string)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-truncate]

`with_special_tokens`::::
(Optional, boolean)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-bert-with-special-tokens]
=======
`roberta`::::
(Optional, object)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-roberta]
+
.Properties of roberta
[%collapsible%open]
=======
`add_prefix_space`::::
(Optional, boolean)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-roberta-add-prefix-space]

`max_sequence_length`::::
(Optional, integer)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-max-sequence-length]

`truncate`::::
(Optional, string)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-truncate]

`with_special_tokens`::::
(Optional, boolean)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-roberta-with-special-tokens]
=======
`mpnet`::::
(Optional, object)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-mpnet]
+
.Properties of mpnet
[%collapsible%open]
=======
`do_lower_case`::::
(Optional, boolean)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-do-lower-case]

`max_sequence_length`::::
(Optional, integer)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-max-sequence-length]

`truncate`::::
(Optional, string)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-truncate]

`with_special_tokens`::::
(Optional, boolean)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=inference-config-nlp-tokenization-mpnet-with-special-tokens]
=======
======
=====
====
//End of inference_config

//Begin input
`input`::
(Required, object)
The input field names for the model definition.
+
.Properties of `input`
[%collapsible%open]
====
`field_names`:::
(Required, string)
An array of input field names for the model.
====
//End input

// Begin location
`location`::
(Optional, object)
The model definition location. If the `definition` or `compressed_definition`
are not specified, the `location` is required.
+
.Properties of `location`
[%collapsible%open]
====
`index`:::
(Required, object)
Indicates that the model definition is stored in an index. This object must be
empty as the index for storing model definitions is configured automatically.
====
// End location

`metadata`::
(Optional, object)
An object map that contains metadata about the model.

`model_size_bytes`::
(Optional, integer)
The estimated memory usage in bytes to keep the trained model in memory. This
property is supported only if `defer_definition_decompression` is `true` or the
model definition is not supplied.

`model_type`::
(Optional, string)
The created model type. By default the model type is `tree_ensemble`.
Appropriate types are:
+
--
* `tree_ensemble`: The model definition is an ensemble model of decision trees.
* `lang_ident`: A special type reserved for language identification models.
* `pytorch`: The stored definition is a PyTorch (specifically a TorchScript) model. Currently only
NLP models are supported. For more information, refer to {ml-docs}/ml-nlp.html[{nlp-cap}].
--

`tags`::
(Optional, string)
An array of tags to organize the model.


[[ml-put-trained-models-example]]
== {api-examples-title}

[[ml-put-trained-models-preprocessor-example]]
=== Preprocessor examples

The example below shows a `frequency_encoding` preprocessor object:

[source,js]
----------------------------------
{
   "frequency_encoding":{
      "field":"FlightDelayType",
      "feature_name":"FlightDelayType_frequency",
      "frequency_map":{
         "Carrier Delay":0.6007414737092798,
         "NAS Delay":0.6007414737092798,
         "Weather Delay":0.024573576178086153,
         "Security Delay":0.02476631010889467,
         "No Delay":0.6007414737092798,
         "Late Aircraft Delay":0.6007414737092798
      }
   }
}
----------------------------------
//NOTCONSOLE


The next example shows a `one_hot_encoding` preprocessor object:

[source,js]
----------------------------------
{
   "one_hot_encoding":{
      "field":"FlightDelayType",
      "hot_map":{
         "Carrier Delay":"FlightDelayType_Carrier Delay",
         "NAS Delay":"FlightDelayType_NAS Delay",
         "No Delay":"FlightDelayType_No Delay",
         "Late Aircraft Delay":"FlightDelayType_Late Aircraft Delay"
      }
   }
}
----------------------------------
//NOTCONSOLE


This example shows a `target_mean_encoding` preprocessor object:

[source,js]
----------------------------------
{
   "target_mean_encoding":{
      "field":"FlightDelayType",
      "feature_name":"FlightDelayType_targetmean",
      "target_map":{
         "Carrier Delay":39.97465788139886,
         "NAS Delay":39.97465788139886,
         "Security Delay":203.171206225681,
         "Weather Delay":187.64705882352948,
         "No Delay":39.97465788139886,
         "Late Aircraft Delay":39.97465788139886
      },
      "default_value":158.17995752420433
   }
}
----------------------------------
//NOTCONSOLE


[[ml-put-trained-models-model-example]]
=== Model examples

The first example shows a `trained_model` object:

[source,js]
----------------------------------
{
   "tree":{
      "feature_names":[
         "DistanceKilometers",
         "FlightTimeMin",
         "FlightDelayType_NAS Delay",
         "Origin_targetmean",
         "DestRegion_targetmean",
         "DestCityName_targetmean",
         "OriginAirportID_targetmean",
         "OriginCityName_frequency",
         "DistanceMiles",
         "FlightDelayType_Late Aircraft Delay"
      ],
      "tree_structure":[
         {
            "decision_type":"lt",
            "threshold":9069.33437193022,
            "split_feature":0,
            "split_gain":4112.094574306927,
            "node_index":0,
            "default_left":true,
            "left_child":1,
            "right_child":2
         },
         ...
         {
            "node_index":9,
            "leaf_value":-27.68987349695448
         },
         ...
      ],
      "target_type":"regression"
   }
}
----------------------------------
//NOTCONSOLE


The following example shows an `ensemble` model object:

[source,js]
----------------------------------
"ensemble":{
   "feature_names":[
      ...
   ],
   "trained_models":[
      {
         "tree":{
            "feature_names":[],
            "tree_structure":[
               {
                  "decision_type":"lte",
                  "node_index":0,
                  "leaf_value":47.64069875778043,
                  "default_left":false
               }
            ],
            "target_type":"regression"
         }
      },
      ...
   ],
   "aggregate_output":{
      "weighted_sum":{
         "weights":[
            ...
         ]
      }
   },
   "target_type":"regression"
}
----------------------------------
//NOTCONSOLE


[[ml-put-trained-models-aggregated-output-example]]
=== Aggregated output example

Example of a `logistic_regression` object:

[source,js]
----------------------------------
"aggregate_output" : {
  "logistic_regression" : {
    "weights" : [2.0, 1.0, .5, -1.0, 5.0, 1.0, 1.0]
  }
}
----------------------------------
//NOTCONSOLE


Example of a `weighted_sum` object:

[source,js]
----------------------------------
"aggregate_output" : {
  "weighted_sum" : {
    "weights" : [1.0, -1.0, .5, 1.0, 5.0]
  }
}
----------------------------------
//NOTCONSOLE


Example of a `weighted_mode` object:

[source,js]
----------------------------------
"aggregate_output" : {
  "weighted_mode" : {
    "weights" : [1.0, 1.0, 1.0, 1.0, 1.0]
  }
}
----------------------------------
//NOTCONSOLE


Example of an `exponent` object:

[source,js]
----------------------------------
"aggregate_output" : {
  "exponent" : {
    "weights" : [1.0, 1.0, 1.0, 1.0, 1.0]
  }
}
----------------------------------
//NOTCONSOLE


[[ml-put-trained-models-json-schema]]
=== Trained models JSON schema

For the full JSON schema of trained models,
https://github.com/elastic/ml-json-schemas[click here].