[role="xpack"]
[[get-trained-models-stats]]
= Get trained models statistics API
[subs="attributes"]
++++
Get trained models stats
++++
Retrieves usage information for trained models.
[[ml-get-trained-models-stats-request]]
== {api-request-title}
`GET _ml/trained_models/_stats` +
`GET _ml/trained_models/_all/_stats` +
`GET _ml/trained_models//_stats` +
`GET _ml/trained_models/,/_stats` +
`GET _ml/trained_models/,/_stats`
[[ml-get-trained-models-stats-prereq]]
== {api-prereq-title}
Requires the `monitor_ml` cluster privilege. This privilege is included in the
`machine_learning_user` built-in role.
[[ml-get-trained-models-stats-desc]]
== {api-description-title}
You can get usage information for multiple trained models in a single API
request by using a comma-separated list of model IDs or a wildcard expression.
[[ml-get-trained-models-stats-path-params]]
== {api-path-parms-title}
``::
(Optional, string)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=model-id-or-alias]
[[ml-get-trained-models-stats-query-params]]
== {api-query-parms-title}
`allow_no_match`::
(Optional, Boolean)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=allow-no-match-models]
`from`::
(Optional, integer)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=from-models]
`size`::
(Optional, integer)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=size-models]
[role="child_attributes"]
[[ml-get-trained-models-stats-results]]
== {api-response-body-title}
`count`::
(integer)
The total number of trained model statistics that matched the requested ID
patterns. Could be higher than the number of items in the `trained_model_stats`
array as the size of the array is restricted by the supplied `size` parameter.
`trained_model_stats`::
(array)
An array of trained model statistics, which are sorted by the `model_id` value
in ascending order.
+
.Properties of trained model stats
[%collapsible%open]
====
`deployment_stats`:::
(list)
A collection of deployment stats if one of the provided `model_id` values
is deployed
+
.Properties of deployment stats
[%collapsible%open]
=====
`allocation_status`:::
(object)
The detailed allocation status given the deployment configuration.
+
.Properties of allocation stats
[%collapsible%open]
======
`allocation_count`:::
(integer)
The current number of nodes where the model is allocated.
`cache_size`:::
(<>)
The inference cache size (in memory outside the JVM heap) per node for the model.
`state`:::
(string)
The detailed allocation state related to the nodes.
+
--
* `starting`: Allocations are being attempted but no node currently has the model allocated.
* `started`: At least one node has the model allocated.
* `fully_allocated`: The deployment is fully allocated and satisfies the `target_allocation_count`.
--
`target_allocation_count`:::
(integer)
The desired number of nodes for model allocation.
======
`error_count`:::
(integer)
The sum of `error_count` for all nodes in the deployment.
`inference_count`:::
(integer)
The sum of `inference_count` for all nodes in the deployment.
`model_id`:::
(string)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=model-id]
`nodes`:::
(array of objects)
The deployment stats for each node that currently has the model allocated.
+
.Properties of node stats
[%collapsible%open]
======
`average_inference_time_ms`:::
(double)
The average time for each inference call to complete on this node.
The average is calculated over the lifetime of the deployment.
`average_inference_time_ms_last_minute`:::
(double)
The average time for each inference call to complete on this node
in the last minute.
`error_count`:::
(integer)
The number of errors when evaluating the trained model.
`inference_cache_hit_count`:::
(integer)
The total number of inference calls made against this node for this
model that were served from the inference cache.
`inference_cache_hit_count_last_minute`:::
(integer)
The number of inference calls made against this node for this model
in the last minute that were served from the inference cache.
`inference_count`:::
(integer)
The total number of inference calls made against this node for this model.
`last_access`:::
(long)
The epoch time stamp of the last inference call for the model on this node.
`node`:::
(object)
Information pertaining to the node.
+
.Properties of node
[%collapsible%open]
========
`attributes`:::
(object)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=node-attributes]
`ephemeral_id`:::
(string)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=node-ephemeral-id]
`id`:::
(string)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=node-id]
`name`:::
(string) The node name.
`transport_address`:::
(string)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=node-transport-address]
========
`number_of_allocations`:::
(integer)
The number of allocations assigned to this node.
`number_of_pending_requests`:::
(integer)
The number of inference requests queued to be processed.
`peak_throughput_per_minute`:::
(integer)
The peak number of requests processed in a 1 minute period.
`routing_state`:::
(object)
The current routing state and reason for the current routing state for this allocation.
+
.Properties of routing_state
[%collapsible%open]
========
`reason`:::
(string)
The reason for the current state. Usually only populated when the `routing_state` is `failed`.
`routing_state`:::
(string)
The current routing state.
--
* `starting`: The model is attempting to allocate on this model, inference calls are not yet accepted.
* `started`: The model is allocated and ready to accept inference requests.
* `stopping`: The model is being deallocated from this node.
* `stopped`: The model is fully deallocated from this node.
* `failed`: The allocation attempt failed, see `reason` field for the potential cause.
--
========
`rejected_execution_count`:::
(integer)
The number of inference requests that were not processed because the
queue was full.
`start_time`:::
(long)
The epoch timestamp when the allocation started.
`threads_per_allocation`:::
(integer)
The number of threads for each allocation during inference.
This value is limited by the number of hardware threads on the node;
it might therefore differ from the `threads_per_allocation` value in the <> API.
`timeout_count`:::
(integer)
The number of inference requests that timed out before being processed.
`throughput_last_minute`:::
(integer)
The number of requests processed in the last 1 minute.
======
`number_of_allocations`:::
(integer)
The requested number of allocations for the trained model deployment.
`peak_throughput_per_minute`:::
(integer)
The peak number of requests processed in a 1 minute period for
all nodes in the deployment. This is calculated as the sum of
each node's `peak_throughput_per_minute` value.
`rejected_execution_count`:::
(integer)
The sum of `rejected_execution_count` for all nodes in the deployment.
Individual nodes reject an inference request if the inference queue is full.
The queue size is controlled by the `queue_capacity` setting in the
<> API.
`reason`:::
(string)
The reason for the current deployment state.
Usually only populated when the model is not deployed to a node.
`start_time`:::
(long)
The epoch timestamp when the deployment started.
`state`:::
(string)
The overall state of the deployment. The values may be:
+
--
* `starting`: The deployment has recently started but is not yet usable as the model is not allocated on any nodes.
* `started`: The deployment is usable as at least one node has the model allocated.
* `stopping`: The deployment is preparing to stop and deallocate the model from the relevant nodes.
--
`threads_per_allocation`:::
(integer)
The number of threads per allocation used by the inference process.
`timeout_count`:::
(integer)
The sum of `timeout_count` for all nodes in the deployment.
`queue_capacity`:::
(integer)
The number of inference requests that may be queued before new requests are
rejected.
=====
`inference_stats`:::
(object)
A collection of inference stats fields.
+
.Properties of inference stats
[%collapsible%open]
=====
`missing_all_fields_count`:::
(integer)
The number of inference calls where all the training features for the model
were missing.
`inference_count`:::
(integer)
The total number of times the model has been called for inference.
This is across all inference contexts, including all pipelines.
`cache_miss_count`:::
(integer)
The number of times the model was loaded for inference and was not retrieved
from the cache. If this number is close to the `inference_count`, then the cache
is not being appropriately used. This can be solved by increasing the cache size
or its time-to-live (TTL). See <> for the appropriate
settings.
`failure_count`:::
(integer)
The number of failures when using the model for inference.
`timestamp`:::
(<>)
The time when the statistics were last updated.
=====
`ingest`:::
(object)
A collection of ingest stats for the model across all nodes. The values are
summations of the individual node statistics. The format matches the `ingest`
section in <>.
`model_id`:::
(string)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=model-id]
`model_size_stats`:::
(object)
A collection of model size stats fields.
+
.Properties of model size stats
[%collapsible%open]
=====
`model_size_bytes`:::
(integer)
The size of the model in bytes.
`required_native_memory_bytes`:::
(integer)
The amount of memory required to load the model in bytes.
=====
`pipeline_count`:::
(integer)
The number of ingest pipelines that currently refer to the model.
====
[[ml-get-trained-models-stats-response-codes]]
== {api-response-codes-title}
`404` (Missing resources)::
If `allow_no_match` is `false`, this code indicates that there are no
resources that match the request or only partial matches for the request.
[[ml-get-trained-models-stats-example]]
== {api-examples-title}
The following example gets usage information for all the trained models:
[source,console]
--------------------------------------------------
GET _ml/trained_models/_stats
--------------------------------------------------
// TEST[skip:TBD]
The API returns the following results:
[source,console-result]
----
{
"count": 2,
"trained_model_stats": [
{
"model_id": "flight-delay-prediction-1574775339910",
"pipeline_count": 0,
"inference_stats": {
"failure_count": 0,
"inference_count": 4,
"cache_miss_count": 3,
"missing_all_fields_count": 0,
"timestamp": 1592399986979
}
},
{
"model_id": "regression-job-one-1574775307356",
"pipeline_count": 1,
"inference_stats": {
"failure_count": 0,
"inference_count": 178,
"cache_miss_count": 3,
"missing_all_fields_count": 0,
"timestamp": 1592399986979
},
"ingest": {
"total": {
"count": 178,
"time_in_millis": 8,
"current": 0,
"failed": 0
},
"pipelines": {
"flight-delay": {
"count": 178,
"time_in_millis": 8,
"current": 0,
"failed": 0,
"processors": [
{
"inference": {
"type": "inference",
"stats": {
"count": 178,
"time_in_millis": 7,
"current": 0,
"failed": 0
}
}
}
]
}
}
}
}
]
}
----
// NOTCONSOLE