[role="xpack"] [[get-trained-models-stats]] = Get trained models statistics API [subs="attributes"] ++++ Get trained models stats ++++ Retrieves usage information for trained models. [[ml-get-trained-models-stats-request]] == {api-request-title} `GET _ml/trained_models/_stats` + `GET _ml/trained_models/_all/_stats` + `GET _ml/trained_models//_stats` + `GET _ml/trained_models/,/_stats` + `GET _ml/trained_models/,/_stats` [[ml-get-trained-models-stats-prereq]] == {api-prereq-title} Requires the `monitor_ml` cluster privilege. This privilege is included in the `machine_learning_user` built-in role. [[ml-get-trained-models-stats-desc]] == {api-description-title} You can get usage information for multiple trained models in a single API request by using a comma-separated list of model IDs or a wildcard expression. [[ml-get-trained-models-stats-path-params]] == {api-path-parms-title} ``:: (Optional, string) include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=model-id-or-alias] [[ml-get-trained-models-stats-query-params]] == {api-query-parms-title} `allow_no_match`:: (Optional, Boolean) include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=allow-no-match-models] `from`:: (Optional, integer) include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=from-models] `size`:: (Optional, integer) include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=size-models] [role="child_attributes"] [[ml-get-trained-models-stats-results]] == {api-response-body-title} `count`:: (integer) The total number of trained model statistics that matched the requested ID patterns. Could be higher than the number of items in the `trained_model_stats` array as the size of the array is restricted by the supplied `size` parameter. `trained_model_stats`:: (array) An array of trained model statistics, which are sorted by the `model_id` value in ascending order. + .Properties of trained model stats [%collapsible%open] ==== `deployment_stats`::: (list) A collection of deployment stats if one of the provided `model_id` values is deployed + .Properties of deployment stats [%collapsible%open] ===== `allocation_status`::: (object) The detailed allocation status given the deployment configuration. + .Properties of allocation stats [%collapsible%open] ====== `allocation_count`::: (integer) The current number of nodes where the model is allocated. `cache_size`::: (<>) The inference cache size (in memory outside the JVM heap) per node for the model. `state`::: (string) The detailed allocation state related to the nodes. + -- * `starting`: Allocations are being attempted but no node currently has the model allocated. * `started`: At least one node has the model allocated. * `fully_allocated`: The deployment is fully allocated and satisfies the `target_allocation_count`. -- `target_allocation_count`::: (integer) The desired number of nodes for model allocation. ====== `error_count`::: (integer) The sum of `error_count` for all nodes in the deployment. `inference_count`::: (integer) The sum of `inference_count` for all nodes in the deployment. `model_id`::: (string) include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=model-id] `nodes`::: (array of objects) The deployment stats for each node that currently has the model allocated. + .Properties of node stats [%collapsible%open] ====== `average_inference_time_ms`::: (double) The average time for each inference call to complete on this node. The average is calculated over the lifetime of the deployment. `average_inference_time_ms_last_minute`::: (double) The average time for each inference call to complete on this node in the last minute. `error_count`::: (integer) The number of errors when evaluating the trained model. `inference_cache_hit_count`::: (integer) The total number of inference calls made against this node for this model that were served from the inference cache. `inference_cache_hit_count_last_minute`::: (integer) The number of inference calls made against this node for this model in the last minute that were served from the inference cache. `inference_count`::: (integer) The total number of inference calls made against this node for this model. `last_access`::: (long) The epoch time stamp of the last inference call for the model on this node. `node`::: (object) Information pertaining to the node. + .Properties of node [%collapsible%open] ======== `attributes`::: (object) include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=node-attributes] `ephemeral_id`::: (string) include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=node-ephemeral-id] `id`::: (string) include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=node-id] `name`::: (string) The node name. `transport_address`::: (string) include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=node-transport-address] ======== `number_of_allocations`::: (integer) The number of allocations assigned to this node. `number_of_pending_requests`::: (integer) The number of inference requests queued to be processed. `peak_throughput_per_minute`::: (integer) The peak number of requests processed in a 1 minute period. `routing_state`::: (object) The current routing state and reason for the current routing state for this allocation. + .Properties of routing_state [%collapsible%open] ======== `reason`::: (string) The reason for the current state. Usually only populated when the `routing_state` is `failed`. `routing_state`::: (string) The current routing state. -- * `starting`: The model is attempting to allocate on this model, inference calls are not yet accepted. * `started`: The model is allocated and ready to accept inference requests. * `stopping`: The model is being deallocated from this node. * `stopped`: The model is fully deallocated from this node. * `failed`: The allocation attempt failed, see `reason` field for the potential cause. -- ======== `rejected_execution_count`::: (integer) The number of inference requests that were not processed because the queue was full. `start_time`::: (long) The epoch timestamp when the allocation started. `threads_per_allocation`::: (integer) The number of threads for each allocation during inference. This value is limited by the number of hardware threads on the node; it might therefore differ from the `threads_per_allocation` value in the <> API. `timeout_count`::: (integer) The number of inference requests that timed out before being processed. `throughput_last_minute`::: (integer) The number of requests processed in the last 1 minute. ====== `number_of_allocations`::: (integer) The requested number of allocations for the trained model deployment. `peak_throughput_per_minute`::: (integer) The peak number of requests processed in a 1 minute period for all nodes in the deployment. This is calculated as the sum of each node's `peak_throughput_per_minute` value. `rejected_execution_count`::: (integer) The sum of `rejected_execution_count` for all nodes in the deployment. Individual nodes reject an inference request if the inference queue is full. The queue size is controlled by the `queue_capacity` setting in the <> API. `reason`::: (string) The reason for the current deployment state. Usually only populated when the model is not deployed to a node. `start_time`::: (long) The epoch timestamp when the deployment started. `state`::: (string) The overall state of the deployment. The values may be: + -- * `starting`: The deployment has recently started but is not yet usable as the model is not allocated on any nodes. * `started`: The deployment is usable as at least one node has the model allocated. * `stopping`: The deployment is preparing to stop and deallocate the model from the relevant nodes. -- `threads_per_allocation`::: (integer) The number of threads per allocation used by the inference process. `timeout_count`::: (integer) The sum of `timeout_count` for all nodes in the deployment. `queue_capacity`::: (integer) The number of inference requests that may be queued before new requests are rejected. ===== `inference_stats`::: (object) A collection of inference stats fields. + .Properties of inference stats [%collapsible%open] ===== `missing_all_fields_count`::: (integer) The number of inference calls where all the training features for the model were missing. `inference_count`::: (integer) The total number of times the model has been called for inference. This is across all inference contexts, including all pipelines. `cache_miss_count`::: (integer) The number of times the model was loaded for inference and was not retrieved from the cache. If this number is close to the `inference_count`, then the cache is not being appropriately used. This can be solved by increasing the cache size or its time-to-live (TTL). See <> for the appropriate settings. `failure_count`::: (integer) The number of failures when using the model for inference. `timestamp`::: (<>) The time when the statistics were last updated. ===== `ingest`::: (object) A collection of ingest stats for the model across all nodes. The values are summations of the individual node statistics. The format matches the `ingest` section in <>. `model_id`::: (string) include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=model-id] `model_size_stats`::: (object) A collection of model size stats fields. + .Properties of model size stats [%collapsible%open] ===== `model_size_bytes`::: (integer) The size of the model in bytes. `required_native_memory_bytes`::: (integer) The amount of memory required to load the model in bytes. ===== `pipeline_count`::: (integer) The number of ingest pipelines that currently refer to the model. ==== [[ml-get-trained-models-stats-response-codes]] == {api-response-codes-title} `404` (Missing resources):: If `allow_no_match` is `false`, this code indicates that there are no resources that match the request or only partial matches for the request. [[ml-get-trained-models-stats-example]] == {api-examples-title} The following example gets usage information for all the trained models: [source,console] -------------------------------------------------- GET _ml/trained_models/_stats -------------------------------------------------- // TEST[skip:TBD] The API returns the following results: [source,console-result] ---- { "count": 2, "trained_model_stats": [ { "model_id": "flight-delay-prediction-1574775339910", "pipeline_count": 0, "inference_stats": { "failure_count": 0, "inference_count": 4, "cache_miss_count": 3, "missing_all_fields_count": 0, "timestamp": 1592399986979 } }, { "model_id": "regression-job-one-1574775307356", "pipeline_count": 1, "inference_stats": { "failure_count": 0, "inference_count": 178, "cache_miss_count": 3, "missing_all_fields_count": 0, "timestamp": 1592399986979 }, "ingest": { "total": { "count": 178, "time_in_millis": 8, "current": 0, "failed": 0 }, "pipelines": { "flight-delay": { "count": 178, "time_in_millis": 8, "current": 0, "failed": 0, "processors": [ { "inference": { "type": "inference", "stats": { "count": 178, "time_in_millis": 7, "current": 0, "failed": 0 } } } ] } } } } ] } ---- // NOTCONSOLE