elasticsearch/docs/reference/inference/post-inference.asciidoc

[role="xpack"]
[[post-inference-api]]
=== Perform inference API

experimental[]

Performs an inference task on an input text by using an {infer} model.

IMPORTANT: The {infer} APIs enable you to use certain services, such as ELSER,
OpenAI, or Hugging Face, in your cluster. This is not the same feature that you
can use on an ML node with custom {ml} models. If you want to train and use your
own model, use the <<ml-df-trained-models-apis>>.


[discrete]
[[post-inference-api-request]]
==== {api-request-title}

`POST /_inference/<inference_id>`

`POST /_inference/<task_type>/<inference_id>`


[discrete]
[[post-inference-api-prereqs]]
==== {api-prereq-title}

* Requires the `monitor_inference` <<privileges-list-cluster,cluster privilege>>
(the built-in `inference_admin` and `inference_user` roles grant this privilege)

[discrete]
[[post-inference-api-desc]]
==== {api-description-title}

The perform {infer} API enables you to use {ml} models to perform specific tasks
on data that you provide as an input. The API returns a response with the
results of the tasks. The {infer} model you use can perform one specific task
that has been defined when the model was created with the <<put-inference-api>>.


[discrete]
[[post-inference-api-path-params]]
==== {api-path-parms-title}

`<inference_id>`::
(Required, string)
The unique identifier of the {infer} endpoint.


`<task_type>`::
(Optional, string)
The type of {infer} task that the model performs.


[discrete]
[[post-inference-api-request-body]]
== {api-request-body-title}

`input`::
(Required, array of strings)
The text on which you want to perform the {infer} task.
`input` can be a single string or an array.
[NOTE]
====
Inference endpoints for the `completion` task type currently only support a single string as input.
====


[discrete]
[[post-inference-api-example]]
==== {api-examples-title}

The following example performs sparse embedding on the example sentence.


[source,console]
------------------------------------------------------------
POST _inference/sparse_embedding/my-elser-model
{
  "input": "The sky above the port was the color of television tuned to a dead channel."
}
------------------------------------------------------------
// TEST[skip:TBD]


The API returns the following response:


[source,console-result]
------------------------------------------------------------
{
  "sparse_embedding": [
    {
      "port": 2.1259406,
      "sky": 1.7073475,
      "color": 1.6922266,
      "dead": 1.6247464,
      "television": 1.3525393,
      "above": 1.2425821,
      "tuned": 1.1440028,
      "colors": 1.1218185,
      "tv": 1.0111054,
      "ports": 1.0067928,
      "poem": 1.0042328,
      "channel": 0.99471164,
      "tune": 0.96235967,
      "scene": 0.9020516,
      (...)
    },
    (...)
  ]
}
------------------------------------------------------------
// NOTCONSOLE


The next example performs a completion on the example question.


[source,console]
------------------------------------------------------------
POST _inference/completion/openai_chat_completions
{
  "input": "What is Elastic?"
}
------------------------------------------------------------
// TEST[skip:TBD]


The API returns the following response:


[source,console-result]
------------------------------------------------------------
{
  "completion": [
    {
      "result": "Elastic is a company that provides a range of software solutions for search, logging, security, and analytics. Their flagship product is Elasticsearch, an open-source, distributed search engine that allows users to search, analyze, and visualize large volumes of data in real-time. Elastic also offers products such as Kibana, a data visualization tool, and Logstash, a log management and pipeline tool, as well as various other tools and solutions for data analysis and management."
    }
  ]
}
------------------------------------------------------------
// NOTCONSOLE