mirror of
https://github.com/elastic/elasticsearch.git
synced 2025-04-25 07:37:19 -04:00
234 lines
5.7 KiB
Text
234 lines
5.7 KiB
Text
[role="xpack"]
|
|
[[post-inference-api]]
|
|
=== Perform inference API
|
|
|
|
experimental[]
|
|
|
|
Performs an inference task on an input text by using an {infer} endpoint.
|
|
|
|
IMPORTANT: The {infer} APIs enable you to use certain services, such as built-in
|
|
{ml} models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, or
|
|
Hugging Face. For built-in models and models uploaded though Eland, the {infer}
|
|
APIs offer an alternative way to use and manage trained models. However, if you
|
|
do not plan to use the {infer} APIs to use these models or if you want to use
|
|
non-NLP models, use the <<ml-df-trained-models-apis>>.
|
|
|
|
|
|
[discrete]
|
|
[[post-inference-api-request]]
|
|
==== {api-request-title}
|
|
|
|
`POST /_inference/<inference_id>`
|
|
|
|
`POST /_inference/<task_type>/<inference_id>`
|
|
|
|
|
|
[discrete]
|
|
[[post-inference-api-prereqs]]
|
|
==== {api-prereq-title}
|
|
|
|
* Requires the `monitor_inference` <<privileges-list-cluster,cluster privilege>>
|
|
(the built-in `inference_admin` and `inference_user` roles grant this privilege)
|
|
|
|
[discrete]
|
|
[[post-inference-api-desc]]
|
|
==== {api-description-title}
|
|
|
|
The perform {infer} API enables you to use {ml} models to perform specific tasks
|
|
on data that you provide as an input. The API returns a response with the
|
|
results of the tasks. The {infer} endpoint you use can perform one specific task
|
|
that has been defined when the endpoint was created with the
|
|
<<put-inference-api>>.
|
|
|
|
|
|
[discrete]
|
|
[[post-inference-api-path-params]]
|
|
==== {api-path-parms-title}
|
|
|
|
`<inference_id>`::
|
|
(Required, string)
|
|
The unique identifier of the {infer} endpoint.
|
|
|
|
|
|
`<task_type>`::
|
|
(Optional, string)
|
|
The type of {infer} task that the model performs.
|
|
|
|
|
|
[discrete]
|
|
[[post-inference-api-query-params]]
|
|
==== {api-query-parms-title}
|
|
|
|
`timeout`::
|
|
(Optional, timeout)
|
|
Controls the amount of time to wait for the inference to complete. Defaults to 30
|
|
seconds.
|
|
|
|
[discrete]
|
|
[[post-inference-api-request-body]]
|
|
==== {api-request-body-title}
|
|
|
|
`input`::
|
|
(Required, string or array of strings)
|
|
The text on which you want to perform the {infer} task.
|
|
`input` can be a single string or an array.
|
|
+
|
|
--
|
|
[NOTE]
|
|
====
|
|
Inference endpoints for the `completion` task type currently only support a
|
|
single string as input.
|
|
====
|
|
--
|
|
|
|
`query`::
|
|
(Required, string)
|
|
Only for `rerank` {infer} endpoints. The search query text.
|
|
|
|
|
|
[discrete]
|
|
[[post-inference-api-example]]
|
|
==== {api-examples-title}
|
|
|
|
|
|
[discrete]
|
|
[[inference-example-completion]]
|
|
===== Completion example
|
|
|
|
The following example performs a completion on the example question.
|
|
|
|
|
|
[source,console]
|
|
------------------------------------------------------------
|
|
POST _inference/completion/openai_chat_completions
|
|
{
|
|
"input": "What is Elastic?"
|
|
}
|
|
------------------------------------------------------------
|
|
// TEST[skip:TBD]
|
|
|
|
|
|
The API returns the following response:
|
|
|
|
|
|
[source,console-result]
|
|
------------------------------------------------------------
|
|
{
|
|
"completion": [
|
|
{
|
|
"result": "Elastic is a company that provides a range of software solutions for search, logging, security, and analytics. Their flagship product is Elasticsearch, an open-source, distributed search engine that allows users to search, analyze, and visualize large volumes of data in real-time. Elastic also offers products such as Kibana, a data visualization tool, and Logstash, a log management and pipeline tool, as well as various other tools and solutions for data analysis and management."
|
|
}
|
|
]
|
|
}
|
|
------------------------------------------------------------
|
|
// NOTCONSOLE
|
|
|
|
[discrete]
|
|
[[inference-example-rerank]]
|
|
===== Rerank example
|
|
|
|
The following example performs reranking on the example input.
|
|
|
|
[source,console]
|
|
------------------------------------------------------------
|
|
POST _inference/rerank/cohere_rerank
|
|
{
|
|
"input": ["luke", "like", "leia", "chewy","r2d2", "star", "wars"],
|
|
"query": "star wars main character"
|
|
}
|
|
------------------------------------------------------------
|
|
// TEST[skip:TBD]
|
|
|
|
The API returns the following response:
|
|
|
|
|
|
[source,console-result]
|
|
------------------------------------------------------------
|
|
{
|
|
"rerank": [
|
|
{
|
|
"index": "2",
|
|
"relevance_score": "0.011597361",
|
|
"text": "leia"
|
|
},
|
|
{
|
|
"index": "0",
|
|
"relevance_score": "0.006338922",
|
|
"text": "luke"
|
|
},
|
|
{
|
|
"index": "5",
|
|
"relevance_score": "0.0016166499",
|
|
"text": "star"
|
|
},
|
|
{
|
|
"index": "4",
|
|
"relevance_score": "0.0011695103",
|
|
"text": "r2d2"
|
|
},
|
|
{
|
|
"index": "1",
|
|
"relevance_score": "5.614787E-4",
|
|
"text": "like"
|
|
},
|
|
{
|
|
"index": "6",
|
|
"relevance_score": "3.7850367E-4",
|
|
"text": "wars"
|
|
},
|
|
{
|
|
"index": "3",
|
|
"relevance_score": "1.2508839E-5",
|
|
"text": "chewy"
|
|
}
|
|
]
|
|
}
|
|
------------------------------------------------------------
|
|
|
|
|
|
[discrete]
|
|
[[inference-example-sparse]]
|
|
===== Sparse embedding example
|
|
|
|
The following example performs sparse embedding on the example sentence.
|
|
|
|
|
|
[source,console]
|
|
------------------------------------------------------------
|
|
POST _inference/sparse_embedding/my-elser-model
|
|
{
|
|
"input": "The sky above the port was the color of television tuned to a dead channel."
|
|
}
|
|
------------------------------------------------------------
|
|
// TEST[skip:TBD]
|
|
|
|
|
|
The API returns the following response:
|
|
|
|
|
|
[source,console-result]
|
|
------------------------------------------------------------
|
|
{
|
|
"sparse_embedding": [
|
|
{
|
|
"port": 2.1259406,
|
|
"sky": 1.7073475,
|
|
"color": 1.6922266,
|
|
"dead": 1.6247464,
|
|
"television": 1.3525393,
|
|
"above": 1.2425821,
|
|
"tuned": 1.1440028,
|
|
"colors": 1.1218185,
|
|
"tv": 1.0111054,
|
|
"ports": 1.0067928,
|
|
"poem": 1.0042328,
|
|
"channel": 0.99471164,
|
|
"tune": 0.96235967,
|
|
"scene": 0.9020516,
|
|
(...)
|
|
},
|
|
(...)
|
|
]
|
|
}
|
|
------------------------------------------------------------
|
|
// NOTCONSOLE
|