elasticsearch/docs/reference/inference/inference-apis.asciidoc
István Zoltán Szabó a73e972777
[DOCS] Adds stream inference API docs (#115333) (#115623)
Co-authored-by: Pat Whelan <pat.whelan@elastic.co>
2024-10-25 18:40:40 +11:00

75 lines
3.6 KiB
Text
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

[role="xpack"]
[[inference-apis]]
== {infer-cap} APIs
IMPORTANT: The {infer} APIs enable you to use certain services, such as built-in
{ml} models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Azure,
Google AI Studio or Hugging Face. For built-in models and models uploaded
through Eland, the {infer} APIs offer an alternative way to use and manage
trained models. However, if you do not plan to use the {infer} APIs to use these
models or if you want to use non-NLP models, use the
<<ml-df-trained-models-apis>>.
The {infer} APIs enable you to create {infer} endpoints and use {ml} models of
different providers - such as Amazon Bedrock, Anthropic, Azure AI Studio,
Cohere, Google AI, Mistral, OpenAI, or HuggingFace - as a service. Use
the following APIs to manage {infer} models and perform {infer}:
* <<delete-inference-api>>
* <<get-inference-api>>
* <<post-inference-api>>
* <<put-inference-api>>
* <<stream-inference-api>>
* <<update-inference-api>>
[[inference-landscape]]
.A representation of the Elastic inference landscape
image::images/inference-landscape.jpg[A representation of the Elastic inference landscape,align="center"]
An {infer} endpoint enables you to use the corresponding {ml} model without
manual deployment and apply it to your data at ingestion time through
<<semantic-search-semantic-text, semantic text>>.
Choose a model from your provider or use ELSER a retrieval model trained by
Elastic , then create an {infer} endpoint by the <<put-inference-api>>.
Now use <<semantic-search-semantic-text, semantic text>> to perform
<<semantic-search, semantic search>> on your data.
[discrete]
[[default-enpoints]]
=== Default {infer} endpoints
Your {es} deployment contains some preconfigured {infer} endpoints that makes it easier for you to use them when defining `semantic_text` fields or {infer} processors.
The following list contains the default {infer} endpoints listed by `inference_id`:
* `.elser-2-elasticsearch`: uses the {ml-docs}/ml-nlp-elser.html[ELSER] built-in trained model for `sparse_embedding` tasks (recommended for English language texts)
* `.multilingual-e5-small-elasticsearch`: uses the {ml-docs}/ml-nlp-e5.html[E5] built-in trained model for `text_embedding` tasks (recommended for non-English language texts)
Use the `inference_id` of the endpoint in a <<semantic-text,`semantic_text`>> field definition or when creating an <<inference-processor,{infer} processor>>.
The API call will automatically download and deploy the model which might take a couple of minutes.
Default {infer} enpoints have {ml-docs}/ml-nlp-auto-scale.html#nlp-model-adaptive-allocations[adaptive allocations] enabled.
For these models, the minimum number of allocations is `0`.
If there is no {infer} activity that uses the endpoint, the number of allocations will scale down to `0` automatically after 15 minutes.
include::delete-inference.asciidoc[]
include::get-inference.asciidoc[]
include::post-inference.asciidoc[]
include::put-inference.asciidoc[]
include::stream-inference.asciidoc[]
include::update-inference.asciidoc[]
include::service-alibabacloud-ai-search.asciidoc[]
include::service-amazon-bedrock.asciidoc[]
include::service-anthropic.asciidoc[]
include::service-azure-ai-studio.asciidoc[]
include::service-azure-openai.asciidoc[]
include::service-cohere.asciidoc[]
include::service-elasticsearch.asciidoc[]
include::service-elser.asciidoc[]
include::service-google-ai-studio.asciidoc[]
include::service-google-vertex-ai.asciidoc[]
include::service-hugging-face.asciidoc[]
include::service-mistral.asciidoc[]
include::service-openai.asciidoc[]
include::service-watsonx-ai.asciidoc[]