elasticsearch/docs/reference/tab-widgets/inference-api/infer-api-task.asciidoc

// tag::cohere[]

[source,console]
------------------------------------------------------------
PUT _inference/text_embedding/cohere_embeddings <1>
{
    "service": "cohere",
    "service_settings": {
        "api_key": "<api_key>", <2>
        "model_id": "embed-english-v3.0", <3>
        "embedding_type": "byte"
    }
}
------------------------------------------------------------
// TEST[skip:TBD]
<1> The task type is `text_embedding` in the path and the `inference_id` which
is the unique identifier of the {infer} endpoint is `cohere_embeddings`.
<2> The API key of your Cohere account. You can find your API keys in your
Cohere dashboard under the
https://dashboard.cohere.com/api-keys[API keys section]. You need to provide
your API key only once. The <<get-inference-api>> does not return your API
key.
<3> The name of the embedding model to use. You can find the list of Cohere
embedding models https://docs.cohere.com/reference/embed[here].

NOTE: When using this model the recommended similarity measure to use in the
`dense_vector` field mapping is `dot_product`. In the case of Cohere models, the
embeddings are normalized to unit length in which case the `dot_product` and
the `cosine` measures are equivalent.

// end::cohere[]

// tag::elser[]

[source,console]
------------------------------------------------------------
PUT _inference/sparse_embedding/elser_embeddings <1>
{
  "service": "elser",
  "service_settings": {
    "num_allocations": 1,
    "num_threads": 1
  }
}
------------------------------------------------------------
// TEST[skip:TBD]
<1> The task type is `sparse_embedding` in the path and the `inference_id` which
is the unique identifier of the {infer} endpoint is `elser_embeddings`.

You don't need to download and deploy the ELSER model upfront, the API request
above will download the model if it's not downloaded yet and then deploy it.

[NOTE]
====
You might see a 502 bad gateway error in the response when using the {kib} Console.
This error usually just reflects a timeout, while the model downloads in the background.
You can check the download progress in the {ml-app} UI.
If using the Python client, you can set the `timeout` parameter to a higher value.
====

// end::elser[]

// tag::hugging-face[]

First, you need to create a new {infer} endpoint on
https://ui.endpoints.huggingface.co/[the Hugging Face endpoint page] to get an
endpoint URL. Select the model `all-mpnet-base-v2` on the new endpoint creation
page, then select the `Sentence Embeddings` task under the Advanced
configuration section. Create the endpoint. Copy the URL after the endpoint
initialization has been finished, you need the URL in the following {infer} API
call.

[source,console]
------------------------------------------------------------
PUT _inference/text_embedding/hugging_face_embeddings <1>
{
  "service": "hugging_face",
  "service_settings": {
    "api_key": "<access_token>", <2>
    "url": "<url_endpoint>" <3>
  }
}
------------------------------------------------------------
// TEST[skip:TBD]
<1> The task type is `text_embedding` in the path and the `inference_id` which
is the unique identifier of the {infer} endpoint is `hugging_face_embeddings`.
<2> A valid HuggingFace access token. You can find on the
https://huggingface.co/settings/tokens[settings page of your account].
<3> The {infer} endpoint URL you created on Hugging Face.

// end::hugging-face[]


// tag::openai[]

[source,console]
------------------------------------------------------------
PUT _inference/text_embedding/openai_embeddings <1>
{
    "service": "openai",
    "service_settings": {
        "api_key": "<api_key>", <2>
        "model_id": "text-embedding-ada-002" <3>
    }
}
------------------------------------------------------------
// TEST[skip:TBD]
<1> The task type is `text_embedding` in the path and the `inference_id` which
is the unique identifier of the {infer} endpoint is `openai_embeddings`.
<2> The API key of your OpenAI account. You can find your OpenAI API keys in
your OpenAI account under the
https://platform.openai.com/api-keys[API keys section]. You need to provide
your API key only once. The <<get-inference-api>> does not return your API
key.
<3> The name of the embedding model to use. You can find the list of OpenAI
embedding models
https://platform.openai.com/docs/guides/embeddings/embedding-models[here].

NOTE: When using this model the recommended similarity measure to use in the
`dense_vector` field mapping is `dot_product`. In the case of OpenAI models, the
embeddings are normalized to unit length in which case the `dot_product` and
the `cosine` measures are equivalent.

// end::openai[]

// tag::azure-openai[]

[source,console]
------------------------------------------------------------
PUT _inference/text_embedding/azure_openai_embeddings <1>
{
    "service": "azureopenai",
    "service_settings": {
        "api_key": "<api_key>", <2>
        "resource_name": "<resource_name>", <3>
        "deployment_id": "<deployment_id>", <4>
        "api_version": "2024-02-01"
    }
}
------------------------------------------------------------
// TEST[skip:TBD]
<1> The task type is `text_embedding` in the path and the `inference_id` which is the unique identifier of the {infer} endpoint is `azure_openai_embeddings`.
<2> The API key for accessing your Azure OpenAI services.
Alternately, you can provide an `entra_id` instead of an `api_key` here.
The <<get-inference-api>> does not return this information.
<3> The name our your Azure resource.
<4> The id of your deployed model.

NOTE: It may take a few minutes for your model's deployment to become available
after it is created. If you try and create the model as above and receive a
`404` error message, wait a few minutes and try again.
Also, when using this model the recommended similarity measure to use in the
`dense_vector` field mapping is `dot_product`.
In the case of Azure OpenAI models, the embeddings are normalized to unit
length in which case the `dot_product` and the `cosine` measures are equivalent.

// end::azure-openai[]

// tag::azure-ai-studio[]

[source,console]
------------------------------------------------------------
PUT _inference/text_embedding/azure_ai_studio_embeddings <1>
{
    "service": "azureaistudio",
    "service_settings": {
        "api_key": "<api_key>", <2>
        "target": "<target_uri>", <3>
        "provider": "<provider>", <4>
        "endpoint_type": "<endpoint_type>" <5>
    }
}
------------------------------------------------------------
// TEST[skip:TBD]
<1> The task type is `text_embedding` in the path and the `inference_id` which is the unique identifier of the {infer} endpoint is `azure_ai_studio_embeddings`.
<2> The API key for accessing your Azure AI Studio deployed model. You can find this on your model deployment's overview page.
<3> The target URI for accessing your Azure AI Studio deployed model. You can find this on your model deployment's overview page.
<4> The model provider, such as `cohere` or `openai`.
<5> The deployed endpoint type. This can be `token` (for "pay as you go" deployments), or `realtime` for real-time deployment endpoints.

NOTE: It may take a few minutes for your model's deployment to become available
after it is created. If you try and create the model as above and receive a
`404` error message, wait a few minutes and try again.
Also, when using this model the recommended similarity measure to use in the
`dense_vector` field mapping is `dot_product`.

// end::azure-ai-studio[]

// tag::google-vertex-ai[]

[source,console]
------------------------------------------------------------
PUT _inference/text_embedding/google_vertex_ai_embeddings <1>
{
    "service": "googlevertexai",
    "service_settings": {
        "service_account_json": "<service_account_json>", <2>
        "model_id": "text-embedding-004", <3>
        "location": "<location>", <4>
        "project_id": "<project_id>" <5>
    }
}
------------------------------------------------------------
// TEST[skip:TBD]
<1> The task type is `text_embedding` per the path. `google_vertex_ai_embeddings` is the unique identifier of the {infer} endpoint (its `inference_id`).
<2> A valid service account in JSON format for the Google Vertex AI API.
<3> For the list of the available models, refer to the https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/text-embeddings-api[Text embeddings API] page.
<4> The name of the location to use for the {infer} task. Refer to https://cloud.google.com/vertex-ai/generative-ai/docs/learn/locations[Generative AI on Vertex AI locations] for available locations.
<5> The name of the project to use for the {infer} task.

// end::google-vertex-ai[]

// tag::mistral[]

[source,console]
------------------------------------------------------------
PUT _inference/text_embedding/mistral_embeddings <1>
{
    "service": "mistral",
    "service_settings": {
        "api_key": "<api_key>", <2>
        "model": "<model_id>" <3>
    }
}
------------------------------------------------------------
// TEST[skip:TBD]
<1> The task type is `text_embedding` in the path and the `inference_id` which is the unique identifier of the {infer} endpoint is `mistral_embeddings`.
<2> The API key for accessing the Mistral API. You can find this in your Mistral account's API Keys page.
<3> The Mistral embeddings model name, for example `mistral-embed`.

// end::mistral[]

// tag::amazon-bedrock[]

[source,console]
------------------------------------------------------------
PUT _inference/text_embedding/amazon_bedrock_embeddings <1>
{
    "service": "amazonbedrock",
    "service_settings": {
        "access_key": "<aws_access_key>", <2>
        "secret_key": "<aws_secret_key>", <3>
        "region": "<region>", <4>
        "provider": "<provider>", <5>
        "model": "<model_id>" <6>
    }
}
------------------------------------------------------------
// TEST[skip:TBD]
<1> The task type is `text_embedding` in the path and the `inference_id` which is the unique identifier of the {infer} endpoint is `amazon_bedrock_embeddings`.
<2> The access key can be found on your AWS IAM management page for the user account to access Amazon Bedrock.
<3> The secret key should be the paired key for the specified access key.
<4> Specify the region that your model is hosted in.
<5> Specify the model provider.
<6> The model ID or ARN of the model to use.

// end::amazon-bedrock[]

// tag::alibabacloud-ai-search[]

[source,console]
------------------------------------------------------------
PUT _inference/text_embedding/alibabacloud_ai_search_embeddings <1>
{
    "service": "alibabacloud-ai-search",
    "service_settings": {
        "api_key": "<api_key>", <2>
        "service_id": "<service_id>", <3>
        "host": "<host>", <4>
        "workspace": "<workspace>" <5>
    }
}
------------------------------------------------------------
// TEST[skip:TBD]
<1> The task type is `text_embedding` in the path and the `inference_id` which is the unique identifier of the {infer} endpoint is `alibabacloud_ai_search_embeddings`.
<2> The API key for accessing the AlibabaCloud AI Search API. You can find your API keys in
your AlibabaCloud account under the
https://opensearch.console.aliyun.com/cn-shanghai/rag/api-key[API keys section]. You need to provide
your API key only once. The <<get-inference-api>> does not return your API
key.
<3> The AlibabaCloud AI Search embeddings model name, for example `ops-text-embedding-zh-001`.
<4> The name our your AlibabaCloud AI Search host address.
<5> The name our your AlibabaCloud AI Search workspace.

// end::alibabacloud-ai-search[]