elasticsearch/docs/reference/tab-widgets/inference-api/infer-api-task.asciidoc

286 lines
11 KiB
Text

// tag::cohere[]
[source,console]
------------------------------------------------------------
PUT _inference/text_embedding/cohere_embeddings <1>
{
"service": "cohere",
"service_settings": {
"api_key": "<api_key>", <2>
"model_id": "embed-english-v3.0", <3>
"embedding_type": "byte"
}
}
------------------------------------------------------------
// TEST[skip:TBD]
<1> The task type is `text_embedding` in the path and the `inference_id` which
is the unique identifier of the {infer} endpoint is `cohere_embeddings`.
<2> The API key of your Cohere account. You can find your API keys in your
Cohere dashboard under the
https://dashboard.cohere.com/api-keys[API keys section]. You need to provide
your API key only once. The <<get-inference-api>> does not return your API
key.
<3> The name of the embedding model to use. You can find the list of Cohere
embedding models https://docs.cohere.com/reference/embed[here].
NOTE: When using this model the recommended similarity measure to use in the
`dense_vector` field mapping is `dot_product`. In the case of Cohere models, the
embeddings are normalized to unit length in which case the `dot_product` and
the `cosine` measures are equivalent.
// end::cohere[]
// tag::elser[]
[source,console]
------------------------------------------------------------
PUT _inference/sparse_embedding/elser_embeddings <1>
{
"service": "elser",
"service_settings": {
"num_allocations": 1,
"num_threads": 1
}
}
------------------------------------------------------------
// TEST[skip:TBD]
<1> The task type is `sparse_embedding` in the path and the `inference_id` which
is the unique identifier of the {infer} endpoint is `elser_embeddings`.
You don't need to download and deploy the ELSER model upfront, the API request
above will download the model if it's not downloaded yet and then deploy it.
[NOTE]
====
You might see a 502 bad gateway error in the response when using the {kib} Console.
This error usually just reflects a timeout, while the model downloads in the background.
You can check the download progress in the {ml-app} UI.
If using the Python client, you can set the `timeout` parameter to a higher value.
====
// end::elser[]
// tag::hugging-face[]
First, you need to create a new {infer} endpoint on
https://ui.endpoints.huggingface.co/[the Hugging Face endpoint page] to get an
endpoint URL. Select the model `all-mpnet-base-v2` on the new endpoint creation
page, then select the `Sentence Embeddings` task under the Advanced
configuration section. Create the endpoint. Copy the URL after the endpoint
initialization has been finished, you need the URL in the following {infer} API
call.
[source,console]
------------------------------------------------------------
PUT _inference/text_embedding/hugging_face_embeddings <1>
{
"service": "hugging_face",
"service_settings": {
"api_key": "<access_token>", <2>
"url": "<url_endpoint>" <3>
}
}
------------------------------------------------------------
// TEST[skip:TBD]
<1> The task type is `text_embedding` in the path and the `inference_id` which
is the unique identifier of the {infer} endpoint is `hugging_face_embeddings`.
<2> A valid HuggingFace access token. You can find on the
https://huggingface.co/settings/tokens[settings page of your account].
<3> The {infer} endpoint URL you created on Hugging Face.
// end::hugging-face[]
// tag::openai[]
[source,console]
------------------------------------------------------------
PUT _inference/text_embedding/openai_embeddings <1>
{
"service": "openai",
"service_settings": {
"api_key": "<api_key>", <2>
"model_id": "text-embedding-ada-002" <3>
}
}
------------------------------------------------------------
// TEST[skip:TBD]
<1> The task type is `text_embedding` in the path and the `inference_id` which
is the unique identifier of the {infer} endpoint is `openai_embeddings`.
<2> The API key of your OpenAI account. You can find your OpenAI API keys in
your OpenAI account under the
https://platform.openai.com/api-keys[API keys section]. You need to provide
your API key only once. The <<get-inference-api>> does not return your API
key.
<3> The name of the embedding model to use. You can find the list of OpenAI
embedding models
https://platform.openai.com/docs/guides/embeddings/embedding-models[here].
NOTE: When using this model the recommended similarity measure to use in the
`dense_vector` field mapping is `dot_product`. In the case of OpenAI models, the
embeddings are normalized to unit length in which case the `dot_product` and
the `cosine` measures are equivalent.
// end::openai[]
// tag::azure-openai[]
[source,console]
------------------------------------------------------------
PUT _inference/text_embedding/azure_openai_embeddings <1>
{
"service": "azureopenai",
"service_settings": {
"api_key": "<api_key>", <2>
"resource_name": "<resource_name>", <3>
"deployment_id": "<deployment_id>", <4>
"api_version": "2024-02-01"
}
}
------------------------------------------------------------
// TEST[skip:TBD]
<1> The task type is `text_embedding` in the path and the `inference_id` which is the unique identifier of the {infer} endpoint is `azure_openai_embeddings`.
<2> The API key for accessing your Azure OpenAI services.
Alternately, you can provide an `entra_id` instead of an `api_key` here.
The <<get-inference-api>> does not return this information.
<3> The name our your Azure resource.
<4> The id of your deployed model.
NOTE: It may take a few minutes for your model's deployment to become available
after it is created. If you try and create the model as above and receive a
`404` error message, wait a few minutes and try again.
Also, when using this model the recommended similarity measure to use in the
`dense_vector` field mapping is `dot_product`.
In the case of Azure OpenAI models, the embeddings are normalized to unit
length in which case the `dot_product` and the `cosine` measures are equivalent.
// end::azure-openai[]
// tag::azure-ai-studio[]
[source,console]
------------------------------------------------------------
PUT _inference/text_embedding/azure_ai_studio_embeddings <1>
{
"service": "azureaistudio",
"service_settings": {
"api_key": "<api_key>", <2>
"target": "<target_uri>", <3>
"provider": "<provider>", <4>
"endpoint_type": "<endpoint_type>" <5>
}
}
------------------------------------------------------------
// TEST[skip:TBD]
<1> The task type is `text_embedding` in the path and the `inference_id` which is the unique identifier of the {infer} endpoint is `azure_ai_studio_embeddings`.
<2> The API key for accessing your Azure AI Studio deployed model. You can find this on your model deployment's overview page.
<3> The target URI for accessing your Azure AI Studio deployed model. You can find this on your model deployment's overview page.
<4> The model provider, such as `cohere` or `openai`.
<5> The deployed endpoint type. This can be `token` (for "pay as you go" deployments), or `realtime` for real-time deployment endpoints.
NOTE: It may take a few minutes for your model's deployment to become available
after it is created. If you try and create the model as above and receive a
`404` error message, wait a few minutes and try again.
Also, when using this model the recommended similarity measure to use in the
`dense_vector` field mapping is `dot_product`.
// end::azure-ai-studio[]
// tag::google-vertex-ai[]
[source,console]
------------------------------------------------------------
PUT _inference/text_embedding/google_vertex_ai_embeddings <1>
{
"service": "googlevertexai",
"service_settings": {
"service_account_json": "<service_account_json>", <2>
"model_id": "text-embedding-004", <3>
"location": "<location>", <4>
"project_id": "<project_id>" <5>
}
}
------------------------------------------------------------
// TEST[skip:TBD]
<1> The task type is `text_embedding` per the path. `google_vertex_ai_embeddings` is the unique identifier of the {infer} endpoint (its `inference_id`).
<2> A valid service account in JSON format for the Google Vertex AI API.
<3> For the list of the available models, refer to the https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/text-embeddings-api[Text embeddings API] page.
<4> The name of the location to use for the {infer} task. Refer to https://cloud.google.com/vertex-ai/generative-ai/docs/learn/locations[Generative AI on Vertex AI locations] for available locations.
<5> The name of the project to use for the {infer} task.
// end::google-vertex-ai[]
// tag::mistral[]
[source,console]
------------------------------------------------------------
PUT _inference/text_embedding/mistral_embeddings <1>
{
"service": "mistral",
"service_settings": {
"api_key": "<api_key>", <2>
"model": "<model_id>" <3>
}
}
------------------------------------------------------------
// TEST[skip:TBD]
<1> The task type is `text_embedding` in the path and the `inference_id` which is the unique identifier of the {infer} endpoint is `mistral_embeddings`.
<2> The API key for accessing the Mistral API. You can find this in your Mistral account's API Keys page.
<3> The Mistral embeddings model name, for example `mistral-embed`.
// end::mistral[]
// tag::amazon-bedrock[]
[source,console]
------------------------------------------------------------
PUT _inference/text_embedding/amazon_bedrock_embeddings <1>
{
"service": "amazonbedrock",
"service_settings": {
"access_key": "<aws_access_key>", <2>
"secret_key": "<aws_secret_key>", <3>
"region": "<region>", <4>
"provider": "<provider>", <5>
"model": "<model_id>" <6>
}
}
------------------------------------------------------------
// TEST[skip:TBD]
<1> The task type is `text_embedding` in the path and the `inference_id` which is the unique identifier of the {infer} endpoint is `amazon_bedrock_embeddings`.
<2> The access key can be found on your AWS IAM management page for the user account to access Amazon Bedrock.
<3> The secret key should be the paired key for the specified access key.
<4> Specify the region that your model is hosted in.
<5> Specify the model provider.
<6> The model ID or ARN of the model to use.
// end::amazon-bedrock[]
// tag::alibabacloud-ai-search[]
[source,console]
------------------------------------------------------------
PUT _inference/text_embedding/alibabacloud_ai_search_embeddings <1>
{
"service": "alibabacloud-ai-search",
"service_settings": {
"api_key": "<api_key>", <2>
"service_id": "<service_id>", <3>
"host": "<host>", <4>
"workspace": "<workspace>" <5>
}
}
------------------------------------------------------------
// TEST[skip:TBD]
<1> The task type is `text_embedding` in the path and the `inference_id` which is the unique identifier of the {infer} endpoint is `alibabacloud_ai_search_embeddings`.
<2> The API key for accessing the AlibabaCloud AI Search API. You can find your API keys in
your AlibabaCloud account under the
https://opensearch.console.aliyun.com/cn-shanghai/rag/api-key[API keys section]. You need to provide
your API key only once. The <<get-inference-api>> does not return your API
key.
<3> The AlibabaCloud AI Search embeddings model name, for example `ops-text-embedding-zh-001`.
<4> The name our your AlibabaCloud AI Search host address.
<5> The name our your AlibabaCloud AI Search workspace.
// end::alibabacloud-ai-search[]