// tag::cohere[] [source,console] ------------------------------------------------------------ PUT _inference/text_embedding/cohere_embeddings <1> { "service": "cohere", "service_settings": { "api_key": "", <2> "model_id": "embed-english-v3.0", <3> "embedding_type": "byte" } } ------------------------------------------------------------ // TEST[skip:TBD] <1> The task type is `text_embedding` in the path and the `inference_id` which is the unique identifier of the {infer} endpoint is `cohere_embeddings`. <2> The API key of your Cohere account. You can find your API keys in your Cohere dashboard under the https://dashboard.cohere.com/api-keys[API keys section]. You need to provide your API key only once. The <> does not return your API key. <3> The name of the embedding model to use. You can find the list of Cohere embedding models https://docs.cohere.com/reference/embed[here]. NOTE: When using this model the recommended similarity measure to use in the `dense_vector` field mapping is `dot_product`. In the case of Cohere models, the embeddings are normalized to unit length in which case the `dot_product` and the `cosine` measures are equivalent. // end::cohere[] // tag::elser[] [source,console] ------------------------------------------------------------ PUT _inference/sparse_embedding/elser_embeddings <1> { "service": "elser", "service_settings": { "num_allocations": 1, "num_threads": 1 } } ------------------------------------------------------------ // TEST[skip:TBD] <1> The task type is `sparse_embedding` in the path and the `inference_id` which is the unique identifier of the {infer} endpoint is `elser_embeddings`. You don't need to download and deploy the ELSER model upfront, the API request above will download the model if it's not downloaded yet and then deploy it. [NOTE] ==== You might see a 502 bad gateway error in the response when using the {kib} Console. This error usually just reflects a timeout, while the model downloads in the background. You can check the download progress in the {ml-app} UI. If using the Python client, you can set the `timeout` parameter to a higher value. ==== // end::elser[] // tag::hugging-face[] First, you need to create a new {infer} endpoint on https://ui.endpoints.huggingface.co/[the Hugging Face endpoint page] to get an endpoint URL. Select the model `all-mpnet-base-v2` on the new endpoint creation page, then select the `Sentence Embeddings` task under the Advanced configuration section. Create the endpoint. Copy the URL after the endpoint initialization has been finished, you need the URL in the following {infer} API call. [source,console] ------------------------------------------------------------ PUT _inference/text_embedding/hugging_face_embeddings <1> { "service": "hugging_face", "service_settings": { "api_key": "", <2> "url": "" <3> } } ------------------------------------------------------------ // TEST[skip:TBD] <1> The task type is `text_embedding` in the path and the `inference_id` which is the unique identifier of the {infer} endpoint is `hugging_face_embeddings`. <2> A valid HuggingFace access token. You can find on the https://huggingface.co/settings/tokens[settings page of your account]. <3> The {infer} endpoint URL you created on Hugging Face. // end::hugging-face[] // tag::openai[] [source,console] ------------------------------------------------------------ PUT _inference/text_embedding/openai_embeddings <1> { "service": "openai", "service_settings": { "api_key": "", <2> "model_id": "text-embedding-ada-002" <3> } } ------------------------------------------------------------ // TEST[skip:TBD] <1> The task type is `text_embedding` in the path and the `inference_id` which is the unique identifier of the {infer} endpoint is `openai_embeddings`. <2> The API key of your OpenAI account. You can find your OpenAI API keys in your OpenAI account under the https://platform.openai.com/api-keys[API keys section]. You need to provide your API key only once. The <> does not return your API key. <3> The name of the embedding model to use. You can find the list of OpenAI embedding models https://platform.openai.com/docs/guides/embeddings/embedding-models[here]. NOTE: When using this model the recommended similarity measure to use in the `dense_vector` field mapping is `dot_product`. In the case of OpenAI models, the embeddings are normalized to unit length in which case the `dot_product` and the `cosine` measures are equivalent. // end::openai[] // tag::azure-openai[] [source,console] ------------------------------------------------------------ PUT _inference/text_embedding/azure_openai_embeddings <1> { "service": "azureopenai", "service_settings": { "api_key": "", <2> "resource_name": "", <3> "deployment_id": "", <4> "api_version": "2024-02-01" } } ------------------------------------------------------------ // TEST[skip:TBD] <1> The task type is `text_embedding` in the path and the `inference_id` which is the unique identifier of the {infer} endpoint is `azure_openai_embeddings`. <2> The API key for accessing your Azure OpenAI services. Alternately, you can provide an `entra_id` instead of an `api_key` here. The <> does not return this information. <3> The name our your Azure resource. <4> The id of your deployed model. NOTE: It may take a few minutes for your model's deployment to become available after it is created. If you try and create the model as above and receive a `404` error message, wait a few minutes and try again. Also, when using this model the recommended similarity measure to use in the `dense_vector` field mapping is `dot_product`. In the case of Azure OpenAI models, the embeddings are normalized to unit length in which case the `dot_product` and the `cosine` measures are equivalent. // end::azure-openai[] // tag::azure-ai-studio[] [source,console] ------------------------------------------------------------ PUT _inference/text_embedding/azure_ai_studio_embeddings <1> { "service": "azureaistudio", "service_settings": { "api_key": "", <2> "target": "", <3> "provider": "", <4> "endpoint_type": "" <5> } } ------------------------------------------------------------ // TEST[skip:TBD] <1> The task type is `text_embedding` in the path and the `inference_id` which is the unique identifier of the {infer} endpoint is `azure_ai_studio_embeddings`. <2> The API key for accessing your Azure AI Studio deployed model. You can find this on your model deployment's overview page. <3> The target URI for accessing your Azure AI Studio deployed model. You can find this on your model deployment's overview page. <4> The model provider, such as `cohere` or `openai`. <5> The deployed endpoint type. This can be `token` (for "pay as you go" deployments), or `realtime` for real-time deployment endpoints. NOTE: It may take a few minutes for your model's deployment to become available after it is created. If you try and create the model as above and receive a `404` error message, wait a few minutes and try again. Also, when using this model the recommended similarity measure to use in the `dense_vector` field mapping is `dot_product`. // end::azure-ai-studio[] // tag::google-vertex-ai[] [source,console] ------------------------------------------------------------ PUT _inference/text_embedding/google_vertex_ai_embeddings <1> { "service": "googlevertexai", "service_settings": { "service_account_json": "", <2> "model_id": "text-embedding-004", <3> "location": "", <4> "project_id": "" <5> } } ------------------------------------------------------------ // TEST[skip:TBD] <1> The task type is `text_embedding` per the path. `google_vertex_ai_embeddings` is the unique identifier of the {infer} endpoint (its `inference_id`). <2> A valid service account in JSON format for the Google Vertex AI API. <3> For the list of the available models, refer to the https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/text-embeddings-api[Text embeddings API] page. <4> The name of the location to use for the {infer} task. Refer to https://cloud.google.com/vertex-ai/generative-ai/docs/learn/locations[Generative AI on Vertex AI locations] for available locations. <5> The name of the project to use for the {infer} task. // end::google-vertex-ai[] // tag::mistral[] [source,console] ------------------------------------------------------------ PUT _inference/text_embedding/mistral_embeddings <1> { "service": "mistral", "service_settings": { "api_key": "", <2> "model": "" <3> } } ------------------------------------------------------------ // TEST[skip:TBD] <1> The task type is `text_embedding` in the path and the `inference_id` which is the unique identifier of the {infer} endpoint is `mistral_embeddings`. <2> The API key for accessing the Mistral API. You can find this in your Mistral account's API Keys page. <3> The Mistral embeddings model name, for example `mistral-embed`. // end::mistral[] // tag::amazon-bedrock[] [source,console] ------------------------------------------------------------ PUT _inference/text_embedding/amazon_bedrock_embeddings <1> { "service": "amazonbedrock", "service_settings": { "access_key": "", <2> "secret_key": "", <3> "region": "", <4> "provider": "", <5> "model": "" <6> } } ------------------------------------------------------------ // TEST[skip:TBD] <1> The task type is `text_embedding` in the path and the `inference_id` which is the unique identifier of the {infer} endpoint is `amazon_bedrock_embeddings`. <2> The access key can be found on your AWS IAM management page for the user account to access Amazon Bedrock. <3> The secret key should be the paired key for the specified access key. <4> Specify the region that your model is hosted in. <5> Specify the model provider. <6> The model ID or ARN of the model to use. // end::amazon-bedrock[] // tag::alibabacloud-ai-search[] [source,console] ------------------------------------------------------------ PUT _inference/text_embedding/alibabacloud_ai_search_embeddings <1> { "service": "alibabacloud-ai-search", "service_settings": { "api_key": "", <2> "service_id": "", <3> "host": "", <4> "workspace": "" <5> } } ------------------------------------------------------------ // TEST[skip:TBD] <1> The task type is `text_embedding` in the path and the `inference_id` which is the unique identifier of the {infer} endpoint is `alibabacloud_ai_search_embeddings`. <2> The API key for accessing the AlibabaCloud AI Search API. You can find your API keys in your AlibabaCloud account under the https://opensearch.console.aliyun.com/cn-shanghai/rag/api-key[API keys section]. You need to provide your API key only once. The <> does not return your API key. <3> The AlibabaCloud AI Search embeddings model name, for example `ops-text-embedding-zh-001`. <4> The name our your AlibabaCloud AI Search host address. <5> The name our your AlibabaCloud AI Search workspace. // end::alibabacloud-ai-search[]