diff --git a/docs/reference/inference/chat-completion-inference.asciidoc b/docs/reference/inference/chat-completion-inference.asciidoc index 1d7d05b0f7d8..88699cca67af 100644 --- a/docs/reference/inference/chat-completion-inference.asciidoc +++ b/docs/reference/inference/chat-completion-inference.asciidoc @@ -13,9 +13,9 @@ However, if you do not plan to use the {infer} APIs to use these models or if yo [[chat-completion-inference-api-request]] ==== {api-request-title} -`POST /_inference//_unified` +`POST /_inference//_stream` -`POST /_inference/chat_completion//_unified` +`POST /_inference/chat_completion//_stream` [discrete] @@ -37,7 +37,7 @@ It only works with the `chat_completion` task type for `openai` and `elastic` {i [NOTE] ==== -* The `chat_completion` task type is only available within the _unified API and only supports streaming. +* The `chat_completion` task type is only available within the _stream API and only supports streaming. * The Chat completion {infer} API and the Stream {infer} API differ in their response structure and capabilities. The Chat completion {infer} API provides more comprehensive customization options through more fields and function calling support. If you use the `openai` service or the `elastic` service, use the Chat completion {infer} API. diff --git a/docs/reference/inference/elastic-infer-service.asciidoc b/docs/reference/inference/elastic-infer-service.asciidoc index 24ae7e20deec..94f2f1992db7 100644 --- a/docs/reference/inference/elastic-infer-service.asciidoc +++ b/docs/reference/inference/elastic-infer-service.asciidoc @@ -39,7 +39,7 @@ Available task types: [NOTE] ==== -The `chat_completion` task type only supports streaming and only through the `_unified` API. +The `chat_completion` task type only supports streaming and only through the `_stream` API. include::inference-shared.asciidoc[tag=chat-completion-docs] ==== diff --git a/docs/reference/inference/service-openai.asciidoc b/docs/reference/inference/service-openai.asciidoc index 511632736a35..d2c0dd460f9e 100644 --- a/docs/reference/inference/service-openai.asciidoc +++ b/docs/reference/inference/service-openai.asciidoc @@ -38,7 +38,7 @@ Available task types: [NOTE] ==== -The `chat_completion` task type only supports streaming and only through the `_unified` API. +The `chat_completion` task type only supports streaming and only through the `_stream` API. include::inference-shared.asciidoc[tag=chat-completion-docs] ==== diff --git a/docs/reference/search/search-your-data/cohere-es.asciidoc b/docs/reference/search/search-your-data/cohere-es.asciidoc index 3029cfd9f098..cea17a4ed9a8 100644 --- a/docs/reference/search/search-your-data/cohere-es.asciidoc +++ b/docs/reference/search/search-your-data/cohere-es.asciidoc @@ -297,7 +297,7 @@ Rerank the results using the new {infer} endpoint. [source,py] -------------------------------------------------- # Pass the query and the search results to the service -response = client.inference.inference( +response = client.inference.rerank( inference_id="cohere_rerank", body={ "query": query,