diff --git a/docs/reference/inference/chat-completion-inference.asciidoc b/docs/reference/inference/chat-completion-inference.asciidoc
index 1d7d05b0f7d8..88699cca67af 100644
--- a/docs/reference/inference/chat-completion-inference.asciidoc
+++ b/docs/reference/inference/chat-completion-inference.asciidoc
@@ -13,9 +13,9 @@ However, if you do not plan to use the {infer} APIs to use these models or if yo
 [[chat-completion-inference-api-request]]
 ==== {api-request-title}
 
-`POST /_inference/<inference_id>/_unified`
+`POST /_inference/<inference_id>/_stream`
 
-`POST /_inference/chat_completion/<inference_id>/_unified`
+`POST /_inference/chat_completion/<inference_id>/_stream`
 
 
 [discrete]
@@ -37,7 +37,7 @@ It only works with the `chat_completion` task type for `openai` and `elastic` {i
 
 [NOTE]
 ====
-* The `chat_completion` task type is only available within the _unified API and only supports streaming.
+* The `chat_completion` task type is only available within the _stream API and only supports streaming.
 * The Chat completion {infer} API and the Stream {infer} API differ in their response structure and capabilities.
 The Chat completion {infer} API provides more comprehensive customization options through more fields and function calling support.
 If you use the `openai` service or the `elastic` service, use the Chat completion {infer} API.
diff --git a/docs/reference/inference/elastic-infer-service.asciidoc b/docs/reference/inference/elastic-infer-service.asciidoc
index 24ae7e20deec..94f2f1992db7 100644
--- a/docs/reference/inference/elastic-infer-service.asciidoc
+++ b/docs/reference/inference/elastic-infer-service.asciidoc
@@ -39,7 +39,7 @@ Available task types:
 
 [NOTE]
 ====
-The `chat_completion` task type only supports streaming and only through the `_unified` API.
+The `chat_completion` task type only supports streaming and only through the `_stream` API.
 
 include::inference-shared.asciidoc[tag=chat-completion-docs]
 ====
diff --git a/docs/reference/inference/service-openai.asciidoc b/docs/reference/inference/service-openai.asciidoc
index 511632736a35..d2c0dd460f9e 100644
--- a/docs/reference/inference/service-openai.asciidoc
+++ b/docs/reference/inference/service-openai.asciidoc
@@ -38,7 +38,7 @@ Available task types:
 
 [NOTE]
 ====
-The `chat_completion` task type only supports streaming and only through the `_unified` API.
+The `chat_completion` task type only supports streaming and only through the `_stream` API.
 
 include::inference-shared.asciidoc[tag=chat-completion-docs]
 ====
diff --git a/docs/reference/search/search-your-data/cohere-es.asciidoc b/docs/reference/search/search-your-data/cohere-es.asciidoc
index 3029cfd9f098..cea17a4ed9a8 100644
--- a/docs/reference/search/search-your-data/cohere-es.asciidoc
+++ b/docs/reference/search/search-your-data/cohere-es.asciidoc
@@ -297,7 +297,7 @@ Rerank the results using the new {infer} endpoint.
 [source,py]
 --------------------------------------------------
 # Pass the query and the search results to the service
-response = client.inference.inference(
+response = client.inference.rerank(
     inference_id="cohere_rerank",
     body={
         "query": query,