[Inference API] Add completion task type docs (#106876)

2025-06-28 17:34:17 -04:00 · 2024-04-02 13:34:46 +02:00 · 2024-04-02 13:34:46 +02:00 · e56dcee078
commit e56dcee078
parent 7f17effb4f
2 changed files with 67 additions and 4 deletions
--- a/docs/reference/inference/post-inference.asciidoc
+++ b/docs/reference/inference/post-inference.asciidoc
@ -33,8 +33,8 @@ own model, use the <<ml-df-trained-models-apis>>.
 ==== {api-description-title}

 The perform {infer} API enables you to use {ml} models to perform specific tasks
-on data that you provide as an input. The API returns a response with the 
-resutls of the tasks. The {infer} model you use can perform one specific task
+on data that you provide as an input. The API returns a response with the
+results of the tasks. The {infer} model you use can perform one specific task
 that has been defined when the model was created with the <<put-inference-api>>.


@ -60,6 +60,10 @@ The type of {infer} task that the model performs.
 (Required, array of strings)
 The text on which you want to perform the {infer} task.
 `input` can be a single string or an array.
+[NOTE]
+====
+Inference endpoints for the `completion` task type currently only support a single string as input.
+====


 [discrete]
@ -108,3 +112,32 @@ The API returns the following response:
 }
 ------------------------------------------------------------
 // NOTCONSOLE
+
+
+The next example performs a completion on the example question.
+
+
+[source,console]
+------------------------------------------------------------
+POST _inference/completion/openai_chat_completions
+{
+  "input": "What is Elastic?"
+}
+------------------------------------------------------------
+// TEST[skip:TBD]
+
+
+The API returns the following response:
+
+
+[source,console-result]
+------------------------------------------------------------
+{
+  "completion": [
+    {
+      "result": "Elastic is a company that provides a range of software solutions for search, logging, security, and analytics. Their flagship product is Elasticsearch, an open-source, distributed search engine that allows users to search, analyze, and visualize large volumes of data in real-time. Elastic also offers products such as Kibana, a data visualization tool, and Logstash, a log management and pipeline tool, as well as various other tools and solutions for data analysis and management."
+    }
+  ]
+}
+------------------------------------------------------------
+// NOTCONSOLE
--- a/docs/reference/inference/put-inference.asciidoc
+++ b/docs/reference/inference/put-inference.asciidoc
@ -58,7 +58,8 @@ The unique identifier of the {infer} endpoint.
 (Required, string)
 The type of the {infer} task that the model will perform. Available task types:
 * `sparse_embedding`,
-* `text_embedding`.
+* `text_embedding`,
+* `completion`


 [discrete]
@ -101,7 +102,7 @@ the same name and the updated API key.
 (Optional, string)
 Specifies the types of embeddings you want to get back. Defaults to `float`.
 Valid values are:
-  * `byte`: use it for signed int8 embeddings (this is a synonym of `int8`). 
+  * `byte`: use it for signed int8 embeddings (this is a synonym of `int8`).
  * `float`: use it for the default float embeddings.
  * `int8`: use it for signed int8 embeddings.

@ -232,6 +233,18 @@ maximum token length. Defaults to `END`. Valid values are:
 the input is discarded.
 * `END`: when the input exceeds the maximum input token length the end of
 the input is discarded.
+
+`user`:::
+(optional, string)
+For `openai` service only. Specifies the user issuing the request, which can be used for abuse detection.
+=====
+
+.`task_settings` for the `completion` task type
+[%collapsible%closed]
+=====
+`user`:::
+(optional, string)
+For `openai` service only. Specifies the user issuing the request, which can be used for abuse detection.
 =====


@ -402,3 +415,20 @@ PUT _inference/text_embedding/openai_embeddings
 }
 ------------------------------------------------------------
 // TEST[skip:TBD]
+
+The next example shows how to create an {infer} endpoint called
+`openai_completion` to perform a `completion` task type.
+
+[source,console]
+------------------------------------------------------------
+PUT _inference/completion/openai_completion
+{
+    "service": "openai",
+    "service_settings": {
+        "api_key": "<api_key>",
+        "model_id": "gpt-3.5-turbo"
+    }
+}
+------------------------------------------------------------
+// TEST[skip:TBD]
+