[[infer-service-openai]] === OpenAI {infer} service Creates an {infer} endpoint to perform an {infer} task with the `openai` service. [discrete] [[infer-service-openai-api-request]] ==== {api-request-title} `PUT /_inference//` [discrete] [[infer-service-openai-api-path-params]] ==== {api-path-parms-title} ``:: (Required, string) include::inference-shared.asciidoc[tag=inference-id] ``:: (Required, string) include::inference-shared.asciidoc[tag=task-type] + -- Available task types: * `completion`, * `text_embedding`. -- [discrete] [[infer-service-openai-api-request-body]] ==== {api-request-body-title} `service`:: (Required, string) The type of service supported for the specified task type. In this case, `openai`. `service_settings`:: (Required, object) include::inference-shared.asciidoc[tag=service-settings] + -- These settings are specific to the `openai` service. -- `api_key`::: (Required, string) A valid API key of your OpenAI account. You can find your OpenAI API keys in your OpenAI account under the https://platform.openai.com/api-keys[API keys section]. + -- include::inference-shared.asciidoc[tag=api-key-admonition] -- `model_id`::: (Required, string) The name of the model to use for the {infer} task. Refer to the https://platform.openai.com/docs/guides/embeddings/what-are-embeddings[OpenAI documentation] for the list of available text embedding models. `organization_id`::: (Optional, string) The unique identifier of your organization. You can find the Organization ID in your OpenAI account under https://platform.openai.com/account/organization[**Settings** > **Organizations**]. `url`::: (Optional, string) The URL endpoint to use for the requests. Can be changed for testing purposes. Defaults to `https://api.openai.com/v1/embeddings`. `rate_limit`::: (Optional, object) The `openai` service sets a default number of requests allowed per minute depending on the task type. For `text_embedding` it is set to `3000`. For `completion` it is set to `500`. This helps to minimize the number of rate limit errors returned from OpenAI. To modify this, set the `requests_per_minute` setting of this object in your service settings: + -- include::inference-shared.asciidoc[tag=request-per-minute-example] More information about the rate limits for OpenAI can be found in your https://platform.openai.com/account/limits[Account limits]. -- `task_settings`:: (Optional, object) include::inference-shared.asciidoc[tag=task-settings] + .`task_settings` for the `completion` task type [%collapsible%closed] ===== `user`::: (Optional, string) Specifies the user issuing the request, which can be used for abuse detection. ===== + .`task_settings` for the `text_embedding` task type [%collapsible%closed] ===== `user`::: (optional, string) Specifies the user issuing the request, which can be used for abuse detection. ===== [discrete] [[inference-example-openai]] ==== OpenAI service example The following example shows how to create an {infer} endpoint called `openai-embeddings` to perform a `text_embedding` task type. [source,console] ------------------------------------------------------------ PUT _inference/text_embedding/openai-embeddings { "service": "openai", "service_settings": { "api_key": "", "model_id": "text-embedding-ada-002" } } ------------------------------------------------------------ // TEST[skip:TBD] The next example shows how to create an {infer} endpoint called `openai-completion` to perform a `completion` task type. [source,console] ------------------------------------------------------------ PUT _inference/completion/openai-completion { "service": "openai", "service_settings": { "api_key": "", "model_id": "gpt-3.5-turbo" } } ------------------------------------------------------------ // TEST[skip:TBD]