[role="xpack"] [[put-inference-api]] === Create {infer} API experimental[] Creates an {infer} endpoint to perform an {infer} task. [IMPORTANT] ==== * The {infer} APIs enable you to use certain services, such as built-in {ml} models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Mistral, Azure OpenAI, Google AI Studio, Google Vertex AI, Anthropic or Hugging Face. * For built-in models and models uploaded through Eland, the {infer} APIs offer an alternative way to use and manage trained models. However, if you do not plan to use the {infer} APIs to use these models or if you want to use non-NLP models, use the <>. ==== [discrete] [[put-inference-api-request]] ==== {api-request-title} `PUT /_inference//` [discrete] [[put-inference-api-prereqs]] ==== {api-prereq-title} * Requires the `manage_inference` <> (the built-in `inference_admin` role grants this privilege) [discrete] [[put-inference-api-path-params]] ==== {api-path-parms-title} ``:: (Required, string) include::inference-shared.asciidoc[tag=inference-id] ``:: (Required, string) include::inference-shared.asciidoc[tag=task-type] + -- Refer to the service list in the <> for the available task types. -- [discrete] [[put-inference-api-desc]] ==== {api-description-title} The create {infer} API enables you to create an {infer} endpoint and configure a {ml} model to perform a specific {infer} task. The following services are available through the {infer} API. You can find the available task types next to the service name. Click the links to review the configuration details of the services: * <> (`rerank`, `sparse_embedding`, `text_embedding`) * <> (`completion`, `text_embedding`) * <> (`completion`) * <> (`completion`, `text_embedding`) * <> (`completion`, `text_embedding`) * <> (`completion`, `rerank`, `text_embedding`) * <> (`rerank`, `sparse_embedding`, `text_embedding` - this service is for built-in models and models uploaded through Eland) * <> (`sparse_embedding`) * <> (`completion`, `text_embedding`) * <> (`rerank`, `text_embedding`) * <> (`text_embedding`) * <> (`text_embedding`) * <> (`completion`, `text_embedding`) The {es} and ELSER services run on a {ml} node in your {es} cluster. The rest of the services connect to external providers.