[[infer-service-anthropic]] === Anthropic {infer} integration .New API reference [sidebar] -- For the most up-to-date API details, refer to {api-es}/group/endpoint-inference[{infer-cap} APIs]. -- Creates an {infer} endpoint to perform an {infer} task with the `anthropic` service. [discrete] [[infer-service-anthropic-api-request]] ==== {api-request-title} `PUT /_inference//` [discrete] [[infer-service-anthropic-api-path-params]] ==== {api-path-parms-title} ``:: (Required, string) include::inference-shared.asciidoc[tag=inference-id] ``:: (Required, string) include::inference-shared.asciidoc[tag=task-type] + -- Available task types: * `completion` -- [discrete] [[infer-service-anthropic-api-request-body]] ==== {api-request-body-title} `chunking_settings`:: (Optional, object) include::inference-shared.asciidoc[tag=chunking-settings] `max_chunk_size`::: (Optional, integer) include::inference-shared.asciidoc[tag=chunking-settings-max-chunking-size] `overlap`::: (Optional, integer) include::inference-shared.asciidoc[tag=chunking-settings-overlap] `sentence_overlap`::: (Optional, integer) include::inference-shared.asciidoc[tag=chunking-settings-sentence-overlap] `strategy`::: (Optional, string) include::inference-shared.asciidoc[tag=chunking-settings-strategy] `service`:: (Required, string) The type of service supported for the specified task type. In this case, `anthropic`. `service_settings`:: (Required, object) include::inference-shared.asciidoc[tag=service-settings] + -- These settings are specific to the `anthropic` service. -- `api_key`::: (Required, string) A valid API key for the Anthropic API. `model_id`::: (Required, string) The name of the model to use for the {infer} task. You can find the supported models at https://docs.anthropic.com/en/docs/about-claude/models#model-names[Anthropic models]. `rate_limit`::: (Optional, object) By default, the `anthropic` service sets the number of requests allowed per minute to `50`. This helps to minimize the number of rate limit errors returned from Anthropic. To modify this, set the `requests_per_minute` setting of this object in your service settings: + -- include::inference-shared.asciidoc[tag=request-per-minute-example] -- `task_settings`:: (Required, object) include::inference-shared.asciidoc[tag=task-settings] + .`task_settings` for the `completion` task type [%collapsible%closed] ===== `max_tokens`::: (Required, integer) The maximum number of tokens to generate before stopping. `temperature`::: (Optional, float) The amount of randomness injected into the response. + For more details about the supported range, see the https://docs.anthropic.com/en/api/messages[Anthropic messages API]. `top_k`::: (Optional, integer) Specifies to only sample from the top K options for each subsequent token. + Recommended for advanced use cases only. You usually only need to use `temperature`. + For more details, see the https://docs.anthropic.com/en/api/messages[Anthropic messages API]. `top_p`::: (Optional, float) Specifies to use Anthropic's nucleus sampling. + In nucleus sampling, Anthropic computes the cumulative distribution over all the options for each subsequent token in decreasing probability order and cut it off once it reaches a particular probability specified by `top_p`. You should either alter `temperature` or `top_p`, but not both. + Recommended for advanced use cases only. You usually only need to use `temperature`. + For more details, see the https://docs.anthropic.com/en/api/messages[Anthropic messages API]. ===== [discrete] [[inference-example-anthropic]] ==== Anthropic service example The following example shows how to create an {infer} endpoint called `anthropic_completion` to perform a `completion` task type. [source,console] ------------------------------------------------------------ PUT _inference/completion/anthropic_completion { "service": "anthropic", "service_settings": { "api_key": "", "model_id": "" }, "task_settings": { "max_tokens": 1024 } } ------------------------------------------------------------ // TEST[skip:TBD]