[[infer-service-google-vertex-ai]] === Google Vertex AI {infer} integration .New API reference [sidebar] -- For the most up-to-date API details, refer to {api-es}/group/endpoint-inference[{infer-cap} APIs]. -- Creates an {infer} endpoint to perform an {infer} task with the `googlevertexai` service. [discrete] [[infer-service-google-vertex-ai-api-request]] ==== {api-request-title} `PUT /_inference//` [discrete] [[infer-service-google-vertex-ai-path-params]] ==== {api-path-parms-title} ``:: (Required, string) include::inference-shared.asciidoc[tag=inference-id] ``:: (Required, string) include::inference-shared.asciidoc[tag=task-type] + -- Available task types: * `rerank` * `text_embedding`. -- [discrete] [[infer-service-google-vertex-ai-api-request-body]] ==== {api-request-body-title} `chunking_settings`:: (Optional, object) include::inference-shared.asciidoc[tag=chunking-settings] `max_chunk_size`::: (Optional, integer) include::inference-shared.asciidoc[tag=chunking-settings-max-chunking-size] `overlap`::: (Optional, integer) include::inference-shared.asciidoc[tag=chunking-settings-overlap] `sentence_overlap`::: (Optional, integer) include::inference-shared.asciidoc[tag=chunking-settings-sentence-overlap] `strategy`::: (Optional, string) include::inference-shared.asciidoc[tag=chunking-settings-strategy] `service`:: (Required, string) The type of service supported for the specified task type. In this case, `googlevertexai`. `service_settings`:: (Required, object) include::inference-shared.asciidoc[tag=service-settings] + -- These settings are specific to the `googlevertexai` service. -- `service_account_json`::: (Required, string) A valid service account in json format for the Google Vertex AI API. `model_id`::: (Required, string) The name of the model to use for the {infer} task. You can find the supported models at https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/text-embeddings-api[Text embeddings API]. `location`::: (Required, string) The name of the location to use for the {infer} task. You find the supported locations at https://cloud.google.com/vertex-ai/generative-ai/docs/learn/locations[Generative AI on Vertex AI locations]. `project_id`::: (Required, string) The name of the project to use for the {infer} task. `rate_limit`::: (Optional, object) By default, the `googlevertexai` service sets the number of requests allowed per minute to `30.000`. This helps to minimize the number of rate limit errors returned from Google Vertex AI. To modify this, set the `requests_per_minute` setting of this object in your service settings: + -- include::inference-shared.asciidoc[tag=request-per-minute-example] More information about the rate limits for Google Vertex AI can be found in the https://cloud.google.com/vertex-ai/docs/quotas[Google Vertex AI Quotas docs]. -- `task_settings`:: (Optional, object) include::inference-shared.asciidoc[tag=task-settings] + .`task_settings` for the `rerank` task type [%collapsible%closed] ===== `top_n`::: (optional, boolean) Specifies the number of the top n documents, which should be returned. ===== + .`task_settings` for the `text_embedding` task type [%collapsible%closed] ===== `auto_truncate`::: (optional, boolean) Specifies if the API truncates inputs longer than the maximum token length automatically. ===== [discrete] [[inference-example-google-vertex-ai]] ==== Google Vertex AI service example The following example shows how to create an {infer} endpoint called `google_vertex_ai_embeddings` to perform a `text_embedding` task type. [source,console] ------------------------------------------------------------ PUT _inference/text_embedding/google_vertex_ai_embeddings { "service": "googlevertexai", "service_settings": { "service_account_json": "", "model_id": "", "location": "", "project_id": "" } } ------------------------------------------------------------ // TEST[skip:TBD] The next example shows how to create an {infer} endpoint called `google_vertex_ai_rerank` to perform a `rerank` task type. [source,console] ------------------------------------------------------------ PUT _inference/rerank/google_vertex_ai_rerank { "service": "googlevertexai", "service_settings": { "service_account_json": "", "project_id": "" } } ------------------------------------------------------------ // TEST[skip:TBD]