### Gemini connector: makes `maxOutputTokens` an optional, configurable request parameter
This PR updates the Gemini connector to make the `maxOutputTokens` request parameter, which was previously included in all requests with a (non configurable) value of `8192`, an optional, configurable request parameter.
- `maxOutputTokens` is no longer included in requests to Gemini models by default
- It may be optionally included in requests by passing the new `maxOutputTokens` parameter to the Gemini connector's `invokeAI` and `invokeAIRaw`, and `invokeStream` APIs
The driver for this change was some requests to newer models, e.g. Gemini Pro 2.5 were failing, because previously all requests from the Gemini Connector were sent with the value `8192` [here](f3115c6746/x-pack/platform/plugins/shared/stack_connectors/server/connector_types/gemini/gemini.ts (L397)).
The previous, non-configurable limit was less than the output token limit of newer Gemini models, as illustrated by the following table:
| Model | Description |
|--------------------|------------------------------------|
| Gemini 1.5 Flash | 1 (inclusive) to 8193 (exclusive) |
| Gemini 1.5 Pro | 1 (inclusive) to 8193 (exclusive) |
| Gemini 1.5 Pro 002 | 1 (inclusive) to 32769 (exclusive) |
| Gemini 2.5 Pro | 1 (inclusive) to 65536 (exclusive) |
When newer Gemini models generated a response with more than `8192` output tokens, the API response from Gemini would:
- Not contain the generated output, (because the `maxOutputTokens` limit was reached)
- Include a finish reason of `MAX_TOKENS`,
- Fail response validation in the connector, because the `usageMetadata` response metadata was missing the required `candidatesTokenCount` property, per the _Details_ below.
#### Details
Attack discovery requests to Gemini 2.5 Pro were failing with errors like the following:
```
ActionsClientLlm: action result status is error: an error occurred while running the action - Response validation failed (Error: [usageMetadata.candidatesTokenCount]: expected value of type [number] but got [undefined])
```
A full example error trace appears in the _Details_ below (click to expand):
<details>
```
ActionsClientLlm: action result status is error: an error occurred while running the action - Response validation failed (Error: [usageMetadata.candidatesTokenCount]: expected value of type [number] but got [undefined])
Response validation failed (Error: [usageMetadata.candidatesTokenCount]: expected value of type [number] but got [undefined]): ActionsClientLlm: action result status is error: an error occurred while running the action - Response validation failed (Error: [usageMetadata.candidatesTokenCount]: expected value of type [number] but got [undefined])
at ActionsClientLlm._call (/Users/$USER/git/kibana/x-pack/platform/packages/shared/kbn-langchain/server/language_models/llm.ts:112:21)
at async ActionsClientLlm._generate (/Users/$USER/git/kibana/node_modules/@langchain/core/dist/language_models/llms.cjs:375:29)
at async ActionsClientLlm._generateUncached (/Users/$USER/git/kibana/node_modules/@langchain/core/dist/language_models/llms.cjs:176:26)
at async ActionsClientLlm.invoke (/Users/$USER/git/kibana/node_modules/@langchain/core/dist/language_models/llms.cjs:34:24)
at async RunnableSequence.invoke (/Users/$USER/git/kibana/node_modules/@langchain/core/dist/runnables/base.cjs:1291:27)
at async RunnableCallable.generate [as func] (/Users/$USER/git/kibana/x-pack/solutions/security/plugins/elastic_assistant/server/lib/attack_discovery/graphs/default_attack_discovery_graph/nodes/generate/index.ts:61:27)
at async RunnableCallable.invoke (/Users/$USER/git/kibana/node_modules/@langchain/langgraph/dist/utils.cjs:82:27)
at async RunnableSequence.invoke (/Users/$USER/git/kibana/node_modules/@langchain/core/dist/runnables/base.cjs:1285:33)
at async _runWithRetry (/Users/$USER/git/kibana/node_modules/@langchain/langgraph/dist/pregel/retry.cjs:70:22)
at async PregelRunner._executeTasksWithRetry (/Users/$USER/git/kibana/node_modules/@langchain/langgraph/dist/pregel/runner.cjs:211:33)
at async PregelRunner.tick (/Users/$USER/git/kibana/node_modules/@langchain/langgraph/dist/pregel/runner.cjs:48:40)
at async CompiledStateGraph._runLoop (/Users/$USER/git/kibana/node_modules/@langchain/langgraph/dist/pregel/index.cjs:1018:17)
at async createAndRunLoop (/Users/$USER/git/kibana/node_modules/@langchain/langgraph/dist/pregel/index.cjs:907:17)
at ActionsClientLlm._generateUncached (/Users/$USER/git/kibana/node_modules/@langchain/core/dist/language_models/llms.cjs:176:37)
at ActionsClientLlm._generate (/Users/$USER/git/kibana/node_modules/@langchain/core/dist/language_models/llms.cjs:374:20)
at ActionsClientLlm._generateUncached (/Users/$USER/git/kibana/node_modules/@langchain/core/dist/language_models/llms.cjs:176:37)
at async ActionsClientLlm.invoke (/Users/$USER/git/kibana/node_modules/@langchain/core/dist/language_models/llms.cjs:34:24)
at async RunnableSequence.invoke (/Users/$USER/git/kibana/node_modules/@langchain/core/dist/runnables/base.cjs:1291:27)
at async RunnableCallable.generate [as func] (/Users/$USER/git/kibana/x-pack/solutions/security/plugins/elastic_assistant/server/lib/attack_discovery/graphs/default_attack_discovery_graph/nodes/generate/index.ts:61:27)
at async RunnableCallable.invoke (/Users/$USER/git/kibana/node_modules/@langchain/langgraph/dist/utils.cjs:82:27)
at async RunnableSequence.invoke (/Users/$USER/git/kibana/node_modules/@langchain/core/dist/runnables/base.cjs:1285:33)
at async _runWithRetry (/Users/$USER/git/kibana/node_modules/@langchain/langgraph/dist/pregel/retry.cjs:70:22)
at async PregelRunner._executeTasksWithRetry (/Users/$USER/git/kibana/node_modules/@langchain/langgraph/dist/pregel/runner.cjs:211:33)
at async PregelRunner.tick (/Users/$USER/git/kibana/node_modules/@langchain/langgraph/dist/pregel/runner.cjs:48:40)
at async CompiledStateGraph._runLoop (/Users/$USER/git/kibana/node_modules/@langchain/langgraph/dist/pregel/index.cjs:1018:17)
at async createAndRunLoop (/Users/$USER/git/kibana/node_modules/@langchain/langgraph/dist/pregel/index.cjs:907:17)
at ActionsClientLlm._generateUncached (/Users/$USER/git/kibana/node_modules/@langchain/core/dist/language_models/llms.cjs:142:51)
at CallbackManager.handleLLMStart (/Users/$USER/git/kibana/node_modules/@langchain/core/dist/callbacks/manager.cjs:464:25)
at ActionsClientLlm._generateUncached (/Users/$USER/git/kibana/node_modules/@langchain/core/dist/language_models/llms.cjs:142:51)
at ActionsClientLlm._generateUncached (/Users/$USER/git/kibana/node_modules/@langchain/core/dist/language_models/llms.cjs:136:73)
at ActionsClientLlm._generateUncached (/Users/$USER/git/kibana/node_modules/@langchain/core/dist/language_models/llms.cjs:136:73)
at ActionsClientLlm._generateUncached (/Users/$USER/git/kibana/node_modules/@langchain/core/dist/language_models/llms.cjs:129:28)
at ActionsClientLlm.generate (/Users/$USER/git/kibana/node_modules/@langchain/core/dist/language_models/llms.cjs:281:25)
at ActionsClientLlm.generatePrompt (/Users/$USER/git/kibana/node_modules/@langchain/core/dist/language_models/llms.cjs:97:21)
at ActionsClientLlm.invoke (/Users/$USER/git/kibana/node_modules/@langchain/core/dist/language_models/llms.cjs:34:35)
at RunnableSequence.invoke (/Users/$USER/git/kibana/node_modules/@langchain/core/dist/runnables/base.cjs:1291:43)
at async RunnableCallable.generate [as func] (/Users/$USER/git/kibana/x-pack/solutions/security/plugins/elastic_assistant/server/lib/attack_discovery/graphs/default_attack_discovery_graph/nodes/generate/index.ts:61:27)
at async RunnableCallable.invoke (/Users/$USER/git/kibana/node_modules/@langchain/langgraph/dist/utils.cjs:82:27)
at async RunnableSequence.invoke (/Users/$USER/git/kibana/node_modules/@langchain/core/dist/runnables/base.cjs:1285:33)
at async _runWithRetry (/Users/$USER/git/kibana/node_modules/@langchain/langgraph/dist/pregel/retry.cjs:70:22)
at async PregelRunner._executeTasksWithRetry (/Users/$USER/git/kibana/node_modules/@langchain/langgraph/dist/pregel/runner.cjs:211:33)
at async PregelRunner.tick (/Users/$USER/git/kibana/node_modules/@langchain/langgraph/dist/pregel/runner.cjs:48:40)
at async CompiledStateGraph._runLoop (/Users/$USER/git/kibana/node_modules/@langchain/langgraph/dist/pregel/index.cjs:1018:17)
at async createAndRunLoop (/Users/$USER/git/kibana/node_modules/@langchain/langgraph/dist/pregel/index.cjs:907:17)
at RunnableSequence.invoke (/Users/$USER/git/kibana/node_modules/@langchain/core/dist/runnables/base.cjs:1285:70)
at raceWithSignal (/Users/$USER/git/kibana/node_modules/@langchain/core/dist/utils/signal.cjs:4:30)
at RunnableSequence.invoke (/Users/$USER/git/kibana/node_modules/@langchain/core/dist/runnables/base.cjs:1285:70)
at async RunnableCallable.generate [as func] (/Users/$USER/git/kibana/x-pack/solutions/security/plugins/elastic_assistant/server/lib/attack_discovery/graphs/default_attack_discovery_graph/nodes/generate/index.ts:61:27)
at async RunnableCallable.invoke (/Users/$USER/git/kibana/node_modules/@langchain/langgraph/dist/utils.cjs:82:27)
at async RunnableSequence.invoke (/Users/$USER/git/kibana/node_modules/@langchain/core/dist/runnables/base.cjs:1285:33)
```
</details>
The root cause of the validation error above was models like Gemini 2.5 Pro were generating responses with output tokens that reached the `maxOutputTokens` limit sent in the request.
When this happens, the response from the model:
- Includes a `finishReason` of `MAX_TOKENS`
- The `usageMetadata` does NOT include the `candidatesTokenCount`, which causes the validation error to occur
as illustrated by the following (partial) response:
```json
{
"candidates": [
{
"content": {
"role": "model",
"parts": [
{
"text": ""
}
]
},
"finishReason": "MAX_TOKENS",
// ...
}
],
"usageMetadata": {
"promptTokenCount": 82088,
"totalTokenCount": 90280,
"trafficType": "ON_DEMAND",
"promptTokensDetails": [
{
"modality": "TEXT",
"tokenCount": 82088
}
],
"thoughtsTokenCount": 8192
},
"modelVersion": "gemini-2.5-pro-preview-03-25",
"createTime": "2025-05-06T18:05:44.475008Z",
"responseId": "eE8aaID_HJXj-O4PifyReA"
}
```
To resolve this issue, the `maxOutputTokens` request parameter is only sent when it is specified via the Gemini connector's `invokeAI` and `invokeAIRaw` APIs.
#### Alternatives considered
All of the Gemini models tested will fail with errors like the following example when their (model-specific) output token limits are reached:
```
Status code: 400. Message: API Error: INVALID_ARGUMENT: Unable to submit request because it has a maxOutputTokens value of 4096000 but the supported range is from 1 (inclusive) to 65536 (exclusive). Update the value and try again.: ActionsClientLlm: action result status is error: an error occurred while running the action - Status code: 400. Message: API Error: INVALID_ARGUMENT: Unable to submit request because it has a maxOutputTokens value of 4096000 but the supported range is from 1 (inclusive) to 65536 (exclusive). Update the value and try again.
```
For example, passing a `maxOutputTokens` value of `32768` will work OK for `Gemini 1.5 Pro 002` and `Gemini 2.5 Pro`, but fail for `Gemini 1.5 Flash` and `Gemini 1.5 Pro`:
| Model | Description |
|--------------------|------------------------------------|
| Gemini 1.5 Flash | 1 (inclusive) to 8193 (exclusive) |
| Gemini 1.5 Pro | 1 (inclusive) to 8193 (exclusive) |
| Gemini 1.5 Pro 002 | 1 (inclusive) to 32769 (exclusive) |
| Gemini 2.5 Pro | 1 (inclusive) to 65536 (exclusive) |
As an alternative to removing `maxOutputTokens` from requests, we could have made it a required parameter to the `invokeAI`, `invokeAIRaw`, and `invokeStream` APIs, but that would require callers of these APIs to know the correct model-specific value to provide.
#### Desk testing
This PR was desk tested:
- Interactively in Attack discovery with the following models:
| Model |
|--------------------|
| Gemini 1.5 Flash |
| Gemini 1.5 Pro |
| Gemini 1.5 Pro 002 |
| Gemini 2.5 Pro |
- Interactively in Attack discovery with non-Gemini models, e.g. a subset of the GPT and Claude family of models (to test for unintended side effects)
- Via evaluations using `Eval AD: All Scenarios` for the Gemini connectors above, (and with the Claude / OpenAI models for regression testing)
- Interactively via the security assistant, to answer requests in tool calling, and non-tool calling scenarios
If you would like to contribute code, please follow our STYLEGUIDE.mdx.
For all other questions, check out the FAQ.md and
wiki.
Documentation
Visit Elastic.co for the full Kibana documentation.
For information about building the documentation, see the README in elastic/docs.
Version Compatibility with Elasticsearch
Ideally, you should be running Elasticsearch and Kibana with matching version numbers. If your Elasticsearch has an older version number or a newer major number than Kibana, then Kibana will fail to run. If Elasticsearch has a newer minor or patch number than Kibana, then the Kibana Server will log a warning.
Note: The version numbers below are only examples, meant to illustrate the relationships between different types of version numbers.
Situation
Example Kibana version
Example ES version
Outcome
Versions are the same.
7.15.1
7.15.1
💚 OK
ES patch number is newer.
7.15.0
7.15.1
⚠️ Logged warning
ES minor number is newer.
7.14.2
7.15.0
⚠️ Logged warning
ES major number is newer.
7.15.1
8.0.0
🚫 Fatal error
ES patch number is older.
7.15.1
7.15.0
⚠️ Logged warning
ES minor number is older.
7.15.1
7.14.2
🚫 Fatal error
ES major number is older.
8.0.0
7.15.1
🚫 Fatal error
Questions? Problems? Suggestions?
If you've found a bug or want to request a feature, please create a GitHub Issue.
Please check to make sure someone else hasn't already created an issue for the same topic.
Need help using Kibana? Ask away on our Kibana Discuss Forum and a fellow community member or
Elastic engineer will be glad to help you out.