kibana/dev_docs
Garrett Spong e57663a0cf
[Security Assistant] Adds BuildKite pipeline for running Security GenAI Evaluations weekly (#215254)
## Summary

Introduces a new `security_solution/gen_ai_evals.yml` BuildKite pipeline
for automatically running our Assistant and Attack Discovery evaluation
suites weekly.

### To Run Locally:
Ensure you are authenticated with vault for LLM + LangSmith creds:

> See [internal
docs](https://github.com/elastic/infra/blob/master/docs/vault/README.md#login-with-your-okta)
for setup/login instructions.

Fetch Connectors and LangSmith creds:

> [!NOTE]
> In discussion with @elastic/kibana-operations it was preferred to use
the ci-prod secrets vault, so we cannot self-manage the secrets. To test
this locally though, you can grab the secrets and follow the
instructions in this [paste
bin](https://p.elstc.co/paste/q7k+zYOc#PN0kasw11u2J0XWC2Ls5PMNWreKzKTpgWA1wtsPzeH+).

```
cd x-pack/test/security_solution_api_integration
node scripts/genai/vault/retrieve_secrets.js  
```


Navigate to api integration directory, load the env vars, and start
server:
```
cd x-pack/test/security_solution_api_integration
export KIBANA_SECURITY_TESTING_AI_CONNECTORS=$(base64 -w 0 < scripts/genai/vault/connector_config.json) && export KIBANA_SECURITY_TESTING_LANGSMITH_KEY=$(base64 -w 0 < scripts/genai/vault/langsmith_key.txt)
yarn genai_evals:server:ess
```

Then in another terminal, load vars and run the tests:
```
cd x-pack/test/security_solution_api_integration
export KIBANA_SECURITY_TESTING_AI_CONNECTORS=$(base64 -w 0 < scripts/genai/vault/connector_config.json) && export KIBANA_SECURITY_TESTING_LANGSMITH_KEY=$(base64 -w 0 < scripts/genai/vault/langsmith_key.txt)
yarn genai_evals🏃ess
```

### To manually run on BuildKite:
Navigate to
[BuildKite](https://buildkite.com/elastic?filter=ftr-security-solution-gen-ai-evaluations)
and run `ftr-security-solution-gen-ai-evaluations` pipeline.

### To manually run on BuildKite for specific PR:
In `.buildkite/ftr_security_stateful_configs.yml`, temporarily move the
`genai/evaluations/trial_license_complete_tier/configs/ess.config.ts`
line down to the `enabled` section. Will see if we can do this without
requiring a commit. @elastic/kibana-operations is it possible to set a
buildkite env var that can be read in FTR tests when a specific GitHub
label is added to the PR? I.e. can I create a `SecurityGenAI:Run Evals`
label that when added will run this suite as part of the build?

> [!NOTE]
> Currently the connectors secrets only include `gpt-4o` and
`gpt-4o-mini`. Waiting on finalized list w/ credentials from @jamesspi
and @peluja1012 and then we can have ops update using the scripts
included in this PR.

---------

Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>
Co-authored-by: Patryk Kopycinski <patryk.kopycinski@elastic.co>
2025-04-24 11:46:57 -06:00
..
assets Developer documentation for designing feature privileges (#166716) 2023-09-27 13:43:55 +02:00
contributing [Docs] Added dependency evaluation process (#216494) 2025-04-15 14:54:55 +02:00
getting_started SKA: Update broken references and URLs (#206836) 2025-01-28 03:32:48 +00:00
key_concepts [Authz] Added allOf and anyOf nested conditions (#215516) 2025-04-03 14:28:17 +02:00
lens [Lens] fit line charts by default (#196184) 2024-10-21 15:05:02 +02:00
operations [EuiProvider / Functional tests] Check for EuiProvider Dev Warning (#189018) 2024-08-26 15:08:32 -05:00
shared_ux [CoreRenderingService] Add dev docs (#218630) 2025-04-24 00:08:38 +02:00
tutorials [Security Assistant] Adds BuildKite pipeline for running Security GenAI Evaluations weekly (#215254) 2025-04-24 11:46:57 -06:00
api_welcome.mdx SKA: Update broken references and URLs (#206836) 2025-01-28 03:32:48 +00:00
kibana_server_core_components.mdx Clean up dev docs (#124271) 2022-02-03 10:09:10 -05:00
nav-kibana-dev.docnav.json Revert "[ResponseOps] Document creating task-manager serverless monitoring assets - adding to kibana dev docs navigation" (#211030) 2025-02-13 18:09:06 +01:00