[Security Assistant] Adds BuildKite pipeline for running Security GenAI Evaluations weekly (#215254)

## Summary

Introduces a new `security_solution/gen_ai_evals.yml` BuildKite pipeline
for automatically running our Assistant and Attack Discovery evaluation
suites weekly.

### To Run Locally:
Ensure you are authenticated with vault for LLM + LangSmith creds:

> See [internal
docs](https://github.com/elastic/infra/blob/master/docs/vault/README.md#login-with-your-okta)
for setup/login instructions.

Fetch Connectors and LangSmith creds:

> [!NOTE]
> In discussion with @elastic/kibana-operations it was preferred to use
the ci-prod secrets vault, so we cannot self-manage the secrets. To test
this locally though, you can grab the secrets and follow the
instructions in this [paste
bin](https://p.elstc.co/paste/q7k+zYOc#PN0kasw11u2J0XWC2Ls5PMNWreKzKTpgWA1wtsPzeH+).

```
cd x-pack/test/security_solution_api_integration
node scripts/genai/vault/retrieve_secrets.js  
```


Navigate to api integration directory, load the env vars, and start
server:
```
cd x-pack/test/security_solution_api_integration
export KIBANA_SECURITY_TESTING_AI_CONNECTORS=$(base64 -w 0 < scripts/genai/vault/connector_config.json) && export KIBANA_SECURITY_TESTING_LANGSMITH_KEY=$(base64 -w 0 < scripts/genai/vault/langsmith_key.txt)
yarn genai_evals:server:ess
```

Then in another terminal, load vars and run the tests:
```
cd x-pack/test/security_solution_api_integration
export KIBANA_SECURITY_TESTING_AI_CONNECTORS=$(base64 -w 0 < scripts/genai/vault/connector_config.json) && export KIBANA_SECURITY_TESTING_LANGSMITH_KEY=$(base64 -w 0 < scripts/genai/vault/langsmith_key.txt)
yarn genai_evals🏃ess
```

### To manually run on BuildKite:
Navigate to
[BuildKite](https://buildkite.com/elastic?filter=ftr-security-solution-gen-ai-evaluations)
and run `ftr-security-solution-gen-ai-evaluations` pipeline.

### To manually run on BuildKite for specific PR:
In `.buildkite/ftr_security_stateful_configs.yml`, temporarily move the
`genai/evaluations/trial_license_complete_tier/configs/ess.config.ts`
line down to the `enabled` section. Will see if we can do this without
requiring a commit. @elastic/kibana-operations is it possible to set a
buildkite env var that can be read in FTR tests when a specific GitHub
label is added to the PR? I.e. can I create a `SecurityGenAI:Run Evals`
label that when added will run this suite as part of the build?

> [!NOTE]
> Currently the connectors secrets only include `gpt-4o` and
`gpt-4o-mini`. Waiting on finalized list w/ credentials from @jamesspi
and @peluja1012 and then we can have ops update using the scripts
included in this PR.

---------

Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>
Co-authored-by: Patryk Kopycinski <patryk.kopycinski@elastic.co>
This commit is contained in:
Garrett Spong 2025-04-24 11:46:57 -06:00 committed by GitHub
parent 91b0988c2c
commit e57663a0cf
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
58 changed files with 9635 additions and 198 deletions

View file

@ -110,3 +110,7 @@ Create or update a serverless Security project on Elastic Cloud QA.
#### `ci:project-persist-deployment`
Prevents an existing deployment from being shutdown due to inactivity.
#### `ci:security-genai-run-evals`
Run evaluations for the GenAI security evaluation suite.