kibana

mirror of https://github.com/elastic/kibana.git synced 2025-06-28 11:05:39 -04:00

History

Garrett Spong e9a8909fad [Security Assistant] Simplifies Security Gen AI Evaluation secret management (#219885 ) ## Summary Simplifies secret management for running the Security Gen AI Evaluations. See updated README.md for full details, but includes: * Consolidation of multiple vault keys to a single `KIBANA_SECURITY_GEN_AI_CONFIG` key, which contains all connectors, langsmith creds and now a way to specify `evaluatorConnectorId`. * Added `vault` params to both `retrieve_secrets.js` and `upload_secrets.js` for specifying the vault. Defaults to `sieam-team` secrets.elastic.co for ease of use by developers. * Introduces `get_commands.js` script for fetching commands to hand off to either Kibana Ops for updating, or specifying config overrides when manually running BuildKite pipelines. * Deleted `export_env_secrets.js` as it couldn't be used for setting env vars locally for the dev testing experience. * Updated `connectors` as per team discussion to include: GPT-4.1, Claude 3.5/3.7, and Gemini 2.5 Pro. This was a config change made by Kibana Ops, so no code change present. But you can confirm by running `retrieve_secrets.js`. And finally, a much more detailed `README.md` for testing locally, on PR's and CI, and the process for updating secrets. See full [README.md](https://github.com/spong/kibana/blob/ci-eval-tweaks/x-pack/test/security_solution_api_integration/test_suites/genai/evaluations/README.md) Example LangSmith Runs: * `ES\|QL Generation Regression Suite`: [Run 298372](`261dcc59`-fbe7-4397-a662-ff94042f666c) * `Alerts RAG Regression (Episodes 1-8)`: [Run 298372](`bd5bba1d`-97aa-4512-bce7-b09aa943c651) * `Assistant Eval: Custom Knowledge`: [Run 298372](`2d5f7c18`-4bf4-4cdb-97a1-16e39a865cab) * `Eval AD: All Scenarios`: [Run 300138](`4690ee16`-9df5-416c-8bf0-b62bc2f2aba9/compare?selectedSessions=6d44134b-6492-4f2d-9b28-6d4a82a0e9ae&baseline=undefined) Note: there is currently a timing bug with Alerts/KB entries being cleaned up before the server is complete, so you may see poor evals for `Alerts RAG Regression (Episodes 1-8)` and `Assistant Eval: Custom Knowledge` until that is fixed. I'll address this in a follow-up PR since it is unrelated to this change-set.		2025-05-09 11:01:36 -06:00
..
genai/vault	[Security Assistant] Simplifies Security Gen AI Evaluation secret management (#219885 )	2025-05-09 11:01:36 -06:00
api_configs.json	[Security][Serverless] Add Product types in FTR API Integration tests. (#184309 )	2024-06-20 17:30:35 +03:00
index.js	[EDR Workflows] MKI API tests (#187560 )	2024-07-12 14:41:41 +02:00
mki_api_ftr_execution.ts	[Security Solution][Serverless] Logging - Fix explore test issue (#195941 )	2024-10-15 12:16:31 +03:00
mki_start_api_ftr_execution.js	[Security][Serverless] Add Product types in FTR API Integration tests. (#184309 )	2024-06-20 17:30:35 +03:00