[Investigate App] add MVP evaluation framework for AI root cause analysis integration (#204634)

## Summary Extends the Observability AI Assistant's evaluation framework to create the first set of tests aimed at evaluating the performance of the Investigation App's AI root cause analysis integration. To execute tests, please consult the [README](https://github.com/elastic/kibana/pull/204634/files#diff-4823a154e593051126d3d5822c88d72e89d07f41b8c07a5a69d18281c50b09adR1). Note the prerequisites and the Kibana & Elasticsearch configuration. Further evolution -- This PR is the first MVP of the evaluation framework. A (somewhat light) [meta issue](https://github.com/elastic/kibana/issues/205670) exists for our continued work on this project, and will be added to over time. Test data and fixture architecture -- Logs, metrics, and traces are indexed to [edge-rca](https://studious-disco-k66oojq.pages.github.io/edge-rca/). Observability engineers can [create an oblt-cli cluster](https://studious-disco-k66oojq.pages.github.io/user-guide/cluster-create-ccs/) configured for cross cluster search against edge-rca as the remote cluster. When creating new testing fixtures, engineers will utilize their oblt-cli cluster to create rules against the remote cluster data. Once alerts are triggered in a failure scenario, the engineer can choose to archive the alert data to utilize as a test fixture. Test fixtures are added to the `investigate_app/scripts/load/fixtures` directory for use in tests. When execute tests, the fixtures are loaded into the engineer's oblt-cli cluster, configured for cross cluster search against edge-rca. The local alert fixture and the remote demo data are utilized together to replay root cause analysis and execute the test evaluations. Implementation -- Creates a new directory `scripts`, to house scripts related to setting up and running these tests. Here's what each directory does: ## scripts/evaluate 1. Extends the evaluation script from `observability_ai_assistant_app/scripts/evaluation` by creating a [custom Kibana client](https://github.com/elastic/kibana/pull/204634/files#diff-ae05b2a20168ea08f452297fc1bd59310c69ac3ea4651da1f65cd9fa93bb8fe9R1) with RCA specific methods. The custom client is [passed to the Observability AI Assistant's `runEvaluations`](https://github.com/elastic/kibana/pull/204634/files#diff-0f2d3662c01df8fbe7d1f19704fa071cbd6232fb5f732b313e8ba99012925d0bR14) script an[d invoked instead of the default Kibana Client](https://github.com/elastic/kibana/pull/204634/files#diff-98509a357e86ea5c5931b1b46abc72f76e5304439430358eee845f9ad57f63f1R54). 2. Defines a single, MVP test in `index.spec.ts`. This test find a specific alert fixture designated for that test, creates an investigation for that alert with a specified time range, and calls the root cause analysis api. Once the report is received back from the api, a prompt is created for the evaluation framework with details of the report. The evaluation framework then judges how well the root cause analysis api performed against specified criteria. ## scripts/archive 1. Utilized when creating new test fixtures, this script will easily archive observability alerts data for use as a fixture in a feature test ## scripts/load 1. Loads created testing fixtures before running the test. --------- Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com> Co-authored-by: Dario Gieselaar <d.gieselaar@gmail.com>
2025-06-27 10:40:07 -04:00 · 2025-01-17 12:16:10 -05:00 · 2025-01-17 12:16:10 -05:00 · 5ab8a52187
commit 5ab8a52187
parent 61c2d18e5c
24 changed files with 33910 additions and 41 deletions
--- a/.gitignore
+++ b/.gitignore
@ -137,6 +137,7 @@ src/platform/packages/**/package-map.json
 /packages/kbn-synthetic-package-map/
 **/.synthetics/
 **/.journeys/
+**/.rca/
 x-pack/test/security_api_integration/plugins/audit_log/audit.log

 # ignore FTR temp directory
--- a/x-pack/solutions/observability/packages/utils_server/entities/get_data_streams_for_entity.ts
+++ b/x-pack/solutions/observability/packages/utils_server/entities/get_data_streams_for_entity.ts
@ -54,7 +54,20 @@ export async function getDataStreamsForEntity({
  });

  const dataStreams = uniq(
-    compact(await resolveIndexResponse.indices.flatMap((idx) => idx.data_stream))
+    compact([
+      /* Check both data streams and indices.
+       * The response body shape differs depending on the request. Example:
+       * GET _resolve/index/logs-*-default* will return data in the `data_streams` key.
+       * GET _resolve/index/.ds-logs-*-default* will return data in the `indices` key */
+      ...resolveIndexResponse.indices.flatMap((idx) => {
+        const remoteCluster = idx.name.includes(':') ? idx.name.split(':')[0] : null;
+        if (remoteCluster) {
+          return `${remoteCluster}:${idx.data_stream}`;
+        }
+        return idx.data_stream;
+      }),
+      ...resolveIndexResponse.data_streams.map((ds) => ds.name),
+    ])
  );

  return {
--- a/x-pack/solutions/observability/plugins/investigate_app/common/rca/llm_context.ts
+++ b/x-pack/solutions/observability/plugins/investigate_app/common/rca/llm_context.ts
@ -0,0 +1,36 @@
+/*
+ * Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
+ * or more contributor license agreements. Licensed under the Elastic License
+ * 2.0; you may not use this file except in compliance with the Elastic License
+ * 2.0.
+ */
+
+import { EcsFieldsResponse } from '@kbn/rule-registry-plugin/common';
+import {
+  ALERT_FLAPPING_HISTORY,
+  ALERT_RULE_EXECUTION_TIMESTAMP,
+  ALERT_RULE_EXECUTION_UUID,
+  EVENT_ACTION,
+  EVENT_KIND,
+} from '@kbn/rule-registry-plugin/common/technical_rule_data_field_names';
+import { omit } from 'lodash';
+
+export function sanitizeAlert(alert: EcsFieldsResponse) {
+  return omit(
+    alert,
+    ALERT_RULE_EXECUTION_TIMESTAMP,
+    '_index',
+    ALERT_FLAPPING_HISTORY,
+    EVENT_ACTION,
+    EVENT_KIND,
+    ALERT_RULE_EXECUTION_UUID,
+    '@timestamp'
+  );
+}
+
+export function getRCAContext(alert: EcsFieldsResponse, serviceName: string) {
+  return `The user is investigating an alert for the ${serviceName} service,
+    and wants to find the root cause. Here is the alert:
+  
+    ${JSON.stringify(sanitizeAlert(alert))}`;
+}
--- a/x-pack/solutions/observability/plugins/investigate_app/public/pages/details/components/assistant_hypothesis/assistant_hypothesis.tsx
+++ b/x-pack/solutions/observability/plugins/investigate_app/public/pages/details/components/assistant_hypothesis/assistant_hypothesis.tsx
@ -8,19 +8,12 @@
 import { i18n } from '@kbn/i18n';
 import type { RootCauseAnalysisEvent } from '@kbn/observability-ai-server/root_cause_analysis';
 import { EcsFieldsResponse } from '@kbn/rule-registry-plugin/common';
-import {
-  ALERT_FLAPPING_HISTORY,
-  ALERT_RULE_EXECUTION_TIMESTAMP,
-  ALERT_RULE_EXECUTION_UUID,
-  EVENT_ACTION,
-  EVENT_KIND,
-} from '@kbn/rule-registry-plugin/common/technical_rule_data_field_names';
 import { isRequestAbortedError } from '@kbn/server-route-repository-client';
-import { omit } from 'lodash';
 import React, { useEffect, useRef, useState } from 'react';
 import { useKibana } from '../../../../hooks/use_kibana';
 import { useUpdateInvestigation } from '../../../../hooks/use_update_investigation';
 import { useInvestigation } from '../../contexts/investigation_context';
+import { getRCAContext } from '../../../../../common/rca/llm_context';

 export interface InvestigationContextualInsight {
  key: string;
@ -90,10 +83,7 @@ export function AssistantHypothesis() {
          body: {
            investigationId: investigation!.id,
            connectorId,
-            context: `The user is investigating an alert for the ${serviceName} service,
-            and wants to find the root cause. Here is the alert:
-
-            ${JSON.stringify(sanitizeAlert(nonNullishAlert))}`,
+            context: getRCAContext(nonNullishAlert, nonNullishServiceName),
            rangeFrom,
            rangeTo,
            serviceName: nonNullishServiceName,
@ -190,16 +180,3 @@ export function AssistantHypothesis() {
    />
  );
 }
-
-function sanitizeAlert(alert: EcsFieldsResponse) {
-  return omit(
-    alert,
-    ALERT_RULE_EXECUTION_TIMESTAMP,
-    '_index',
-    ALERT_FLAPPING_HISTORY,
-    EVENT_ACTION,
-    EVENT_KIND,
-    ALERT_RULE_EXECUTION_UUID,
-    '@timestamp'
-  );
-}
--- a/x-pack/solutions/observability/plugins/investigate_app/scripts/archive/README.md
+++ b/x-pack/solutions/observability/plugins/investigate_app/scripts/archive/README.md
@ -0,0 +1,35 @@
+# Investigation RCA Evaluation Framework
+
+## Overview
+
+This tool is developed for our team working on the Elastic Observability platform, specifically focusing on evaluating the Investigation RCA AI Integration. It simplifies archiving data critical for evaluating the Investigation UI and it's integration with large language models (LLM).
+
+## Setup requirements
+
+- An Elasticsearch instance
+
+You'll need an instance configured with cross cluster search for the [edge-rca](https://studious-disco-k66oojq.pages.github.io/edge-rca/) cluster. To create one, utilize [oblt-cli](https://studious-disco-k66oojq.pages.github.io/user-guide/cluster-create-ccs/) and select `edge-rca` as the remote cluster.
+
+## Running archive
+
+Run the tool using:
+
+`$ node x-pack/solutions/observability/plugins/investigate_app/scripts/archive/index.js --kibana http://admin:[YOUR_CLUSTER_PASSWORD]@localhost:5601`
+
+This will archive the observability alerts index to use as fixtures within the tests.
+
+Archived data will automatically be saved at the root of the kibana project in the `.rca/archives` folder.
+
+## Creating a test fixture
+
+To create a test fixture, create a new folder in `x-pack/solutions/observability/plugins/investigate_app/scripts/load/fixtures` with the `data.json.gz` file and the `mappings.json` file. The fixture will now be loaded when running `$ node x-pack/solutions/observability/plugins/investigate_app/scripts/load/index.js`
+
+### Configuration
+
+#### Kibana and Elasticsearch
+
+By default, the tool will look for a Kibana instance running locally (at `http://localhost:5601`, which is the default address for running Kibana in development mode). It will also attempt to read the Kibana config file for the Elasticsearch address & credentials. If you want to override these settings, use `--kibana` and `--es`. Only basic auth is supported, e.g. `--kibana http://username:password@localhost:5601`. If you want to use a specific space, use `--spaceId`
+
+#### filePath
+
+Use `--filePath` to specify a custom file path to store your archived data. By default, data is stored at `.rca/archives`
--- a/x-pack/solutions/observability/plugins/investigate_app/scripts/archive/archive.ts
+++ b/x-pack/solutions/observability/plugins/investigate_app/scripts/archive/archive.ts
@ -0,0 +1,53 @@
+/*
+ * Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
+ * or more contributor license agreements. Licensed under the Elastic License
+ * 2.0; you may not use this file except in compliance with the Elastic License
+ * 2.0.
+ */
+
+import { spawnSync } from 'child_process';
+import { run } from '@kbn/dev-cli-runner';
+import yargs from 'yargs';
+import { getServiceUrls } from '@kbn/observability-ai-assistant-app-plugin/scripts/evaluation/get_service_urls';
+import { options } from './cli';
+
+async function archiveAllRelevantData({ filePath, esUrl }: { filePath: string; esUrl: string }) {
+  spawnSync(
+    'node',
+    ['scripts/es_archiver', 'save', `${filePath}/alerts`, '.internal.alerts-*', '--es-url', esUrl],
+    {
+      stdio: 'inherit',
+    }
+  );
+}
+
+function archiveData() {
+  yargs(process.argv.slice(2))
+    .command('*', 'Archive RCA data', async () => {
+      const argv = await options(yargs);
+      run(
+        async ({ log }) => {
+          const serviceUrls = await getServiceUrls({
+            log,
+            elasticsearch: argv.elasticsearch,
+            kibana: argv.kibana,
+          });
+          await archiveAllRelevantData({
+            esUrl: serviceUrls.esUrl,
+            filePath: argv.filePath,
+          });
+        },
+        {
+          log: {
+            defaultLevel: argv.logLevel as any,
+          },
+          flags: {
+            allowUnexpected: true,
+          },
+        }
+      );
+    })
+    .parse();
+}
+
+archiveData();
--- a/x-pack/solutions/observability/plugins/investigate_app/scripts/archive/cli.ts
+++ b/x-pack/solutions/observability/plugins/investigate_app/scripts/archive/cli.ts
@ -0,0 +1,54 @@
+/*
+ * Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
+ * or more contributor license agreements. Licensed under the Elastic License
+ * 2.0; you may not use this file except in compliance with the Elastic License
+ * 2.0.
+ */
+import * as inquirer from 'inquirer';
+import * as fs from 'fs';
+import { Argv } from 'yargs';
+import {
+  elasticsearchOption,
+  kibanaOption,
+} from '@kbn/observability-ai-assistant-app-plugin/scripts/evaluation/cli';
+
+function getISOStringWithoutMicroseconds(): string {
+  const now = new Date();
+  const isoString = now.toISOString();
+  return isoString.split('.')[0] + 'Z';
+}
+
+export async function options(y: Argv) {
+  const argv = y
+    .option('filePath', {
+      string: true as const,
+      describe: 'file path to store the archived data',
+      default: `./.rca/archives/${getISOStringWithoutMicroseconds()}`,
+    })
+    .option('kibana', kibanaOption)
+    .option('elasticsearch', elasticsearchOption)
+    .option('logLevel', {
+      describe: 'Log level',
+      default: 'info',
+    }).argv;
+
+  if (
+    fs.existsSync(`${argv.filePath}/data.json.gz`) ||
+    fs.existsSync(`${argv.filePath}/mappings.json`)
+  ) {
+    const { confirmOverwrite } = await inquirer.prompt([
+      {
+        type: 'confirm',
+        name: 'confirmOverwrite',
+        message: `Archived data already exists at path: ${argv.filePath}. Do you want to overwrite it?`,
+        default: false,
+      },
+    ]);
+
+    if (!confirmOverwrite) {
+      process.exit(1);
+    }
+  }
+
+  return argv;
+}
--- a/x-pack/solutions/observability/plugins/investigate_app/scripts/archive/index.js
+++ b/x-pack/solutions/observability/plugins/investigate_app/scripts/archive/index.js
@ -0,0 +1,10 @@
+/*
+ * Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
+ * or more contributor license agreements. Licensed under the Elastic License
+ * 2.0; you may not use this file except in compliance with the Elastic License
+ * 2.0.
+ */
+
+require('@kbn/babel-register').install();
+
+require('./archive');
--- a/x-pack/solutions/observability/plugins/investigate_app/scripts/evaluate/.eslintrc.json
+++ b/x-pack/solutions/observability/plugins/investigate_app/scripts/evaluate/.eslintrc.json
@ -0,0 +1,17 @@
+{
+  "overrides": [
+    {
+      "files": [
+        "**/*.spec.ts"
+      ],
+      "rules": {
+        "@kbn/imports/require_import": [
+          "error",
+          "@kbn/ambient-ftr-types"
+        ],
+        "@typescript-eslint/triple-slash-reference": "off",
+        "spaced-comment": "off"
+      }
+    }
+  ]
+}
--- a/x-pack/solutions/observability/plugins/investigate_app/scripts/evaluate/README.md
+++ b/x-pack/solutions/observability/plugins/investigate_app/scripts/evaluate/README.md
@ -0,0 +1,53 @@
+# Investigation RCA Evaluation Framework
+
+## Overview
+
+This tool is developed for our team working on the Elastic Observability platform, specifically focusing on evaluating the Investigation RCA AI Integration. It simplifies scripting and evaluating various scenarios with the Large Language Model (LLM) integration.
+
+## Setup requirements
+
+- An Elasticsearch instance configured with cross cluster search pointing to the edge-rca cluster
+- A Kibana instance
+- At least one .gen-ai connector set up
+
+## Running evaluations
+
+### Prerequists
+
+#### Elasticsearch instance
+
+You'll need an instance configured with cross cluster search for the [edge-rca](https://studious-disco-k66oojq.pages.github.io/edge-rca/) cluster. To create one, utilize [oblt-cli](https://studious-disco-k66oojq.pages.github.io/user-guide/cluster-create-ccs/) and select `edge-rca` as the remote cluster.
+
+Once your cluster is created, paste the the yml config provided in your `kibana.dev.yml` file.
+
+#### Fixture data
+
+To load the fixtures needed for the tests, first run:
+
+`$ node x-pack/solutions/observability/plugins/investigate_app/scripts/load/index.js --kibana http://admin:[YOUR_CLUSTER_PASSWORD]@localhost:5601`
+
+### Executing tests
+
+Run the tool using:
+
+`$ $ node x-pack/solutions/observability/plugins/observability_ai_assistant_app/scripts/evaluation/index.js --files=x-pack/solutions/observability/plugins/investigate_app/scripts/evaluate/scenarios/rca/index.spec.ts --kibana http://admin:[YOUR_CLUSTER_PASSWORD]@localhost:5601`
+
+This will evaluate all existing scenarios, and write the evaluation results to the terminal.
+
+### Configuration
+
+#### Kibana and Elasticsearch
+
+By default, the tool will look for a Kibana instance running locally (at `http://localhost:5601`, which is the default address for running Kibana in development mode). It will also attempt to read the Kibana config file for the Elasticsearch address & credentials. If you want to override these settings, use `--kibana` and `--es`. Only basic auth is supported, e.g. `--kibana http://username:password@localhost:5601`. If you want to use a specific space, use `--spaceId`
+
+#### Connector
+
+Use `--connectorId` to specify a `.gen-ai` or `.bedrock` connector to use. If none are given, it will prompt you to select a connector based on the ones that are available. If only a single supported connector is found, it will be used without prompting.
+
+#### Persisting conversations
+
+By default, completed conversations are not persisted. If you do want to persist them, for instance for reviewing purposes, set the `--persist` flag to store them. This will also generate a clickable link in the output of the evaluation that takes you to the conversation.
+
+If you want to clear conversations on startup, use the `--clear` flag. This only works when `--persist` is enabled. If `--spaceId` is set, only conversations for the current space will be cleared.
+
+When storing conversations, the name of the scenario is used as a title. Set the `--autoTitle` flag to have the LLM generate a title for you.
--- a/x-pack/solutions/observability/plugins/investigate_app/scripts/evaluate/rca_client.ts
+++ b/x-pack/solutions/observability/plugins/investigate_app/scripts/evaluate/rca_client.ts
@ -0,0 +1,174 @@
+/*
+ * Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
+ * or more contributor license agreements. Licensed under the Elastic License
+ * 2.0; you may not use this file except in compliance with the Elastic License
+ * 2.0.
+ */
+import { Readable } from 'stream';
+import { AxiosResponse } from 'axios';
+import { v4 as uuidv4 } from 'uuid';
+import datemath from '@kbn/datemath';
+import { ToolingLog } from '@kbn/tooling-log';
+import { CreateInvestigationResponse } from '@kbn/investigation-shared';
+import type { EcsFieldsResponse } from '@kbn/rule-registry-plugin/common';
+import { httpResponseIntoObservable } from '@kbn/sse-utils-client';
+import { defer, lastValueFrom, toArray } from 'rxjs';
+import { KibanaClient } from '@kbn/observability-ai-assistant-app-plugin/scripts/evaluation/kibana_client';
+import type { RootCauseAnalysisEvent } from '@kbn/observability-ai-server/root_cause_analysis';
+import { getRCAContext } from '../../common/rca/llm_context';
+
+export class RCAClient {
+  constructor(protected readonly kibanaClient: KibanaClient, protected readonly log: ToolingLog) {}
+
+  async getAlert(alertId: string): Promise<EcsFieldsResponse> {
+    const response = await this.kibanaClient.callKibana<EcsFieldsResponse>('get', {
+      pathname: '/internal/rac/alerts',
+      query: {
+        id: alertId,
+      },
+    });
+    return response.data;
+  }
+
+  async getTimeRange({
+    fromOffset = 'now-15m',
+    toOffset = 'now+15m',
+    alert,
+  }: {
+    fromOffset: string;
+    toOffset: string;
+    alert: EcsFieldsResponse;
+  }) {
+    const alertStart = alert['kibana.alert.start'] as string | undefined;
+    if (!alertStart) {
+      throw new Error(
+        'Alert start time is missing from the alert data. Please double check your alert fixture.'
+      );
+    }
+    const from = datemath.parse(fromOffset, { forceNow: new Date(alertStart) })?.valueOf()!;
+    const to = datemath.parse(toOffset, { forceNow: new Date(alertStart) })?.valueOf()!;
+    return {
+      from,
+      to,
+    };
+  }
+
+  async createInvestigation({
+    alertId,
+    from,
+    to,
+  }: {
+    alertId: string;
+    from: number;
+    to: number;
+  }): Promise<string> {
+    const body = {
+      id: uuidv4(),
+      title: 'Investigate Custom threshold breached',
+      params: {
+        timeRange: {
+          from,
+          to,
+        },
+      },
+      tags: [],
+      origin: {
+        type: 'alert',
+        id: alertId,
+      },
+      externalIncidentUrl: null,
+    };
+
+    const response = await this.kibanaClient.callKibana<CreateInvestigationResponse>(
+      'post',
+      {
+        pathname: '/api/observability/investigations',
+      },
+      body
+    );
+
+    return response.data.id;
+  }
+
+  async deleteInvestigation({ investigationId }: { investigationId: string }): Promise<void> {
+    await this.kibanaClient.callKibana('delete', {
+      pathname: `/api/observability/investigations/${investigationId}`,
+    });
+  }
+
+  async rootCauseAnalysis({
+    connectorId,
+    investigationId,
+    from,
+    to,
+    alert,
+  }: {
+    connectorId: string;
+    investigationId: string;
+    from: string;
+    to: string;
+    alert?: EcsFieldsResponse;
+  }) {
+    this.log.debug(`Calling root cause analysis API`);
+    const that = this;
+    const serviceName = alert?.['service.name'] as string | undefined;
+    if (!alert) {
+      throw new Error(
+        'Alert not found. Please ensure you have loaded test fixture data prior to running tests.'
+      );
+    }
+    if (!serviceName) {
+      throw new Error(
+        'Service name is missing from the alert data. Please double check your alert fixture.'
+      );
+    }
+    const context = getRCAContext(alert, serviceName);
+    const body = {
+      investigationId,
+      connectorId,
+      context,
+      rangeFrom: from,
+      rangeTo: to,
+      serviceName: 'controller',
+      completeInBackground: false,
+    };
+
+    const chat$ = defer(async () => {
+      const response: AxiosResponse<Readable> = await this.kibanaClient.callKibana(
+        'post',
+        {
+          pathname: '/internal/observability/investigation/root_cause_analysis',
+        },
+        body,
+        { responseType: 'stream', timeout: NaN }
+      );
+
+      return {
+        response: {
+          body: new ReadableStream<Uint8Array>({
+            start(controller) {
+              response.data.on('data', (chunk: Buffer) => {
+                that.log.info(`Analyzing root cause...`);
+                controller.enqueue(chunk);
+              });
+
+              response.data.on('end', () => {
+                that.log.info(`Root cause analysis completed`);
+                controller.close();
+              });
+
+              response.data.on('error', (err: Error) => {
+                that.log.error(`Error while analyzing root cause: ${err}`);
+                controller.error(err);
+              });
+            },
+          }),
+        },
+      };
+    }).pipe(httpResponseIntoObservable(), toArray());
+
+    const events = await lastValueFrom(chat$);
+
+    return events.map((event) => event.event) as RootCauseAnalysisEvent[];
+  }
+}
--- a/x-pack/solutions/observability/plugins/investigate_app/scripts/evaluate/scenarios/rca/index.spec.ts
+++ b/x-pack/solutions/observability/plugins/investigate_app/scripts/evaluate/scenarios/rca/index.spec.ts
@ -0,0 +1,150 @@
+/*
+ * Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
+ * or more contributor license agreements. Licensed under the Elastic License
+ * 2.0; you may not use this file except in compliance with the Elastic License
+ * 2.0.
+ */
+
+/// <reference types="@kbn/ambient-ftr-types"/>
+
+import type {
+  RootCauseAnalysisEvent,
+  EndProcessToolMessage,
+  InvestigateEntityToolMessage,
+  ObservationToolMessage,
+  ToolErrorMessage,
+} from '@kbn/observability-ai-server/root_cause_analysis';
+import {
+  chatClient,
+  kibanaClient,
+  logger,
+} from '@kbn/observability-ai-assistant-app-plugin/scripts/evaluation/services';
+import { RCAClient } from '../../rca_client';
+
+type ToolCallMessage =
+  | EndProcessToolMessage
+  | InvestigateEntityToolMessage
+  | ObservationToolMessage
+  | ToolErrorMessage;
+
+const ALERT_FIXTURE_ID = '0265d890-8d8d-4c7e-a5bd-a3951f79574e';
+
+describe('Root cause analysis', () => {
+  const investigations: string[] = [];
+  const rcaChatClient = new RCAClient(kibanaClient, logger);
+  function countEntities(entities: InvestigateEntityToolMessage[]) {
+    const entityCount: Record<string, number> = {};
+    entities.forEach((entity) => {
+      const name = entity.response.entity['service.name'];
+      entityCount[name] = (entityCount[name] || 0) + 1;
+    });
+    return entityCount;
+  }
+
+  function categorizeEvents(events: RootCauseAnalysisEvent[]) {
+    const report: EndProcessToolMessage[] = [];
+    const observations: ObservationToolMessage[] = [];
+    const errors: ToolErrorMessage[] = [];
+    const entities: InvestigateEntityToolMessage[] = [];
+    const other: RootCauseAnalysisEvent[] = [];
+    const toolCallEvents = events.filter((event): event is ToolCallMessage => {
+      const maybeToolEvent = event as EndProcessToolMessage;
+      return (
+        maybeToolEvent?.name === 'endProcessAndWriteReport' ||
+        maybeToolEvent?.name === 'observe' ||
+        maybeToolEvent?.name === 'error' ||
+        maybeToolEvent?.name === 'investigateEntity'
+      );
+    });
+    toolCallEvents.forEach((event) => {
+      if (event.name) {
+        switch (event.name) {
+          case 'endProcessAndWriteReport':
+            report.push(event as EndProcessToolMessage);
+            break;
+          case 'observe':
+            observations.push(event as ObservationToolMessage);
+            break;
+          case 'error':
+            errors.push(event as ToolErrorMessage);
+            break;
+          case 'investigateEntity':
+            entities.push(event as InvestigateEntityToolMessage);
+            break;
+          default:
+            other.push(event);
+        }
+      }
+    });
+    if (report.length > 1) {
+      throw new Error('More than one final report found');
+    }
+    if (report.length === 0) {
+      throw new Error('No final report found');
+    }
+    return { report: report[0], observations, errors, entities, other };
+  }
+
+  it('can accurately pinpoint the root cause of cartservice bad entrypoint failure', async () => {
+    const alert = await rcaChatClient.getAlert(ALERT_FIXTURE_ID);
+    const connectorId = chatClient.getConnectorId();
+    const { from, to } = await rcaChatClient.getTimeRange({
+      fromOffset: 'now-15m',
+      toOffset: 'now+15m',
+      alert,
+    });
+    const investigationId = await rcaChatClient.createInvestigation({
+      alertId: ALERT_FIXTURE_ID,
+      from,
+      to,
+    });
+    investigations.push(investigationId);
+    const events = await rcaChatClient.rootCauseAnalysis({
+      investigationId,
+      from: new Date(from).toISOString(),
+      to: new Date(to).toISOString(),
+      alert,
+      connectorId,
+    });
+    const { report, entities, errors } = categorizeEvents(events);
+    const prompt = `
+    An investigation was performed by the Observability AI Assistant to identify the root cause of an alert for the controller service. Here is the alert:         
+    
+    ${JSON.stringify(alert)}
+
+    The following entities were analyzed during the investigation.
+    ${Object.entries(countEntities(entities))
+      .map(([name, count]) => {
+        return `    - ${name} (analyzed ${count} times)`;
+      })
+      .join('\n')}
+
+    During the course of the investigation, the Observability AI Assistant encountered ${
+      errors.length
+    } errors when attempting to analyze the entities.${
+      errors.length
+        ? ' These errors were failures to retrieve data from the entities and do not reflect issues in the system being evaluated'
+        : ''
+    }.
+
+    A report was written by the Observability AI Assistant detailing issues throughout the system, including the controller service and it's dependencies. The report includes a hypothesis about the underlying root cause of the system failure. Here is the report:
+
+    ${report.response.report}
+    `;
+
+    const conversation = await chatClient.complete({ messages: prompt });
+
+    await chatClient.evaluate(conversation, [
+      'Effectively reflects the actual root cause in the report. The actual root cause of the system failure was a misconfiguration related to the `cartservice`. A bad container entrypoint was configured for the cart service, causing it to fail to start',
+      'Analyzes the cartservice during the course of the investigation.',
+      'Analyzes each entity only once.',
+      'The Observability AI Assistant encountered 0 errors when attempting to analyze the system failure.',
+    ]);
+  });
+
+  after(async () => {
+    for (const investigationId of investigations) {
+      await rcaChatClient.deleteInvestigation({ investigationId });
+    }
+  });
+});
--- a/x-pack/solutions/observability/plugins/investigate_app/scripts/load/README.md
+++ b/x-pack/solutions/observability/plugins/investigate_app/scripts/load/README.md
@ -0,0 +1,25 @@
+# Investigation RCA Evaluation Framework
+
+## Overview
+
+This tool is developed for our team working on the Elastic Observability platform, specifically focusing on evaluating the Investigation RCA AI Integration. It simplifies archiving data critical for evaluating the Investigation UI and it's integration with large language models (LLM).
+
+## Setup requirements
+
+- An Elasticsearch instance
+
+You'll need an instance configured with cross cluster search for the [edge-rca](https://studious-disco-k66oojq.pages.github.io/edge-rca/) cluster. To create one, utilize [oblt-cli](https://studious-disco-k66oojq.pages.github.io/user-guide/cluster-create-ccs/) and select `edge-rca` as the remote cluster.
+
+## Running archive
+
+Run the tool using:
+
+`$ node x-pack/solutions/observability/plugins/investigate_app/scripts/load/index.js --kibana http://admin:[YOUR_CLUSTER_PASSWORD]@localhost:5601`
+
+This will load all fixtures located in `./fixtures`.
+
+### Configuration
+
+#### Kibana and Elasticsearch
+
+By default, the tool will look for a Kibana instance running locally (at `http://localhost:5601`, which is the default address for running Kibana in development mode). It will also attempt to read the Kibana config file for the Elasticsearch address & credentials. If you want to override these settings, use `--kibana` and `--es`. Only basic auth is supported, e.g. `--kibana http://username:password@localhost:5601`. If you want to use a specific space, use `--spaceId`
--- a/x-pack/solutions/observability/plugins/investigate_app/scripts/load/cli.ts
+++ b/x-pack/solutions/observability/plugins/investigate_app/scripts/load/cli.ts
@ -0,0 +1,23 @@
+/*
+ * Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
+ * or more contributor license agreements. Licensed under the Elastic License
+ * 2.0; you may not use this file except in compliance with the Elastic License
+ * 2.0.
+ */
+import { Argv } from 'yargs';
+import {
+  elasticsearchOption,
+  kibanaOption,
+} from '@kbn/observability-ai-assistant-app-plugin/scripts/evaluation/cli';
+
+export async function options(y: Argv) {
+  const argv = y
+    .option('kibana', kibanaOption)
+    .option('elasticsearch', elasticsearchOption)
+    .option('logLevel', {
+      describe: 'Log level',
+      default: 'info',
+    }).argv;
+
+  return argv;
+}
--- a/x-pack/solutions/observability/plugins/investigate_app/scripts/load/fixtures/custom_threshold_alerts/data.json.gz
+++ b/x-pack/solutions/observability/plugins/investigate_app/scripts/load/fixtures/custom_threshold_alerts/data.json.gz
--- a/x-pack/solutions/observability/plugins/investigate_app/scripts/load/fixtures/custom_threshold_alerts/mappings.json
+++ b/x-pack/solutions/observability/plugins/investigate_app/scripts/load/fixtures/custom_threshold_alerts/mappings.json
--- a/x-pack/solutions/observability/plugins/investigate_app/scripts/load/index.js
+++ b/x-pack/solutions/observability/plugins/investigate_app/scripts/load/index.js
@ -0,0 +1,17 @@
+/*
+ * Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
+ * or more contributor license agreements. Licensed under the Elastic License
+ * 2.0; you may not use this file except in compliance with the Elastic License
+ * 2.0.
+ */
+
+/*
+ * Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
+ * or more contributor license agreements. Licensed under the Elastic License
+ * 2.0; you may not use this file except in compliance with the Elastic License
+ * 2.0.
+ */
+
+require('@kbn/babel-register').install();
+
+require('./load');
--- a/x-pack/solutions/observability/plugins/investigate_app/scripts/load/load.ts
+++ b/x-pack/solutions/observability/plugins/investigate_app/scripts/load/load.ts
@ -0,0 +1,110 @@
+/*
+ * Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
+ * or more contributor license agreements. Licensed under the Elastic License
+ * 2.0; you may not use this file except in compliance with the Elastic License
+ * 2.0.
+ */
+import axios from 'axios';
+import { spawnSync } from 'child_process';
+import { run } from '@kbn/dev-cli-runner';
+import { ToolingLog } from '@kbn/tooling-log';
+import { getServiceUrls } from '@kbn/observability-ai-assistant-app-plugin/scripts/evaluation/get_service_urls';
+import yargs from 'yargs';
+import fs from 'fs';
+import path from 'path';
+import { options } from './cli';
+
+async function loadFixtureData({
+  esUrl,
+  kibanaUrl,
+  log,
+}: {
+  esUrl: string;
+  kibanaUrl: string;
+  log: ToolingLog;
+}) {
+  const directory = `${__dirname}/fixtures`;
+  const directories = getDirectories({ filePath: `${__dirname}/fixtures`, log });
+  await axios.post(
+    `${kibanaUrl}/internal/kibana/settings`,
+    {
+      changes: {
+        'observability:logSources': [
+          'remote_cluster:logs-*-*',
+          'remote_cluster:logs-*',
+          'remote_cluster:filebeat-*',
+        ],
+      },
+    },
+    {
+      headers: {
+        'kbn-xsrf': 'foo',
+        'x-elastic-internal-origin': 'observability-ai-assistant',
+      },
+    }
+  );
+  log.info('Logs sources updated');
+  directories.forEach((dir) => {
+    spawnSync(
+      'node',
+      [
+        'scripts/es_archiver',
+        'load',
+        `${directory}/${dir}`,
+        '--es-url',
+        esUrl,
+        '--kibana-url',
+        kibanaUrl,
+      ],
+      {
+        stdio: 'inherit',
+      }
+    );
+  });
+}
+
+function getDirectories({ filePath, log }: { filePath: string; log: ToolingLog }): string[] {
+  try {
+    const items = fs.readdirSync(filePath);
+    const folders = items.filter((item) => {
+      const itemPath = path.join(filePath, item);
+      return fs.statSync(itemPath).isDirectory();
+    });
+    return folders;
+  } catch (error) {
+    log.error(`Error reading directory: ${error.message}`);
+    return [];
+  }
+}
+
+function loadData() {
+  yargs(process.argv.slice(2))
+    .command('*', 'Load RCA data', async () => {
+      const argv = await options(yargs);
+      run(
+        async ({ log }) => {
+          const serviceUrls = await getServiceUrls({
+            log,
+            elasticsearch: argv.elasticsearch,
+            kibana: argv.kibana,
+          });
+          loadFixtureData({
+            esUrl: serviceUrls.esUrl,
+            kibanaUrl: serviceUrls.kibanaUrl,
+            log,
+          });
+        },
+        {
+          log: {
+            defaultLevel: argv.logLevel as any,
+          },
+          flags: {
+            allowUnexpected: true,
+          },
+        }
+      );
+    })
+    .parse();
+}
+
+loadData();
--- a/x-pack/solutions/observability/plugins/investigate_app/server/services/create_investigation.ts
+++ b/x-pack/solutions/observability/plugins/investigate_app/server/services/create_investigation.ts
@ -4,7 +4,6 @@
 * 2.0; you may not use this file except in compliance with the Elastic License
 * 2.0.
 */
-
 import { CreateInvestigationParams, CreateInvestigationResponse } from '@kbn/investigation-shared';
 import type { AuthenticatedUser } from '@kbn/core-security-common';
 import { InvestigationRepository } from './investigation_repository';
@ -23,7 +22,7 @@ export async function createInvestigation(
    ...params,
    updatedAt: now,
    createdAt: now,
-    createdBy: user.profile_uid!,
+    createdBy: user.profile_uid! || user.username,
    status: 'triage',
    notes: [],
    items: [],
--- a/x-pack/solutions/observability/plugins/investigate_app/tsconfig.json
+++ b/x-pack/solutions/observability/plugins/investigate_app/tsconfig.json
@ -10,6 +10,7 @@
    "typings/**/*",
    "public/**/*.json",
    "server/**/*",
+    "scripts/**/*",
    ".storybook/**/*"
  ],
  "exclude": [
@ -80,5 +81,10 @@
    "@kbn/utility-types-jest",
    "@kbn/visualization-utils",
    "@kbn/zod",
+    "@kbn/babel-register",
+    "@kbn/tooling-log",
+    "@kbn/dev-cli-runner",
+    "@kbn/datemath",
+    "@kbn/sse-utils-client",
  ],
 }
--- a/x-pack/solutions/observability/plugins/observability_ai_assistant_app/scripts/evaluation/README.md
+++ b/x-pack/solutions/observability/plugins/observability_ai_assistant_app/scripts/evaluation/README.md
@ -24,7 +24,9 @@ This tool is developed for our team working on the Elastic Observability platfor
 This will evaluate all existing scenarios, and write the evaluation results to the terminal.

 #### To run the evaluation using a hosted deployment:
+
 - Add the credentials of Elasticsearch to `kibana.dev.yml` as follows:
+
 ```
 elasticsearch.hosts: https://<hosted-url>:<port>
 elasticsearch.username: <username>
@ -32,6 +34,7 @@ elasticsearch.password: <password>
 elasticsearch.ssl.verificationMode: none
 elasticsearch.ignoreVersionMismatch: true
 ```
+
 - Start Kibana
 - Run this command to start evaluating: `node x-pack/solutions/observability/plugins/observability_ai_assistant_app/scripts/evaluation/index.js --kibana http://<username>:<password>@localhost:5601`

@ -41,6 +44,7 @@ E.g.: `node x-pack/solutions/observability/plugins/observability_ai_assistant_ap
 The `--kibana` and `--es` flags override the default credentials. Only basic auth is supported.

 ## Other (optional) configuration flags
+
 - `--connectorId` - Specify a generative AI connector to use. If none are given, it will prompt you to select a connector based on the ones that are available. If only a single supported connector is found, it will be used without prompting.
 - `--evaluateWith`: The connector ID to evaluate with. Leave empty to use the same connector, use "other" to get a selection menu.
 - `--spaceId` - Specify the space ID if you want to use a specific space.
--- a/x-pack/solutions/observability/plugins/observability_ai_assistant_app/scripts/evaluation/cli.ts
+++ b/x-pack/solutions/observability/plugins/observability_ai_assistant_app/scripts/evaluation/cli.ts
@ -25,8 +25,8 @@ export const elasticsearchOption = {
  describe: 'Where Elasticsearch is running',
  string: true as const,
  default: format({
-    ...parse(config['elasticsearch.hosts']),
-    auth: `${config['elasticsearch.username']}:${config['elasticsearch.password']}`,
+    ...parse(config.elasticsearch.hosts || 'http://localhost:9200'),
+    auth: `${config.elasticsearch.username}:${config.elasticsearch.password}`,
  }),
 };

--- a/x-pack/solutions/observability/plugins/observability_ai_assistant_app/scripts/evaluation/kibana_client.ts
+++ b/x-pack/solutions/observability/plugins/observability_ai_assistant_app/scripts/evaluation/kibana_client.ts
@ -25,7 +25,7 @@ import { throwSerializedChatCompletionErrors } from '@kbn/observability-ai-assis
 import { Message, MessageRole } from '@kbn/observability-ai-assistant-plugin/common';
 import { streamIntoObservable } from '@kbn/observability-ai-assistant-plugin/server';
 import { ToolingLog } from '@kbn/tooling-log';
-import axios, { AxiosInstance, AxiosResponse, isAxiosError } from 'axios';
+import axios, { AxiosInstance, AxiosResponse, isAxiosError, AxiosRequestConfig } from 'axios';
 import { omit, pick, remove } from 'lodash';
 import pRetry from 'p-retry';
 import {
@ -81,6 +81,7 @@ export interface ChatClient {
  ) => Promise<EvaluationResult>;
  getResults: () => EvaluationResult[];
  onResult: (cb: (result: EvaluationResult) => void) => () => void;
+  getConnectorId: () => string;
 }

 export class KibanaClient {
@ -93,6 +94,7 @@ export class KibanaClient {
    this.axios = axios.create({
      headers: {
        'kbn-xsrf': 'foo',
+        'x-elastic-internal-origin': 'kibana',
      },
    });
  }
@ -118,17 +120,15 @@ export class KibanaClient {
  callKibana<T>(
    method: string,
    props: { query?: UrlObject['query']; pathname: string; ignoreSpaceId?: boolean },
-    data?: any
+    data?: any,
+    axiosParams: Partial<AxiosRequestConfig> = {}
  ) {
    const url = this.getUrl(props);
    return this.axios<T>({
      method,
      url,
      ...(method.toLowerCase() === 'delete' && !data ? {} : { data: data || {} }),
-      headers: {
-        'kbn-xsrf': 'true',
-        'x-elastic-internal-origin': 'Kibana',
-      },
+      ...axiosParams,
    }).catch((error) => {
      if (isAxiosError(error)) {
        const interestingPartsOfError = {
@ -635,6 +635,7 @@ export class KibanaClient {
        onResultCallbacks.push({ callback, unregister });
        return unregister;
      },
+      getConnectorId: () => connectorId,
    };
  }

--- a/x-pack/solutions/observability/plugins/observability_ai_assistant_app/scripts/evaluation/read_kibana_config.ts
+++ b/x-pack/solutions/observability/plugins/observability_ai_assistant_app/scripts/evaluation/read_kibana_config.ts
@ -9,6 +9,7 @@ import path from 'path';
 import fs from 'fs';
 import yaml from 'js-yaml';
 import { identity, pickBy } from 'lodash';
+import { unflattenObject } from '@kbn/observability-utils-common/object/unflatten_object';

 export type KibanaConfig = ReturnType<typeof readKibanaConfig>;

@ -35,10 +36,14 @@ export const readKibanaConfig = () => {
  };

  return {
-    'elasticsearch.hosts': 'http://localhost:9200',
-    'elasticsearch.username': 'elastic',
-    'elasticsearch.password': 'changeme',
-    ...loadedKibanaConfig,
-    ...cliEsCredentials,
+    elasticsearch: {
+      hosts: 'http://localhost:9200',
+      username: 'elastic',
+      password: 'changeme',
+    },
+    ...unflattenObject({
+      ...loadedKibanaConfig,
+      ...cliEsCredentials,
+    }),
  };
 };