New Integration Assistant plugin (#184296)

## Summary

This is a PR to add a new backend plugin (frontend will be done in
separate [PR](https://github.com/elastic/kibana/pull/184546)).

The purpose of the plugin is to provide a set of API routes that is used
to perform a variety of GenAI workflows to generate new integrations
based on provided inputs.

It reuses the existing GenAI connectors for its LLM communication, and
provides a set of API's to create ECS mapping, Categorization, Related
Fields and an API to generate the actual integration package zip, which
is forwarded to the UI component.

### Planned follow-up changes:

As the PR is getting way too large, some planned changes would be added
in much smaller follow-ups. This includes mostly more improved try/catch
for certain routes, adding debug/error log entries where relevant,
especially for the API endpoints themself, some more unit and end2end
tests.

- OpenAPI spec for the API will be handled in a separate PR
- All the missing unit tests will be added as a followup PR

### Testing

The `integration_assistant` plugin will be disabled by default while
it's being implemented so we can iterate and merge partial PRs without
interfering with the releases. This config will work as our feature
flag:


6aefd4ff7b/x-pack/plugins/integration_assistant/server/config.ts (L11-L13)

To test it add this to your _kibana.dev.yml_:
```
xpack.integration_assistant.enabled: true
```

### Checklist

Delete any items that are not applicable to this PR.

- [x]
[Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html)
was added for features that require explanation or tutorials
- [x] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios

### Risk Matrix

Delete this section if it is not applicable to this PR.

Before closing this PR, invite QA, stakeholders, and other developers to
identify risks that should be tested prior to the change/feature
release.

When forming the risk matrix, consider some of the following examples
and how they may potentially impact the change:

| Risk | Probability | Severity | Mitigation/Notes |

|---------------------------|-------------|----------|-------------------------|
| Multiple Spaces—unexpected behavior in non-default Kibana Space.
| Low | High | Integration tests will verify that all features are still
supported in non-default Kibana Space and when user switches between
spaces. |
| Multiple nodes—Elasticsearch polling might have race conditions
when multiple Kibana nodes are polling for the same tasks. | High | Low
| Tasks are idempotent, so executing them multiple times will not result
in logical error, but will degrade performance. To test for this case we
add plenty of unit tests around this logic and document manual testing
procedure. |
| Code should gracefully handle cases when feature X or plugin Y are
disabled. | Medium | High | Unit tests will verify that any feature flag
or plugin combination still results in our service operational. |
| [See more potential risk
examples](https://github.com/elastic/kibana/blob/main/RISK_MATRIX.mdx) |


### For maintainers

- [ ] This was checked for breaking API changes and was [labeled
appropriately](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)

---------

Co-authored-by: Patryk Kopycinski <contact@patrykkopycinski.com>
Co-authored-by: Sergi Massaneda <sergi.massaneda@elastic.co>
Co-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com>
Co-authored-by: Bharat Pasupula <saibharatchandra.pasupula@elastic.co>
Co-authored-by: Bharat Pasupula <123897612+bhapas@users.noreply.github.com>
This commit is contained in:
Marius Iversen 2024-06-14 00:48:36 +02:00 committed by GitHub
parent 5000201d56
commit 9ed2865838
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
143 changed files with 12386 additions and 165 deletions

View file

@ -1004,6 +1004,29 @@ module.exports = {
},
},
/**
* Integration assistant overrides
*/
{
// front end and common typescript and javascript files only
files: [
'x-pack/plugins/integration_assistant/public/**/*.{js,mjs,ts,tsx}',
'x-pack/plugins/integration_assistant/common/**/*.{js,mjs,ts,tsx}',
],
rules: {
'import/no-nodejs-modules': 'error',
'no-duplicate-imports': 'off',
'@typescript-eslint/no-duplicate-imports': 'error',
'no-restricted-imports': [
'error',
{
// prevents UI code from importing server side code and then webpack including it when doing builds
patterns: ['**/server/*'],
},
],
},
},
/**
* ML overrides
*/
@ -1068,6 +1091,7 @@ module.exports = {
files: [
'x-pack/plugins/ecs_data_quality_dashboard/**/*.{ts,tsx}',
'x-pack/plugins/elastic_assistant/**/*.{ts,tsx}',
'x-pack/plugins/integration_assistant/**/*.{ts,tsx}',
'x-pack/packages/kbn-elastic-assistant/**/*.{ts,tsx}',
'x-pack/packages/kbn-elastic-assistant-common/**/*.{ts,tsx}',
'x-pack/packages/kbn-langchain/**/*.{ts,tsx}',
@ -1082,6 +1106,7 @@ module.exports = {
excludedFiles: [
'x-pack/plugins/ecs_data_quality_dashboard/**/*.{test,mock,test_helper}.{ts,tsx}',
'x-pack/plugins/elastic_assistant/**/*.{test,mock,test_helper}.{ts,tsx}',
'x-pack/plugins/integration_assistant/**/*.{test,mock,test_helper}.{ts,tsx}',
'x-pack/packages/kbn-elastic-assistant/**/*.{test,mock,test_helper}.{ts,tsx}',
'x-pack/packages/kbn-elastic-assistant-common/**/*.{test,mock,test_helper}.{ts,tsx}',
'x-pack/packages/kbn-langchain/**/*.{test,mock,test_helper}.{ts,tsx}',
@ -1102,6 +1127,7 @@ module.exports = {
files: [
'x-pack/plugins/ecs_data_quality_dashboard/**/*.{ts,tsx}',
'x-pack/plugins/elastic_assistant/**/*.{ts,tsx}',
'x-pack/plugins/integration_assistant/**/*.{ts,tsx}',
'x-pack/packages/kbn-elastic-assistant/**/*.{ts,tsx}',
'x-pack/packages/kbn-elastic-assistant-common/**/*.{ts,tsx}',
'x-pack/packages/kbn-langchain/**/*.{ts,tsx}',
@ -1141,6 +1167,7 @@ module.exports = {
files: [
'x-pack/plugins/ecs_data_quality_dashboard/**/*.{js,mjs,ts,tsx}',
'x-pack/plugins/elastic_assistant/**/*.{js,mjs,ts,tsx}',
'x-pack/plugins/integration_assistant/**/*.{js,mjs,ts,tsx}',
'x-pack/packages/kbn-elastic-assistant/**/*.{js,mjs,ts,tsx}',
'x-pack/packages/kbn-elastic-assistant-common/**/*.{js,mjs,ts,tsx}',
'x-pack/packages/kbn-langchain/**/*.{js,mjs,ts,tsx}',

1
.github/CODEOWNERS vendored
View file

@ -503,6 +503,7 @@ x-pack/plugins/observability_solution/infra @elastic/obs-ux-logs-team @elastic/o
x-pack/plugins/ingest_pipelines @elastic/kibana-management
src/plugins/input_control_vis @elastic/kibana-presentation
src/plugins/inspector @elastic/kibana-presentation
x-pack/plugins/integration_assistant @elastic/security-solution
src/plugins/interactive_setup @elastic/kibana-security
test/interactive_setup_api_integration/plugins/test_endpoints @elastic/kibana-security
packages/kbn-interpreter @elastic/kibana-visualizations

View file

@ -638,6 +638,10 @@ the infrastructure monitoring use-case within Kibana.
|The ingest_pipelines plugin provides Kibana support for Elasticsearch's ingest pipelines.
|{kib-repo}blob/{branch}/x-pack/plugins/integration_assistant/README.md[integrationAssistant]
|Team owner: Security Integrations Scalability
|{kib-repo}blob/{branch}/x-pack/plugins/observability_solution/investigate/README.md[investigate]
|undefined

View file

@ -80,7 +80,7 @@
"resolutions": {
"**/@bazel/typescript/protobufjs": "6.11.4",
"**/@hello-pangea/dnd": "16.6.0",
"**/@langchain/core": "0.1.53",
"**/@langchain/core": "0.2.3",
"**/@types/node": "20.10.5",
"**/@typescript-eslint/utils": "5.62.0",
"**/chokidar": "^3.5.3",
@ -540,6 +540,7 @@
"@kbn/ingest-pipelines-plugin": "link:x-pack/plugins/ingest_pipelines",
"@kbn/input-control-vis-plugin": "link:src/plugins/input_control_vis",
"@kbn/inspector-plugin": "link:src/plugins/inspector",
"@kbn/integration-assistant-plugin": "link:x-pack/plugins/integration_assistant",
"@kbn/interactive-setup-plugin": "link:src/plugins/interactive_setup",
"@kbn/interactive-setup-test-endpoints-plugin": "link:test/interactive_setup_api_integration/plugins/test_endpoints",
"@kbn/interpreter": "link:packages/kbn-interpreter",
@ -927,9 +928,10 @@
"@kbn/watcher-plugin": "link:x-pack/plugins/watcher",
"@kbn/xstate-utils": "link:packages/kbn-xstate-utils",
"@kbn/zod-helpers": "link:packages/kbn-zod-helpers",
"@langchain/community": "^0.0.44",
"@langchain/core": "^0.1.53",
"@langchain/openai": "^0.0.25",
"@langchain/community": "^0.2.4",
"@langchain/core": "0.2.3",
"@langchain/langgraph": "^0.0.23",
"@langchain/openai": "^0.0.34",
"@langtrase/trace-attributes": "^3.0.8",
"@langtrase/typescript-sdk": "^2.2.1",
"@launchdarkly/node-server-sdk": "^9.4.5",
@ -952,10 +954,10 @@
"@paralleldrive/cuid2": "^2.2.2",
"@reduxjs/toolkit": "1.9.7",
"@slack/webhook": "^7.0.1",
"@smithy/eventstream-codec": "^2.0.12",
"@smithy/eventstream-serde-node": "^2.1.1",
"@smithy/types": "^2.9.1",
"@smithy/util-utf8": "^2.0.0",
"@smithy/eventstream-codec": "^3.0.0",
"@smithy/eventstream-serde-node": "^3.0.0",
"@smithy/types": "^3.0.0",
"@smithy/util-utf8": "^3.0.0",
"@tanstack/react-query": "^4.29.12",
"@tanstack/react-query-devtools": "^4.29.12",
"@turf/along": "6.0.1",
@ -1067,9 +1069,10 @@
"jsonwebtoken": "^9.0.2",
"jsts": "^1.6.2",
"kea": "^2.6.0",
"langchain": "^0.1.30",
"langsmith": "^0.1.14",
"langchain": "0.2.3",
"langsmith": "^0.1.30",
"launchdarkly-js-client-sdk": "^3.3.0",
"launchdarkly-node-server-sdk": "^7.0.3",
"load-json-file": "^6.2.0",
"lodash": "^4.17.21",
"lru-cache": "^4.1.5",
@ -1092,6 +1095,7 @@
"node-forge": "^1.3.1",
"nodemailer": "^6.9.9",
"normalize-path": "^3.0.0",
"nunjucks": "^3.2.4",
"object-hash": "^1.3.1",
"object-path-immutable": "^3.1.1",
"openai": "^4.24.1",
@ -1504,6 +1508,7 @@
"@types/node-forge": "^1.3.10",
"@types/nodemailer": "^6.4.0",
"@types/normalize-path": "^3.0.0",
"@types/nunjucks": "^3.2.6",
"@types/object-hash": "^1.3.0",
"@types/opn": "^5.1.0",
"@types/ora": "^1.3.5",

View file

@ -1000,6 +1000,8 @@
"@kbn/input-control-vis-plugin/*": ["src/plugins/input_control_vis/*"],
"@kbn/inspector-plugin": ["src/plugins/inspector"],
"@kbn/inspector-plugin/*": ["src/plugins/inspector/*"],
"@kbn/integration-assistant-plugin": ["x-pack/plugins/integration_assistant"],
"@kbn/integration-assistant-plugin/*": ["x-pack/plugins/integration_assistant/*"],
"@kbn/interactive-setup-plugin": ["src/plugins/interactive_setup"],
"@kbn/interactive-setup-plugin/*": ["src/plugins/interactive_setup/*"],
"@kbn/interactive-setup-test-endpoints-plugin": ["test/interactive_setup_api_integration/plugins/test_endpoints"],

View file

@ -27,6 +27,7 @@ export interface ActionsClientChatOpenAIParams {
streaming?: boolean;
traceId?: string;
maxRetries?: number;
maxTokens?: number;
model?: string;
temperature?: number;
signal?: AbortSignal;
@ -75,9 +76,11 @@ export class ActionsClientChatOpenAI extends ChatOpenAI {
streaming = true,
temperature,
timeout,
maxTokens,
}: ActionsClientChatOpenAIParams) {
super({
maxRetries,
maxTokens,
streaming,
// matters only for the LangSmith logs (Metadata > Invocation Params), which are misleading if this is not set
modelName: model ?? DEFAULT_OPEN_AI_MODEL,

View file

@ -35,6 +35,7 @@ export interface CustomChatModelInput extends BaseChatModelParams {
temperature?: number;
request: KibanaRequest;
streaming: boolean;
maxTokens?: number;
}
export class ActionsClientSimpleChatModel extends SimpleChatModel {
@ -44,6 +45,7 @@ export class ActionsClientSimpleChatModel extends SimpleChatModel {
#request: KibanaRequest;
#traceId: string;
#signal?: AbortSignal;
#maxTokens?: number;
llmType: string;
streaming: boolean;
model?: string;
@ -59,6 +61,7 @@ export class ActionsClientSimpleChatModel extends SimpleChatModel {
temperature,
signal,
streaming,
maxTokens,
}: CustomChatModelInput) {
super({});
@ -68,6 +71,7 @@ export class ActionsClientSimpleChatModel extends SimpleChatModel {
this.#logger = logger;
this.#signal = signal;
this.#request = request;
this.#maxTokens = maxTokens;
this.llmType = llmType ?? 'ActionsClientSimpleChatModel';
this.model = model;
this.temperature = temperature;
@ -95,7 +99,7 @@ export class ActionsClientSimpleChatModel extends SimpleChatModel {
throw new Error('No messages provided.');
}
const formattedMessages = [];
if (messages.length === 2) {
if (messages.length >= 2) {
messages.forEach((message, i) => {
if (typeof message.content !== 'string') {
throw new Error('Multimodal messages are not supported.');
@ -121,6 +125,7 @@ export class ActionsClientSimpleChatModel extends SimpleChatModel {
subActionParams: {
model: this.model,
messages: formattedMessages,
maxTokens: this.#maxTokens,
...getDefaultArguments(this.llmType, this.temperature, options.stop),
},
},

View file

@ -212,6 +212,19 @@ Object {
"presence": "optional",
},
"keys": Object {
"maxTokens": Object {
"flags": Object {
"default": [Function],
"error": [Function],
"presence": "optional",
},
"metas": Array [
Object {
"x-oas-optional": true,
},
],
"type": "number",
},
"messages": Object {
"flags": Object {
"error": [Function],
@ -399,6 +412,19 @@ Object {
"presence": "optional",
},
"keys": Object {
"maxTokens": Object {
"flags": Object {
"default": [Function],
"error": [Function],
"presence": "optional",
},
"metas": Array [
Object {
"x-oas-optional": true,
},
],
"type": "number",
},
"messages": Object {
"flags": Object {
"error": [Function],

View file

@ -0,0 +1,66 @@
# Integration Assistant
## Overview
Team owner: Security Integrations Scalability
This is a new Kibana plugin created to help users with automatically generating integration packages based on provided log samples and relevant information
## Features
Exposes 4 API's that can be consumed by any frontend plugin, which are:
- ECS Mapping API
- Categorization API
- Related Fields API
- Build Integration API
- Optional Test Pipeline API (Used to update pipeline results if the ingest pipeline is changed by a user in the UI).
## Development
### Backend
#### Overview
The backend part of the plugin utilizes langraph extensively to parse the provided log samples and generate the integration package.
One instance of langraph is created that will include one or more `nodes` in which each node represents a step in the integration package generation process.
Each node links to a specific function, usually a `handler` specified in its own file under each graph folder that will be executed when the node is reached.
#### Structure
**Graphs**
The graph components are split into logical parts and are placed in separate folders for each graph under the `./server/graphs` directory.
Each graph folder needs to contains at least one `graph.ts`, which exports a function that returns the compiled graph object.
Each exported graph function is then linked up to one or more API routes.
**Routes**
All routes are defined under `./server/routes` in its own file, and then included in the `./server/routes/register_routes.ts` file.
**Integration Builder**
The integration builder is the last step in the expected API flow (ECS Mapping -> Categorization -> Related Fields -> Integration Builder).
With the provided package and data stream details, an optional logo and a list of sample logs, the API will build out the entire folder structure and files required for the integration package, archive it and return it as a `Buffer`.
**Templates**
Currently the templates are stored as `nunjucks` files as they were converted from `jinja2` templates, which use the exact same format. Longer term this will most likely be switched to the Kibana forked Handlebars templating engine.
The templates are stored in the `./server/templates` directory and are used to generate the integration package files while running the Integration Builder API.
One template (pipeline.yml.njk) is used by the ECS Mapping API to generate the boilerplate ingest pipeline structure we want to use for all generated integrations.
## Tests
All mocks/fixtures are placed in the top `./__jest__` directory of the plugin. If many mocks/fixtures are required, try to split them up into separate file(s).
Tests can be run with:
```bash
node scripts/jest x-pack/plugins/integration_assistant/ --coverage
```

View file

@ -0,0 +1,289 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/
import type { Pipeline } from '../../common';
export const categorizationInitialPipeline: Pipeline = {
description: 'Pipeline to process mysql_enterprise audit logs',
processors: [
{
set: {
field: 'ecs.version',
value: '8.11.0',
},
},
{
rename: {
field: 'message',
target_field: 'event.original',
ignore_missing: true,
if: 'ctx.event?.original == null',
},
},
{
remove: {
field: 'event.original',
tag: 'remove_original_event',
if: 'ctx?.tags == null || !(ctx.tags.contains("preserve_original_event"))',
ignore_failure: true,
ignore_missing: true,
},
},
],
};
export const categorizationExpectedResults = {
docs: [
{
key: 'value',
anotherKey: 'anotherValue',
},
],
pipeline: {
description: 'Pipeline to process mysql_enterprise audit logs',
processors: [
{
set: {
field: 'ecs.version',
value: '8.11.0',
},
},
{
append: {
field: 'event.type',
value: ['change'],
if: "ctx.mysql_enterprise?.audit?.general_data?.sql_command == 'create_db'",
},
},
{
append: {
field: 'event.category',
value: ['database'],
if: "ctx.mysql_enterprise?.audit?.general_data?.sql_command == 'create_db'",
},
},
{
rename: {
field: 'message',
target_field: 'event.original',
ignore_missing: true,
if: 'ctx.event?.original == null',
},
},
{
remove: {
field: 'event.original',
tag: 'remove_original_event',
if: 'ctx?.tags == null || !(ctx.tags.contains("preserve_original_event"))',
ignore_failure: true,
ignore_missing: true,
},
},
],
},
};
export const categorizationInitialMockedResponse = [
{
append: {
field: 'event.type',
value: ['creation'],
if: "ctx.mysql_enterprise?.audit?.general_data?.sql_command == 'create_db'",
},
},
{
append: {
field: 'event.category',
value: ['database'],
if: "ctx.mysql_enterprise.audit.general_data.sql_command == 'create_db'",
},
},
];
export const categorizationErrorMockedResponse = [
{
append: {
field: 'event.type',
value: ['creation'],
if: "ctx.mysql_enterprise?.audit?.general_data?.sql_command == 'create_db'",
},
},
{
append: {
field: 'event.category',
value: ['database'],
if: "ctx.mysql_enterprise?.audit?.general_data?.sql_command == 'create_db'",
},
},
];
export const categorizationInvalidMockedResponse = [
{
append: {
field: 'event.type',
value: ['change'],
if: "ctx.mysql_enterprise?.audit?.general_data?.sql_command == 'create_db'",
},
},
{
append: {
field: 'event.category',
value: ['database'],
if: "ctx.mysql_enterprise?.audit?.general_data?.sql_command == 'create_db'",
},
},
];
export const categorizationReviewMockedResponse = [
{
append: {
field: 'event.type',
value: ['change'],
if: "ctx.mysql_enterprise?.audit?.general_data?.sql_command == 'create_db'",
},
},
{
append: {
field: 'event.category',
value: ['database'],
if: "ctx.mysql_enterprise?.audit?.general_data?.sql_command == 'create_db'",
},
},
];
export const testPipelineError: { pipelineResults: object[]; errors: object[] } = {
pipelineResults: [],
errors: [{ error: 'Sample error message 1' }, { error: 'Sample error message 2' }],
};
export const testPipelineValidResult: { pipelineResults: object[]; errors: object[] } = {
pipelineResults: [{ key: 'value', anotherKey: 'anotherValue' }],
errors: [],
};
export const testPipelineInvalidEcs: { pipelineResults: object[]; errors: object[] } = {
pipelineResults: [
{ event: { type: ['database'], category: ['creation'] }, anotherKey: 'anotherValue' },
],
errors: [],
};
export const categorizationTestState = {
rawSamples: ['{"test1": "test1"}'],
samples: ['{ "test1": "test1" }'],
formattedSamples: '{"test1": "test1"}',
ecsTypes: 'testtypes',
ecsCategories: 'testcategories',
exAnswer: 'testanswer',
lastExecutedChain: 'testchain',
packageName: 'testpackage',
dataStreamName: 'testdatastream',
errors: { test: 'testerror' },
pipelineResults: [{ test: 'testresult' }],
finalized: false,
reviewed: false,
currentPipeline: { test: 'testpipeline' },
currentProcessors: [
{
append: {
field: 'event.type',
value: ['creation'],
if: "ctx.mysql_enterprise?.audit?.general_data?.sql_command == 'create_db'",
},
},
{
append: {
field: 'event.category',
value: ['database'],
if: "ctx.mysql_enterprise.audit.general_data.sql_command == 'create_db'",
},
},
],
invalidCategorization: { test: 'testinvalid' },
initialPipeline: categorizationInitialPipeline,
results: { test: 'testresults' },
};
export const categorizationMockProcessors = [
{
append: {
field: 'event.type',
value: ['creation'],
if: "ctx.mysql_enterprise?.audit?.general_data?.sql_command == 'create_db'",
},
},
{
append: {
field: 'event.category',
value: ['database'],
if: "ctx.mysql_enterprise.audit.general_data.sql_command == 'create_db'",
},
},
];
export const categorizationExpectedHandlerResponse = {
currentPipeline: {
description: 'Pipeline to process mysql_enterprise audit logs',
processors: [
{
set: {
field: 'ecs.version',
value: '8.11.0',
},
},
{
append: {
field: 'event.type',
value: ['creation'],
if: "ctx.mysql_enterprise?.audit?.general_data?.sql_command == 'create_db'",
},
},
{
append: {
field: 'event.category',
value: ['database'],
if: "ctx.mysql_enterprise.audit.general_data.sql_command == 'create_db'",
},
},
{
rename: {
field: 'message',
target_field: 'event.original',
ignore_missing: true,
if: 'ctx.event?.original == null',
},
},
{
remove: {
field: 'event.original',
tag: 'remove_original_event',
if: 'ctx?.tags == null || !(ctx.tags.contains("preserve_original_event"))',
ignore_failure: true,
ignore_missing: true,
},
},
],
},
currentProcessors: [
{
append: {
field: 'event.type',
value: ['creation'],
if: "ctx.mysql_enterprise?.audit?.general_data?.sql_command == 'create_db'",
},
},
{
append: {
field: 'event.category',
value: ['database'],
if: "ctx.mysql_enterprise.audit.general_data.sql_command == 'create_db'",
},
},
],
reviewed: false,
lastExecutedChain: 'error',
};

View file

@ -0,0 +1,448 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/
export const ecsMappingExpectedResults = {
mapping: {
mysql_enterprise: {
audit: {
test_array: null,
timestamp: {
target: '@timestamp',
confidence: 0.99,
type: 'date',
date_formats: ['yyyy-MM-dd HH:mm:ss'],
},
id: null,
class: null,
cpu_usage: {
target: 'host.cpu.usage',
confidence: 0.99,
type: 'number',
date_formats: [],
},
bytes: {
target: 'network.bytes',
confidence: 0.99,
type: 'number',
date_formats: [],
},
account: {
user: {
target: 'user.name',
type: 'string',
date_formats: [],
confidence: 1,
},
ip: {
target: 'source.ip',
type: 'string',
date_formats: [],
confidence: 1,
},
},
event: {
target: 'event.action',
confidence: 0.8,
type: 'string',
date_formats: [],
},
},
},
},
pipeline: {
description: 'Pipeline to process mysql_enterprise audit logs',
processors: [
{
set: {
field: 'ecs.version',
tag: 'set_ecs_version',
value: '8.11.0',
},
},
{
rename: {
field: 'message',
target_field: 'event.original',
tag: 'rename_message',
ignore_missing: true,
if: 'ctx.event?.original == null',
},
},
{
remove: {
field: 'message',
ignore_missing: true,
tag: 'remove_message',
if: 'ctx.event?.original != null',
},
},
{
json: {
field: 'event.original',
tag: 'json_original',
target_field: 'mysql_enterprise.audit',
},
},
{
date: {
field: 'mysql_enterprise.audit.timestamp',
target_field: '@timestamp',
formats: ['yyyy-MM-dd HH:mm:ss'],
if: 'ctx.mysql_enterprise?.audit?.timestamp != null',
},
},
{
rename: {
field: 'mysql_enterprise.audit.cpu_usage',
target_field: 'host.cpu.usage',
ignore_missing: true,
},
},
{
rename: {
field: 'mysql_enterprise.audit.bytes',
target_field: 'network.bytes',
ignore_missing: true,
},
},
{
rename: {
field: 'mysql_enterprise.audit.account.user',
target_field: 'user.name',
ignore_missing: true,
},
},
{
convert: {
field: 'mysql_enterprise.audit.account.ip',
target_field: 'source.ip',
ignore_missing: true,
ignore_failure: true,
type: 'ip',
},
},
{
rename: {
field: 'mysql_enterprise.audit.event',
target_field: 'event.action',
ignore_missing: true,
},
},
{
script: {
description: 'Drops null/empty values recursively.',
tag: 'script_drop_null_empty_values',
lang: 'painless',
source:
'boolean dropEmptyFields(Object object) {\n if (object == null || object == "") {\n return true;\n } else if (object instanceof Map) {\n ((Map) object).values().removeIf(value -> dropEmptyFields(value));\n return (((Map) object).size() == 0);\n } else if (object instanceof List) {\n ((List) object).removeIf(value -> dropEmptyFields(value));\n return (((List) object).length == 0);\n }\n return false;\n}\ndropEmptyFields(ctx);\n',
},
},
{
geoip: {
field: 'source.ip',
tag: 'geoip_source_ip',
target_field: 'source.geo',
ignore_missing: true,
},
},
{
geoip: {
ignore_missing: true,
database_file: 'GeoLite2-ASN.mmdb',
field: 'source.ip',
tag: 'geoip_source_asn',
target_field: 'source.as',
properties: ['asn', 'organization_name'],
},
},
{
rename: {
field: 'source.as.asn',
tag: 'rename_source_as_asn',
target_field: 'source.as.number',
ignore_missing: true,
},
},
{
rename: {
field: 'source.as.organization_name',
tag: 'rename_source_as_organization_name',
target_field: 'source.as.organization.name',
ignore_missing: true,
},
},
{
geoip: {
field: 'destination.ip',
tag: 'geoip_destination_ip',
target_field: 'destination.geo',
ignore_missing: true,
},
},
{
geoip: {
database_file: 'GeoLite2-ASN.mmdb',
field: 'destination.ip',
tag: 'geoip_destination_asn',
target_field: 'destination.as',
properties: ['asn', 'organization_name'],
ignore_missing: true,
},
},
{
rename: {
field: 'destination.as.asn',
tag: 'rename_destination_as_asn',
target_field: 'destination.as.number',
ignore_missing: true,
},
},
{
rename: {
field: 'destination.as.organization_name',
tag: 'rename_destination_as_organization_name',
target_field: 'destination.as.organization.name',
ignore_missing: true,
},
},
{
remove: {
field: ['mysql_enterprise.audit.account.ip'],
ignore_missing: true,
tag: 'remove_fields',
},
},
{
remove: {
field: 'event.original',
tag: 'remove_original_event',
if: 'ctx?.tags == null || !(ctx.tags.contains("preserve_original_event"))',
ignore_failure: true,
ignore_missing: true,
},
},
],
on_failure: [
{
append: {
field: 'error.message',
value:
'Processor {{{_ingest.on_failure_processor_type}}} with tag {{{_ingest.on_failure_processor_tag}}} in pipeline {{{_ingest.on_failure_pipeline}}} failed with message: {{{_ingest.on_failure_message}}}',
},
},
{
set: {
field: 'event.kind',
value: 'pipeline_error',
},
},
],
},
};
export const ecsInitialMappingMockedResponse = {
mysql_enterprise: {
audit: {
test_array: null,
timestamp: {
target: 'event.action',
confidence: 0.99,
type: 'string',
date_formats: ['yyyy-MM-dd HH:mm:ss'],
},
class: null,
id: {
target: 'file.code_signature.trusted',
confidence: 0.99,
type: 'boolean',
date_formats: [],
},
cpu_usage: {
target: 'host.cpu.usage',
confidence: 0.99,
type: 'number',
date_formats: [],
},
bytes: {
target: 'network.bytes',
confidence: 0.99,
type: 'number',
date_formats: [],
},
account: {
user: {
target: 'user.name',
type: 'string',
date_formats: [],
confidence: 1.0,
},
ip: {
target: 'source.ip',
type: 'string',
date_formats: [],
confidence: 1.0,
},
},
event: {
target: 'event.action',
confidence: 0.8,
type: 'string',
date_formats: [],
},
},
},
};
export const ecsDuplicateMockedResponse = {
mysql_enterprise: {
audit: {
test_array: null,
timestamp: {
target: '@timestamp',
confidence: 0.99,
type: 'date',
date_formats: ['yyyy-MM-dd HH:mm:ss'],
},
id: null,
bytes: {
target: 'network.bytes',
confidence: 0.99,
type: 'number',
date_formats: [],
},
account: {
user: {
target: 'user.name',
type: 'string',
date_formats: [],
confidence: 1.0,
},
ip: {
target: 'source.ip',
type: 'string',
date_formats: [],
confidence: 1.0,
},
},
},
},
};
export const ecsMissingKeysMockedResponse = {
mysql_enterprise: {
audit: {
test_array: null,
timestamp: {
target: '@timestamp',
confidence: 0.99,
type: 'date',
date_formats: ['yyyy-MM-dd HH:mm:ss'],
},
id: null,
class: null,
cpu_usage: {
target: 'host.cpu.usage',
confidence: 0.99,
type: 'number',
date_formats: [],
},
bytes: {
target: 'network.bytes',
confidence: 0.99,
type: 'number',
date_formats: [],
},
account: {
user: {
target: 'user.name',
type: 'string',
date_formats: [],
confidence: 1.0,
},
ip: {
target: 'source.ip',
type: 'string',
date_formats: [],
confidence: 1.0,
},
},
event: {
target: 'invalid.ecs.field',
confidence: 0.8,
type: 'string',
date_formats: [],
},
},
},
};
export const ecsInvalidMappingMockedResponse = {
mysql_enterprise: {
audit: {
test_array: null,
timestamp: {
target: '@timestamp',
confidence: 0.99,
type: 'date',
date_formats: ['yyyy-MM-dd HH:mm:ss'],
},
id: null,
class: null,
cpu_usage: {
target: 'host.cpu.usage',
confidence: 0.99,
type: 'number',
date_formats: [],
},
bytes: {
target: 'network.bytes',
confidence: 0.99,
type: 'number',
date_formats: [],
},
account: {
user: {
target: 'user.name',
type: 'string',
date_formats: [],
confidence: 1.0,
},
ip: {
target: 'source.ip',
type: 'string',
date_formats: [],
confidence: 1.0,
},
},
event: {
target: 'event.action',
confidence: 0.8,
type: 'string',
date_formats: [],
},
},
},
};
export const ecsTestState = {
ecs: 'teststring',
exAnswer: 'testanswer',
finalized: false,
currentPipeline: { test: 'testpipeline' },
duplicateFields: [],
missingKeys: [],
invalidEcsFields: [],
results: { test: 'testresults' },
logFormat: 'testlogformat',
ecsVersion: 'testversion',
currentMapping: { test1: 'test1' },
lastExecutedChain: 'testchain',
rawSamples: ['{"test1": "test1"}'],
samples: ['{ "test1": "test1" }'],
packageName: 'testpackage',
dataStreamName: 'testdatastream',
formattedSamples: '{"test1": "test1"}',
};

View file

@ -0,0 +1,54 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/
import type { Pipeline } from '../../common';
const currentPipelineMock: Pipeline = {
description: 'Pipeline to process mysql_enterprise audit logs',
processors: [
{
set: {
field: 'ecs.version',
value: '8.11.0',
},
},
{
rename: {
field: 'message',
target_field: 'event.original',
ignore_missing: true,
if: 'ctx.event?.original == null',
},
},
{
remove: {
field: 'event.original',
tag: 'remove_original_event',
if: 'ctx?.tags == null || !(ctx.tags.contains("preserve_original_event"))',
ignore_failure: true,
ignore_missing: true,
},
},
],
};
export const mockedRequest = {
rawSamples: [
'{ "timestamp": "2020-10-19 19:31:31", "cpu_usage": 0.1, "class": "general", "event": "status", "test_array": ["test1", "test2"]}',
'{ "timestamp": "2020-10-19 19:32:10", "cpu_usage": 0.2, "class": "connection", "event": "disconnect", "bytes": 16, "account": { "user": "audit_test_user2", "ip": "10.10.10.10" }}',
],
packageName: 'mysql_enterprise',
dataStreamName: 'audit',
};
export const mockedRequestWithPipeline = {
rawSamples: [
'{ "timestamp": "2020-10-19 19:31:31", "cpu_usage": 0.1, "class": "general", "event": "status", "test_array": ["test1", "test2"]}',
'{ "timestamp": "2020-10-19 19:32:10", "cpu_usage": 0.2, "class": "connection", "event": "disconnect", "bytes": 16, "account": { "user": "audit_test_user2", "ip": "10.10.10.10" }}',
],
packageName: 'mysql_enterprise',
dataStreamName: 'audit',
currentPipeline: currentPipelineMock,
};

View file

@ -0,0 +1,277 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/
import type { Pipeline } from '../../common';
export const relatedInitialPipeline: Pipeline = {
description: 'Pipeline to process mysql_enterprise audit logs',
processors: [
{
set: {
field: 'ecs.version',
value: '8.11.0',
},
},
{
rename: {
field: 'message',
target_field: 'event.original',
ignore_missing: true,
if: 'ctx.event?.original == null',
},
},
{
remove: {
field: 'event.original',
tag: 'remove_original_event',
if: 'ctx?.tags == null || !(ctx.tags.contains("preserve_original_event"))',
ignore_failure: true,
ignore_missing: true,
},
},
],
};
export const relatedExpectedResults = {
docs: [
{
key: 'value',
anotherKey: 'anotherValue',
},
],
pipeline: {
description: 'Pipeline to process mysql_enterprise audit logs',
processors: [
{
set: {
field: 'ecs.version',
value: '8.11.0',
},
},
{
append: {
field: 'related.ip',
value: ['{{{source.ip}}}'],
allow_duplicates: false,
if: 'ctx.source?.ip != null',
},
},
{
append: {
field: 'related.ip',
value: ['{{{destination.ip}}}'],
allow_duplicates: false,
if: 'ctx.destination?.ip != null',
},
},
{
rename: {
field: 'message',
target_field: 'event.original',
ignore_missing: true,
if: 'ctx.event?.original == null',
},
},
{
remove: {
field: 'event.original',
tag: 'remove_original_event',
if: 'ctx?.tags == null || !(ctx.tags.contains("preserve_original_event"))',
ignore_failure: true,
ignore_missing: true,
},
},
],
},
};
export const relatedInitialMockedResponse = [
{
append: {
field: 'related.ip',
value: ['{{{source.ip}?.split(":")[0]}}'],
allow_duplicates: false,
if: 'ctx.source?.ip != null',
},
},
{
append: {
field: 'related.ip',
value: ['{{{destination.ip}}}'],
allow_duplicates: false,
if: 'ctx.destination?.ip != null',
},
},
];
export const relatedErrorMockedResponse = [
{
append: {
field: 'related.ip',
value: ['{{{source.ip}}}'],
allow_duplicates: false,
if: 'ctx.source?.ip != null',
},
},
{
append: {
field: 'related.ip',
value: ['{{{destination.ip}}}'],
allow_duplicates: false,
if: 'ctx.destination?.ip != null',
},
},
];
export const relatedReviewMockedResponse = [
{
append: {
field: 'related.ip',
value: ['{{{source.ip}}}'],
allow_duplicates: false,
if: 'ctx.source?.ip != null',
},
},
{
append: {
field: 'related.ip',
value: ['{{{destination.ip}}}'],
allow_duplicates: false,
if: 'ctx.destination?.ip != null',
},
},
];
export const testPipelineError: { pipelineResults: object[]; errors: object[] } = {
pipelineResults: [],
errors: [{ error: 'Sample error message 1' }, { error: 'Sample error message 2' }],
};
export const testPipelineValidResult: { pipelineResults: object[]; errors: object[] } = {
pipelineResults: [{ key: 'value', anotherKey: 'anotherValue' }],
errors: [],
};
export const relatedTestState = {
rawSamples: ['{"test1": "test1"}'],
samples: ['{ "test1": "test1" }'],
formattedSamples: '{"test1": "test1"}',
ecs: 'testtypes',
exAnswer: 'testanswer',
packageName: 'testpackage',
dataStreamName: 'testdatastream',
errors: { test: 'testerror' },
pipelineResults: [{ test: 'testresult' }],
finalized: false,
reviewed: false,
currentPipeline: { test: 'testpipeline' },
currentProcessors: [
{
append: {
field: 'related.ip',
value: ['{{{source.ip}?.split(":")[0]}}'],
allow_duplicates: false,
if: 'ctx.source?.ip != null',
},
},
{
append: {
field: 'related.ip',
value: ['{{{destination.ip}}}'],
allow_duplicates: false,
if: 'ctx.destination?.ip != null',
},
},
],
initialPipeline: relatedInitialPipeline,
results: { test: 'testresults' },
lastExecutedChain: 'testchain',
};
export const relatedMockProcessors = [
{
append: {
field: 'related.ip',
value: ['{{{source.ip}?.split(":")[0]}}'],
allow_duplicates: false,
if: 'ctx.source?.ip != null',
},
},
{
append: {
field: 'related.ip',
value: ['{{{destination.ip}}}'],
allow_duplicates: false,
if: 'ctx.destination?.ip != null',
},
},
];
export const relatedExpectedHandlerResponse = {
currentPipeline: {
description: 'Pipeline to process mysql_enterprise audit logs',
processors: [
{
set: {
field: 'ecs.version',
value: '8.11.0',
},
},
{
append: {
field: 'related.ip',
value: ['{{{source.ip}?.split(":")[0]}}'],
allow_duplicates: false,
if: 'ctx.source?.ip != null',
},
},
{
append: {
field: 'related.ip',
value: ['{{{destination.ip}}}'],
allow_duplicates: false,
if: 'ctx.destination?.ip != null',
},
},
{
rename: {
field: 'message',
target_field: 'event.original',
ignore_missing: true,
if: 'ctx.event?.original == null',
},
},
{
remove: {
field: 'event.original',
tag: 'remove_original_event',
if: 'ctx?.tags == null || !(ctx.tags.contains("preserve_original_event"))',
ignore_failure: true,
ignore_missing: true,
},
},
],
},
currentProcessors: [
{
append: {
field: 'event.type',
value: ['creation'],
if: "ctx.mysql_enterprise?.audit?.general_data?.sql_command == 'create_db'",
},
},
{
append: {
field: 'event.category',
value: ['database'],
if: "ctx.mysql_enterprise.audit.general_data.sql_command == 'create_db'",
},
},
],
reviewed: false,
lastExecutedChain: 'error',
};

View file

@ -0,0 +1,20 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/
// Plugin information
export const PLUGIN_ID = 'integrationAssistant';
// Public App Routes
export const INTEGRATION_ASSISTANT_APP_ROUTE = '/app/integration_assistant';
// Server API Routes
export const INTEGRATION_ASSISTANT_BASE_PATH = '/api/integration_assistant';
export const ECS_GRAPH_PATH = `${INTEGRATION_ASSISTANT_BASE_PATH}/ecs`;
export const CATEGORIZATION_GRAPH_PATH = `${INTEGRATION_ASSISTANT_BASE_PATH}/categorization`;
export const RELATED_GRAPH_PATH = `${INTEGRATION_ASSISTANT_BASE_PATH}/related`;
export const INTEGRATION_BUILDER_PATH = `${INTEGRATION_ASSISTANT_BASE_PATH}/build`;
export const TEST_PIPELINE_PATH = `${INTEGRATION_ASSISTANT_BASE_PATH}/pipeline`;

File diff suppressed because it is too large Load diff

View file

@ -0,0 +1,35 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/
export type {
BuildIntegrationApiRequest,
EcsMappingApiRequest,
CategorizationApiRequest,
RelatedApiRequest,
CategorizationApiResponse,
RelatedApiResponse,
EcsMappingApiResponse,
Pipeline,
ESProcessorItem,
ESProcessorOptions,
DataStream,
Integration,
InputTypes,
TestPipelineApiRequest,
TestPipelineApiResponse,
} from './types';
export {
PLUGIN_ID,
INTEGRATION_ASSISTANT_APP_ROUTE,
ECS_GRAPH_PATH,
CATEGORIZATION_GRAPH_PATH,
RELATED_GRAPH_PATH,
TEST_PIPELINE_PATH,
INTEGRATION_BUILDER_PATH,
INTEGRATION_ASSISTANT_BASE_PATH,
} from './constants';

View file

@ -0,0 +1,119 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/
export interface ESProcessorOptions {
on_failure?: ESProcessorItem[];
ignore_failure?: boolean;
ignore_missing?: boolean;
if?: string;
tag?: string;
[key: string]: unknown;
}
export interface ESProcessorItem {
[processorName: string]: ESProcessorOptions;
}
export interface Pipeline {
name?: string;
description?: string;
version?: number;
processors: ESProcessorItem[];
on_failure?: ESProcessorItem[];
}
export enum InputTypes {
Cloudwatch = 'aws-cloudwatch',
S3 = 'aws-s3',
AzureBlobStorage = 'azure-blob-storage',
EventHub = 'azure-eventhub',
Cloudfoundry = 'cloudfoundry',
FileStream = 'filestream',
PubSub = 'gcp-pubsub',
GoogleCloudStorage = 'gcs',
HTTPListener = 'http_endpoint',
Journald = 'journald',
Kafka = 'kafka',
TCP = 'tcp',
UDP = 'udp',
}
export interface DataStream {
name: string;
title: string;
description: string;
inputTypes: InputTypes[];
rawSamples: string[];
pipeline: object;
docs: object[];
}
export interface Integration {
name: string;
title: string;
description: string;
dataStreams: DataStream[];
logo?: string;
}
// Server Request Schemas
export interface BuildIntegrationApiRequest {
integration: Integration;
}
export interface EcsMappingApiRequest {
packageName: string;
dataStreamName: string;
rawSamples: string[];
mapping?: object;
}
export interface CategorizationApiRequest {
packageName: string;
dataStreamName: string;
rawSamples: string[];
currentPipeline: object;
}
export interface RelatedApiRequest {
packageName: string;
dataStreamName: string;
rawSamples: string[];
currentPipeline: object;
}
export interface TestPipelineApiRequest {
rawSamples: string[];
currentPipeline: Pipeline;
}
// Server Response Schemas
export interface CategorizationApiResponse {
results: {
pipeline: object;
docs: object[];
};
}
export interface RelatedApiResponse {
results: {
pipeline: object;
docs: object[];
};
}
export interface EcsMappingApiResponse {
results: {
mapping: object;
pipeline: object;
};
}
export interface TestPipelineApiResponse {
pipelineResults: object[];
errors?: object[];
}

View file

@ -0,0 +1,21 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/
module.exports = {
preset: '@kbn/test',
rootDir: '../../..',
roots: ['<rootDir>/x-pack/plugins/integration_assistant'],
coverageDirectory: '<rootDir>/target/kibana-coverage/jest/x-pack/plugins/integration_assistant',
coverageReporters: ['text', 'html'],
collectCoverageFrom: [
'<rootDir>/x-pack/plugins/integration_assistant/{common,server}/**/*.{ts,tsx}',
'!<rootDir>/x-pack/plugins/integration_assistant/{__jest__}/**/*',
'!<rootDir>/x-pack/plugins/integration_assistant/*.test.{ts,tsx}',
'!<rootDir>/x-pack/plugins/integration_assistant/*.config.ts',
],
setupFiles: ['jest-canvas-mock'],
};

View file

@ -0,0 +1,15 @@
{
"type": "plugin",
"id": "@kbn/integration-assistant-plugin",
"owner": "@elastic/security-solution",
"description": "A simple example of how to use core's routing services test",
"plugin": {
"id": "integrationAssistant",
"server": true,
"browser": false,
"configPath": ["xpack", "integration_assistant"],
"requiredPlugins": ["actions", "licensing", "management", "features", "share", "fileUpload"],
"optionalPlugins": ["security", "usageCollection", "console"],
"extraPublicDirs": ["common"]
}
}

View file

@ -0,0 +1,18 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/
import { schema, type TypeOf } from '@kbn/config-schema';
import type { PluginConfigDescriptor } from '@kbn/core/server';
export const configSchema = schema.object({
enabled: schema.boolean({ defaultValue: false }),
});
export type ServerlessSecuritySchema = TypeOf<typeof configSchema>;
export const config: PluginConfigDescriptor<ServerlessSecuritySchema> = {
schema: configSchema,
};

View file

@ -0,0 +1,10 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/
export const ROUTE_HANDLER_TIMEOUT = 10 * 60 * 1000; // 10 * 60 seconds = 10 minutes
export const LANG_CHAIN_TIMEOUT = ROUTE_HANDLER_TIMEOUT - 10_000; // 9 minutes 50 seconds
export const CONNECTOR_TIMEOUT = LANG_CHAIN_TIMEOUT - 10_000; // 9 minutes 40 seconds

View file

@ -0,0 +1,35 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/
import { FakeLLM } from '@langchain/core/utils/testing';
import { handleCategorization } from './categorization';
import type { CategorizationState } from '../../types';
import {
categorizationTestState,
categorizationMockProcessors,
categorizationExpectedHandlerResponse,
} from '../../../__jest__/fixtures/categorization';
import {
ActionsClientChatOpenAI,
ActionsClientSimpleChatModel,
} from '@kbn/langchain/server/language_models';
const mockLlm = new FakeLLM({
response: JSON.stringify(categorizationMockProcessors, null, 2),
}) as unknown as ActionsClientChatOpenAI | ActionsClientSimpleChatModel;
const testState: CategorizationState = categorizationTestState;
describe('Testing categorization handler', () => {
it('handleCategorization()', async () => {
const response = await handleCategorization(testState, mockLlm);
expect(response.currentPipeline).toStrictEqual(
categorizationExpectedHandlerResponse.currentPipeline
);
expect(response.lastExecutedChain).toBe('categorization');
});
});

View file

@ -0,0 +1,39 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/
import type {
ActionsClientChatOpenAI,
ActionsClientSimpleChatModel,
} from '@kbn/langchain/server/language_models';
import { JsonOutputParser } from '@langchain/core/output_parsers';
import type { ESProcessorItem, Pipeline } from '../../../common';
import type { CategorizationState } from '../../types';
import { combineProcessors } from '../../util/processors';
import { CATEGORIZATION_MAIN_PROMPT } from './prompts';
export async function handleCategorization(
state: CategorizationState,
model: ActionsClientChatOpenAI | ActionsClientSimpleChatModel
) {
const categorizationMainPrompt = CATEGORIZATION_MAIN_PROMPT;
const outputParser = new JsonOutputParser();
const categorizationMainGraph = categorizationMainPrompt.pipe(model).pipe(outputParser);
const currentProcessors = (await categorizationMainGraph.invoke({
pipeline_results: JSON.stringify(state.pipelineResults, null, 2),
ex_answer: state?.exAnswer,
ecs_categories: state?.ecsCategories,
ecs_types: state?.ecsTypes,
})) as ESProcessorItem[];
const currentPipeline = combineProcessors(state.initialPipeline as Pipeline, currentProcessors);
return {
currentPipeline,
currentProcessors,
lastExecutedChain: 'categorization',
};
}

View file

@ -0,0 +1,242 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/
export const ECS_CATEGORIES = {
api: 'Covers events from API calls, including those from OS and network protocols. Allowed event.type combinations: access, admin, allowed, change, creation, deletion, denied, end, info, start, user',
authentication:
'Focuses on login and credential verification processes. Allowed event.type combinations: start, end, info',
configuration:
'Deals with application, process, or system settings changes. Allowed event.type combinations: access, change, creation, deletion, info',
database:
'Relates to data storage systems, such as SQL or Elasticsearch. Allowed event.type combinations: access, change, info, error',
driver:
'Involves OS device driver activities. Allowed event.type combinations: change, end, info, start',
email: 'Covers events from email messages and protocols. Allowed event.type combinations: info',
file: 'Related to file creation, access, and deletion. Allowed event.type combinations: access, change, creation, deletion, info',
host: 'Provides information about hosts, excluding activity on them. Allowed event.type combinations: access, change, end, info, start',
iam: 'Concerns users, groups, and administration events. Allowed event.type combinations: admin, change, creation, deletion, group, info, user',
intrusion_detection:
'Detects intrusions from IDS/IPS systems. Allowed event.type combinations: allowed, denied, info',
library:
'Refers to the loading of libraries into processes. Allowed event.type combinations: start',
malware: 'Focuses on malware detection events and alerts. Allowed event.type combinations: info',
network:
'Captures all network-related activities. Allowed event.type combinations: access, allowed, connection, denied, end, info, protocol, start',
package:
'Concerns software packages on hosts. Allowed event.type combinations: access, change, deletion, info, installation, start',
process:
'Addresses process-specific details. Allowed event.type combinations: access, change, end, info, start',
registry:
'Focuses on Windows registry settings. Allowed event.type combinations: access, change, creation, deletion',
session:
'Relates to persistent connections to hosts/services. Allowed event.type combinations: start, end, info',
threat:
"Describes threat actors' intentions and behaviors. Allowed event.type combinations: indicator",
vulnerability: 'Pertain to vulnerability scan outcomes. Allowed event.type combinations: info',
web: 'Concerns web server access events. access, error, Allowed event.type combinations: info',
};
export const ECS_TYPES = {
access: 'Used to indicate something was accessed. Examples include accessing databases or files.',
admin:
'Pertains to events related to admin objects, like administrative changes in IAM not tied to specific users or groups.',
allowed:
'Indicates that a certain action or event was permitted, like firewall connections that were permitted.',
change:
'Used for events indicating that something has changed, such as modifications in files or processes.',
connection:
'Mainly for network-related events, capturing details sufficient for flow or connection analysis, like Netflow or IPFIX events.',
creation: 'Denotes that something was created. A typical example is file creation.',
deletion: 'Indicates that something was removed or deleted, for instance, file deletions.',
denied:
'Refers to events where something was denied or blocked, such as a network connection that was blocked by a firewall.',
end: 'Suggests that something has concluded or ended, like a process.',
error:
'Used for events that describe errors, but not errors during event ingestion. For instance, database errors.',
group:
'Pertains to group-related events within categories, like creation or modification of user groups in IAM.',
indicator:
'Represents events that contain indicators of compromise (IOCs), commonly associated with threat detection.',
info: "Denotes purely informational events that don't imply a state change or an action. For example, system information logs.",
installation: 'Indicates that something was installed, typically software or packages.',
protocol:
'Used for events containing detailed protocol analysis, beyond just naming the protocol, especially in network events.',
start: 'Signals the commencement of something, such as a process.',
user: 'Relates to user-centric events within categories, like user creation or deletion in IAM.',
};
export const EVENT_TYPES = [
'access',
'admin',
'allowed',
'change',
'connection',
'creation',
'deletion',
'denied',
'end',
'error',
'group',
'indicator',
'info',
'installation',
'protocol',
'start',
'user',
];
export const EVENT_CATEGORIES = [
'api',
'authentication',
'configuration',
'database',
'driver',
'email',
'file',
'host',
'iam',
'intrusion_detection',
'library',
'malware',
'network',
'package',
'process',
'registry',
'session',
'threat',
'vulnerability',
'web',
];
export type EventCategories =
| 'api'
| 'authentication'
| 'configuration'
| 'database'
| 'driver'
| 'email'
| 'file'
| 'host'
| 'iam'
| 'intrusion_detection'
| 'library'
| 'network'
| 'package'
| 'process'
| 'registry'
| 'session'
| 'threat'
| 'user'
| 'vulnerability'
| 'web';
export const ECS_EVENT_TYPES_PER_CATEGORY: {
[key in EventCategories]: string[];
} = {
api: [
'access',
'admin',
'allowed',
'change',
'creation',
'deletion',
'denied',
'end',
'info',
'start',
'user',
],
authentication: ['start', 'end', 'info'],
configuration: ['access', 'change', 'creation', 'deletion', 'info'],
database: ['access', 'change', 'info', 'error'],
driver: ['change', 'end', 'info', 'start'],
email: ['access', 'change', 'creation', 'deletion', 'info', 'start'],
file: ['access', 'change', 'creation', 'deletion', 'info', 'start'],
host: ['access', 'change', 'creation', 'deletion', 'info', 'start'],
iam: ['access', 'change', 'creation', 'deletion', 'info', 'start'],
intrusion_detection: ['access', 'change', 'creation', 'deletion', 'info', 'start'],
library: ['access', 'change', 'creation', 'deletion', 'info', 'start'],
network: ['access', 'change', 'creation', 'deletion', 'info', 'start'],
package: ['access', 'change', 'creation', 'deletion', 'info', 'start'],
process: ['access', 'change', 'creation', 'deletion', 'info', 'start'],
registry: ['access', 'change', 'creation', 'deletion', 'info', 'start'],
session: ['access', 'change', 'creation', 'deletion', 'info', 'start'],
threat: ['access', 'change', 'creation', 'deletion', 'info', 'start'],
user: ['access', 'change', 'creation', 'deletion', 'info', 'start'],
vulnerability: ['access', 'change', 'creation', 'deletion', 'info', 'start'],
web: ['access', 'change', 'creation', 'deletion', 'info', 'start'],
};
export const CATEGORIZATION_EXAMPLE_PROCESSORS = `
If condition that determines if ctx.checkpoint?.operation is not of a specific value:
{
"append": {
"field": "event.category",
"value": "network",
"allow_duplicates": false,
"if": "ctx.checkpoint?.operation != 'Log In'"
}
}
If condition that determines if ctx.checkpoint?.operation is of a specific value:
{
"append": {
"field": "event.category",
"value": "authentication",
"allow_duplicates": false,
"if": "ctx.checkpoint?.operation == 'Log In'"
}
}
Appending multiple values when either the value Accept or Allow is found in ctx.checkpoint?.rule_action:
{
"append": {
"field": "event.type",
"value": [
"allowed",
"connection"
],
"allow_duplicates": false,
"if": "['Accept', 'Allow'].contains(ctx.checkpoint?.rule_action)"
}
}
`;
export const CATEGORIZATION_EXAMPLE_ANSWER = [
{ append: { field: 'event.type', value: ['access'] } },
{
append: {
field: 'event.type',
value: ['allowed', 'connection'],
allow_duplicates: false,
if: "['Accept', 'Allow'].contains(ctx.checkpoint?.rule_action)",
},
},
{
append: {
field: 'event.category',
value: ['network'],
allow_duplicates: false,
if: "['Accept', 'Allow'].contains(ctx.checkpoint?.rule_action)",
},
},
{
append: {
field: 'event.type',
value: ['start'],
allow_duplicates: false,
if: "ctx.checkpoint?.operation == 'Log In'",
},
},
{
append: {
field: 'event.category',
value: ['authentication'],
allow_duplicates: false,
if: "ctx.checkpoint?.operation == 'Log In'",
},
},
];

View file

@ -0,0 +1,35 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/
import { FakeLLM } from '@langchain/core/utils/testing';
import { handleErrors } from './errors';
import type { CategorizationState } from '../../types';
import {
categorizationTestState,
categorizationMockProcessors,
categorizationExpectedHandlerResponse,
} from '../../../__jest__/fixtures/categorization';
import {
ActionsClientChatOpenAI,
ActionsClientSimpleChatModel,
} from '@kbn/langchain/server/language_models';
const mockLlm = new FakeLLM({
response: JSON.stringify(categorizationMockProcessors, null, 2),
}) as unknown as ActionsClientChatOpenAI | ActionsClientSimpleChatModel;
const testState: CategorizationState = categorizationTestState;
describe('Testing categorization handler', () => {
it('handleErrors()', async () => {
const response = await handleErrors(testState, mockLlm);
expect(response.currentPipeline).toStrictEqual(
categorizationExpectedHandlerResponse.currentPipeline
);
expect(response.lastExecutedChain).toBe('error');
});
});

View file

@ -0,0 +1,42 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/
import type {
ActionsClientChatOpenAI,
ActionsClientSimpleChatModel,
} from '@kbn/langchain/server/language_models';
import { JsonOutputParser } from '@langchain/core/output_parsers';
import type { ESProcessorItem, Pipeline } from '../../../common';
import type { CategorizationState } from '../../types';
import { combineProcessors } from '../../util/processors';
import { CATEGORIZATION_ERROR_PROMPT } from './prompts';
export async function handleErrors(
state: CategorizationState,
model: ActionsClientChatOpenAI | ActionsClientSimpleChatModel
) {
const categorizationErrorPrompt = CATEGORIZATION_ERROR_PROMPT;
const outputParser = new JsonOutputParser();
const categorizationErrorGraph = categorizationErrorPrompt.pipe(model).pipe(outputParser);
const currentProcessors = (await categorizationErrorGraph.invoke({
current_processors: JSON.stringify(state.currentProcessors, null, 2),
ex_answer: state.exAnswer,
errors: JSON.stringify(state.errors, null, 2),
package_name: state.packageName,
data_stream_name: state.dataStreamName,
})) as ESProcessorItem[];
const currentPipeline = combineProcessors(state.initialPipeline as Pipeline, currentProcessors);
return {
currentPipeline,
currentProcessors,
reviewed: false,
lastExecutedChain: 'error',
};
}

View file

@ -0,0 +1,137 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/
import type { IScopedClusterClient } from '@kbn/core/server';
import { FakeLLM } from '@langchain/core/utils/testing';
import { getCategorizationGraph } from './graph';
import {
categorizationExpectedResults,
categorizationErrorMockedResponse,
categorizationInitialMockedResponse,
categorizationInvalidMockedResponse,
categorizationReviewMockedResponse,
categorizationInitialPipeline,
testPipelineError,
testPipelineValidResult,
testPipelineInvalidEcs,
} from '../../../__jest__/fixtures/categorization';
import { mockedRequestWithPipeline } from '../../../__jest__/fixtures';
import { handleReview } from './review';
import { handleCategorization } from './categorization';
import { handleErrors } from './errors';
import { handleInvalidCategorization } from './invalid';
import { testPipeline, combineProcessors } from '../../util';
import {
ActionsClientChatOpenAI,
ActionsClientSimpleChatModel,
} from '@kbn/langchain/server/language_models';
const mockLlm = new FakeLLM({
response: "I'll callback later.",
}) as unknown as ActionsClientChatOpenAI | ActionsClientSimpleChatModel;
jest.mock('./errors');
jest.mock('./review');
jest.mock('./categorization');
jest.mock('./invalid');
jest.mock('../../util/pipeline', () => ({
testPipeline: jest.fn(),
}));
describe('runCategorizationGraph', () => {
const mockClient = {
asCurrentUser: {
ingest: {
simulate: jest.fn(),
},
},
} as unknown as IScopedClusterClient;
beforeEach(() => {
// Mocked responses for each node that requires an LLM API call/response.
const mockInvokeCategorization = jest
.fn()
.mockResolvedValue(categorizationInitialMockedResponse);
const mockInvokeError = jest.fn().mockResolvedValue(categorizationErrorMockedResponse);
const mockInvokeInvalid = jest.fn().mockResolvedValue(categorizationInvalidMockedResponse);
const mockInvokeReview = jest.fn().mockResolvedValue(categorizationReviewMockedResponse);
// We do not care about ES in these tests, the mock is just to prevent errors.
// After this is triggered, the mock of TestPipeline will trigger the expected error, to route to error handler
(handleCategorization as jest.Mock).mockImplementation(async () => ({
currentPipeline: categorizationInitialPipeline,
currentProcessors: await mockInvokeCategorization(),
reviewed: false,
finalized: false,
lastExecutedChain: 'categorization',
}));
// Error pipeline resolves it, though the responce includes an invalid categorization
(handleErrors as jest.Mock).mockImplementation(async () => ({
currentPipeline: categorizationInitialPipeline,
currentProcessors: await mockInvokeError(),
reviewed: false,
finalized: false,
lastExecutedChain: 'error',
}));
// Invalid categorization is resolved and returned correctly, which routes it to a review
(handleInvalidCategorization as jest.Mock).mockImplementation(async () => ({
currentPipeline: categorizationInitialPipeline,
currentProcessors: await mockInvokeInvalid(),
reviewed: false,
finalized: false,
lastExecutedChain: 'invalidCategorization',
}));
// After the review it should route to modelOutput and finish.
(handleReview as jest.Mock).mockImplementation(async () => {
const currentProcessors = await mockInvokeReview();
const currentPipeline = combineProcessors(categorizationInitialPipeline, currentProcessors);
return {
currentProcessors,
currentPipeline,
reviewed: true,
finalized: false,
lastExecutedChain: 'review',
};
});
});
it('Ensures that the graph compiles', async () => {
try {
await getCategorizationGraph(mockClient, mockLlm);
} catch (error) {
// noop
}
});
it('Runs the whole graph, with mocked outputs from the LLM.', async () => {
const categorizationGraph = await getCategorizationGraph(mockClient, mockLlm);
(testPipeline as jest.Mock)
.mockResolvedValueOnce(testPipelineValidResult)
.mockResolvedValueOnce(testPipelineError)
.mockResolvedValueOnce(testPipelineInvalidEcs)
.mockResolvedValueOnce(testPipelineValidResult)
.mockResolvedValueOnce(testPipelineValidResult)
.mockResolvedValueOnce(testPipelineValidResult);
let response;
try {
response = await categorizationGraph.invoke(mockedRequestWithPipeline);
} catch (e) {
// noop
}
expect(response.results).toStrictEqual(categorizationExpectedResults);
// Check if the functions were called
expect(handleCategorization).toHaveBeenCalled();
expect(handleErrors).toHaveBeenCalled();
expect(handleInvalidCategorization).toHaveBeenCalled();
expect(handleReview).toHaveBeenCalled();
});
});

View file

@ -0,0 +1,193 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/
import type { IScopedClusterClient } from '@kbn/core-elasticsearch-server';
import type { StateGraphArgs } from '@langchain/langgraph';
import { StateGraph, END, START } from '@langchain/langgraph';
import type {
ActionsClientChatOpenAI,
ActionsClientSimpleChatModel,
} from '@kbn/langchain/server/language_models';
import type { CategorizationState } from '../../types';
import { modifySamples, formatSamples } from '../../util/samples';
import { handleCategorization } from './categorization';
import { handleValidatePipeline } from '../../util/graph';
import { handleCategorizationValidation } from './validate';
import { handleInvalidCategorization } from './invalid';
import { handleErrors } from './errors';
import { handleReview } from './review';
import { CATEGORIZATION_EXAMPLE_ANSWER, ECS_CATEGORIES, ECS_TYPES } from './constants';
const graphState: StateGraphArgs<CategorizationState>['channels'] = {
lastExecutedChain: {
value: (x: string, y?: string) => y ?? x,
default: () => '',
},
rawSamples: {
value: (x: string[], y?: string[]) => y ?? x,
default: () => [],
},
samples: {
value: (x: string[], y?: string[]) => y ?? x,
default: () => [],
},
formattedSamples: {
value: (x: string, y?: string) => y ?? x,
default: () => '',
},
ecsTypes: {
value: (x: string, y?: string) => y ?? x,
default: () => '',
},
ecsCategories: {
value: (x: string, y?: string) => y ?? x,
default: () => '',
},
exAnswer: {
value: (x: string, y?: string) => y ?? x,
default: () => '',
},
packageName: {
value: (x: string, y?: string) => y ?? x,
default: () => '',
},
dataStreamName: {
value: (x: string, y?: string) => y ?? x,
default: () => '',
},
finalized: {
value: (x: boolean, y?: boolean) => y ?? x,
default: () => false,
},
reviewed: {
value: (x: boolean, y?: boolean) => y ?? x,
default: () => false,
},
errors: {
value: (x: object, y?: object) => y ?? x,
default: () => ({}),
},
pipelineResults: {
value: (x: object[], y?: object[]) => y ?? x,
default: () => [{}],
},
currentPipeline: {
value: (x: object, y?: object) => y ?? x,
default: () => ({}),
},
currentProcessors: {
value: (x: object[], y?: object[]) => y ?? x,
default: () => [],
},
invalidCategorization: {
value: (x: object, y?: object) => y ?? x,
default: () => ({}),
},
initialPipeline: {
value: (x: object, y?: object) => y ?? x,
default: () => ({}),
},
results: {
value: (x: object, y?: object) => y ?? x,
default: () => ({}),
},
};
function modelInput(state: CategorizationState): Partial<CategorizationState> {
const samples = modifySamples(state);
const formattedSamples = formatSamples(samples);
const initialPipeline = JSON.parse(JSON.stringify(state.currentPipeline));
return {
exAnswer: JSON.stringify(CATEGORIZATION_EXAMPLE_ANSWER, null, 2),
ecsCategories: JSON.stringify(ECS_CATEGORIES, null, 2),
ecsTypes: JSON.stringify(ECS_TYPES, null, 2),
samples,
formattedSamples,
initialPipeline,
finalized: false,
reviewed: false,
lastExecutedChain: 'modelInput',
};
}
function modelOutput(state: CategorizationState): Partial<CategorizationState> {
return {
finalized: true,
lastExecutedChain: 'modelOutput',
results: {
docs: state.pipelineResults,
pipeline: state.currentPipeline,
},
};
}
function validationRouter(state: CategorizationState): string {
if (Object.keys(state.currentProcessors).length === 0) {
return 'categorization';
}
return 'validateCategorization';
}
function chainRouter(state: CategorizationState): string {
if (Object.keys(state.errors).length > 0) {
return 'errors';
}
if (Object.keys(state.invalidCategorization).length > 0) {
return 'invalidCategorization';
}
if (!state.reviewed) {
return 'review';
}
if (!state.finalized) {
return 'modelOutput';
}
return END;
}
export async function getCategorizationGraph(
client: IScopedClusterClient,
model: ActionsClientChatOpenAI | ActionsClientSimpleChatModel
) {
const workflow = new StateGraph({
channels: graphState,
})
.addNode('modelInput', modelInput)
.addNode('modelOutput', modelOutput)
.addNode('handleCategorization', (state: CategorizationState) =>
handleCategorization(state, model)
)
.addNode('handleValidatePipeline', (state: CategorizationState) =>
handleValidatePipeline(state, client)
)
.addNode('handleCategorizationValidation', handleCategorizationValidation)
.addNode('handleInvalidCategorization', (state: CategorizationState) =>
handleInvalidCategorization(state, model)
)
.addNode('handleErrors', (state: CategorizationState) => handleErrors(state, model))
.addNode('handleReview', (state: CategorizationState) => handleReview(state, model))
.addEdge(START, 'modelInput')
.addEdge('modelOutput', END)
.addEdge('modelInput', 'handleValidatePipeline')
.addEdge('handleCategorization', 'handleValidatePipeline')
.addEdge('handleInvalidCategorization', 'handleValidatePipeline')
.addEdge('handleErrors', 'handleValidatePipeline')
.addEdge('handleReview', 'handleValidatePipeline')
.addConditionalEdges('handleValidatePipeline', validationRouter, {
categorization: 'handleCategorization',
validateCategorization: 'handleCategorizationValidation',
})
.addConditionalEdges('handleCategorizationValidation', chainRouter, {
modelOutput: 'modelOutput',
errors: 'handleErrors',
invalidCategorization: 'handleInvalidCategorization',
review: 'handleReview',
});
const compiledCategorizationGraph = workflow.compile();
return compiledCategorizationGraph;
}

View file

@ -0,0 +1,7 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/
export { getCategorizationGraph } from './graph';

View file

@ -0,0 +1,35 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/
import { FakeLLM } from '@langchain/core/utils/testing';
import { handleInvalidCategorization } from './invalid';
import type { CategorizationState } from '../../types';
import {
categorizationTestState,
categorizationMockProcessors,
categorizationExpectedHandlerResponse,
} from '../../../__jest__/fixtures/categorization';
import {
ActionsClientChatOpenAI,
ActionsClientSimpleChatModel,
} from '@kbn/langchain/server/language_models';
const mockLlm = new FakeLLM({
response: JSON.stringify(categorizationMockProcessors, null, 2),
}) as unknown as ActionsClientChatOpenAI | ActionsClientSimpleChatModel;
const testState: CategorizationState = categorizationTestState;
describe('Testing categorization handler', () => {
it('handleInvalidCategorization()', async () => {
const response = await handleInvalidCategorization(testState, mockLlm);
expect(response.currentPipeline).toStrictEqual(
categorizationExpectedHandlerResponse.currentPipeline
);
expect(response.lastExecutedChain).toBe('invalidCategorization');
});
});

View file

@ -0,0 +1,42 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/
import type {
ActionsClientChatOpenAI,
ActionsClientSimpleChatModel,
} from '@kbn/langchain/server/language_models';
import { JsonOutputParser } from '@langchain/core/output_parsers';
import type { ESProcessorItem, Pipeline } from '../../../common';
import type { CategorizationState } from '../../types';
import { combineProcessors } from '../../util/processors';
import { ECS_EVENT_TYPES_PER_CATEGORY } from './constants';
import { CATEGORIZATION_VALIDATION_PROMPT } from './prompts';
export async function handleInvalidCategorization(
state: CategorizationState,
model: ActionsClientChatOpenAI | ActionsClientSimpleChatModel
) {
const categorizationInvalidPrompt = CATEGORIZATION_VALIDATION_PROMPT;
const outputParser = new JsonOutputParser();
const categorizationInvalidGraph = categorizationInvalidPrompt.pipe(model).pipe(outputParser);
const currentProcessors = (await categorizationInvalidGraph.invoke({
current_processors: JSON.stringify(state.currentProcessors, null, 2),
invalid_categorization: JSON.stringify(state.invalidCategorization, null, 2),
ex_answer: state.exAnswer,
compatible_types: JSON.stringify(ECS_EVENT_TYPES_PER_CATEGORY, null, 2),
})) as ESProcessorItem[];
const currentPipeline = combineProcessors(state.initialPipeline as Pipeline, currentProcessors);
return {
currentPipeline,
currentProcessors,
reviewed: false,
lastExecutedChain: 'invalidCategorization',
};
}

View file

@ -0,0 +1,204 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/
import { ChatPromptTemplate } from '@langchain/core/prompts';
export const CATEGORIZATION_MAIN_PROMPT = ChatPromptTemplate.fromMessages([
[
'system',
`You are a helpful, expert assistant on Elasticsearch Ingest Pipelines, focusing on providing append processors that can be used to enrich samples with all relevant event.type and event.category values.
Here are some context for you to reference for your task, read it carefully as you will get questions about it later:
<context>
<ecs>
Event Category (event.category):
Purpose: It is the second level in the ECS category hierarchy, representing the primary category or "big bucket" for event classification.
Type: It's a keyword type and can have multiple values (list).
Relationship: Works alongside event.type, which acts as a subcategory.
Allowed categories and their descriptions:
{ecs_categories}
Event Type (event.type):
Purpose: It is the third level in the ECS category hierarchy, represents a categorization "sub-bucket".
Type: It's a keyword type and can have multiple values (list).
Relationship: Works alongside event.category, which acts as a subcategory.
Allowed types and their descriptions:
{ecs_types}
</ecs>
</context>`,
],
[
'human',
`Please help me by providing all relevant append processors for any detected event.category and event.type combinations that would fit the below pipeline results as an array of JSON objects.
<pipeline_results>
{pipeline_results}
</pipeline_results>
Go through each of the pipeline results above step by step and do the following to add all relevant event.type and event.category combinations.
1. Try to understand what is unique about each pipeline result, and what sort of event.categorization and event.type combinations that fit best, and if there is any unique values for each result.
2. For for each combination of event.category and event.type that you find, add a new append processor to your array of JSON objects.
3. If only certain results are relevant to the event.category and event.type combination, add an if condition similar to the above example processors, that describes what value or field needs to be available for this categorization to take place. The if condition should be inside the processor object.
4. Always check if the combination of event.category and event.type is common in the ecs context above.
5. Always make sure the value for event.category and event.type is strictly from the allowed categories and allowed types in the ecs context above.
6. The value argument for the append processor is an array of one or more types and categories.
You ALWAYS follow these guidelines when writing your response:
<guidelines>
- You can add as many append processors you need to cover all the unique combinations that you detected.
- If conditions should always use a ? character when accessing nested fields, in case the field might not always be available, see example processors above.
- When an if condition is not needed the argument should not be used for the processor object.
- When using a range based if condition like > 0, you first need to check that the field is not null, for example: ctx.somefield?.production != null && ctx.somefield?.production > 0
- Do not respond with anything except the array of processors as a valid JSON objects enclosed with 3 backticks (\`), see example response below.
</guidelines>
Example response format:
<example>
A: Please find the Categorization processors below:
\`\`\`json
{ex_answer}
\`\`\`
</example>`,
],
['ai', 'Please find the Categorization processors below:'],
]);
export const CATEGORIZATION_REVIEW_PROMPT = ChatPromptTemplate.fromMessages([
[
'system',
`You are a helpful, expert assistant on Elasticsearch Ingest Pipelines, focusing on adding improvements to the provided array of processors and reviewing the current results.
Here is some context that you can reference for your task, read it carefully as you will get questions about it later:
<context>
<current_processors>
{current_processors}
</current_processors>
<compatibility_matrix>
{compatibility_matrix}
</compatibility_matrix>
</context>`,
],
[
'human',
`Testing my current pipeline returned me with the results:
<pipeline_results>
{pipeline_results}
</pipeline_results>
Please review the pipeline results and the array of current processors, ensuring to identify all the possible event.type and event.category combinatinations that would match each pipeline result document. If any event.type or event.category is missing from any of the pipeline results, add them by updating the array of current processors and return the whole updated array of processors.
For each pipeline result you review step by step, remember the below steps:
1. Check if each of the pipeline results have at least one event.category and event.type added to them. If not then try to correlate the results with the current processors and see if either a new append processor should be added to the list with a matching if condition, or if any of the if conditions should be modified as they are not matching that is in the results.
2. If the results have at least one event.category and event.type value, see if more of them could match, if so it could be added to the relevant append processor which added the initial values.
3. When adding more values to event.type and event.category please keep in mind the compatibility_matrix in the context to make sure only compatible event.type , event.category pairs that are compatible are created.
4. Ensure that all append processors has allow_duplicates: false, as seen in the example response.
You ALWAYS follow these guidelines when writing your response:
<guidelines>
- You can use as many append processors as you need to add all relevant ECS categories and types combinations.
- If conditions should always use a ? character when accessing nested fields, in case the field might not always be available, see example processors above.
- When an if condition is not needed the argument should not be used for the processor object.
- If not updates are needed you respond with the initially provided current processors.
- Each append processor needs to have the allow_duplicates: false argument, as shown in the below example response.
- Do not respond with anything except updated array of processors as a valid JSON object enclosed with 3 backticks (\`), see example response below.
</guidelines>
Example response format:
<example>
A: Please find the updated ECS categorization append processors below:
\`\`\`
{ex_answer}
\`\`\`
</example>`,
],
['ai', 'Please find the updated ECS categorization append processors below:'],
]);
export const CATEGORIZATION_VALIDATION_PROMPT = ChatPromptTemplate.fromMessages([
[
'system',
`You are a helpful, expert assistant on Elasticsearch Ingest Pipelines, focusing on resolving errors and issues with append processors used for categorization.
Here is some context that you can reference for your task, read it carefully as you will get questions about it later:
<context>
<current_processors>
{current_processors}
</current_processors>
<compatible_types>
{compatible_types}
</compatible_types>
<errors>
{invalid_categorization}
</errors>
</context>`,
],
[
'human',
`Please go through each error above, carefully review the provided current processors, and resolve the most likely cause to the supplied error by returning an updated version of the current_processors.
Follow these steps to help resolve the current ingest pipeline issues:
1. Try to fix all related errors before responding.
2. Apply all fixes to the provided array of current append processors.
3. If you do not know how to fix an error, then continue to the next and return the complete updated array of current append processors.
You ALWAYS follow these guidelines when writing your response:
<guidelines>
- If the error complains about having event.type or event.category not in the allowed values , fix the corresponding append processors to use the allowed values mentioned in the error.
- If the error is about event.type not compatible with any event.category, please refer to the 'compatible_types' in the context to fix the corresponding append processors to use valid combination of event.type and event.category
- Do not respond with anything except the complete updated array of processors as a valid JSON object enclosed with 3 backticks (\`), see example response below.
</guidelines>
Example response format:
<example>
A: Please find the updated ECS categorization append processors below:
\`\`\`json
{ex_answer}
\`\`\`
</example>`,
],
['ai', 'Please find the updated ECS categorization append processors below:'],
]);
export const CATEGORIZATION_ERROR_PROMPT = ChatPromptTemplate.fromMessages([
[
'system',
`You are a helpful, expert assistant on Elasticsearch Ingest Pipelines, focusing on resolving errors and issues with append processors used for categorization.
Here is some context that you can reference for your task, read it carefully as you will get questions about it later:
<context>
<current_processors>
{current_processors}
</current_processors>
<errors>
{errors}
</errors>
</context>`,
],
[
'human',
`Please go through each error above, carefully review the provided current processors, and resolve the most likely cause to the supplied error by returning an updated version of the current_processors.
Follow these steps to help resolve the current ingest pipeline issues:
1. Try to fix all related errors before responding.
2. Apply all fixes to the provided array of current append processors.
3. If you do not know how to fix an error, then continue to the next and return the complete updated array of current append processors.
You ALWAYS follow these guidelines when writing your response:
<guidelines>
- When checking for the existance of multiple values in a single variable, use this format: "if": "['value1', 'value2'].contains(ctx.{package_name}?.{data_stream_name}?.field)"
- If conditions should never be in a format like "if": "true". If it exist in the current array of append processors, remove only the redundant if condition.
- If the error complains that it is a null point exception, always ensure the if conditions uses a ? when accessing nested fields. For example ctx.field1?.nestedfield1?.nestedfield2.
- If the error complains about having values not in the list of allowed values , fix the corresponding append processors to use the allowed values as mentioned in the error.
- Do not respond with anything except the complete updated array of processors as a valid JSON object enclosed with 3 backticks (\`), see example response below.
</guidelines>
Example response format:
<example>
A: Please find the updated ECS categorization append processors below:
\`\`\`json
{ex_answer}
\`\`\`
</example>`,
],
['ai', 'Please find the updated ECS categorization append processors below:'],
]);

View file

@ -0,0 +1,35 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/
import { FakeLLM } from '@langchain/core/utils/testing';
import { handleReview } from './review';
import type { CategorizationState } from '../../types';
import {
categorizationTestState,
categorizationMockProcessors,
categorizationExpectedHandlerResponse,
} from '../../../__jest__/fixtures/categorization';
import {
ActionsClientChatOpenAI,
ActionsClientSimpleChatModel,
} from '@kbn/langchain/server/language_models';
const mockLlm = new FakeLLM({
response: JSON.stringify(categorizationMockProcessors, null, 2),
}) as unknown as ActionsClientChatOpenAI | ActionsClientSimpleChatModel;
const testState: CategorizationState = categorizationTestState;
describe('Testing categorization handler', () => {
it('handleReview()', async () => {
const response = await handleReview(testState, mockLlm);
expect(response.currentPipeline).toStrictEqual(
categorizationExpectedHandlerResponse.currentPipeline
);
expect(response.lastExecutedChain).toBe('review');
});
});

View file

@ -0,0 +1,43 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/
import type {
ActionsClientChatOpenAI,
ActionsClientSimpleChatModel,
} from '@kbn/langchain/server/language_models';
import { JsonOutputParser } from '@langchain/core/output_parsers';
import { CATEGORIZATION_REVIEW_PROMPT } from './prompts';
import type { ESProcessorItem, Pipeline } from '../../../common';
import type { CategorizationState } from '../../types';
import { combineProcessors } from '../../util/processors';
import { ECS_EVENT_TYPES_PER_CATEGORY } from './constants';
export async function handleReview(
state: CategorizationState,
model: ActionsClientChatOpenAI | ActionsClientSimpleChatModel
) {
const categorizationReviewPrompt = CATEGORIZATION_REVIEW_PROMPT;
const outputParser = new JsonOutputParser();
const categorizationReview = categorizationReviewPrompt.pipe(model).pipe(outputParser);
const currentProcessors = (await categorizationReview.invoke({
current_processors: JSON.stringify(state.currentProcessors, null, 2),
pipeline_results: JSON.stringify(state.pipelineResults, null, 2),
ex_answer: state?.exAnswer,
package_name: state?.packageName,
compatibility_matrix: JSON.stringify(ECS_EVENT_TYPES_PER_CATEGORY, null, 2),
})) as ESProcessorItem[];
const currentPipeline = combineProcessors(state.initialPipeline as Pipeline, currentProcessors);
return {
currentPipeline,
currentProcessors,
reviewed: true,
lastExecutedChain: 'review',
};
}

View file

@ -0,0 +1,128 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/
import type { CategorizationState } from '../../types';
import { ECS_EVENT_TYPES_PER_CATEGORY, EVENT_CATEGORIES, EVENT_TYPES } from './constants';
import type { EventCategories } from './constants';
interface Event {
type?: string[];
category?: string[];
}
interface PipelineResult {
event?: Event;
}
interface CategorizationError {
error: string;
}
export function handleCategorizationValidation(state: CategorizationState): {
invalidCategorization: CategorizationError[];
lastExecutedChain: string;
} {
const errors: CategorizationError[] = [];
const pipelineResults = state.pipelineResults as PipelineResult[];
// Loops through the pipeline results to find invalid categories and types
for (const doc of pipelineResults) {
let types: string[] = [];
let categories: string[] = [];
if (doc?.event?.type) {
types = doc.event.type;
}
if (doc?.event?.category) {
categories = doc.event.category;
}
const invalidCategories = findInvalidCategories(categories);
const invalidTypes = findInvalidTypes(types);
if (invalidCategories.length > 0) {
errors.push(createErrorMessage('event.category', invalidCategories, EVENT_CATEGORIES));
}
if (invalidTypes.length > 0) {
errors.push(createErrorMessage('event.type', invalidTypes, EVENT_TYPES));
}
// Compatibility check is done only on valid categories and types
const validCategories = categories.filter((x) => !invalidCategories.includes(x));
const validTypes = types.filter((x) => !invalidTypes.includes(x));
const compatibleErrors = getTypeCategoryIncompatibleError(validCategories, validTypes);
for (const ce of compatibleErrors) {
errors.push(ce);
}
}
return {
invalidCategorization: errors,
lastExecutedChain: 'handleCategorizationValidation',
};
}
function createErrorMessage(
field: string,
errorList: string[],
allowedValues: string[]
): CategorizationError {
return {
error: `field ${field}'s values (${errorList.join(
', '
)}) is not one of the allowed values (${allowedValues.join(', ')})`,
};
}
function findInvalidCategories(categories: string[]): string[] {
const invalidCategories: string[] = [];
for (const c of categories) {
if (!EVENT_CATEGORIES.includes(c)) {
invalidCategories.push(c);
}
}
return invalidCategories;
}
function findInvalidTypes(types: string[]): string[] {
const invalidTypes: string[] = [];
for (const t of types) {
if (!EVENT_TYPES.includes(t)) {
invalidTypes.push(t);
}
}
return invalidTypes;
}
function getTypeCategoryIncompatibleError(
categories: string[],
types: string[]
): CategorizationError[] {
const errors: CategorizationError[] = [];
let unmatchedTypes = new Set(types);
const matchCategories = new Set(categories);
let categoryExists = false;
for (const c of matchCategories) {
if (c in ECS_EVENT_TYPES_PER_CATEGORY) {
categoryExists = true;
const matchTypes = new Set(ECS_EVENT_TYPES_PER_CATEGORY[c as EventCategories]);
unmatchedTypes = new Set([...unmatchedTypes].filter((x) => !matchTypes.has(x)));
}
}
if (categoryExists && unmatchedTypes.size > 0) {
errors.push({
error: `event.type (${[...unmatchedTypes].join(
', '
)}) not compatible with any of the event.category (${[...matchCategories].join(', ')})`,
});
}
return errors;
}

File diff suppressed because it is too large Load diff

View file

@ -0,0 +1,29 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/
import { FakeLLM } from '@langchain/core/utils/testing';
import { handleDuplicates } from './duplicates';
import type { EcsMappingState } from '../../types';
import { ecsTestState } from '../../../__jest__/fixtures/ecs_mapping';
import {
ActionsClientChatOpenAI,
ActionsClientSimpleChatModel,
} from '@kbn/langchain/server/language_models';
const mockLlm = new FakeLLM({
response: '{ "message": "ll callback later."}',
}) as unknown as ActionsClientChatOpenAI | ActionsClientSimpleChatModel;
const testState: EcsMappingState = ecsTestState;
describe('Testing ecs handler', () => {
it('handleDuplicates()', async () => {
const response = await handleDuplicates(testState, mockLlm);
expect(response.currentMapping).toStrictEqual({ message: 'll callback later.' });
expect(response.lastExecutedChain).toBe('duplicateFields');
});
});

View file

@ -0,0 +1,31 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/
import type {
ActionsClientChatOpenAI,
ActionsClientSimpleChatModel,
} from '@kbn/langchain/server/language_models';
import { JsonOutputParser } from '@langchain/core/output_parsers';
import type { EcsMappingState } from '../../types';
import { ECS_DUPLICATES_PROMPT } from './prompts';
export async function handleDuplicates(
state: EcsMappingState,
model: ActionsClientChatOpenAI | ActionsClientSimpleChatModel
) {
const ecsDuplicatesPrompt = ECS_DUPLICATES_PROMPT;
const outputParser = new JsonOutputParser();
const ecsDuplicatesGraph = ecsDuplicatesPrompt.pipe(model).pipe(outputParser);
const currentMapping = await ecsDuplicatesGraph.invoke({
ecs: state.ecs,
current_mapping: JSON.stringify(state.currentMapping, null, 2),
ex_answer: state.exAnswer,
duplicate_fields: state.duplicateFields,
});
return { currentMapping, lastExecutedChain: 'duplicateFields' };
}

View file

@ -0,0 +1,92 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/
import { FakeLLM } from '@langchain/core/utils/testing';
import { getEcsGraph } from './graph';
import {
ecsInitialMappingMockedResponse,
ecsDuplicateMockedResponse,
ecsInvalidMappingMockedResponse,
ecsMissingKeysMockedResponse,
ecsMappingExpectedResults,
} from '../../../__jest__/fixtures/ecs_mapping';
import { mockedRequest } from '../../../__jest__/fixtures';
import { handleEcsMapping } from './mapping';
import { handleDuplicates } from './duplicates';
import { handleMissingKeys } from './missing';
import { handleInvalidEcs } from './invalid';
import {
ActionsClientChatOpenAI,
ActionsClientSimpleChatModel,
} from '@kbn/langchain/server/language_models';
const mockLlm = new FakeLLM({
response: "I'll callback later.",
}) as unknown as ActionsClientChatOpenAI | ActionsClientSimpleChatModel;
jest.mock('./mapping');
jest.mock('./duplicates');
jest.mock('./missing');
jest.mock('./invalid');
describe('EcsGraph', () => {
describe('Compiling and Running', () => {
beforeEach(() => {
// Mocked responses for each node that requires an LLM API call/response.
const mockInvokeMapping = jest.fn().mockResolvedValue(ecsInitialMappingMockedResponse);
const mockInvokeDuplicates = jest.fn().mockResolvedValue(ecsDuplicateMockedResponse);
const mockInvokeMissingKeys = jest.fn().mockResolvedValue(ecsMissingKeysMockedResponse);
const mockInvokeInvalidEcs = jest.fn().mockResolvedValue(ecsInvalidMappingMockedResponse);
// Returns the initial response, with one duplicate field, to trigger the next step.
(handleEcsMapping as jest.Mock).mockImplementation(async () => ({
currentMapping: await mockInvokeMapping(),
lastExecutedChain: 'ecsMapping',
}));
// Returns the response with the duplicate field removed, but missing one to trigger the next step.
(handleDuplicates as jest.Mock).mockImplementation(async () => ({
currentMapping: await mockInvokeDuplicates(),
lastExecutedChain: 'duplicateFields',
}));
// Returns the response with the missing field added, but invalid ECS field to trigger the next step.
(handleMissingKeys as jest.Mock).mockImplementation(async () => ({
currentMapping: await mockInvokeMissingKeys(),
lastExecutedChain: 'missingKeys',
}));
// Returns the response with the invalid ECS field fixed, which finishes the chain.
(handleInvalidEcs as jest.Mock).mockImplementation(async () => ({
currentMapping: await mockInvokeInvalidEcs(),
lastExecutedChain: 'invalidEcs',
}));
});
it('Ensures that the graph compiles', async () => {
// When getEcsGraph runs, langgraph compiles the graph it will error if the graph has any issues.
// Common issues for example detecting a node has no next step, or there is a infinite loop between them.
try {
await getEcsGraph(mockLlm);
} catch (error) {
fail(`getEcsGraph threw an error: ${error}`);
}
});
it('Runs the whole graph, with mocked outputs from the LLM.', async () => {
// The mocked outputs are specifically crafted to trigger ALL different conditions, allowing us to test the whole graph.
// This is why we have all the expects ensuring each function was called.
const ecsGraph = await getEcsGraph(mockLlm);
const response = await ecsGraph.invoke(mockedRequest);
expect(response.results).toStrictEqual(ecsMappingExpectedResults);
// Check if the functions were called
expect(handleEcsMapping).toHaveBeenCalled();
expect(handleDuplicates).toHaveBeenCalled();
expect(handleMissingKeys).toHaveBeenCalled();
expect(handleInvalidEcs).toHaveBeenCalled();
});
});
});

View file

@ -0,0 +1,174 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/
import type { StateGraphArgs } from '@langchain/langgraph';
import { StateGraph, END, START } from '@langchain/langgraph';
import type {
ActionsClientChatOpenAI,
ActionsClientSimpleChatModel,
} from '@kbn/langchain/server/language_models';
import { ECS_EXAMPLE_ANSWER, ECS_FIELDS } from './constants';
import { modifySamples, mergeSamples } from '../../util/samples';
import { createPipeline } from './pipeline';
import { handleEcsMapping } from './mapping';
import { handleDuplicates } from './duplicates';
import { handleMissingKeys } from './missing';
import { handleInvalidEcs } from './invalid';
import { handleValidateMappings } from './validate';
import type { EcsMappingState } from '../../types';
const graphState: StateGraphArgs<EcsMappingState>['channels'] = {
ecs: {
value: (x: string, y?: string) => y ?? x,
default: () => '',
},
lastExecutedChain: {
value: (x: string, y?: string) => y ?? x,
default: () => '',
},
rawSamples: {
value: (x: string[], y?: string[]) => y ?? x,
default: () => [],
},
samples: {
value: (x: string[], y?: string[]) => y ?? x,
default: () => [],
},
formattedSamples: {
value: (x: string, y?: string) => y ?? x,
default: () => '',
},
exAnswer: {
value: (x: string, y?: string) => y ?? x,
default: () => '',
},
packageName: {
value: (x: string, y?: string) => y ?? x,
default: () => '',
},
dataStreamName: {
value: (x: string, y?: string) => y ?? x,
default: () => '',
},
finalized: {
value: (x: boolean, y?: boolean) => y ?? x,
default: () => false,
},
currentMapping: {
value: (x: object, y?: object) => y ?? x,
default: () => ({}),
},
currentPipeline: {
value: (x: object, y?: object) => y ?? x,
default: () => ({}),
},
duplicateFields: {
value: (x: string[], y?: string[]) => y ?? x,
default: () => [],
},
missingKeys: {
value: (x: string[], y?: string[]) => y ?? x,
default: () => [],
},
invalidEcsFields: {
value: (x: string[], y?: string[]) => y ?? x,
default: () => [],
},
results: {
value: (x: object, y?: object) => y ?? x,
default: () => ({}),
},
logFormat: {
value: (x: string, y?: string) => y ?? x,
default: () => 'json',
},
ecsVersion: {
value: (x: string, y?: string) => y ?? x,
default: () => '8.11.0',
},
};
function modelInput(state: EcsMappingState): Partial<EcsMappingState> {
const samples = modifySamples(state);
const formattedSamples = mergeSamples(samples);
return {
exAnswer: JSON.stringify(ECS_EXAMPLE_ANSWER, null, 2),
ecs: JSON.stringify(ECS_FIELDS, null, 2),
samples,
finalized: false,
formattedSamples,
lastExecutedChain: 'modelInput',
};
}
function modelOutput(state: EcsMappingState): Partial<EcsMappingState> {
const currentPipeline = createPipeline(state);
return {
finalized: true,
lastExecutedChain: 'modelOutput',
results: {
mapping: state.currentMapping,
pipeline: currentPipeline,
},
};
}
function inputRouter(state: EcsMappingState): string {
if (Object.keys(state.currentMapping).length === 0) {
return 'ecsMapping';
}
return 'modelOutput';
}
function chainRouter(state: EcsMappingState): string {
if (Object.keys(state.duplicateFields).length > 0) {
return 'duplicateFields';
}
if (Object.keys(state.missingKeys).length > 0) {
return 'missingKeys';
}
if (Object.keys(state.invalidEcsFields).length > 0) {
return 'invalidEcsFields';
}
if (!state.finalized) {
return 'modelOutput';
}
return END;
}
export async function getEcsGraph(model: ActionsClientChatOpenAI | ActionsClientSimpleChatModel) {
const workflow = new StateGraph({
channels: graphState,
})
.addNode('modelInput', modelInput)
.addNode('modelOutput', modelOutput)
.addNode('handleEcsMapping', (state: EcsMappingState) => handleEcsMapping(state, model))
.addNode('handleValidation', handleValidateMappings)
.addNode('handleDuplicates', (state: EcsMappingState) => handleDuplicates(state, model))
.addNode('handleMissingKeys', (state: EcsMappingState) => handleMissingKeys(state, model))
.addNode('handleInvalidEcs', (state: EcsMappingState) => handleInvalidEcs(state, model))
.addEdge(START, 'modelInput')
.addEdge('modelOutput', END)
.addEdge('handleEcsMapping', 'handleValidation')
.addEdge('handleDuplicates', 'handleValidation')
.addEdge('handleMissingKeys', 'handleValidation')
.addEdge('handleInvalidEcs', 'handleValidation')
.addConditionalEdges('modelInput', inputRouter, {
ecsMapping: 'handleEcsMapping',
modelOutput: 'modelOutput',
})
.addConditionalEdges('handleValidation', chainRouter, {
duplicateFields: 'handleDuplicates',
missingKeys: 'handleMissingKeys',
invalidEcsFields: 'handleInvalidEcs',
modelOutput: 'modelOutput',
});
const compiledEcsGraph = workflow.compile();
return compiledEcsGraph;
}

View file

@ -0,0 +1,7 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/
export { getEcsGraph } from './graph';

View file

@ -0,0 +1,29 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/
import { FakeLLM } from '@langchain/core/utils/testing';
import { handleInvalidEcs } from './invalid';
import type { EcsMappingState } from '../../types';
import { ecsTestState } from '../../../__jest__/fixtures/ecs_mapping';
import {
ActionsClientChatOpenAI,
ActionsClientSimpleChatModel,
} from '@kbn/langchain/server/language_models';
const mockLlm = new FakeLLM({
response: '{ "message": "ll callback later."}',
}) as unknown as ActionsClientChatOpenAI | ActionsClientSimpleChatModel;
const testState: EcsMappingState = ecsTestState;
describe('Testing ecs handlers', () => {
it('handleInvalidEcs()', async () => {
const response = await handleInvalidEcs(testState, mockLlm);
expect(response.currentMapping).toStrictEqual({ message: 'll callback later.' });
expect(response.lastExecutedChain).toBe('invalidEcs');
});
});

View file

@ -0,0 +1,32 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/
import type {
ActionsClientChatOpenAI,
ActionsClientSimpleChatModel,
} from '@kbn/langchain/server/language_models';
import { JsonOutputParser } from '@langchain/core/output_parsers';
import type { EcsMappingState } from '../../types';
import { ECS_INVALID_PROMPT } from './prompts';
export async function handleInvalidEcs(
state: EcsMappingState,
model: ActionsClientChatOpenAI | ActionsClientSimpleChatModel
) {
const ecsInvalidEcsPrompt = ECS_INVALID_PROMPT;
const outputParser = new JsonOutputParser();
const ecsInvalidEcsGraph = ecsInvalidEcsPrompt.pipe(model).pipe(outputParser);
const currentMapping = await ecsInvalidEcsGraph.invoke({
ecs: state.ecs,
current_mapping: JSON.stringify(state.currentMapping, null, 2),
ex_answer: state.exAnswer,
formatted_samples: state.formattedSamples,
invalid_ecs_fields: state.invalidEcsFields,
});
return { currentMapping, lastExecutedChain: 'invalidEcs' };
}

View file

@ -0,0 +1,29 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/
import { FakeLLM } from '@langchain/core/utils/testing';
import { handleEcsMapping } from './mapping';
import type { EcsMappingState } from '../../types';
import { ecsTestState } from '../../../__jest__/fixtures/ecs_mapping';
import {
ActionsClientChatOpenAI,
ActionsClientSimpleChatModel,
} from '@kbn/langchain/server/language_models';
const mockLlm = new FakeLLM({
response: '{ "message": "ll callback later."}',
}) as unknown as ActionsClientChatOpenAI | ActionsClientSimpleChatModel;
const testState: EcsMappingState = ecsTestState;
describe('Testing ecs handler', () => {
it('handleEcsMapping()', async () => {
const response = await handleEcsMapping(testState, mockLlm);
expect(response.currentMapping).toStrictEqual({ message: 'll callback later.' });
expect(response.lastExecutedChain).toBe('ecsMapping');
});
});

View file

@ -0,0 +1,32 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/
import type {
ActionsClientChatOpenAI,
ActionsClientSimpleChatModel,
} from '@kbn/langchain/server/language_models';
import { JsonOutputParser } from '@langchain/core/output_parsers';
import type { EcsMappingState } from '../../types';
import { ECS_MAIN_PROMPT } from './prompts';
export async function handleEcsMapping(
state: EcsMappingState,
model: ActionsClientChatOpenAI | ActionsClientSimpleChatModel
) {
const ecsMainPrompt = ECS_MAIN_PROMPT;
const outputParser = new JsonOutputParser();
const ecsMainGraph = ecsMainPrompt.pipe(model).pipe(outputParser);
const currentMapping = await ecsMainGraph.invoke({
ecs: state.ecs,
formatted_samples: state.formattedSamples,
package_name: state.packageName,
data_stream_name: state.dataStreamName,
ex_answer: state.exAnswer,
});
return { currentMapping, lastExecutedChain: 'ecsMapping' };
}

View file

@ -0,0 +1,29 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/
import { FakeLLM } from '@langchain/core/utils/testing';
import { handleMissingKeys } from './missing';
import type { EcsMappingState } from '../../types';
import { ecsTestState } from '../../../__jest__/fixtures/ecs_mapping';
import {
ActionsClientChatOpenAI,
ActionsClientSimpleChatModel,
} from '@kbn/langchain/server/language_models';
const mockLlm = new FakeLLM({
response: '{ "message": "ll callback later."}',
}) as unknown as ActionsClientChatOpenAI | ActionsClientSimpleChatModel;
const testState: EcsMappingState = ecsTestState;
describe('Testing ecs handler', () => {
it('handleMissingKeys()', async () => {
const response = await handleMissingKeys(testState, mockLlm);
expect(response.currentMapping).toStrictEqual({ message: 'll callback later.' });
expect(response.lastExecutedChain).toBe('missingKeys');
});
});

View file

@ -0,0 +1,32 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/
import type {
ActionsClientChatOpenAI,
ActionsClientSimpleChatModel,
} from '@kbn/langchain/server/language_models';
import { JsonOutputParser } from '@langchain/core/output_parsers';
import type { EcsMappingState } from '../../types';
import { ECS_MISSING_KEYS_PROMPT } from './prompts';
export async function handleMissingKeys(
state: EcsMappingState,
model: ActionsClientChatOpenAI | ActionsClientSimpleChatModel
) {
const ecsMissingPrompt = ECS_MISSING_KEYS_PROMPT;
const outputParser = new JsonOutputParser();
const ecsMissingGraph = ecsMissingPrompt.pipe(model).pipe(outputParser);
const currentMapping = await ecsMissingGraph.invoke({
ecs: state.ecs,
current_mapping: JSON.stringify(state.currentMapping, null, 2),
ex_answer: state.exAnswer,
formatted_samples: state.formattedSamples,
missing_keys: state?.missingKeys,
});
return { currentMapping, lastExecutedChain: 'missingKeys' };
}

View file

@ -0,0 +1,177 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/
/* eslint-disable @typescript-eslint/no-explicit-any */
import { load } from 'js-yaml';
import { Environment, FileSystemLoader } from 'nunjucks';
import { join as joinPath } from 'path';
import type { EcsMappingState } from '../../types';
import { ECS_TYPES } from './constants';
interface IngestPipeline {
[key: string]: unknown;
}
interface ECSField {
target: string;
confidence: number;
date_formats: string[];
type: string;
}
function generateProcessor(
currentPath: string,
ecsField: ECSField,
expectedEcsType: string,
sampleValue: unknown
): object {
if (needsTypeConversion(sampleValue, expectedEcsType)) {
return {
convert: {
field: currentPath,
target_field: ecsField.target,
type: getConvertProcessorType(expectedEcsType),
ignore_missing: true,
},
};
}
if (ecsField.type === 'date') {
return {
date: {
field: currentPath,
target_field: ecsField.target,
formats: ecsField.date_formats,
if: currentPath.replace(/\./g, '?.'),
},
};
}
return {
rename: {
field: currentPath,
target_field: ecsField.target,
ignore_missing: true,
},
};
}
function getSampleValue(key: string, samples: Record<string, any>): unknown {
const keyList = key.split('.');
let value: any = samples;
for (const k of keyList) {
if (value === undefined || value === null) {
return null;
}
value = value[k];
}
return value;
}
function getEcsType(ecsField: ECSField, ecsTypes: Record<string, string>): string {
const ecsTarget = ecsField.target;
return ecsTypes[ecsTarget];
}
function getConvertProcessorType(expectedEcsType: string): string {
if (expectedEcsType === 'long') {
return 'long';
}
if (['scaled_float', 'float'].includes(expectedEcsType)) {
return 'float';
}
if (expectedEcsType === 'ip') {
return 'ip';
}
if (expectedEcsType === 'boolean') {
return 'boolean';
}
return 'string';
}
function needsTypeConversion(sample: unknown, expected: string): boolean {
if (sample === null || sample === undefined) {
return false;
}
if (expected === 'ip') {
return true;
}
if (expected === 'boolean' && typeof sample !== 'boolean') {
return true;
}
if (['long', 'float', 'scaled_float'].includes(expected) && typeof sample !== 'number') {
return true;
}
if (
['keyword', 'wildcard', 'match_only_text', 'constant_keyword'].includes(expected) &&
!(typeof sample === 'string' || Array.isArray(sample))
) {
return true;
}
// If types are anything but the above, we return false. Example types:
// "nested", "flattened", "object", "geopoint", "date"
return false;
}
function generateProcessors(ecsMapping: object, samples: object, basePath: string = ''): object[] {
const ecsTypes = ECS_TYPES;
const valueFieldKeys = new Set(['target', 'confidence', 'date_formats', 'type']);
const results: object[] = [];
for (const [key, value] of Object.entries(ecsMapping)) {
const currentPath = basePath ? `${basePath}.${key}` : key;
if (value !== null && typeof value === 'object' && value?.target !== null) {
const valueKeys = new Set(Object.keys(value));
if ([...valueFieldKeys].every((k) => valueKeys.has(k))) {
const processor = generateProcessor(
currentPath,
value as ECSField,
getEcsType(value as ECSField, ecsTypes),
getSampleValue(currentPath, samples)
);
results.push(processor);
} else {
results.push(...generateProcessors(value, samples, currentPath));
}
}
}
return results;
}
export function createPipeline(state: EcsMappingState): IngestPipeline {
const samples = JSON.parse(state.formattedSamples);
const processors = generateProcessors(state.currentMapping, samples);
// Retrieve all source field names from convert processors to populate single remove processor:
const fieldsToRemove = processors
.map((p: any) => p.convert?.field)
.filter((f: unknown) => f != null);
const mappedValues = {
processors,
ecs_version: state.ecsVersion,
package_name: state.packageName,
data_stream_name: state.dataStreamName,
log_format: state.logFormat,
fields_to_remove: fieldsToRemove,
};
const templatesPath = joinPath(__dirname, '../../templates');
const env = new Environment(new FileSystemLoader(templatesPath), {
autoescape: false,
});
env.addFilter('startswith', function (str, prefix) {
return str.startsWith(prefix);
});
const template = env.getTemplate('pipeline.yml.njk');
const renderedTemplate = template.render(mappedValues);
const ingestPipeline = load(renderedTemplate) as IngestPipeline;
return ingestPipeline;
}

View file

@ -0,0 +1,182 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/
import { ChatPromptTemplate } from '@langchain/core/prompts';
export const ECS_MAIN_PROMPT = ChatPromptTemplate.fromMessages([
[
'system',
`You are a helpful, expert assistant in Elastic Common Schema (ECS), focusing only on helping users with translating their provided combined samples to Elastic Common Schema (ECS).
Here is some context for you to reference for your task, read it carefully as you will get questions about it later:
<context>
<ecs>
{ecs}
</ecs>
<combined_sample>
{formatted_samples}
</combined_sample>
</context>`,
],
[
'human',
`Looking at the combined sample from {package_name} {data_stream_name} provided above. The combined sample is a JSON object that includes all unique fields from the log samples sent by {package_name} {data_stream_name}.
Go through each value step by step and modify it with the following process:
1. Check if the name of each key and its current value matches the description and usecase of any of the above ECS fields.
2. If one or more relevant ECS field is found, pick the one you are most confident about.
3. If no relevant ECS field is found, the value should just be replaced with "null" rather than a new object.
4. Only if a relevant ECS field is found replace the value with a new object that has the keys "target", "confidence", "date_format" and "type".
5. The object key "target" should be set to be the full path of the ECS field name you think it matches. Set the object key "type" to be either "string", "boolean", "number" or "date" depending on what was detected as the example value.
6. If the type "date" is used, then set date_format to be an array of one or more of the equivilant JAVA date formats that fits the example value. If the type is not date then date_format should be set to an empty array [].
7. For each key that you set a target ECS field, also score the confidence you have in that the target field is correct, use a float between 0.0 and 1.0 and set the value in the nested "confidence" key.
8. When you want to use an ECS field as a value for a target, but another field already has the same ECS field as its target, try to find another fitting ECS field. If none is found then the one you are least confident about should have the object replaced with null.
9. If you are not confident for a specific field, you should always set the value to null.
10. These {package_name} log samples are based on source and destination type data, prioritize these compared to other related ECS fields like host.* and observer.*.
You ALWAYS follow these guidelines when writing your response:
<guidelines>
- Never use \`event.category\` or \`event.type\` as target ECS fields.
- The target key should never have a null value, if no matching target ECS field is found, the whole key value should be set to null.
- Never use the same ECS target multiple times. If no other field is found that you are confident in, it should always be null.
- All keys should be under the {package_name} {data_stream_name} parent fields, same as the original combined sample above.
- All target key values should be ECS field names only from the above ECS fields provided as context.
- All original keys from the combined sample object needs to be in your response.
- Only when a target value is set should type, date_format and confidence be filled out. If no target value then the value should simply be null.
- Do not respond with anything except the ecs maping JSON object enclosed with 3 backticks (\`), see example response below.
</guidelines>
Example response format:
<example_response>
A: Please find the JSON object below:
\`\`\`json
{ex_answer}
\`\`\`
</example_response>"`,
],
['ai', 'Please find the JSON object below:'],
]);
export const ECS_INVALID_PROMPT = ChatPromptTemplate.fromMessages([
[
'system',
`You are a helpful, expert assistant in Elastic Common Schema (ECS), you help review and try to resolve incorrect field mappings.
Here is some context for you to reference your task, read it carefully as you will get questions about it later:
<context>
<ecs>
{ecs}
</ecs>
<formatted_samples>
{formatted_samples}
</formatted_samples>
<current_mapping>
{current_mapping}
</current_mapping>
</context>`,
],
[
'human',
`The following fields are mapped incorrectly in the current mapping, please help me resolve this:
<invalid_ecs_fields>
{invalid_ecs_fields}
</invalid_ecs_fields>
To resolve the invalid ecs fields, go through each key and value defined in the invalid fields, and modify the current mapping step by step, and ensure they follow these guidelines:
<guidelines>
- Update the provided current mapping object, the value should be the corresponding Elastic Common Schema field name. If no good or valid match is found the value should always be null.
- Do not respond with anything except the updated current mapping JSON object enclosed with 3 backticks (\`). See example response below.
</guidelines>
Example response format:
<example>
A: Please find the JSON object below:
\`\`\`json
{ex_answer}
\`\`\`
</example>`,
],
['ai', 'Please find the JSON object below:'],
]);
export const ECS_MISSING_KEYS_PROMPT = ChatPromptTemplate.fromMessages([
[
'system',
`You are a helpful, expert assistant in Elastic Common Schema (ECS), you help review and try to resolve missing fields in the current mapping.
Here is some context for you to reference for your task, read it carefully as you will get questions about it later:
<context>
<ecs>
{ecs}
</ecs>
<samples>
{formatted_samples}
</samples>
<current_mapping>
{current_mapping}
</current_mapping>
</context>`,
],
[
'human',
`The following keys are missing from the current mapping:
<missing_keys>
{missing_keys}
</missing_keys>
Help resolve the issue by adding the missing keys, look up example values from the formatted samples, and go through each missing key step by step, resolve it by following these guidelines:
<guidelines>
- Update the provided current mapping object with all the missing keys, the value should be the corresponding Elastic Common Schema field name. If no good match is found the value should always be null.
- Do not respond with anything except the updated current mapping JSON object enclosed with 3 backticks (\`). See example response below.
</guidelines>
Example response format:
<example>
A: Please find the JSON object below:
\`\`\`json
{ex_answer}
\`\`\`
</example>`,
],
['ai', 'Please find the JSON object below:'],
]);
export const ECS_DUPLICATES_PROMPT = ChatPromptTemplate.fromMessages([
[
'system',
`You are a helpful, expert assistant in Elastic Common Schema (ECS), you help review and try to resolve incorrect duplicate fields in the current mapping.
Here is some context for you to reference for your task, read it carefully as you will get questions about it later:
<context>
<ecs>
{ecs}
</ecs>
<current_mapping>
{current_mapping}
</current_mapping>
</context>`,
],
[
'human',
`The following duplicate fields are mapped to the same ECS fields in the current mapping, please help me resolve this:
<duplicate_fields>
{duplicate_fields}
</duplicate_fields>
To resolve the duplicate mappings, go through each key and value defined in the duplicate fields, and modify the current mapping step by step, and ensure they follow these guidelines:
<guidelines>
- Multiple keys should not have the same value (ECS field it will be mapped to). If multiple keys do have the same value then always choose the best match for the ECS field, while the other duplicates should have their value changed to null.
- Do not respond with anything except the updated current mapping JSON object enclosed with 3 backticks (\`). See example response below.
</guidelines>
Example response format:
<example>
A: Please find the JSON object below:
\`\`\`json
{ex_answer}
\`\`\`
</example>`,
],
['ai', 'Please find the JSON object below:'],
]);

View file

@ -0,0 +1,156 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/
/* eslint-disable @typescript-eslint/no-explicit-any */
import { ECS_FULL } from '../../../common/ecs';
import type { EcsMappingState } from '../../types';
const valueFieldKeys = new Set(['target', 'confidence', 'date_formats', 'type']);
type AnyObject = Record<string, any>;
function extractKeys(data: AnyObject, prefix: string = ''): Set<string> {
const keys = new Set<string>();
for (const [key, value] of Object.entries(data)) {
const fullKey = prefix ? `${prefix}.${key}` : key;
if (Array.isArray(value)) {
// Directly add the key for arrays without iterating over elements
keys.add(fullKey);
} else if (typeof value === 'object' && value !== null) {
const valueKeys = new Set(Object.keys(value));
if ([...valueFieldKeys].every((k) => valueKeys.has(k))) {
keys.add(fullKey);
} else {
// Recursively extract keys if the current value is a nested object
for (const nestedKey of extractKeys(value, fullKey)) {
keys.add(nestedKey);
}
}
} else {
// Add the key if the value is not an object or is null
keys.add(fullKey);
}
}
return keys;
}
function findMissingFields(formattedSamples: string, ecsMapping: AnyObject): string[] {
const combinedSamples = JSON.parse(formattedSamples);
const uniqueKeysFromSamples = extractKeys(combinedSamples);
const ecsResponseKeys = extractKeys(ecsMapping);
const missingKeys = [...uniqueKeysFromSamples].filter((key) => !ecsResponseKeys.has(key));
return missingKeys;
}
function processMapping(path: string[], value: any, output: Record<string, string[][]>): void {
if (typeof value === 'object' && value !== null) {
if (!Array.isArray(value)) {
// If the value is a dict with all the keys returned for each source field, this is the full path of the field.
const valueKeys = new Set(Object.keys(value));
if ([...valueFieldKeys].every((k) => valueKeys.has(k))) {
if (value?.target !== null) {
if (!output[value?.target]) {
output[value.target] = [];
}
output[value.target].push(path);
}
} else {
// Regular dictionary, continue traversing
for (const [k, v] of Object.entries(value)) {
processMapping([...path, k], v, output);
}
}
} else {
// If the value is an array, iterate through items and process them
for (const item of value) {
if (typeof item === 'object' && item !== null) {
processMapping(path, item, output);
}
}
}
} else if (value !== null) {
// Direct value, accumulate path
if (!output[value]) {
output[value] = [];
}
output[value].push(path);
}
}
function getValueFromPath(obj: AnyObject, path: string[]): unknown {
return path.reduce((acc, key) => (acc && acc[key] !== undefined ? acc[key] : null), obj);
}
function findDuplicateFields(samples: string[], ecsMapping: AnyObject): string[] {
const parsedSamples = samples.map((sample) => JSON.parse(sample));
const results: string[] = [];
const output: Record<string, string[][]> = {};
// Get all keys for each target ECS mapping field
processMapping([], ecsMapping, output);
// Filter out any ECS field that does not have multiple source fields mapped to it
const filteredOutput = Object.fromEntries(
Object.entries(output).filter(([_, paths]) => paths.length > 1 && _ !== null)
);
// For each ECS field where value is the ECS field and paths is the array of source field names
for (const [value, paths] of Object.entries(filteredOutput)) {
// For each log sample, checking if more than 1 source field exists in the same sample
for (const sample of parsedSamples) {
const foundPaths = paths.filter((path) => getValueFromPath(sample, path) !== null);
if (foundPaths.length > 1) {
const matchingFields = foundPaths.map((p) => p.join('.'));
results.push(
`One or more samples have matching fields for ECS field '${value}': ${matchingFields.join(
', '
)}`
);
break;
}
}
}
return results;
}
// Function to find invalid ECS fields
function findInvalidEcsFields(ecsMapping: AnyObject): string[] {
const results: string[] = [];
const output: Record<string, string[][]> = {};
const ecsDict = ECS_FULL;
processMapping([], ecsMapping, output);
const filteredOutput = Object.fromEntries(
Object.entries(output).filter(([key, _]) => key !== null)
);
for (const [ecsValue, paths] of Object.entries(filteredOutput)) {
if (!Object.prototype.hasOwnProperty.call(ecsDict, ecsValue)) {
const field = paths.map((p) => p.join('.'));
results.push(`Invalid ECS field mapping identified for ${ecsValue} : ${field.join(', ')}`);
}
}
return results;
}
export function handleValidateMappings(state: EcsMappingState): AnyObject {
const missingKeys = findMissingFields(state?.formattedSamples, state?.currentMapping);
const duplicateFields = findDuplicateFields(state?.samples, state?.currentMapping);
const invalidEcsFields = findInvalidEcsFields(state?.currentMapping);
return {
missingKeys,
duplicateFields,
invalidEcsFields,
lastExecutedChain: 'validateMappings',
};
}

View file

@ -0,0 +1,59 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/
export const RELATED_ECS_FIELDS = {
'related.hash': {
type: 'keyword',
description: 'All the hashes seen in the docs',
note: 'this field should contain an array of values',
},
'related.hosts': {
type: 'keyword',
description: 'All hostnames or other host identifiers seen in the docs',
note: 'this field should contain an array of values',
},
'related.ip': {
type: 'keyword',
description: 'All of the IPs seen in the docs',
note: 'this field should contain an array of values',
},
'related.user': {
type: 'keyword',
description: 'All the user names or other user identifiers seen in the docs',
note: 'this field should contain an array of values',
},
};
export const RELATED_EXAMPLE_ANSWER = [
{
append: {
field: 'related.ip',
value: ['{{{source.ip}}}'],
allow_duplicates: 'false',
},
},
{
append: {
field: 'related.user',
value: ['{{{server.user.name}}}'],
allow_duplicates: 'false',
},
},
{
append: {
field: 'related.hosts',
value: ['{{{client.domain}}}'],
allow_duplicates: 'false',
},
},
{
append: {
field: 'related.hash',
value: ['{{{file.hash.sha1}}}'],
allow_duplicates: 'false',
},
},
];

View file

@ -0,0 +1,33 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/
import { FakeLLM } from '@langchain/core/utils/testing';
import { handleErrors } from './errors';
import type { RelatedState } from '../../types';
import {
relatedTestState,
relatedMockProcessors,
relatedExpectedHandlerResponse,
} from '../../../__jest__/fixtures/related';
import {
ActionsClientChatOpenAI,
ActionsClientSimpleChatModel,
} from '@kbn/langchain/server/language_models';
const mockLlm = new FakeLLM({
response: JSON.stringify(relatedMockProcessors, null, 2),
}) as unknown as ActionsClientChatOpenAI | ActionsClientSimpleChatModel;
const testState: RelatedState = relatedTestState;
describe('Testing related handler', () => {
it('handleErrors()', async () => {
const response = await handleErrors(testState, mockLlm);
expect(response.currentPipeline).toStrictEqual(relatedExpectedHandlerResponse.currentPipeline);
expect(response.lastExecutedChain).toBe('error');
});
});

View file

@ -0,0 +1,40 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/
import type {
ActionsClientChatOpenAI,
ActionsClientSimpleChatModel,
} from '@kbn/langchain/server/language_models';
import { JsonOutputParser } from '@langchain/core/output_parsers';
import type { ESProcessorItem, Pipeline } from '../../../common';
import type { RelatedState } from '../../types';
import { combineProcessors } from '../../util/processors';
import { RELATED_ERROR_PROMPT } from './prompts';
export async function handleErrors(
state: RelatedState,
model: ActionsClientChatOpenAI | ActionsClientSimpleChatModel
) {
const relatedErrorPrompt = RELATED_ERROR_PROMPT;
const outputParser = new JsonOutputParser();
const relatedErrorGraph = relatedErrorPrompt.pipe(model).pipe(outputParser);
const currentProcessors = (await relatedErrorGraph.invoke({
current_processors: JSON.stringify(state.currentProcessors, null, 2),
ex_answer: state.exAnswer,
errors: JSON.stringify(state.errors, null, 2),
package_name: state.packageName,
data_stream_name: state.dataStreamName,
})) as ESProcessorItem[];
const currentPipeline = combineProcessors(state.initialPipeline as Pipeline, currentProcessors);
return {
currentPipeline,
currentProcessors,
reviewed: false,
lastExecutedChain: 'error',
};
}

View file

@ -0,0 +1,118 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/
import type { IScopedClusterClient } from '@kbn/core/server';
import { FakeLLM } from '@langchain/core/utils/testing';
import { getRelatedGraph } from './graph';
import {
relatedExpectedResults,
relatedErrorMockedResponse,
relatedInitialMockedResponse,
relatedReviewMockedResponse,
relatedInitialPipeline,
testPipelineError,
testPipelineValidResult,
} from '../../../__jest__/fixtures/related';
import { mockedRequestWithPipeline } from '../../../__jest__/fixtures';
import { handleReview } from './review';
import { handleRelated } from './related';
import { handleErrors } from './errors';
import { testPipeline, combineProcessors } from '../../util';
import {
ActionsClientChatOpenAI,
ActionsClientSimpleChatModel,
} from '@kbn/langchain/server/language_models';
const mockLlm = new FakeLLM({
response: "I'll callback later.",
}) as unknown as ActionsClientChatOpenAI | ActionsClientSimpleChatModel;
jest.mock('./errors');
jest.mock('./review');
jest.mock('./related');
jest.mock('../../util/pipeline', () => ({
testPipeline: jest.fn(),
}));
describe('runRelatedGraph', () => {
const mockClient = {
asCurrentUser: {
indices: {
getMapping: jest.fn(),
},
},
} as unknown as IScopedClusterClient;
beforeEach(() => {
// Mocked responses for each node that requires an LLM API call/response.
const mockInvokeRelated = jest.fn().mockResolvedValue(relatedInitialMockedResponse);
const mockInvokeError = jest.fn().mockResolvedValue(relatedErrorMockedResponse);
const mockInvokeReview = jest.fn().mockResolvedValue(relatedReviewMockedResponse);
// After this is triggered, the mock of TestPipeline will trigger the expected error, to route to error handler
(handleRelated as jest.Mock).mockImplementation(async () => ({
currentPipeline: relatedInitialPipeline,
currentProcessors: await mockInvokeRelated(),
reviewed: false,
finalized: false,
lastExecutedChain: 'related',
}));
// Error pipeline returns the correct response to trigger a review.
(handleErrors as jest.Mock).mockImplementation(async () => ({
currentPipeline: relatedInitialPipeline,
currentProcessors: await mockInvokeError(),
reviewed: false,
finalized: false,
lastExecutedChain: 'error',
}));
// After the review it should route to modelOutput and finish.
(handleReview as jest.Mock).mockImplementation(async () => {
const currentProcessors = await mockInvokeReview();
const currentPipeline = combineProcessors(relatedInitialPipeline, currentProcessors);
return {
currentProcessors,
currentPipeline,
reviewed: true,
finalized: false,
lastExecutedChain: 'review',
};
});
});
it('Ensures that the graph compiles', async () => {
try {
await getRelatedGraph(mockClient, mockLlm);
} catch (error) {
// noop
}
});
it('Runs the whole graph, with mocked outputs from the LLM.', async () => {
const relatedGraph = await getRelatedGraph(mockClient, mockLlm);
(testPipeline as jest.Mock)
.mockResolvedValueOnce(testPipelineValidResult)
.mockResolvedValueOnce(testPipelineError)
.mockResolvedValueOnce(testPipelineValidResult)
.mockResolvedValueOnce(testPipelineValidResult)
.mockResolvedValueOnce(testPipelineValidResult);
let response;
try {
response = await relatedGraph.invoke(mockedRequestWithPipeline);
} catch (e) {
// noop
}
expect(response.results).toStrictEqual(relatedExpectedResults);
// Check if the functions were called
expect(handleRelated).toHaveBeenCalled();
expect(handleErrors).toHaveBeenCalled();
expect(handleReview).toHaveBeenCalled();
});
});

View file

@ -0,0 +1,171 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/
import type { IScopedClusterClient } from '@kbn/core-elasticsearch-server';
import type { StateGraphArgs } from '@langchain/langgraph';
import { StateGraph, END, START } from '@langchain/langgraph';
import type {
ActionsClientChatOpenAI,
ActionsClientSimpleChatModel,
} from '@kbn/langchain/server/language_models';
import type { RelatedState } from '../../types';
import { modifySamples, formatSamples } from '../../util/samples';
import { handleValidatePipeline } from '../../util/graph';
import { handleRelated } from './related';
import { handleErrors } from './errors';
import { handleReview } from './review';
import { RELATED_ECS_FIELDS, RELATED_EXAMPLE_ANSWER } from './constants';
const graphState: StateGraphArgs<RelatedState>['channels'] = {
lastExecutedChain: {
value: (x: string, y?: string) => y ?? x,
default: () => '',
},
rawSamples: {
value: (x: string[], y?: string[]) => y ?? x,
default: () => [],
},
samples: {
value: (x: string[], y?: string[]) => y ?? x,
default: () => [],
},
formattedSamples: {
value: (x: string, y?: string) => y ?? x,
default: () => '',
},
ecs: {
value: (x: string, y?: string) => y ?? x,
default: () => '',
},
exAnswer: {
value: (x: string, y?: string) => y ?? x,
default: () => '',
},
packageName: {
value: (x: string, y?: string) => y ?? x,
default: () => '',
},
dataStreamName: {
value: (x: string, y?: string) => y ?? x,
default: () => '',
},
finalized: {
value: (x: boolean, y?: boolean) => y ?? x,
default: () => false,
},
reviewed: {
value: (x: boolean, y?: boolean) => y ?? x,
default: () => false,
},
errors: {
value: (x: object, y?: object) => y ?? x,
default: () => ({}),
},
pipelineResults: {
value: (x: object[], y?: object[]) => y ?? x,
default: () => [],
},
currentPipeline: {
value: (x: object, y?: object) => y ?? x,
default: () => ({}),
},
currentProcessors: {
value: (x: object[], y?: object[]) => y ?? x,
default: () => [],
},
initialPipeline: {
value: (x: object, y?: object) => y ?? x,
default: () => ({}),
},
results: {
value: (x: object, y?: object) => y ?? x,
default: () => ({}),
},
};
function modelInput(state: RelatedState): Partial<RelatedState> {
const samples = modifySamples(state);
const formattedSamples = formatSamples(samples);
const initialPipeline = JSON.parse(JSON.stringify(state.currentPipeline));
return {
exAnswer: JSON.stringify(RELATED_EXAMPLE_ANSWER, null, 2),
ecs: JSON.stringify(RELATED_ECS_FIELDS, null, 2),
samples,
formattedSamples,
initialPipeline,
finalized: false,
reviewed: false,
lastExecutedChain: 'modelInput',
};
}
function modelOutput(state: RelatedState): Partial<RelatedState> {
return {
finalized: true,
lastExecutedChain: 'modelOutput',
results: {
docs: state.pipelineResults,
pipeline: state.currentPipeline,
},
};
}
function inputRouter(state: RelatedState): string {
if (Object.keys(state.pipelineResults).length === 0) {
return 'validatePipeline';
}
return 'related';
}
function chainRouter(state: RelatedState): string {
if (Object.keys(state.currentProcessors).length === 0) {
return 'related';
}
if (Object.keys(state.errors).length > 0) {
return 'errors';
}
if (!state.reviewed) {
return 'review';
}
if (!state.finalized) {
return 'modelOutput';
}
return END;
}
export async function getRelatedGraph(
client: IScopedClusterClient,
model: ActionsClientChatOpenAI | ActionsClientSimpleChatModel
) {
const workflow = new StateGraph({ channels: graphState })
.addNode('modelInput', modelInput)
.addNode('modelOutput', modelOutput)
.addNode('handleRelated', (state: RelatedState) => handleRelated(state, model))
.addNode('handleValidatePipeline', (state: RelatedState) =>
handleValidatePipeline(state, client)
)
.addNode('handleErrors', (state: RelatedState) => handleErrors(state, model))
.addNode('handleReview', (state: RelatedState) => handleReview(state, model))
.addEdge(START, 'modelInput')
.addEdge('modelOutput', END)
.addEdge('handleRelated', 'handleValidatePipeline')
.addEdge('handleErrors', 'handleValidatePipeline')
.addEdge('handleReview', 'handleValidatePipeline')
.addConditionalEdges('modelInput', inputRouter, {
related: 'handleRelated',
validatePipeline: 'handleValidatePipeline',
})
.addConditionalEdges('handleValidatePipeline', chainRouter, {
related: 'handleRelated',
errors: 'handleErrors',
review: 'handleReview',
modelOutput: 'modelOutput',
});
const compiledRelatedGraph = workflow.compile();
return compiledRelatedGraph;
}

View file

@ -0,0 +1,7 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/
export { getRelatedGraph } from './graph';

View file

@ -0,0 +1,142 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/
import { ChatPromptTemplate } from '@langchain/core/prompts';
export const RELATED_MAIN_PROMPT = ChatPromptTemplate.fromMessages([
[
'system',
`You are a helpful, expert assistant on Elasticsearch Ingest Pipelines, focusing on providing append processors that can be used to enrich samples with all relevant related.ip, related.hash, related.user and related.host fields.
Here are some context for you to reference for your task, read it carefully as you will get questions about it later:
<context>
<ecs>
{ecs}
</ecs>
</context>`,
],
[
'human',
`Please help me by providing all relevant append processors for any detected related.ip, related.hash, related.user and related.host fields that would fit the below pipeline results as an array of JSON objects.
<pipeline_results>
{pipeline_results}
</pipeline_results>
Go through each of the pipeline results above step by step and do the following to add all relevant related.ip, related.hash, related.user and related.host fields.
1. Try to understand what is unique about each pipeline result, and what sort of related.ip, related.hash, related.user and related.host fields that fit best, and if there is any unique values for each result.
2. For each of related.ip, related.hash, related.user and related.host fields that you find, add a new append processor to your array of JSON objects.
3. If only certain results are relevant to the related.ip, related.hash, related.user and related.host fields, add an if condition similar to the above example processors, that describes what value or field needs to be available for this categorization to take place. The if condition should be inside the processor object.
4. Always check if the related.ip, related.hash, related.user and related.host fields are common in the ecs context above.
5. The value argument for the append processor shall consist of one field.
You ALWAYS follow these guidelines when writing your response:
<guidelines>
- You can add as many append processors you need to cover all the fields that you detected.
- If conditions should always use a ? character when accessing nested fields, in case the field might not always be available, see example processors above.
- When an if condition is not needed the argument should not be used for the processor object.
- Do not respond with anything except the array of processors as a valid JSON objects enclosed with 3 backticks (\`), see example response below.
</guidelines>
Example response format:
<example>
A: Please find the Related processors below:
\`\`\`json
{ex_answer}
\`\`\`
</example>`,
],
['ai', 'Please find the Related processors below:'],
]);
export const RELATED_ERROR_PROMPT = ChatPromptTemplate.fromMessages([
[
'system',
`You are a helpful, expert assistant on Elasticsearch Ingest Pipelines, focusing on resolving errors and issues with append processors used for related field categorization.
Here is some context that you can reference for your task, read it carefully as you will get questions about it later:
<context>
<current_processors>
{current_processors}
</current_processors>
<errors>
{errors}
</errors>
</context>`,
],
[
'human',
`Please go through each error above, carefully review the provided current processors, and resolve the most likely cause to the supplied error by returning an updated version of the current_processors.
Follow these steps to help resolve the current ingest pipeline issues:
1. Try to fix all related errors before responding.
2. Apply all fixes to the provided array of current append processors.
3. If you do not know how to fix an error, then continue to the next and return the complete updated array of current append processors.
You ALWAYS follow these guidelines when writing your response:
<guidelines>
- When checking for the existance of multiple values in a single variable, use this format: "if": "['value1', 'value2'].contains(ctx.{package_name}?.{data_stream_name}?.field)"
- If conditions should never be in a format like "if": "true". If it exist in the current array of append processors, remove only the redundant if condition.
- If the error complains that it is a null point exception, always ensure the if conditions uses a ? when accessing nested fields. For example ctx.field1?.nestedfield1?.nestedfield2.
- Never use "split" in template values, only use the field name inside the triple brackets. If the error mentions "Improperly closed variable in query-template" then check each "value" field for any special characters and remove them.
- Do not respond with anything except the complete updated array of processors as a valid JSON object enclosed with 3 backticks (\`), see example response below.
</guidelines>
Example response format:
<example>
A: Please find the updated ECS related append processors below:
\`\`\`json
{ex_answer}
\`\`\`
</example>`,
],
['ai', 'Please find the updated ECS related append processors below:'],
]);
export const RELATED_REVIEW_PROMPT = ChatPromptTemplate.fromMessages([
[
'system',
`You are a helpful, expert assistant on Elasticsearch Ingest Pipelines, focusing on adding improvements to the provided array of processors and reviewing the current results.
Here is some context that you can reference for your task, read it carefully as you will get questions about it later:
<context>
<current_processors>
{current_processors}
</current_processors>
</context>`,
],
[
'human',
`Testing my current pipeline returned me with the below pipeline results:
<pipeline_results>
{pipeline_results}
</pipeline_results>
Please review the pipeline results and the array of current processors, ensuring to identify all the related.ip , related.user , related.hash and related.host fields that would match each pipeline result document. If any related.ip , related.user , related.hash or related.host fields is missing from any of the pipeline results, add them by updating the array of current processors and return the whole updated array of processors.
For each pipeline result you review step by step, remember the below steps:
1. Check each of the pipeline results to see if the field/value matches related.ip , related.user , related.hash or related.host. If not then try to correlate the results with the current processors and see if either a new append processor should be added to the list with a matching if condition, or if any of the if conditions should be modified as they are not matching that is in the results.
2. If the results have related.ip , related.user , related.hash or related.host value, see if more of them could match, if so it could be added to the relevant append processor which added the initial values.
3. Ensure that all append processors has allow_duplicates: false, as seen in the example response.
You ALWAYS follow these guidelines when writing your response:
<guidelines>
- You can use as many append processors as you need to add all relevant ECS categories and types combinations.
- If conditions should always use a ? character when accessing nested fields, in case the field might not always be available, see example processors above.
- When an if condition is not needed the argument should not be used for the processor object.
- If not updates are needed you respond with the initially provided current processors.
- Each append processor needs to have the allow_duplicates: false argument, as shown in the below example response.
- Do not respond with anything except updated array of processors as a valid JSON object enclosed with 3 backticks (\`), see example response below.
</guidelines>
Example response format:
<example>
A: Please find the updated ECS related append processors below:
\`\`\`
{ex_answer}
\`\`\`
</example>`,
],
['ai', 'Please find the updated ECS related append processors below:'],
]);

View file

@ -0,0 +1,33 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/
import { FakeLLM } from '@langchain/core/utils/testing';
import { handleRelated } from './related';
import type { RelatedState } from '../../types';
import {
relatedTestState,
relatedMockProcessors,
relatedExpectedHandlerResponse,
} from '../../../__jest__/fixtures/related';
import {
ActionsClientChatOpenAI,
ActionsClientSimpleChatModel,
} from '@kbn/langchain/server/language_models';
const mockLlm = new FakeLLM({
response: JSON.stringify(relatedMockProcessors, null, 2),
}) as unknown as ActionsClientChatOpenAI | ActionsClientSimpleChatModel;
const testState: RelatedState = relatedTestState;
describe('Testing related handler', () => {
it('handleRelated()', async () => {
const response = await handleRelated(testState, mockLlm);
expect(response.currentPipeline).toStrictEqual(relatedExpectedHandlerResponse.currentPipeline);
expect(response.lastExecutedChain).toBe('related');
});
});

View file

@ -0,0 +1,39 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/
import type {
ActionsClientChatOpenAI,
ActionsClientSimpleChatModel,
} from '@kbn/langchain/server/language_models';
import { JsonOutputParser } from '@langchain/core/output_parsers';
import type { ESProcessorItem, Pipeline } from '../../../common';
import type { RelatedState } from '../../types';
import { combineProcessors } from '../../util/processors';
import { RELATED_MAIN_PROMPT } from './prompts';
export async function handleRelated(
state: RelatedState,
model: ActionsClientChatOpenAI | ActionsClientSimpleChatModel
) {
const relatedMainPrompt = RELATED_MAIN_PROMPT;
const outputParser = new JsonOutputParser();
const relatedMainGraph = relatedMainPrompt.pipe(model).pipe(outputParser);
const currentProcessors = (await relatedMainGraph.invoke({
pipeline_results: JSON.stringify(state.pipelineResults, null, 2),
ex_answer: state.exAnswer,
ecs: state.ecs,
})) as ESProcessorItem[];
const currentPipeline = combineProcessors(state.initialPipeline as Pipeline, currentProcessors);
return {
currentPipeline,
currentProcessors,
reviewed: false,
lastExecutedChain: 'related',
};
}

View file

@ -0,0 +1,33 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/
import { FakeLLM } from '@langchain/core/utils/testing';
import { handleReview } from './review';
import type { RelatedState } from '../../types';
import {
relatedTestState,
relatedMockProcessors,
relatedExpectedHandlerResponse,
} from '../../../__jest__/fixtures/related';
import {
ActionsClientChatOpenAI,
ActionsClientSimpleChatModel,
} from '@kbn/langchain/server/language_models';
const mockLlm = new FakeLLM({
response: JSON.stringify(relatedMockProcessors, null, 2),
}) as unknown as ActionsClientChatOpenAI | ActionsClientSimpleChatModel;
const testState: RelatedState = relatedTestState;
describe('Testing related handler', () => {
it('handleReview()', async () => {
const response = await handleReview(testState, mockLlm);
expect(response.currentPipeline).toStrictEqual(relatedExpectedHandlerResponse.currentPipeline);
expect(response.lastExecutedChain).toBe('review');
});
});

View file

@ -0,0 +1,39 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/
import type {
ActionsClientChatOpenAI,
ActionsClientSimpleChatModel,
} from '@kbn/langchain/server/language_models';
import { JsonOutputParser } from '@langchain/core/output_parsers';
import type { ESProcessorItem, Pipeline } from '../../../common';
import type { RelatedState } from '../../types';
import { combineProcessors } from '../../util/processors';
import { RELATED_REVIEW_PROMPT } from './prompts';
export async function handleReview(
state: RelatedState,
model: ActionsClientChatOpenAI | ActionsClientSimpleChatModel
) {
const relatedReviewPrompt = RELATED_REVIEW_PROMPT;
const outputParser = new JsonOutputParser();
const relatedReviewGraph = relatedReviewPrompt.pipe(model).pipe(outputParser);
const currentProcessors = (await relatedReviewGraph.invoke({
current_processors: JSON.stringify(state.currentProcessors, null, 2),
ex_answer: state.exAnswer,
pipeline_results: JSON.stringify(state.pipelineResults, null, 2),
})) as ESProcessorItem[];
const currentPipeline = combineProcessors(state.initialPipeline as Pipeline, currentProcessors);
return {
currentPipeline,
currentProcessors,
reviewed: true,
lastExecutedChain: 'review',
};
}

View file

@ -0,0 +1,17 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/
import type { PluginInitializerContext } from '@kbn/core/server';
export { config } from './config';
export async function plugin(initializerContext: PluginInitializerContext) {
const { IntegrationAssistantPlugin } = await import('./plugin');
return new IntegrationAssistantPlugin(initializerContext);
}
export type { IntegrationAssistantPluginSetup, IntegrationAssistantPluginStart } from './types';

View file

@ -0,0 +1,33 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/
import { join as joinPath } from 'path';
import type { InputTypes } from '../../common';
import { ensureDirSync, createSync, readSync } from '../util';
export function createAgentInput(specificDataStreamDir: string, inputTypes: InputTypes[]): void {
const agentDir = joinPath(specificDataStreamDir, 'agent', 'stream');
const agentTemplatesDir = joinPath(__dirname, '../templates/agent');
ensureDirSync(agentDir);
// Load common options that exists for all .yml.hbs files, to be merged with each specific input file
const commonFilePath = joinPath(agentTemplatesDir, 'common.yml.hbs');
const commonFile = readSync(commonFilePath);
for (const inputType of inputTypes) {
const inputTypeFilePath = joinPath(
agentTemplatesDir,
`${inputType.replaceAll('-', '_')}.yml.hbs`
);
const inputTypeFile = readSync(inputTypeFilePath);
const combinedContents = `${inputTypeFile}\n${commonFile}`;
const destinationFilePath = joinPath(agentDir, `${inputType}.yml.hbs`);
createSync(destinationFilePath, combinedContents);
}
}

View file

@ -0,0 +1,142 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/
import AdmZip from 'adm-zip';
import nunjucks from 'nunjucks';
import { tmpdir } from 'os';
import { join as joinPath } from 'path';
import type { DataStream, Integration } from '../../common';
import { copySync, createSync, ensureDirSync, generateUniqueId } from '../util';
import { createAgentInput } from './agent';
import { createDatastream } from './data_stream';
import { createFieldMapping } from './fields';
import { createPipeline } from './pipeline';
export async function buildPackage(integration: Integration): Promise<Buffer> {
const templateDir = joinPath(__dirname, '../templates');
const agentTemplates = joinPath(templateDir, 'agent');
const manifestTemplates = joinPath(templateDir, 'manifest');
const systemTestTemplates = joinPath(templateDir, 'system_tests');
nunjucks.configure([templateDir, agentTemplates, manifestTemplates, systemTestTemplates], {
autoescape: false,
});
const tmpDir = joinPath(tmpdir(), `integration-assistant-${generateUniqueId()}`);
const packageDir = createDirectories(tmpDir, integration);
const dataStreamsDir = joinPath(packageDir, 'data_stream');
for (const dataStream of integration.dataStreams) {
const dataStreamName = dataStream.name;
const specificDataStreamDir = joinPath(dataStreamsDir, dataStreamName);
createDatastream(integration.name, specificDataStreamDir, dataStream);
createAgentInput(specificDataStreamDir, dataStream.inputTypes);
createPipeline(specificDataStreamDir, dataStream.pipeline);
createFieldMapping(integration.name, dataStreamName, specificDataStreamDir, dataStream.docs);
}
const tmpPackageDir = joinPath(tmpDir, `${integration.name}-0.1.0`);
const zipBuffer = await createZipArchive(tmpPackageDir);
return zipBuffer;
}
function createDirectories(tmpDir: string, integration: Integration): string {
const packageDir = joinPath(tmpDir, `${integration.name}-0.1.0`);
ensureDirSync(tmpDir);
ensureDirSync(packageDir);
createPackage(packageDir, integration);
return packageDir;
}
function createPackage(packageDir: string, integration: Integration): void {
createReadme(packageDir, integration);
createChangelog(packageDir);
createBuildFile(packageDir);
createPackageManifest(packageDir, integration);
// Skipping creation of system tests temporarily for custom package generation
// createPackageSystemTests(packageDir, integration);
createLogo(packageDir, integration);
}
function createLogo(packageDir: string, integration: Integration): void {
const logoDir = joinPath(packageDir, 'img');
ensureDirSync(logoDir);
if (integration?.logo !== undefined) {
const buffer = Buffer.from(integration.logo, 'base64');
createSync(joinPath(logoDir, 'logo.svg'), buffer);
} else {
const imgTemplateDir = joinPath(__dirname, '../templates/img');
copySync(joinPath(imgTemplateDir, 'logo.svg'), joinPath(logoDir, 'logo.svg'));
}
}
function createBuildFile(packageDir: string): void {
const buildFile = nunjucks.render('build.yml.njk', { ecs_version: '8.11.0' });
const buildDir = joinPath(packageDir, '_dev/build');
ensureDirSync(buildDir);
createSync(joinPath(buildDir, 'build.yml'), buildFile);
}
function createChangelog(packageDir: string): void {
const changelogTemplate = nunjucks.render('changelog.yml.njk', {
initial_version: '0.1.0',
});
createSync(joinPath(packageDir, 'changelog.yml'), changelogTemplate);
}
function createReadme(packageDir: string, integration: Integration) {
const readmeDirPath = joinPath(packageDir, '_dev/build/docs/');
ensureDirSync(readmeDirPath);
const readmeTemplate = nunjucks.render('readme.md.njk', {
package_name: integration.name,
data_streams: integration.dataStreams,
});
createSync(joinPath(readmeDirPath, 'README.md'), readmeTemplate);
}
async function createZipArchive(tmpPackageDir: string): Promise<Buffer> {
const zip = new AdmZip();
zip.addLocalFolder(tmpPackageDir);
const buffer = zip.toBuffer();
return buffer;
}
function createPackageManifest(packageDir: string, integration: Integration): void {
const uniqueInputs: { [key: string]: { type: string; title: string; description: string } } = {};
integration.dataStreams.forEach((dataStream: DataStream) => {
dataStream.inputTypes.forEach((inputType: string) => {
if (!uniqueInputs[inputType]) {
uniqueInputs[inputType] = {
type: inputType,
title: dataStream.title,
description: dataStream.description,
};
}
});
});
const uniqueInputsList = Object.values(uniqueInputs);
const packageManifest = nunjucks.render('package_manifest.yml.njk', {
format_version: '3.1.4',
package_title: integration.title,
package_name: integration.name,
package_version: '0.1.0',
package_description: integration.description,
package_owner: '@elastic/custom-integrations',
min_version: '^8.13.0',
inputs: uniqueInputsList,
});
createSync(joinPath(packageDir, 'manifest.yml'), packageManifest);
}

View file

@ -0,0 +1,122 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/
import nunjucks from 'nunjucks';
import { join as joinPath } from 'path';
import type { DataStream } from '../../common';
import { copySync, createSync, ensureDirSync, listDirSync } from '../util';
export function createDatastream(
packageName: string,
specificDataStreamDir: string,
dataStream: DataStream
): void {
const dataStreamName = dataStream.name;
const pipelineDir = joinPath(specificDataStreamDir, 'elasticsearch', 'ingest_pipeline');
const title = dataStream.title;
const description = dataStream.description;
ensureDirSync(specificDataStreamDir);
createDataStreamFolders(specificDataStreamDir, pipelineDir);
createPipelineTests(specificDataStreamDir, dataStream.rawSamples, packageName, dataStreamName);
const dataStreams: string[] = [];
for (const inputType of dataStream.inputTypes) {
const mappedValues = {
data_stream_title: title,
data_stream_description: description,
package_name: packageName,
data_stream_name: dataStreamName,
};
const dataStreamManifest = nunjucks.render(
`${inputType.replaceAll('-', '_')}_manifest.yml.njk`,
mappedValues
);
const commonManifest = nunjucks.render('common_manifest.yml.njk', mappedValues);
const combinedManifest = `${dataStreamManifest}\n${commonManifest}`;
dataStreams.push(combinedManifest);
// We comment this out for now, as its not really needed for custom integrations
/* createDataStreamSystemTests(
specificDataStreamDir,
inputType,
mappedValues,
packageName,
dataStreamName
);
*/
}
const finalManifest = nunjucks.render('data_stream.yml.njk', {
title,
data_streams: dataStreams,
});
createSync(joinPath(specificDataStreamDir, 'manifest.yml'), finalManifest);
}
function createDataStreamFolders(specificDataStreamDir: string, pipelineDir: string): void {
const dataStreamTemplatesDir = joinPath(__dirname, '../templates/data_stream');
const items = listDirSync(dataStreamTemplatesDir);
for (const item of items) {
const s = joinPath(dataStreamTemplatesDir, item);
const d = joinPath(specificDataStreamDir, item);
copySync(s, d);
}
ensureDirSync(pipelineDir);
}
function createPipelineTests(
specificDataStreamDir: string,
rawSamples: string[],
packageName: string,
dataStreamName: string
): void {
const pipelineTestTemplatesDir = joinPath(__dirname, '../templates/pipeline_tests');
const pipelineTestsDir = joinPath(specificDataStreamDir, '_dev/test/pipeline');
ensureDirSync(pipelineTestsDir);
const items = listDirSync(pipelineTestTemplatesDir);
for (const item of items) {
const s = joinPath(pipelineTestTemplatesDir, item);
const d = joinPath(pipelineTestsDir, item.replaceAll('_', '-'));
copySync(s, d);
}
const formattedPackageName = packageName.replace(/_/g, '-');
const formattedDataStreamName = dataStreamName.replace(/_/g, '-');
const testFileName = joinPath(
pipelineTestsDir,
`test-${formattedPackageName}-${formattedDataStreamName}.log`
);
createSync(testFileName, rawSamples.join('\n'));
}
// We are skipping this one for now, as its not really needed for custom integrations
/* function createDataStreamSystemTests(
specificDataStreamDir: string,
inputType: string,
mappedValues: Record<string, string>,
packageName: string,
dataStreamName: string
): void {
const systemTestTemplatesDir = joinPath(__dirname, '../templates/system_tests');
nunjucks.configure({ autoescape: true });
const env = new nunjucks.Environment(new nunjucks.FileSystemLoader(systemTestTemplatesDir));
mappedValues.package_name = packageName.replace(/_/g, '-');
mappedValues.data_stream_name = dataStreamName.replace(/_/g, '-');
const systemTestFolder = joinPath(specificDataStreamDir, '_dev/test/system');
fs.mkdirSync(systemTestFolder, { recursive: true });
const systemTestTemplate = env.getTemplate(`test_${inputType.replaceAll('-', '_')}_config.yml.njk`);
const systemTestRendered = systemTestTemplate.render(mappedValues);
const systemTestFileName = joinPath(systemTestFolder, `test-${inputType}-config.yml`);
fs.writeFileSync(systemTestFileName, systemTestRendered, 'utf-8');
}*/

View file

@ -0,0 +1,53 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/
import { join as joinPath } from 'path';
import nunjucks from 'nunjucks';
import type { Integration } from '../../common';
import { ensureDirSync, createSync } from '../util';
export function createPackageSystemTests(integrationDir: string, integration: Integration) {
const systemTestsDockerDir = joinPath(integrationDir, '_dev/deploy/docker/');
const systemTestsSamplesDir = joinPath(systemTestsDockerDir, 'sample_logs');
ensureDirSync(systemTestsSamplesDir);
const streamVersion = '0.13.0';
const dockerComposeVersion = '2.3';
const dockerServices: string[] = [];
for (const stream of integration.dataStreams) {
const packageName = integration.name.replace(/_/g, '-');
const dataStreamName = stream.name.replace(/_/g, '-');
const systemTestFileName = joinPath(
systemTestsSamplesDir,
`test-${packageName}-${dataStreamName}.log`
);
const rawSamplesContent = stream.rawSamples.join('\n');
createSync(systemTestFileName, rawSamplesContent);
for (const inputType of stream.inputTypes) {
const mappedValues = {
package_name: packageName,
data_stream_name: dataStreamName,
stream_version: streamVersion,
};
const renderedService = nunjucks.render(
`service_${inputType.replaceAll('_', '-')}.njk`,
mappedValues
);
dockerServices.push(renderedService);
}
}
const renderedDockerCompose = nunjucks.render('docker_compose.yml.njk', {
services: dockerServices.join('\n'),
docker_compose_version: dockerComposeVersion,
});
const dockerComposeFileName = joinPath(systemTestsDockerDir, 'docker-compose.yml');
createSync(dockerComposeFileName, renderedDockerCompose);
}

View file

@ -0,0 +1,40 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/
import nunjucks from 'nunjucks';
import { createSync, generateFields, mergeSamples } from '../util';
export function createFieldMapping(
packageName: string,
dataStreamName: string,
specificDataStreamDir: string,
docs: object[]
): void {
createBaseFields(specificDataStreamDir, packageName, dataStreamName);
createCustomFields(specificDataStreamDir, docs);
}
function createBaseFields(
specificDataStreamDir: string,
packageName: string,
dataStreamName: string
): void {
const datasetName = `${packageName}.${dataStreamName}`;
const baseFields = nunjucks.render('base_fields.yml.njk', {
module: packageName,
dataset: datasetName,
});
createSync(`${specificDataStreamDir}/base-fields.yml`, baseFields);
}
function createCustomFields(specificDataStreamDir: string, pipelineResults: object[]): void {
const mergedResults = mergeSamples(pipelineResults);
const fieldKeys = generateFields(mergedResults);
createSync(`${specificDataStreamDir}/fields/fields.yml`, fieldKeys);
}

View file

@ -0,0 +1,8 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/
export { buildPackage } from './build_integration';

View file

@ -0,0 +1,15 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/
import { join as joinPath } from 'path';
import yaml from 'js-yaml';
import { createSync } from '../util';
export function createPipeline(specificDataStreamDir: string, pipeline: object): void {
const filePath = joinPath(specificDataStreamDir, 'elasticsearch/ingest_pipeline/default.yml');
const yamlContent = `---\n${yaml.dump(pipeline, { sortKeys: false })}`;
createSync(filePath, yamlContent);
}

View file

@ -0,0 +1,63 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/
import type {
Plugin,
PluginInitializerContext,
CoreSetup,
CoreStart,
Logger,
CustomRequestHandlerContext,
} from '@kbn/core/server';
import type { PluginStartContract as ActionsPluginsStart } from '@kbn/actions-plugin/server/plugin';
import { registerRoutes } from './routes';
import type { IntegrationAssistantPluginSetup, IntegrationAssistantPluginStart } from './types';
export type IntegrationAssistantRouteHandlerContext = CustomRequestHandlerContext<{
integrationAssistant: {
getStartServices: CoreSetup<{
actions: ActionsPluginsStart;
}>['getStartServices'];
logger: Logger;
};
}>;
export class IntegrationAssistantPlugin
implements Plugin<IntegrationAssistantPluginSetup, IntegrationAssistantPluginStart>
{
private readonly logger: Logger;
constructor(initializerContext: PluginInitializerContext) {
this.logger = initializerContext.logger.get();
}
public setup(
core: CoreSetup<{
actions: ActionsPluginsStart;
}>
) {
core.http.registerRouteHandlerContext<
IntegrationAssistantRouteHandlerContext,
'integrationAssistant'
>('integrationAssistant', () => ({
getStartServices: core.getStartServices,
logger: this.logger,
}));
const router = core.http.createRouter<IntegrationAssistantRouteHandlerContext>();
this.logger.debug('integrationAssistant api: Setup');
registerRoutes(router);
return {};
}
public start(core: CoreStart) {
this.logger.debug('integrationAssistant api: Started');
return {};
}
public stop() {}
}

View file

@ -0,0 +1,72 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/
import { schema } from '@kbn/config-schema';
import type { IRouter } from '@kbn/core/server';
import type { BuildIntegrationApiRequest } from '../../common';
import { INTEGRATION_BUILDER_PATH } from '../../common';
import { buildPackage } from '../integration_builder';
import type { IntegrationAssistantRouteHandlerContext } from '../plugin';
export function registerIntegrationBuilderRoutes(
router: IRouter<IntegrationAssistantRouteHandlerContext>
) {
router.versioned
.post({
path: INTEGRATION_BUILDER_PATH,
access: 'internal',
})
.addVersion(
{
version: '1',
validate: {
request: {
body: schema.object({
integration: schema.object({
name: schema.string(),
title: schema.string(),
description: schema.string(),
logo: schema.maybe(schema.string()),
dataStreams: schema.arrayOf(
schema.object({
name: schema.string(),
title: schema.string(),
description: schema.string(),
inputTypes: schema.arrayOf(schema.string()),
rawSamples: schema.arrayOf(schema.string()),
pipeline: schema.object({
name: schema.maybe(schema.string()),
description: schema.maybe(schema.string()),
version: schema.maybe(schema.number()),
processors: schema.arrayOf(
schema.recordOf(schema.string(), schema.object({}, { unknowns: 'allow' }))
),
on_failure: schema.maybe(
schema.arrayOf(
schema.recordOf(schema.string(), schema.object({}, { unknowns: 'allow' }))
)
),
}),
docs: schema.arrayOf(schema.object({}, { unknowns: 'allow' })),
})
),
}),
}),
},
},
},
async (_, request, response) => {
const { integration } = request.body as BuildIntegrationApiRequest;
try {
const zippedIntegration = await buildPackage(integration);
return response.custom({ statusCode: 200, body: zippedIntegration });
} catch (e) {
return response.customError({ statusCode: 500, body: e });
}
}
);
}

View file

@ -0,0 +1,99 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/
import { schema } from '@kbn/config-schema';
import type { IRouter } from '@kbn/core/server';
import { getRequestAbortedSignal } from '@kbn/data-plugin/server';
import {
ActionsClientChatOpenAI,
ActionsClientSimpleChatModel,
} from '@kbn/langchain/server/language_models';
import type { CategorizationApiRequest, CategorizationApiResponse } from '../../common';
import { CATEGORIZATION_GRAPH_PATH } from '../../common';
import { ROUTE_HANDLER_TIMEOUT } from '../constants';
import { getCategorizationGraph } from '../graphs/categorization';
import type { IntegrationAssistantRouteHandlerContext } from '../plugin';
export function registerCategorizationRoutes(
router: IRouter<IntegrationAssistantRouteHandlerContext>
) {
router.versioned
.post({
path: CATEGORIZATION_GRAPH_PATH,
access: 'internal',
options: {
timeout: {
idleSocket: ROUTE_HANDLER_TIMEOUT,
},
},
})
.addVersion(
{
version: '1',
validate: {
request: {
body: schema.object({
packageName: schema.string(),
dataStreamName: schema.string(),
rawSamples: schema.arrayOf(schema.string()),
currentPipeline: schema.any(),
connectorId: schema.maybe(schema.string()),
model: schema.maybe(schema.string()),
region: schema.maybe(schema.string()),
}),
},
},
},
async (context, req, res) => {
const { packageName, dataStreamName, rawSamples, currentPipeline } =
req.body as CategorizationApiRequest;
const services = await context.resolve(['core']);
const { client } = services.core.elasticsearch;
const { getStartServices, logger } = await context.integrationAssistant;
const [, { actions: actionsPlugin }] = await getStartServices();
const actionsClient = await actionsPlugin.getActionsClientWithRequest(req);
const connector = req.body.connectorId
? await actionsClient.get({ id: req.body.connectorId })
: (await actionsClient.getAll()).filter(
(connectorItem) => connectorItem.actionTypeId === '.bedrock'
)[0];
const abortSignal = getRequestAbortedSignal(req.events.aborted$);
const isOpenAI = connector.actionTypeId === '.gen-ai';
const llmClass = isOpenAI ? ActionsClientChatOpenAI : ActionsClientSimpleChatModel;
const model = new llmClass({
actions: actionsPlugin,
connectorId: connector.id,
request: req,
logger,
llmType: isOpenAI ? 'openai' : 'bedrock',
model: req.body.model || connector.config?.defaultModel,
temperature: 0.05,
maxTokens: 4096,
signal: abortSignal,
streaming: false,
});
const graph = await getCategorizationGraph(client, model);
let results = { results: { docs: {}, pipeline: {} } };
try {
results = (await graph.invoke({
packageName,
dataStreamName,
rawSamples,
currentPipeline,
})) as CategorizationApiResponse;
} catch (e) {
return res.badRequest({ body: e });
}
return res.ok({ body: results });
}
);
}

View file

@ -0,0 +1,103 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/
import { schema } from '@kbn/config-schema';
import type { IRouter } from '@kbn/core/server';
import { getRequestAbortedSignal } from '@kbn/data-plugin/server';
import {
ActionsClientChatOpenAI,
ActionsClientSimpleChatModel,
} from '@kbn/langchain/server/language_models';
import { ECS_GRAPH_PATH } from '../../common';
import type { EcsMappingApiRequest, EcsMappingApiResponse } from '../../common/types';
import { ROUTE_HANDLER_TIMEOUT } from '../constants';
import { getEcsGraph } from '../graphs/ecs';
import type { IntegrationAssistantRouteHandlerContext } from '../plugin';
export function registerEcsRoutes(router: IRouter<IntegrationAssistantRouteHandlerContext>) {
router.versioned
.post({
path: ECS_GRAPH_PATH,
access: 'internal',
options: {
timeout: {
idleSocket: ROUTE_HANDLER_TIMEOUT,
},
},
})
.addVersion(
{
version: '1',
validate: {
request: {
body: schema.object({
packageName: schema.string(),
dataStreamName: schema.string(),
rawSamples: schema.arrayOf(schema.string()),
// TODO: This is a single nested object of any key or shape, any better schema?
mapping: schema.maybe(schema.any()),
connectorId: schema.maybe(schema.string()),
region: schema.maybe(schema.string()),
model: schema.maybe(schema.string()),
}),
},
},
},
async (context, req, res) => {
const { packageName, dataStreamName, rawSamples, mapping } =
req.body as EcsMappingApiRequest;
const { getStartServices, logger } = await context.integrationAssistant;
const [, { actions: actionsPlugin }] = await getStartServices();
const actionsClient = await actionsPlugin.getActionsClientWithRequest(req);
const connector = req.body.connectorId
? await actionsClient.get({ id: req.body.connectorId })
: (await actionsClient.getAll()).filter(
(connectorItem) => connectorItem.actionTypeId === '.bedrock'
)[0];
const abortSignal = getRequestAbortedSignal(req.events.aborted$);
const isOpenAI = connector.actionTypeId === '.gen-ai';
const llmClass = isOpenAI ? ActionsClientChatOpenAI : ActionsClientSimpleChatModel;
const model = new llmClass({
actions: actionsPlugin,
connectorId: connector.id,
request: req,
logger,
llmType: isOpenAI ? 'openai' : 'bedrock',
model: req.body.model || connector.config?.defaultModel,
temperature: 0.05,
maxTokens: 4096,
signal: abortSignal,
streaming: false,
});
const graph = await getEcsGraph(model);
let results = { results: { mapping: {}, pipeline: {} } };
try {
if (req.body?.mapping) {
results = (await graph.invoke({
packageName,
dataStreamName,
rawSamples,
mapping,
})) as EcsMappingApiResponse;
} else
results = (await graph.invoke({
packageName,
dataStreamName,
rawSamples,
})) as EcsMappingApiResponse;
} catch (e) {
return res.badRequest({ body: e });
}
return res.ok({ body: results });
}
);
}

View file

@ -0,0 +1,8 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/
export { registerRoutes } from './register_routes';

View file

@ -0,0 +1,60 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/
import { schema } from '@kbn/config-schema';
import type { IRouter } from '@kbn/core/server';
import { TEST_PIPELINE_PATH } from '../../common';
import type { TestPipelineApiRequest, TestPipelineApiResponse } from '../../common/types';
import { ROUTE_HANDLER_TIMEOUT } from '../constants';
import type { IntegrationAssistantRouteHandlerContext } from '../plugin';
import { testPipeline } from '../util/pipeline';
export function registerPipelineRoutes(router: IRouter<IntegrationAssistantRouteHandlerContext>) {
router.versioned
.post({
path: TEST_PIPELINE_PATH,
access: 'internal',
options: {
timeout: {
idleSocket: ROUTE_HANDLER_TIMEOUT,
},
},
})
.addVersion(
{
version: '1',
validate: {
request: {
body: schema.object({
pipeline: schema.any(),
rawSamples: schema.arrayOf(schema.string()),
}),
},
},
},
async (context, req, res) => {
const { rawSamples, currentPipeline } = req.body as TestPipelineApiRequest;
const services = await context.resolve(['core']);
const { client } = services.core.elasticsearch;
let results: TestPipelineApiResponse = { pipelineResults: [], errors: [] };
try {
results = (await testPipeline(
rawSamples,
currentPipeline,
client
)) as TestPipelineApiResponse;
if (results?.errors && results.errors.length > 0) {
return res.badRequest({ body: JSON.stringify(results.errors) });
}
} catch (e) {
return res.badRequest({ body: e });
}
return res.ok({ body: results });
}
);
}

View file

@ -0,0 +1,22 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/
import type { IRouter } from '@kbn/core/server';
import { registerEcsRoutes } from './ecs_routes';
import { registerIntegrationBuilderRoutes } from './build_integration_routes';
import { registerCategorizationRoutes } from './categorization_routes';
import { registerRelatedRoutes } from './related_routes';
import { registerPipelineRoutes } from './pipeline_routes';
import type { IntegrationAssistantRouteHandlerContext } from '../plugin';
export function registerRoutes(router: IRouter<IntegrationAssistantRouteHandlerContext>) {
registerEcsRoutes(router);
registerIntegrationBuilderRoutes(router);
registerCategorizationRoutes(router);
registerRelatedRoutes(router);
registerPipelineRoutes(router);
}

View file

@ -0,0 +1,98 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/
import { schema } from '@kbn/config-schema';
import type { IRouter } from '@kbn/core/server';
import { getRequestAbortedSignal } from '@kbn/data-plugin/server';
import {
ActionsClientChatOpenAI,
ActionsClientSimpleChatModel,
} from '@kbn/langchain/server/language_models';
import { RELATED_GRAPH_PATH } from '../../common';
import type { RelatedApiRequest, RelatedApiResponse } from '../../common/types';
import { ROUTE_HANDLER_TIMEOUT } from '../constants';
import { getRelatedGraph } from '../graphs/related';
import type { IntegrationAssistantRouteHandlerContext } from '../plugin';
export function registerRelatedRoutes(router: IRouter<IntegrationAssistantRouteHandlerContext>) {
router.versioned
.post({
path: RELATED_GRAPH_PATH,
access: 'internal',
options: {
timeout: {
idleSocket: ROUTE_HANDLER_TIMEOUT,
},
},
})
.addVersion(
{
version: '1',
validate: {
request: {
body: schema.object({
packageName: schema.string(),
dataStreamName: schema.string(),
rawSamples: schema.arrayOf(schema.string()),
// TODO: This is a single nested object of any key or shape, any better schema?
currentPipeline: schema.maybe(schema.any()),
connectorId: schema.maybe(schema.string()),
region: schema.maybe(schema.string()),
model: schema.maybe(schema.string()),
}),
},
},
},
async (context, req, res) => {
const { packageName, dataStreamName, rawSamples, currentPipeline } =
req.body as RelatedApiRequest;
const services = await context.resolve(['core']);
const { client } = services.core.elasticsearch;
const { getStartServices, logger } = await context.integrationAssistant;
const [, { actions: actionsPlugin }] = await getStartServices();
const actionsClient = await actionsPlugin.getActionsClientWithRequest(req);
const connector = req.body.connectorId
? await actionsClient.get({ id: req.body.connectorId })
: (await actionsClient.getAll()).filter(
(connectorItem) => connectorItem.actionTypeId === '.bedrock'
)[0];
const isOpenAI = connector.actionTypeId === '.gen-ai';
const llmClass = isOpenAI ? ActionsClientChatOpenAI : ActionsClientSimpleChatModel;
const abortSignal = getRequestAbortedSignal(req.events.aborted$);
const model = new llmClass({
actions: actionsPlugin,
connectorId: connector.id,
request: req,
logger,
llmType: isOpenAI ? 'openai' : 'bedrock',
model: req.body.model || connector.config?.defaultModel,
temperature: 0.05,
maxTokens: 4096,
signal: abortSignal,
streaming: false,
});
const graph = await getRelatedGraph(client, model);
let results = { results: { docs: {}, pipeline: {} } };
try {
results = (await graph.invoke({
packageName,
dataStreamName,
rawSamples,
currentPipeline,
})) as RelatedApiResponse;
} catch (e) {
return res.badRequest({ body: e });
}
return res.ok({ body: results });
}
);
}

View file

@ -0,0 +1,76 @@
{{#unless log_group_name}}
{{#unless log_group_name_prefix}}
{{#if log_group_arn }}
log_group_arn: {{ log_group_arn }}
{{/if}}
{{/unless}}
{{/unless}}
{{#unless log_group_arn}}
{{#unless log_group_name}}
{{#if log_group_name_prefix }}
log_group_name_prefix: {{ log_group_name_prefix }}
{{/if}}
{{/unless}}
{{/unless}}
{{#unless log_group_arn}}
{{#unless log_group_name_prefix}}
{{#if log_group_name }}
log_group_name: {{ log_group_name }}
{{/if}}
{{/unless}}
{{/unless}}
{{#unless log_group_arn}}
region_name: {{ region_name }}
{{/unless}}
{{#unless log_stream_prefix}}
{{#if log_streams }}
log_streams: {{ log_streams }}
{{/if}}
{{/unless}}
{{#unless log_streams}}
{{#if log_stream_prefix }}
log_stream_prefix: {{ log_stream_prefix }}
{{/if}}
{{/unless}}
{{#if start_position }}
start_position: {{ start_position }}
{{/if}}
{{#if scan_frequency }}
scan_frequency: {{ scan_frequency }}
{{/if}}
{{#if api_sleep }}
api_sleep: {{ api_sleep }}
{{/if}}
{{#if api_timeout}}
api_timeout: {{api_timeout}}
{{/if}}
{{#if latency }}
latency: {{ latency }}
{{/if}}
{{#if number_of_workers }}
number_of_workers: {{ number_of_workers }}
{{/if}}
{{#if credential_profile_name}}
credential_profile_name: {{credential_profile_name}}
{{/if}}
{{#if shared_credential_file}}
shared_credential_file: {{shared_credential_file}}
{{/if}}
{{#if default_region}}
default_region: {{default_region}}
{{/if}}
{{#if access_key_id}}
access_key_id: {{access_key_id}}
{{/if}}
{{#if secret_access_key}}
secret_access_key: {{secret_access_key}}
{{/if}}
{{#if session_token}}
session_token: {{session_token}}
{{/if}}
{{#if role_arn}}
role_arn: {{role_arn}}
{{/if}}
{{#if proxy_url }}
proxy_url: {{proxy_url}}
{{/if}}

View file

@ -0,0 +1,130 @@
{{! start SQS queue }}
{{#unless bucket_arn}}
{{#unless non_aws_bucket_name}}
{{#if queue_url }}
queue_url: {{ queue_url }}
{{/if}}
{{/unless}}
{{/unless}}
{{! end SQS queue }}
{{#unless queue_url}}{{! start S3 bucket polling }}
{{!
When using an S3 bucket, you can specify only one of the following options:
- An AWS bucket ARN
- A non-AWS bucket name
}}
{{! shared S3 bucket polling options }}
{{#if number_of_workers }}
number_of_workers: {{ number_of_workers }}
{{/if}}
{{#if bucket_list_prefix }}
bucket_list_prefix: {{ bucket_list_prefix }}
{{/if}}
{{#if bucket_list_interval }}
bucket_list_interval: {{ bucket_list_interval }}
{{/if}}
{{! AWS S3 bucket ARN options }}
{{#unless non_aws_bucket_name}}
{{#if bucket_arn }}
bucket_arn: {{ bucket_arn }}
{{/if}}
{{/unless}}{{! end AWS S3 bucket ARN options }}
{{! non-AWS S3 bucket ARN options }}
{{#unless bucket_arn}}
{{#if non_aws_bucket_name }}
non_aws_bucket_name: {{ non_aws_bucket_name }}
{{/if}}
{{/unless}}{{! end non-AWS S3 bucket ARN options }}
{{/unless}}{{! end S3 bucket polling }}
{{#if buffer_size }}
buffer_size: {{ buffer_size }}
{{/if}}
{{#if content_type }}
content_type: {{ content_type }}
{{/if}}
{{#if encoding }}
encoding: {{ encoding }}
{{/if}}
{{#if expand_event_list_from_field }}
expand_event_list_from_field: {{ expand_event_list_from_field }}
{{/if}}
{{#if buffer_size }}
buffer_size: {{ buffer_size }}
{{/if}}
{{#if fips_enabled }}
fips_enabled: {{ fips_enabled }}
{{/if}}
{{#if include_s3_metadata }}
include_s3_metadata: {{ include_s3_metadata }}
{{/if}}
{{#if max_bytes }}
max_bytes: {{ max_bytes }}
{{/if}}
{{#if max_number_of_messages }}
max_number_of_messages: {{ max_number_of_messages }}
{{/if}}
{{#if path_style }}
path_style: {{ path_style }}
{{/if}}
{{#if provider }}
provider: {{ provider }}
{{/if}}
{{#if sqs.max_receive_count }}
sqs.max_receive_count: {{ sqs.max_receive_count }}
{{/if}}
{{#if sqs.wait_time }}
sqs.wait_time: {{ sqs.wait_time }}
{{/if}}
{{#if file_selectors}}
file_selectors:
{{file_selectors}}
{{/if}}
{{#if credential_profile_name}}
credential_profile_name: {{credential_profile_name}}
{{/if}}
{{#if shared_credential_file}}
shared_credential_file: {{shared_credential_file}}
{{/if}}
{{#if visibility_timeout}}
visibility_timeout: {{visibility_timeout}}
{{/if}}
{{#if api_timeout}}
api_timeout: {{api_timeout}}
{{/if}}
{{#if endpoint}}
endpoint: {{endpoint}}
{{/if}}
{{#if default_region}}
default_region: {{default_region}}
{{/if}}
{{#if access_key_id}}
access_key_id: {{access_key_id}}
{{/if}}
{{#if secret_access_key}}
secret_access_key: {{secret_access_key}}
{{/if}}
{{#if session_token}}
session_token: {{session_token}}
{{/if}}
{{#if role_arn}}
role_arn: {{role_arn}}
{{/if}}
{{#if fips_enabled}}
fips_enabled: {{fips_enabled}}
{{/if}}
{{#if proxy_url }}
proxy_url: {{proxy_url}}
{{/if}}
{{#if parsers}}
parsers:
{{parsers}}
{{/if}}

View file

@ -0,0 +1,35 @@
{{#if account_name}}
account_name: {{account_name}}
{{/if}}
{{#if service_account_key}}
auth.shared_credentials.account_key: {{service_account_key}}
{{/if}}
{{#if service_account_uri}}
auth.connection_string.uri: {{service_account_uri}}
{{/if}}
{{#if storage_url}}
storage_url: {{storage_url}}
{{/if}}
{{#if number_of_workers}}
max_workers: {{number_of_workers}}
{{/if}}
{{#if poll}}
poll: {{poll}}
{{/if}}
{{#if poll_interval}}
poll_interval: {{poll_interval}}
{{/if}}
{{#if containers}}
containers:
{{containers}}
{{/if}}
{{#if file_selectors}}
file_selectors:
{{file_selectors}}
{{/if}}
{{#if timestamp_epoch}}
timestamp_epoch: {{timestamp_epoch}}
{{/if}}
{{#if expand_event_list_from_field}}
expand_event_list_from_field: {{expand_event_list_from_field}}
{{/if}}

View file

@ -0,0 +1,28 @@
{{#if eventhub}}
eventhub: {{eventhub}}
{{/if}}
{{#if consumer_group}}
consumer_group: {{consumer_group}}
{{/if}}
{{#if connection_string}}
connection_string: {{connection_string}}
{{/if}}
{{#if storage_account}}
storage_account: {{storage_account}}
{{/if}}
{{#if storage_account_key}}
storage_account_key: {{storage_account_key}}
{{/if}}
{{#if storage_account_container}}
storage_account_container: {{storage_account_container}}
{{/if}}
{{#if resource_manager_endpoint}}
resource_manager_endpoint: {{resource_manager_endpoint}}
{{/if}}
sanitize_options:
{{#if sanitize_newlines}}
- NEW_LINES
{{/if}}
{{#if sanitize_singlequotes}}
- SINGLE_QUOTES
{{/if}}

View file

@ -0,0 +1,24 @@
{{#if api_address}}
api_address: {{api_address}}
{{/if}}
{{#if doppler_address}}
doppler_address: {{doppler_address}}
{{/if}}
{{#if uaa_address}}
uaa_address: {{uaa_address}}
{{/if}}
{{#if rlp_address}}
rlp_address: {{rlp_address}}
{{/if}}
{{#if client_id}}
client_id: {{client_id}}
{{/if}}
{{#if client_secret}}
client_secret: {{client_secret}}
{{/if}}
{{#if version}}
version: {{version}}
{{/if}}
{{#if shard_id}}
shard_id: {{shard_id}}
{{/if}}

View file

@ -0,0 +1,14 @@
tags:
{{#if preserve_original_event}}
- preserve_original_event
{{/if}}
{{#each tags as |tag|}}
- {{tag}}
{{/each}}
{{#contains "forwarded" tags}}
publisher_pipeline.disable_host: true
{{/contains}}
{{#if processors}}
processors:
{{processors}}
{{/if}}

View file

@ -0,0 +1,13 @@
paths:
{{#each paths as |path|}}
- {{path}}
{{/each}}
{{#if exclude_files}}
prospector.scanner.exclude_files:
{{#each exclude_files as |pattern f|}}
- {{pattern}}
{{/each}}
{{/if}}
{{#if custom}}
{{custom}}
{{/if}}

View file

@ -0,0 +1,27 @@
{{#if project_id}}
project_id: {{project_id}}
{{/if}}
{{#if topic}}
topic: {{topic}}
{{/if}}
{{#if subscription_name}}
subscription.name: {{subscription_name}}
{{/if}}
{{#if subscription_create}}
subscription.create: {{subscription_create}}
{{/if}}
{{#if subscription_num_goroutines}}
subscription.num_goroutines: {{subscription_num_goroutines}}
{{/if}}
{{#if subscription_max_outstanding_messages}}
subscription.max_outstanding_messages: {{subscription_max_outstanding_messages}}
{{/if}}
{{#if credentials_file}}
credentials_file: {{credentials_file}}
{{/if}}
{{#if credentials_json}}
credentials_json: '{{credentials_json}}'
{{/if}}
{{#if alternative_host}}
alternative_host: {{alternative_host}}
{{/if}}

View file

@ -0,0 +1,35 @@
{{#if project_id}}
project_id: {{project_id}}
{{/if}}
{{#if alternative_host}}
alternative_host: {{alternative_host}}
{{/if}}
{{#if service_account_key}}
auth.credentials_json.account_key: {{service_account_key}}
{{/if}}
{{#if service_account_file}}
auth.credentials_file.path: {{service_account_file}}
{{/if}}
{{#if number_of_workers}}
max_workers: {{number_of_workers}}
{{/if}}
{{#if poll}}
poll: {{poll}}
{{/if}}
{{#if poll_interval}}
poll_interval: {{poll_interval}}
{{/if}}
{{#if bucket_timeout}}
bucket_timeout: {{bucket_timeout}}
{{/if}}
{{#if buckets}}
buckets:
{{buckets}}
{{/if}}
{{#if file_selectors}}
file_selectors:
{{file_selectors}}
{{/if}}
{{#if timestamp_epoch}}
timestamp_epoch: {{timestamp_epoch}}
{{/if}}

View file

@ -0,0 +1,57 @@
{{#if listen_address}}
listen_address: {{listen_address}}
{{/if}}
{{#if listen_port}}
listen_port: {{listen_port}}
{{/if}}
{{#if prefix}}
prefix: {{prefix}}
{{/if}}
{{#if preserve_original_event}}
preserve_original_event: {{preserve_original_event}}
{{/if}}
{{#if basic_auth}}
basic_auth: {{basic_auth}}
{{/if}}
{{#if username}}
username: {{username}}
{{/if}}
{{#if password}}
password: {{password}}
{{/if}}
{{#if secret_header}}
secret.header: {{secret_header}}
{{/if}}
{{#if secret_value}}
secret.value: {{secret_value}}
{{/if}}
{{#if hmac_header}}
hmac.header: {{hmac_header}}
{{/if}}
{{#if hmac_key}}
hmac.key: {{hmac_key}}
{{/if}}
{{#if hmac_type}}
hmac.type: {{hmac_type}}
{{/if}}
{{#if hmac_prefix}}
hmac.prefix: {{hmac_prefix}}
{{/if}}
{{#if content_type}}
content_type: {{content_type}}
{{/if}}
{{#if response_code}}
response_code: {{response_code}}
{{/if}}
{{#if response_body}}
response_body: '{{response_body}}'
{{/if}}
{{#if url}}
url: {{url}}
{{/if}}
{{#if include_headers}}
include_headers:
{{#each include_headers as |header|}}
- {{header}}
{{/each}}
{{/if}}

View file

@ -0,0 +1,44 @@
condition: ${host.platform} == 'linux'
{{#if paths}}
paths:
{{#each paths as |path i|}}
- {{path}}
{{/each}}
{{/if}}
{{#if backoff}}
backoff: {{backoff}}
{{/if}}
{{#if max_backoff}}
max_backoff: {{max_backoff}}
{{/if}}
{{#if seek}}
seek: {{seek}}
{{/if}}
{{#if cursor_seek_fallback}}
cursor_seek_fallback: {{cursor_seek_fallback}}
{{/if}}
{{#if since}}
since: {{since}}
{{/if}}
{{#if units}}
units: {{units}}
{{/if}}
{{#if syslog_identifiers}}
syslog_identifiers:
{{#each syslog_identifiers as |identifier i|}}
- {{identifier}}
{{/each}}
{{/if}}
{{#if transports}}
transports:
{{#each transports as |transport i|}}
- {{transport}}
{{/each}}
{{/if}}
{{#if include_matches}}
include_matches:
{{#each include_matches as |match i|}}
- {{match}}
{{/each}}
{{/if}}

View file

@ -0,0 +1,100 @@
{{#if hosts}}
hosts:
{{#each hosts as |host i|}}
- {{host}}
{{/each}}
{{/if}}
{{#if topics}}
topics:
{{#each topics as |topic i|}}
- {{topic}}
{{/each}}
{{/if}}
{{#if group_id}}
group_id: {{group_id}}
{{/if}}
{{#if client_id}}
client_id: {{client_id}}
{{/if}}
{{#if username}}
username: {{username}}
{{/if}}
{{#if password}}
password: {{password}}
{{/if}}
{{#if version}}
version: {{version}}
{{/if}}
{{#if initial_offset}}
initial_offset: {{initial_offset}}
{{/if}}
{{#if connect_backoff}}
connect_backoff: {{connect_backoff}}
{{/if}}
{{#if consume_backoff}}
consume_backoff: {{consume_backoff}}
{{/if}}
{{#if max_wait_time}}
max_wait_time: {{max_wait_time}}
{{/if}}
{{#if wait_close}}
wait_close: {{wait_close}}
{{/if}}
{{#if isolation_level}}
isolation_level: {{isolation_level}}
{{/if}}
{{#if expand_event_list_from_field}}
expand_event_list_from_field: {{expand_event_list_from_field}}
{{/if}}
{{#if fetch_min}}
fetch.min: {{fetch_min}}
{{/if}}
{{#if fetch_default}}
fetch.default: {{fetch_default}}
{{/if}}
{{#if fetch_max}}
fetch.max: {{fetch_max}}
{{/if}}
{{#if rebalance_strategy}}
rebalance.strategy: {{rebalance_strategy}}
{{/if}}
{{#if rebalance_timeout}}
rebalance.timeout: {{rebalance_timeout}}
{{/if}}
{{#if rebalance_max_retries}}
rebalance.max_retries: {{rebalance_max_retries}}
{{/if}}
{{#if rebalance_retry_backoff}}
rebalance.retry_backoff: {{rebalance_retry_backoff}}
{{/if}}
{{#if parsers}}
parsers:
{{parsers}}
{{/if}}
{{#if kerberos_enabled}}
kerberos.enabled: {{kerberos_enabled}}
{{/if}}
{{#if kerberos_auth_type}}
kerberos.auth_type: {{kerberos_auth_type}}
{{/if}}
{{#if kerberos_config_path}}
kerberos.config_path: {{kerberos_config_path}}
{{/if}}
{{#if kerberos_username}}
kerberos.username: {{kerberos_username}}
{{/if}}
{{#if kerberos_password}}
kerberos.password: {{kerberos_password}}
{{/if}}
{{#if kerberos_keytab}}
kerberos.keytab: {{kerberos_keytab}}
{{/if}}
{{#if kerberos_service_name}}
kerberos.service_name: {{kerberos_service_name}}
{{/if}}
{{#if kerberos_realm}}
kerberos.realm: {{kerberos_realm}}
{{/if}}
{{#if kerberos_enable_krb5_fast}}
kerberos.enable_krb5_fast: {{kerberos_enable_krb5_fast}}
{{/if}}

View file

@ -0,0 +1,13 @@
paths:
{{#each paths as |path i|}}
- {{path}}
{{/each}}
{{#if exclude_files}}
exclude_files:
{{#each exclude_files as |file f|}}
- {{file}}
{{/each}}
{{/if}}
{{#if custom}}
{{custom}}
{{/if}}

View file

@ -0,0 +1,19 @@
host: {{listen_address}}:{{listen_port}}
{{#if max_message_size}}
max_message_size: {{max_message_size}}
{{/if}}
{{#if framing}}
framing: {{framing}}
{{/if}}
{{#if line_delimiter}}
line_delimiter: {{line_delimiter}}
{{/if}}
{{#if max_connections}}
max_connections: {{max_connections}}
{{/if}}
{{#if timeout}}
timeout: {{timeout}}
{{/if}}
{{#if keep_null}}
keep_null: {{keep_null}}
{{/if}}

View file

@ -0,0 +1,10 @@
host: {{listen_address}}:{{listen_port}}
{{#if max_message_size}}
max_message_size: {{max_message_size}}
{{/if}}
{{#if timeout}}
timeout: {{timeout}}
{{/if}}
{{#if keep_null}}
keep_null: {{keep_null}}
{{/if}}

View file

@ -0,0 +1,20 @@
- name: data_stream.type
type: constant_keyword
description: Data stream type.
- name: data_stream.dataset
type: constant_keyword
description: Data stream dataset name.
- name: data_stream.namespace
type: constant_keyword
description: Data stream namespace.
- name: event.module
type: constant_keyword
description: Event module
value: {{ module }}
- name: event.dataset
type: constant_keyword
description: Event dataset
value: {{ dataset }}
- name: "@timestamp"
type: date
description: Event timestamp.

View file

@ -0,0 +1,3 @@
dependencies:
ecs:
reference: "git@{{ ecs_version }}"

View file

@ -0,0 +1,6 @@
# newer versions go on top
- version: {{ initial_version }}
changes:
- description: Initial Version
type: enhancement
link: https://github.com/elastic/integrations/pull/xxxx

View file

@ -0,0 +1,44 @@
- name: cloud
title: Cloud
group: 2
description: Fields related to the cloud or infrastructure the events are coming from.
footnote: 'Examples: If Metricbeat is running on an EC2 host and fetches data from its host, the cloud info contains the data about this machine. If Metricbeat runs on a remote machine outside the cloud and fetches data from a service running in the cloud, the field contains cloud data from the machine the service is running on.'
type: group
fields:
- name: image.id
type: keyword
description: Image ID for the cloud instance.
- name: container
title: Container
group: 2
description: 'Container fields are used for meta information about the specific container that is the source of information.
These fields help correlate data based containers from any runtime.'
type: group
fields:
- name: labels
level: extended
type: object
object_type: keyword
description: Image labels.
- name: host
title: Host
group: 2
description: 'A host is defined as a general computing instance.
ECS host.* fields should be populated with details about the host on which the event happened, or from which the measurement was taken. Host types include hardware, virtual machines, Docker containers, and Kubernetes nodes.'
type: group
fields:
- name: containerized
type: boolean
description: >
If the host is a container.
- name: os.build
type: keyword
example: "18D109"
description: >
OS build information.
- name: os.codename
type: keyword
example: "stretch"
description: >
OS codename, if any.

View file

@ -0,0 +1,30 @@
- name: input.type
type: keyword
description: Type of Filebeat input.
- name: log.flags
type: keyword
description: Flags for the log file.
- name: log.offset
type: long
description: Offset of the entry in the log file.
- name: log.file
type: group
fields:
- name: device_id
type: keyword
description: ID of the device containing the filesystem where the file resides.
- name: fingerprint
type: keyword
description: The sha256 fingerprint identity of the file when fingerprinting is enabled.
- name: inode
type: keyword
description: Inode number of the log file.
- name: idxhi
type: keyword
description: The high-order part of a unique identifier that is associated with a file. (Windows-only)
- name: idxlo
type: keyword
description: The low-order part of a unique identifier that is associated with a file. (Windows-only)
- name: vol
type: keyword
description: The serial number of the volume that contains a file. (Windows-only)

View file

@ -0,0 +1,4 @@
<svg width="32" height="32" viewBox="0 0 32 32" fill="none" xmlns="http://www.w3.org/2000/svg">
<path fill-rule="evenodd" clip-rule="evenodd" d="M17 13H8V15H17V13ZM24 18H8V20H24V18ZM8 23H24V25H8V23Z" fill="#017D73"/>
<path d="M21.41 0H5C3.34315 0 2 1.34315 2 3V29C2 30.6569 3.34315 32 5 32H27C28.6569 32 30 30.6569 30 29V8.59L21.41 0ZM22 3.41L26.59 8H22V3.41ZM27 30H5C4.44772 30 4 29.5523 4 29V3C4 2.44772 4.44772 2 5 2H20V10H28V29C28 29.5523 27.5523 30 27 30Z" fill="#343741"/>
</svg>

After

Width:  |  Height:  |  Size: 493 B

View file

@ -0,0 +1,92 @@
- input: aws-cloudwatch
template_path: aws-cloudwatch.yml.hbs
title: {{ data_stream_title }}
description: {{ data_stream_description }}
vars:
- name: log_group_arn
type: text
title: Log Group ARN
multi: false
required: false
show_user: true
description: ARN of the log group to collect logs from.
- name: start_position
type: text
title: Start Position
multi: false
required: false
default: beginning
show_user: true
description: Allows user to specify if this input should read log files from the beginning or from the end.
- name: log_group_name
type: text
title: Log Group Name
multi: false
required: false
show_user: false
description: Name of the log group to collect logs from. `region_name` is required when `log_group_name` is given.
- name: log_group_name_prefix
type: text
title: Log Group Name Prefix
multi: false
required: false
show_user: false
description: The prefix for a group of log group names. `region_name` is required when `log_group_name_prefix` is given. `log_group_name` and `log_group_name_prefix` cannot be given at the same time.
- name: region_name
type: text
title: Region Name
multi: false
required: false
show_user: false
description: Region that the specified log group or log group prefix belongs to.
- name: log_streams
type: text
title: Log Streams
multi: true
required: false
show_user: false
description: A list of strings of log streams names that Filebeat collect log events from.
- name: log_stream_prefix
type: text
title: Log Stream Prefix
multi: false
required: false
show_user: false
description: A string to filter the results to include only log events from log streams that have names starting with this prefix.
- name: scan_frequency
type: text
title: Scan Frequency
multi: false
required: false
show_user: false
default: 1m
description: This config parameter sets how often Filebeat checks for new log events from the specified log group.
- name: api_timeput
type: text
title: API Timeout
multi: false
required: false
show_user: false
default: 120s
description: The maximum duration of AWS API can take. If it exceeds the timeout, AWS API will be interrupted.
- name: api_sleep
type: text
title: API Sleep
multi: false
required: false
show_user: false
default: 200ms
description: This is used to sleep between AWS FilterLogEvents API calls inside the same collection period. `FilterLogEvents` API has a quota of 5 transactions per second (TPS)/account/Region. This value should only be adjusted when there are multiple Filebeats or multiple Filebeat inputs collecting logs from the same region and AWS account.
- name: latency
type: text
title: Latency
multi: false
required: false
show_user: false
description: "The amount of time required for the logs to be available to CloudWatch Logs. Sample values, `1m` or `5m` — see Golang [time.ParseDuration](https://pkg.go.dev/time#ParseDuration) for more details. Latency translates the query's time range to consider the CloudWatch Logs latency. Example: `5m` means that the integration will query CloudWatch to search for logs available 5 minutes ago."
- name: number_of_workers
type: integer
title: Number of workers
required: false
show_user: false
description: The number of workers assigned to reading from log groups. Each worker will read log events from one of the log groups matching `log_group_name_prefix`. For example, if `log_group_name_prefix` matches five log groups, then `number_of_workers` should be set to `5`. The default value is `1`.

View file

@ -0,0 +1,177 @@
- input: aws-s3
template_path: aws-s3.yml.hbs
title: {{ data_stream_title }}
description: {{ data_stream_description }}
vars:
- name: bucket_arn
type: text
title: Bucket ARN
multi: false
required: false
show_user: true
description: ARN of the AWS S3 bucket that will be polled for list operation. (Required when `queue_url` and `non_aws_bucket_name` are not set).
- name: queue_url
type: text
title: Queue URL
multi: false
required: false
show_user: true
description: URL of the AWS SQS queue that messages will be received from.
- name: number_of_workers
type: integer
title: Number of Workers
multi: false
required: false
default: 1
show_user: true
description: Number of workers that will process the S3 objects listed. (Required when `bucket_arn` is set).
- name: parsers
type: yaml
title: Parsers
description: >-
This option expects a list of parsers that the payload has to go through. For more information see [Parsers](https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-input-aws-s3.html#input-aws-s3-parsers)
required: false
show_user: true
multi: false
default: |
#- multiline:
# pattern: "^<Event"
# negate: true
# match: after
- name: api_timeout
type: text
title: API Timeout
multi: false
required: false
show_user: false
description: The maximum duration of AWS API can take. The maximum is half of the visibility timeout value.
- name: bucket_list_interval
type: text
title: Bucket List Interval
multi: false
required: false
show_user: false
default: 120s
description: Time interval for polling listing of the S3 bucket.
- name: bucket_list_prefix
type: text
title: Bucket List Prefix
multi: false
required: false
show_user: false
description: Prefix to apply for the list request to the S3 bucket.
- name: buffer_size
type: text
title: Buffer Size
multi: false
required: false
show_user: false
description: The size in bytes of the buffer that each harvester uses when fetching a file. This only applies to non-JSON logs.
- name: content_type
type: text
title: Content Type
multi: false
required: false
show_user: false
description: >-
A standard MIME type describing the format of the object data. This can be set to override the MIME type that was given to the object when it was uploaded. For example application/json.
- name: encoding
type: text
title: Encoding
multi: false
required: false
show_user: false
description: The file encoding to use for reading data that contains international characters. This only applies to non-JSON logs.
- name: expand_event_list_from_field
type: text
title: Expand Event List from Field
multi: false
required: false
show_user: false
description: >-
If the fileset using this input expects to receive multiple messages bundled under a specific field then the config option expand_event_list_from_field value can be assigned the name of the field. This setting will be able to split the messages under the group value into separate events. For example, CloudTrail logs are in JSON format and events are found under the JSON object "Records".
- name: file_selectors
type: yaml
title: File Selectors
multi: true
required: false
show_user: false
description: >-
If the SQS queue will have events that correspond to files that this integration shouldnt process file_selectors can be used to limit the files that are downloaded. This is a list of selectors which are made up of regex and expand_event_list_from_field options. The regex should match the S3 object key in the SQS message, and the optional expand_event_list_from_field is the same as the global setting. If file_selectors is given, then any global expand_event_list_from_field value is ignored in favor of the ones specified in the file_selectors. Regex syntax is the same as the Go language. Files that dont match one of the regexes wont be processed. content_type, parsers, include_s3_metadata,max_bytes, buffer_size, and encoding may also be set for each file selector.
- name: fips_enabled
type: bool
title: Enable S3 FIPS
default: false
multi: false
required: false
show_user: false
description: Enabling this option changes the service name from `s3` to `s3-fips` for connecting to the correct service endpoint.
- name: include_s3_metadata
type: text
title: Include S3 Metadata
multi: true
required: false
show_user: false
description: >-
This input can include S3 object metadata in the generated events for use in follow-on processing. You must specify the list of keys to include. By default none are included. If the key exists in the S3 response then it will be included in the event as aws.s3.metadata.<key> where the key name as been normalized to all lowercase.
- name: max_bytes
type: text
title: Max Bytes
default: 10MiB
multi: false
required: false
show_user: false
description: The maximum number of bytes that a single log message can have. All bytes after max_bytes are discarded and not sent. This setting is especially useful for multiline log messages, which can get large. This only applies to non-JSON logs.
- name: max_number_of_messages
type: integer
title: Maximum Concurrent SQS Messages
description: The maximum number of SQS messages that can be inflight at any time.
default: 5
required: false
show_user: false
- name: non_aws_bucket_name
type: text
title: Non AWS Bucket Name
multi: false
required: false
show_user: false
description: Name of the S3 bucket that will be polled for list operation. Required for 3rd party S3 compatible services. (Required when queue_url and bucket_arn are not set).
- name: path_style
type: text
title: Path Style
multi: false
required: false
show_user: false
description: >-
Enabling this option sets the bucket name as a path in the API call instead of a subdomain. When enabled https://<bucket-name>.s3.<region>.<provider>.com becomes https://s3.<region>.<provider>.com/<bucket-name>. This is only supported with 3rd party S3 providers. AWS does not support path style.
- name: provider
type: text
title: Provider Name
multi: false
required: false
show_user: false
description: Name of the 3rd party S3 bucket provider like backblaze or GCP.
- name: sqs.max_receive_count
type: integer
title: SQS Message Maximum Receive Count
multi: false
required: false
show_user: false
default: 5
description: The maximum number of times a SQS message should be received (retried) before deleting it. This feature prevents poison-pill messages (messages that can be received but cant be processed) from consuming resources.
- name: sqs.wait_time
type: text
title: SQS Maximum Wait Time
multi: false
required: false
show_user: false
default: 20s
description: >-
The maximum duration that an SQS `ReceiveMessage` call should wait for a message to arrive in the queue before returning. The maximum value is `20s`.
- name: visibility_timeout
type: text
title: Visibility Timeout
multi: false
required: false
show_user: false
description: The duration that the received messages are hidden from subsequent retrieve requests after being retrieved by a ReceiveMessage request. The maximum is 12 hours.

Some files were not shown because too many files have changed in this diff Show more