Your window into the Elastic Stack
Find a file
Chris Cowan 96dc2a5104
[SLO] Add dependencies for Burn Rate rule suppression (#177078)
## 🍒  Summary

This PR adds a rule dependency feature to the SLO Burn Rate rule to
enable rule suppression when one of the dependencies meets the
suppression criteria.

### 📟 Use case

When you add a rule dependency to your SLO Burn Rate rule, you will also
choose which action groups you want to suppress on. For example, if you
have rule `A` which depends on rule `B` and you want to suppress the
actions of rule `A` when rule `B` is triggering `Critical` or `High`,
you'd add rule `B` and pick the action groups `Critical` and `High`.
When rule `B` is triggering either of those action groups, ALL of the
actions for rule `A` will be suppressed.

When an action is suppressed, we will trigger a `Suppressed` action
group an set the context variable `{context.suppressedAction}` to the
action that would have been trigger if they rule wasn't suppressed. This
will allow users to create an "action" for `Suppressed` alerts so they
can still create notification without waking up the team for a
`Critical` or `High` severity alert.

If you have 2 rules that use a group by, then the suppression will
happen on the intersection of the `slo.instanceId`. For example, imagine
we have a Nginx Proxy in front of an Node.js web service and we've
created an availability SLO based on `status_code < 500` for both,
grouped-by `url.domain`. When the Node.js app responds with a `500`, the
Nginx Proxy's SLO will start to degrade because of the Node.js service.
The admins for the Nginx Proxy would like to only receive alerts if the
Node.js web services is "healthy" so they've listed the Node.js burn
rate rule as a dependency to suppress on `Critical` or `High` burn
rates.

When one of the domains, `you-got.mail`, starts to throw 500's and the
burn rate becomes `High`, the rule will suppress the alert for the
`you-got.mail` Nginx Proxy instance. If one of the other domains,
`box.mail`, for Nginx started throwing `502` because of a
mis-configuration, the alert would trigger normally because the
`box.mail` instance of the rule dependency for the Node.js web service
is still healthy (or not triggering `Critical` or `High`).

The suppression between group-by SLOs and non-group-by SLOs works like
this:

- SLO with a group-by depends on a non-grouped-by SLO, all the instances
of the group by will be suppressed.
- SLO without a group-by depends on an SLO with a group-by, the
non-grouped SLO will be suppressed if ANY of the instances of the
group-by are triggering the "suppress on" action groups.

### 💻 Screenshots

Adding a rule dependency for MongoDB to a Node.js web app

<img width="764" alt="image"
src="da2fd411-2a8e-4433-a505-2c4111e115be">

In this scenario, Nginx Proxies to Admin Console which reads data from
MongoDB. The connection between MongoDB and the Admin Console has a
network outage which causes the MongoDB rule to trigger a `Critical`
action group and suppresses the `Critical` action for the Admin Console.
The Admin Console also goes `Critical` which then suppresses the rule
for the Nginx Proxy.

<img width="1784" alt="image"
src="2db75993-8912-4769-83f8-240de811a92f">

### ⚙️ How it works

- Execute the primary rule and evaluate if should trigger any actions
- If the primary rule is triggering, execute each of the dependencies
(in the same process using the same function) and suppress when:
- For group-by SLOs that depend on another SLO with a group by, we
suppress the intersection between the instanceIds.
- For group-by SLOs that depend on a non-group-by SLO, we suppress all
the instanceIds.
- For non-group-by SLO that depends on a group-by SLO, we suppress if
ANY instanceId matches. (not recommended)

### 🔬 How to test

- Add the following lines to your `config/kibana.dev.yaml`:
  - `server.basePath: '/kibana'`
  - `server.publicBaseUrl: 'http://localhost:5601/kibana'`
- Start with the following command: `node x-pack/scripts/data_forge.js
--events-per-cycle 50 --lookback now-1d --dataset fake_stack
--install-kibana-assets --kibana-url http://localhost:5601/kibana
--event-template good`
- Wait till the log message says `info Waiting 60000ms`
- Create 2 SLOs:
- "Admin Console Availability" using the "Custom Query" SLI with the
`Admin Console` DataView, set the "Good query" to
`http.response.status_code < 500` and the set the "Total query" to
`http.response.status_code: *` using a rolling `7d` time window
- "MongoDB Availability" using the "Custom Query" SLI with the
`Heartbeat` DataView, set the "Good query" to `event.outcome: "success"`
and the set the "Total query" to `event.outcome: *` using a rolling `7d`
time window
- You should have 2 burn rate rules that were created by default
- Open the "Admin Console Availability Burn Rate rule" and add the
"MongoDB Availability Burn Rate rule" as the dependency with `Critical`
and `High` action groups to "Suppress on".
- Save the rule
- Stop the first `data_forge.js` command
- Start `node x-pack/scripts/data_forge.js --events-per-cycle 50
--lookback now --dataset fake_stack --install-kibana-assets --kibana-url
http://localhost:5601/kibana --event-template bad`

Once the Burn Rate rules go `Critical`, you should see the "MongoDB
Availability Burn Rate rule" reason message should start with
`CRITICAL:...` and the "Admin Console Availability Burn Rate rule"
reason message should start with `SUPPRESSED - CRITICAL: ...`

Fixes #173653

---------

Co-authored-by: Panagiota Mitsopoulou <giota85@gmail.com>
Co-authored-by: Dominique Clarke <doclarke71@gmail.com>
Co-authored-by: Kevin Delemme <kdelemme@gmail.com>
Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>
Co-authored-by: Dominique Clarke <dominique.clarke@elastic.co>
2024-04-16 06:25:50 -04:00
.buildkite build: remove requirement to clone open-source repo (#180715) 2024-04-15 15:10:46 -05:00
.github [CODEOWNERS] add owners for all svl FTR tests (#180802) 2024-04-16 11:15:02 +02:00
api_docs [api-docs] 2024-04-16 Daily api_docs build (#180862) 2024-04-16 05:20:20 +00:00
config [Search] Introduced Notebooks view for console (#180400) 2024-04-15 11:10:28 -05:00
dev_docs [Docs] SO migration on serverless tutorial notes (#179261) 2024-03-27 00:37:37 +01:00
docs [DOCS] Adding to release notes (#180490) 2024-04-16 09:59:29 +01:00
examples [embeddable rebuild] change onFetchContextChanged from accepting callback to returning an observable (#180410) 2024-04-11 13:45:22 -06:00
kbn_pm Replace deprecated node-sass with sass #2 (#173942) 2023-12-28 10:35:17 -06:00
legacy_rfcs rename @elastic/* packages to @kbn/* (#138957) 2022-08-18 08:54:42 -07:00
licenses build: remove requirement to clone open-source repo (#180715) 2024-04-15 15:10:46 -05:00
packages [Security Solution] [AI Insights] AI Insights (#180611) 2024-04-16 11:34:15 +02:00
plugins
scripts Fix SAML provider incorrectly added to Docker SSL (#178555) 2024-03-12 12:40:11 -07:00
src [Security Solution][Detection Engine] fixes warning toasts on exception flyout (#180800) 2024-04-16 11:18:51 +01:00
test [Security Solution] Add guided tour to document details flyout (#180318) 2024-04-15 12:42:50 -07:00
typings Remove legacy kibana react code editor (#171047) 2024-01-05 14:35:09 +01:00
x-pack [SLO] Add dependencies for Burn Rate rule suppression (#177078) 2024-04-16 06:25:50 -04:00
.backportrc.json chore(NA): adds 8.14 into backportrc (#176936) 2024-02-14 19:48:14 +00:00
.bazelignore Remove references to deleted .ci folder (#177168) 2024-02-20 19:54:21 +01:00
.bazeliskversion chore(NA): upgrade bazelisk into v1.11.0 (#125070) 2022-02-09 20:43:57 +00:00
.bazelrc chore(NA): use new and more performant BuildBuddy servers (#130350) 2022-04-18 02:01:38 +01:00
.bazelrc.common Transpile packages on demand, validate all TS projects (#146212) 2022-12-22 19:00:29 -06:00
.bazelversion chore(NA): revert bazel upgrade for v5.2.0 (#135096) 2022-06-24 03:57:21 +01:00
.browserslistrc
.editorconfig .editorconfig MDX files should follow the same rules as MD (#96942) 2021-04-13 11:40:42 -04:00
.eslintignore [ES|QL] New @kbn/esql-services package (#179029) 2024-03-27 14:39:48 +01:00
.eslintrc.js [ESLint i18n] Add FormattedMessage start with the right ID (#180048) 2024-04-05 18:17:01 +02:00
.gitattributes
.gitignore [Moving] Move APM and APM_Data_Access folders into /x-pack/observability_solution/ (#177433) 2024-02-23 09:56:21 -07:00
.i18nrc.json Share Modal (#179037) 2024-04-04 09:06:14 -07:00
.node-version Upgrade Node.js to v20.12.2 (#180522) 2024-04-11 08:56:38 -05:00
.npmrc [npmrc] Fix puppeteer_skip_download configuration (#177673) 2024-02-22 18:59:01 -07:00
.nvmrc Upgrade Node.js to v20.12.2 (#180522) 2024-04-11 08:56:38 -05:00
.prettierignore
.prettierrc
.puppeteerrc Add .puppeteerrc (#179847) 2024-04-03 09:14:39 -05:00
.stylelintignore chore(NA): stop grouping bazel out symlink folders (#96066) 2021-04-01 14:16:14 -05:00
.stylelintrc Bump stylelint to ^14 (#136693) 2022-07-20 10:11:00 -05:00
.telemetryrc.json [Telemetry] Fix telemetry-tools TS parser for packages (#149819) 2023-01-31 04:09:09 +03:00
.yarnrc chore(NA): manage npm dependencies within bazel (#92864) 2021-03-03 12:37:20 -05:00
BUILD.bazel Transpile packages on demand, validate all TS projects (#146212) 2022-12-22 19:00:29 -06:00
catalog-info.yaml [BK] Add template for pipeline defs (#180189) 2024-04-08 11:21:28 +02:00
CODE_OF_CONDUCT.md
CONTRIBUTING.md Update doc slugs to improve analytic tracking, move to appropriate folders (#113630) 2021-10-04 13:36:45 -04:00
FAQ.md Fix small typos in the root md files (#134609) 2022-06-23 09:36:11 -05:00
fleet_packages.json [main] Sync bundled packages with Package Storage (#179927) 2024-04-03 09:01:30 -07:00
github_checks_reporter.json
kibana.d.ts fix all violations 2022-04-16 01:37:30 -05:00
LICENSE.txt
nav-kibana-dev.docnav.json lens config builder docs (#177993) 2024-03-12 10:22:42 +01:00
NOTICE.txt Copy assets from appropriate directory for kbn-monaco (#178669) 2024-03-21 16:29:20 +01:00
package.json [Obs AI Assistant] ai assistant system connector (#179980) 2024-04-15 22:22:06 +02:00
preinstall_check.js Always throw error objects - never strings (#171498) 2023-11-20 09:23:16 -05:00
README.md [README] Update version Compatibility with Elasticsearch (#116040) 2022-01-10 10:31:21 -05:00
renovate.json Upgrade nodemailer dependency 6.6.2 -> 6.9.9 (#176487) 2024-03-19 17:00:55 +01:00
RISK_MATRIX.mdx Add "Risk Matrix" section to the PR template (#100649) 2021-06-02 14:43:47 +02:00
SECURITY.md
sonar-project.properties [ci] Run sonarqube daily (#173961) 2024-01-03 15:43:29 -06:00
STYLEGUIDE.mdx [styleguide] update path to scss theme (#140742) 2022-09-15 10:41:14 -04:00
tsconfig.base.json [Search] Introduced Notebooks view for console (#180400) 2024-04-15 11:10:28 -05:00
tsconfig.browser.json
tsconfig.browser_bazel.json [build_ts_refs] improve caches, allow building a subset of projects (#107981) 2021-08-10 22:12:45 -07:00
tsconfig.json Transpile packages on demand, validate all TS projects (#146212) 2022-12-22 19:00:29 -06:00
TYPESCRIPT.md Fix small typos in the root md files (#134609) 2022-06-23 09:36:11 -05:00
versions.json chore(NA): update versions after v7.17.21 bump (#180298) 2024-04-10 20:47:43 +01:00
WORKSPACE.bazel Upgrade Node.js to v20.12.2 (#180522) 2024-04-11 08:56:38 -05:00
yarn.lock [Obs AI Assistant] ai assistant system connector (#179980) 2024-04-15 22:22:06 +02:00

Kibana

Kibana is your window into the Elastic Stack. Specifically, it's a browser-based analytics and search dashboard for Elasticsearch.

Getting Started

If you just want to try Kibana out, check out the Elastic Stack Getting Started Page to give it a whirl.

If you're interested in diving a bit deeper and getting a taste of Kibana's capabilities, head over to the Kibana Getting Started Page.

Using a Kibana Release

If you want to use a Kibana release in production, give it a test run, or just play around:

Building and Running Kibana, and/or Contributing Code

You might want to build Kibana locally to contribute some code, test out the latest features, or try out an open PR:

Documentation

Visit Elastic.co for the full Kibana documentation.

For information about building the documentation, see the README in elastic/docs.

Version Compatibility with Elasticsearch

Ideally, you should be running Elasticsearch and Kibana with matching version numbers. If your Elasticsearch has an older version number or a newer major number than Kibana, then Kibana will fail to run. If Elasticsearch has a newer minor or patch number than Kibana, then the Kibana Server will log a warning.

Note: The version numbers below are only examples, meant to illustrate the relationships between different types of version numbers.

Situation Example Kibana version Example ES version Outcome
Versions are the same. 7.15.1 7.15.1 💚 OK
ES patch number is newer. 7.15.0 7.15.1 ⚠️ Logged warning
ES minor number is newer. 7.14.2 7.15.0 ⚠️ Logged warning
ES major number is newer. 7.15.1 8.0.0 🚫 Fatal error
ES patch number is older. 7.15.1 7.15.0 ⚠️ Logged warning
ES minor number is older. 7.15.1 7.14.2 🚫 Fatal error
ES major number is older. 8.0.0 7.15.1 🚫 Fatal error

Questions? Problems? Suggestions?

  • If you've found a bug or want to request a feature, please create a GitHub Issue. Please check to make sure someone else hasn't already created an issue for the same topic.
  • Need help using Kibana? Ask away on our Kibana Discuss Forum and a fellow community member or Elastic engineer will be glad to help you out.