mirror of
https://github.com/elastic/kibana.git
synced 2025-06-27 18:51:07 -04:00
181 lines
9.5 KiB
Text
181 lines
9.5 KiB
Text
[role="xpack"]
|
|
[[create-and-manage-rules]]
|
|
== Create and manage rules
|
|
|
|
The *{stack-manage-app}* > *{rules-ui}* UI provides a cross-app view of alerting.
|
|
Different {kib} apps like {observability-guide}/create-alerts.html[*{observability}*],
|
|
{security-guide}/prebuilt-rules.html[*Security*], <<geo-alerting,*Maps*>> and
|
|
<<xpack-ml,*{ml-app}*>> can offer their own rules. *{rules-ui}* provides a
|
|
central place to:
|
|
|
|
* <<create-edit-rules,Create and edit>> rules
|
|
* <<controlling-rules,Manage rules>> including enabling/disabling, muting/unmuting, and deleting
|
|
* Drill down to <<rule-details,rule details>>
|
|
|
|
[role="screenshot"]
|
|
image:images/rules-ui.png[Example rule listing in {rules-ui}]
|
|
// NOTE: This is an autogenerated screenshot. Do not edit it directly.
|
|
|
|
For more information on alerting concepts and the types of rules and connectors
|
|
available, go to <<alerting-getting-started>>.
|
|
|
|
[float]
|
|
=== Required permissions
|
|
|
|
Access to rules is granted based on your {alert-features} privileges. For
|
|
more information, go to <<alerting-security>>.
|
|
|
|
[float]
|
|
[[create-edit-rules]]
|
|
=== Create and edit rules
|
|
|
|
Some rules must be created within the context of a {kib} app like
|
|
<<metrics-app,Metrics>>, <<xpack-apm,APM>>, or <<uptime-app,Uptime>>, but others
|
|
are generic. Generic rule types can be created in *{rules-ui}* by clicking the
|
|
*Create rule* button. This will launch a flyout that guides you through selecting
|
|
a rule type and configuring its conditions and actions.
|
|
|
|
After a rule is created, you can open the action menu (…) and select *Edit rule*
|
|
to re-open the flyout and change the rule properties.
|
|
|
|
[float]
|
|
[[defining-rules-type-conditions]]
|
|
==== Rule type and conditions
|
|
|
|
Depending on the {kib} app and context, you might be prompted to choose the type of rule to create. Some apps will preselect the type of rule for you.
|
|
|
|
Each rule type provides its own way of defining the conditions to detect, but an expression formed by a series of clauses is a common pattern. For example, in an index threshold rule, the `WHEN` clause enables you to select an aggregation operation to apply to a numeric field.
|
|
|
|
[role="screenshot"]
|
|
image::images/rule-flyout-rule-conditions.png[UI for defining rule conditions on an index threshold rule,500]
|
|
|
|
All rules must have a check interval, which defines how often to evaluate the rule conditions. Checks are queued; they run as close to the defined value as capacity allows.
|
|
|
|
For details on what types of rules are available and how to configure them, refer to <<rule-types>>.
|
|
|
|
[float]
|
|
[[defining-rules-actions-details]]
|
|
==== Actions
|
|
|
|
You can add one or more actions to your rule to generate notifications when its
|
|
conditions are met and when they are no longer met.
|
|
|
|
Each action uses a connector, which provides connection information for a {kib} service or third party integration, depending on where you want to send the notifications. If no connectors exist, click **Add connector** to create one.
|
|
|
|
After you select a connector, set the action frequency. If the rule type supports alert summaries, you can choose to create a summary of alerts on each check interval or on a custom interval. For example, if you create a metrics threshold rule, you can send email notifications that summarize the new, ongoing, and recovered alerts each day:
|
|
|
|
[role="screenshot"]
|
|
image::images/rule-flyout-action-summary.png[UI for defining rule conditions on an index threshold rule,500]
|
|
|
|
TIP: If you choose a custom action interval, it cannot be shorter than the rule's check interval.
|
|
|
|
Alternatively, you can set the action frequency such that the action runs for each alert. If the rule type does not support alert summaries, this is your only available option. You must choose when the action runs (for example, at each check interval, only when the alert status changes, or at a custom action interval). You must also choose an action group, which affects whether the action runs (for example, the action runs when the issue is detected or when it is recovered). Each rule type has a specific set of valid action groups.
|
|
|
|
[role="screenshot"]
|
|
image::images/rule-flyout-action-details.png[UI for defining an email action,500]
|
|
|
|
Each connector enables different action properties. For example, an email connector enables you to set the recipients, the subject, and a message body in markdown format. For more information about connectors, refer to <<action-types>>.
|
|
|
|
[[alerting-concepts-suppressing-duplicate-notifications]]
|
|
[TIP]
|
|
==============================================
|
|
If you are not using alert summaries, actions are triggered per alert and a rule can end up generating a large number of actions. Take the following example where a rule is monitoring three servers every minute for CPU usage > 0.9, and the rule is set to notify `On check intervals`:
|
|
|
|
* Minute 1: server X123 > 0.9. _One email_ is sent for server X123.
|
|
* Minute 2: X123 and Y456 > 0.9. _Two emails_ are sent, one for X123 and one for Y456.
|
|
* Minute 3: X123, Y456, Z789 > 0.9. _Three emails_ are sent, one for each of X123, Y456, Z789.
|
|
|
|
In this example, three emails are sent for server X123 in the span of 3 minutes for the same rule. Often, it's desirable to suppress these re-notifications. If
|
|
you set the rule notify setting to `On custom action intervals` with an interval of 5 minutes, you reduce noise by getting emails only every 5 minutes for
|
|
servers that continue to exceed the threshold:
|
|
|
|
* Minute 1: server X123 > 0.9. _One email_ is sent for server X123.
|
|
* Minute 2: X123 and Y456 > 0.9. _One email_ is sent for Y456.
|
|
* Minute 3: X123, Y456, Z789 > 0.9. _One email_ is sent for Z789.
|
|
|
|
To get notified only once when a server exceeds the threshold, you can set the rule notify setting to `On status changes`.
|
|
==============================================
|
|
|
|
[float]
|
|
[[defining-rules-actions-variables]]
|
|
==== Action variables
|
|
|
|
You can pass rule values to an action at the time a condition is detected.
|
|
To view the list of variables available for your rule, click the "add rule variable" button:
|
|
|
|
[role="screenshot"]
|
|
image::images/rule-flyout-action-variables.png[Passing rule values to an action,500]
|
|
|
|
For more information about common action variables, refer to <<rule-action-variables>>.
|
|
|
|
[float]
|
|
[[controlling-rules]]
|
|
=== Snooze and disable rules
|
|
|
|
The rule listing enables you to quickly snooze, disable, enable, or delete
|
|
individual rules. For example, you can change the state of a rule:
|
|
|
|
[role="screenshot"]
|
|
image:images/individual-enable-disable.png[Use the rule status dropdown to enable or disable an individual rule]
|
|
// NOTE: This is an autogenerated screenshot. Do not edit it directly.
|
|
|
|
When you snooze a rule, the rule checks continue to run on a schedule but the
|
|
alert will not trigger any actions. You can snooze for a specified period of
|
|
time, indefinitely, or schedule single or recurring downtimes:
|
|
|
|
[role="screenshot"]
|
|
image:images/snooze-panel.png[Snooze notifications for a rule]
|
|
// NOTE: This is an autogenerated screenshot. Do not edit it directly.
|
|
|
|
When a rule is in a `snoozed` state, you can cancel or change the duration of
|
|
this state.
|
|
|
|
[float]
|
|
=== Rule status
|
|
|
|
A rule can have one of the following statuses:
|
|
|
|
`failed`:: The rule ran with errors.
|
|
`succeeded`:: The rule ran without errors.
|
|
`warning`:: The rule ran with some non-critical errors.
|
|
|
|
[float]
|
|
[[importing-and-exporting-rules]]
|
|
=== Import and export rules
|
|
|
|
To import and export rules, use <<managing-saved-objects,Saved Objects>>.
|
|
|
|
[NOTE]
|
|
==============================================
|
|
Some rule types cannot be exported through this interface:
|
|
|
|
**Security rules** can be imported and exported using the {security-guide}/rules-ui-management.html#import-export-rules-ui[Security UI].
|
|
|
|
**Stack monitoring rules** are <<kibana-alerts, automatically created>> for you and therefore cannot be managed in *Saved Objects*.
|
|
==============================================
|
|
|
|
Rules are disabled on export. You are prompted to re-enable the rule on successful import.
|
|
[role="screenshot"]
|
|
image::images/rules-imported-banner.png[Rules import banner,500]
|
|
|
|
[float]
|
|
[[rule-details]]
|
|
=== Drill down to rule details
|
|
|
|
Select a rule name from the rule listing to access the *Rule details* page, which tells you about the state of the rule and provides granular control over the actions it is taking.
|
|
|
|
[role="screenshot"]
|
|
image::images/rule-details-alerts-active.png[Rule details page with three alerts]
|
|
|
|
In this example, the rule detects when a site serves more than a threshold number of bytes in a 24 hour period. Four sites are above the threshold. These are called alerts - occurrences of the condition being detected - and the alert name, status, time of detection, and duration of the condition are shown in this view. Alerts come and go from the list depending on whether the rule conditions are met.
|
|
|
|
When an alert is created, it generates actions. If the conditions that caused the alert persist, the actions run again according to the rule notification settings. There are two common alert statuses:
|
|
|
|
`active`:: The conditions for the rule are met, and actions should be generated according to the notification settings.
|
|
`recovered`:: The conditions for the rule are no longer met, and recovery actions should be generated.
|
|
|
|
You can suppress future actions for a specific alert by turning on the *Mute* toggle. If a muted alert no longer meets the rule conditions, it stays in the list to avoid generating actions if the conditions recur. You can also disable a rule, which stops it from running checks and clears any alerts it was tracking. You may want to disable rules that are not currently needed to reduce the load on {kib} and {es}.
|
|
|
|
[role="screenshot"]
|
|
image::images/rule-details-disabling.png[Use the disable toggle to turn off rule checks and clear alerts tracked]
|
|
|