kibana/docs/user/ml/index.asciidoc
Kibana Machine e93f0f385b
[8.10] [DOCS] Adds documentation for data comparison view (#164297) (#164722)
# Backport

This will backport the following commits from `main` to `8.10`:
- [[DOCS] Adds documentation for data comparison view
(#164297)](https://github.com/elastic/kibana/pull/164297)

<!--- Backport version: 8.9.7 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sqren/backport)

<!--BACKPORT [{"author":{"name":"István Zoltán
Szabó","email":"szabosteve@gmail.com"},"sourceCommit":{"committedDate":"2023-08-24T14:13:38Z","message":"[DOCS]
Adds documentation for data comparison view (#164297)\n\n##
Summary\r\n\r\nRelated PR:
https://github.com/elastic/kibana/pull/161365\r\nRelated issue:
https://github.com/elastic/platform-docs-team/issues/153\r\n\r\nThis PR
drafts documentation for the new data comparison feature under\r\nthe
Data Visualizer in
Kibana.","sha":"e91103811be0c731af056779ba20a30a89e34253","branchLabelMapping":{"^v8.11.0$":"main","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["Team:Docs",":ml","release_note:skip","docs","v8.10.0","v8.11.0"],"number":164297,"url":"https://github.com/elastic/kibana/pull/164297","mergeCommit":{"message":"[DOCS]
Adds documentation for data comparison view (#164297)\n\n##
Summary\r\n\r\nRelated PR:
https://github.com/elastic/kibana/pull/161365\r\nRelated issue:
https://github.com/elastic/platform-docs-team/issues/153\r\n\r\nThis PR
drafts documentation for the new data comparison feature under\r\nthe
Data Visualizer in
Kibana.","sha":"e91103811be0c731af056779ba20a30a89e34253"}},"sourceBranch":"main","suggestedTargetBranches":["8.10"],"targetPullRequestStates":[{"branch":"8.10","label":"v8.10.0","labelRegex":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"},{"branch":"main","label":"v8.11.0","labelRegex":"^v8.11.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/164297","number":164297,"mergeCommit":{"message":"[DOCS]
Adds documentation for data comparison view (#164297)\n\n##
Summary\r\n\r\nRelated PR:
https://github.com/elastic/kibana/pull/161365\r\nRelated issue:
https://github.com/elastic/platform-docs-team/issues/153\r\n\r\nThis PR
drafts documentation for the new data comparison feature under\r\nthe
Data Visualizer in
Kibana.","sha":"e91103811be0c731af056779ba20a30a89e34253"}}]}]
BACKPORT-->

Co-authored-by: István Zoltán Szabó <szabosteve@gmail.com>
2023-08-24 10:33:52 -04:00

261 lines
No EOL
12 KiB
Text

[[xpack-ml]]
= {ml-cap}
:frontmatter-tags-products: [ml]
:frontmatter-tags-content-type: [overview]
:frontmatter-tags-user-goals: [analyze]
[partintro]
--
As data sets increase in size and complexity, the human effort required to
inspect dashboards or maintain rules for spotting infrastructure problems,
cyber attacks, or business issues becomes impractical. Elastic {ml-features}
such as {anomaly-detect} and {oldetection} make it easier to notice suspicious
activities with minimal human interference.
{kib} includes a free *{data-viz}* to learn more about your data. In particular,
if your data is stored in {es} and contains a time field, you can use the
*{data-viz}* to identify possible fields for {anomaly-detect}:
[role="screenshot"]
image::user/ml/images/ml-data-visualizer-sample.png[{data-viz} for sample flight data]
You can also upload a CSV, NDJSON, or log file. The *{data-viz}*
identifies the file format and field mappings. You can then optionally import
that data into an {es} index. To change the default file size limit, see
<<kibana-general-settings, fileUpload:maxFileSize advanced settings>>.
If {stack-security-features} are enabled, users must have the necessary
privileges to use {ml-features}. Refer to
{ml-docs}/setup.html#setup-privileges[Set up {ml-features}].
NOTE: There are limitations in {ml-features} that affect {kib}. For more
information, refer to {ml-docs}/ml-limitations.html[{ml-cap}].
[discrete]
[[data-comparison-view]]
== Data comparison
preview::[]
You can find the data comparison view in **{ml-app}** > *{data-viz}* in {kib}.
The data comparison view shows you the differences in each field for two
different time ranges in a given {data-source}. The view helps you to visualize
the changes in your data over time and enables you to understand its behavior
better.
[role="screenshot"]
image::user/ml/images/ml-data-comparison.png[Data comparison view for {kib} Sample Data Flights ]
Select a {data-source} that you want to analyze, then select a time range for
the reference and the comparison data in the appearing histogram chart. You can
adjust the time range for both the reference and the comparison data by moving
the respective brushes. When you finished setting the time ranges, click
*Run analysis*.
You can decide whether you want to see all the fields in the {data-source} or
only the ones that contains drifted data. The analysis results table displays
the fields, their types, if drift is detected, the p-value that indicates how
significant the detected change is, the reference and comparison distribution,
and the comparison chart. You can expand the results for a particular field by
clicking the arrow icon at the beginning of the field's row.
--
[[xpack-ml-anomalies]]
== {anomaly-detect-cap}
:frontmatter-tags-products: [ml]
:frontmatter-tags-content-type: [overview]
:frontmatter-tags-user-goals: [analyze]
The Elastic {ml} {anomaly-detect} feature automatically models the normal
behavior of your time series data — learning trends, periodicity, and more — in
real time to identify anomalies, streamline root cause analysis, and reduce
false positives. {anomaly-detect-cap} runs in and scales with {es}, and
includes an intuitive UI on the {kib} *Machine Learning* page for creating
{anomaly-jobs} and understanding results.
If you have a license that includes the {ml-features}, you can
create {anomaly-jobs} and manage jobs and {dfeeds} from the *Job Management*
pane:
[role="screenshot"]
image::user/ml/images/ml-job-management.png[Job Management]
You can use the *Settings* pane to create and edit calendars and the
filters that are used in custom rules:
[role="screenshot"]
image::user/ml/images/ml-settings.png[Calendar Management]
The *Anomaly Explorer* and *Single Metric Viewer* display the results of your
{anomaly-jobs}. For example:
[role="screenshot"]
image::user/ml/images/ml-single-metric-viewer.png[Single Metric Viewer]
You can optionally add annotations by drag-selecting a period of time in
the *Single Metric Viewer* and adding a description. For example, you can add an
explanation for anomalies in that time period or provide notes about what is
occurring in your operational environment at that time:
[role="screenshot"]
image::user/ml/images/ml-annotations-list.png[Single Metric Viewer with annotations]
In some circumstances, annotations are also added automatically. For example, if
the {anomaly-job} detects that there is missing data, it annotates the affected
time period. For more information, see
{ml-docs}/ml-delayed-data-detection.html[Handling delayed data]. The
*Job Management* pane shows the full list of annotations for each job.
NOTE: The {kib} {ml-features} use pop-ups. You must configure your web
browser so that it does not block pop-up windows or create an exception for your
{kib} URL.
For more information about the {anomaly-detect} feature, see
https://www.elastic.co/what-is/elastic-stack-machine-learning[{ml-cap} in the {stack}]
and {ml-docs}/ml-ad-overview.html[{ml-cap} {anomaly-detect}].
[[xpack-ml-dfanalytics]]
== {dfanalytics-cap}
:frontmatter-tags-products: [ml]
:frontmatter-tags-content-type: [overview]
:frontmatter-tags-user-goals: [analyze]
The Elastic {ml} {dfanalytics} feature enables you to analyze your data using
{classification}, {oldetection}, and {regression} algorithms and generate new
indices that contain the results alongside your source data.
If you have a license that includes the {ml-features}, you can create
{dfanalytics-jobs} and view their results on the *Data Frame Analytics* page in
{kib}. For example:
[role="screenshot"]
image::user/ml/images/classification.png[{classification-cap} results in {kib}]
For more information about the {dfanalytics} feature, see
{ml-docs}/ml-dfanalytics.html[{ml-cap} {dfanalytics}].
[[xpack-ml-aiops]]
== AIOps Labs
:frontmatter-tags-products: [ml]
:frontmatter-tags-content-type: [overview]
:frontmatter-tags-user-goals: [analyze]
AIOps Labs is a part of {ml-app} in {kib} which provides features that use
advanced statistical methods to help you interpret your data and its behavior.
[discrete]
[[log-rate-analysis]]
=== Log rate analysis
preview::[]
Log rate analysis is a feature that uses advanced statistical methods to
identify reasons for increases or decreases in log rates. It makes it easy to
find and investigate causes of unusual spikes or drops by using the analysis
workflow view. Examine the histogram chart of the log rates for a given
{data-source}, and find the reason behind a particular change possibly in
millions of log events across multiple fields and values.
You can find log rate analysis under **{ml-app}** > **AIOps Labs** where
you can select the {data-source} or saved search that you want to analyze.
[role="screenshot"]
image::user/ml/images/ml-log-rate-analysis-before.png[Log event histogram chart]
Select a spike or drop in the log event histogram chart to start the analysis. It
identifies statistically significant field-value combinations that contribute to
the spike or drop and displays them in a table. You can optionally choose to summarize
the results into groups. The table also shows an indicator of the level of
impact and a sparkline showing the shape of the impact in the chart. Hovering
over a row displays the impact on the histogram chart in more detail. You can
inspect a field in **Discover**, further investigate in **Log pattern analysis**,
or copy the table row information as a query filter to the clipboard by
selecting the corresponding option under the **Actions** column. You can also
pin a table row by clicking on it then move the cursor to the histogram chart.
It displays a tooltip with exact count values for the pinned field which enables
closer investigation.
Brushes in the chart show the baseline time range and the deviation in the
analyzed data. You can move the brushes to redefine both the baseline and the
deviation and rerun the analysis with the modified values.
[role="screenshot"]
image::user/ml/images/ml-log-rate-analysis.png[Log rate spike explained]
[discrete]
[[log-pattern-analysis]]
=== Log pattern analysis
preview::[]
// The following intro is used on the `run-pattern-analysis-discover` page.
//tag::log-pattern-analysis-intro[]
Log pattern analysis helps you to find patterns in unstructured log messages and
makes it easier to examine your data. It performs categorization analysis on a
selected field of a {data-source}, creates categories based on the data and
displays them together with a chart that shows the distribution of each category
and an example document that matches the category.
//end::log-pattern-analysis-intro[]
You can find log pattern analysis under **{ml-app}** > **AIOps Labs** where you
can select the {data-source} or saved search that you want to analyze, or in
**Discover** as an available action for any text field.
[role="screenshot"]
image::user/ml/images/ml-log-pattern-analysis.png[Log pattern analysis UI]
Select a field for categorization and optionally apply any filters that you
want, then start the analysis. The analysis uses the same algorithms as a {ml}
categorization job. The results of the analysis are shown in a table that makes
it possible to open **Discover** and show or filter out the given category
there, which helps you to further examine your log messages.
[discrete]
[[change-point-detection]]
=== Change point detection
preview::[]
Change point detection uses the
{ref}/search-aggregations-change-point-aggregation.html[change point aggregation]
to detect distribution changes, trend changes, and other statistically
significant change points in a metric of your time series data.
You can find change point detection under **{ml-app}** > **AIOps Labs** where
you can select the {data-source} or saved search that you want to analyze.
[role="screenshot"]
image::user/ml/images/ml-change-point-detection.png[Change point detection UI]
Select a function and a metric field, then pick a date range to start detecting
change points in the defined range. Optionally, you can split the data by a
field. If the cardinality of the split field exceeds 10,000, then only the first
10,000, sorted by document count, are analyzed. You can configure a maximum of 6
combinations of a function applied to a metric field, partitioned by a split
field to identify change points.
When a change point is detected, a row displays basic information including the
timestamp of the change point, a preview chart, the type of change point, its
p-value, the name and value of the split field. You can further examine the
selected change point in a detailed view. A chart visualizes the identified
change point within the analyzed time window, making the interpretation easier.
If the analysis is split by a field, a separate chart is shown for every
partition that has a detected change point. The chart displays the type of
change point, its value, and the timestamp of the bucket where the change point
has been detected. The corresponding `p-value` indicates the magnitude of the
change; lower values indicate more significant changes. You can use the change
point type selector to filter the results by specific types of change points.
[role="screenshot"]
image::user/ml/images/ml-change-point-detection-selected.png[Selected change points]
You can attach change point charts to a dashboard or a case by using the context
menu. If the split field is selected, you can either select specific charts
(partitions) or set the maximum number of top change points to plot. It's
possible to preserve the applied time range or use the time bound from the page
date picker. You can also add or edit change point charts directly from the
**Dashboard** app.