[6.8] [DOCS] Move machine learning details out of Kibana Guide (#45855) (#45965)

2025-04-23 17:28:26 -04:00 · 2019-09-17 15:13:23 -07:00 · 2019-09-17 15:13:23 -07:00 · db9fc60384
commit db9fc60384
parent c08c6a7bb7
9 changed files with 27 additions and 201 deletions
--- a/docs/redirects.asciidoc
+++ b/docs/redirects.asciidoc
@ -16,3 +16,14 @@ See <<monitoring-kibana>>.
 == Upgrade Assistant

 See <<upgrade-assistant>>.
+
+[role="exclude",id="ml-jobs"]
+== Creating {anomaly-jobs}
+
+This page has moved. Please see {stack-ov}/create-jobs.html[Creating {anomaly-jobs}].
+
+[role="exclude",id="job-tips"]
+== Machine learning job tips
+
+This page has moved. Please see {stack-ov}/job-tips.html[Machine learning job tips].
+
--- a/docs/user/extend.asciidoc
+++ b/docs/user/extend.asciidoc
@ -0,0 +1,12 @@
+[[extend]]
+= Extend your use case
+
+[partintro]
+--
+//TBD
+
+* <<xpack-ml>>
+
+--
+
+include::ml/index.asciidoc[]
--- a/docs/user/index.asciidoc
+++ b/docs/user/index.asciidoc
@ -20,7 +20,7 @@ include::timelion.asciidoc[]

 include::canvas.asciidoc[]

-include::ml/index.asciidoc[]
+include::extend.asciidoc[]

 include::{kib-repo-dir}/maps/index.asciidoc[]

--- a/docs/user/ml/creating-jobs.asciidoc
+++ b/docs/user/ml/creating-jobs.asciidoc
@ -1,67 +0,0 @@
-[role="xpack"]
-[[ml-jobs]]
-== Creating {anomaly-jobs}
-
-{anomaly-jobs-cap} contain the configuration information and metadata
-necessary to perform an analytics task.
-
-{kib} provides the following wizards to make it easier to create jobs:
-
-[role="screenshot"]
-image::user/ml/images/ml-create-job.jpg[Create New Job]
-
-A _single metric job_ is a simple job that contains a single _detector_. A
-detector defines the type of analysis that will occur and which fields to
-analyze. In addition to limiting the number of detectors, the single metric job
-creation wizard omits many of the more advanced configuration options.
-
-A _multi-metric job_ can contain more than one detector, which is more efficient
-than running multiple jobs against the same data.
-
-A _population job_ detects activity that is unusual compared to the behavior of
-the population. For more information, see
-{stack-ov}/ml-configuring-pop.html[Performing population analysis].
-
-An _advanced job_ can contain multiple detectors and enables you to configure all
-job settings.
-
-{kib} can also recognize certain types of data and provide specialized wizards
-for that context.  For example, if you use {filebeat-ref}/index.html[{filebeat}]
-to ship access logs from your
-http://nginx.org/[Nginx] and https://httpd.apache.org/[Apache] HTTP servers to
-{es}, the following wizards appear:
-
-[role="screenshot"]
-image::user/ml/images/ml-data-recognizer-filebeat.jpg[A screenshot of the {filebeat} job creation wizards]
-
-Likewise, if you use {auditbeat-ref}/index.html[{auditbeat}] to audit process
-activity on your systems, the following wizards appear:
-
-[role="screenshot"]
-image::user/ml/images/ml-data-recognizer-auditbeat.jpg[A screenshot of the {auditbeat} job creation wizards]
-
-These wizards create {anomaly-jobs}, dashboards, searches, and visualizations that
-are customized to help you analyze your {auditbeat} and {filebeat} data.
-
-If you are not certain which type of job to create, you can use the
-*Data Visualizer* to learn more about your data. If your index pattern contains
-a time field, it can identify possible fields for {ml} analysis. 
-
-[NOTE] 
-===============================
-If your data is located outside of {es}, you cannot use {kib} to create
-your jobs and you cannot use {dfeeds} to retrieve your data in real time.
-{anomaly-detect-cap} is still possible, however, by using APIs to
-create and manage jobs and post data to them. For more information, see
-{ref}/ml-apis.html[Machine Learning APIs].
-===============================
-
-Ready to get some hands-on experience? See
-{stack-ov}/ml-getting-started.html[Getting Started with Machine Learning].
-
-The following video tutorials also demonstrate single metric, multi-metric, and
-advanced jobs:
-
-* https://www.elastic.co/videos/machine-learning-tutorial-creating-a-single-metric-job[Machine Learning for the Elastic Stack: Creating a single metric job]
-* https://www.elastic.co/videos/machine-learning-tutorial-creating-a-multi-metric-job[Machine Learning for the Elastic Stack: Creating a multi-metric job]
-* https://www.elastic.co/videos/machine-learning-lab-3-detect-outliers-in-a-population[Machine Learning for the Elastic Stack: Detect Outliers in a Population]
--- a/docs/user/ml/images/ml-create-job.jpg
+++ b/docs/user/ml/images/ml-create-job.jpg
--- a/docs/user/ml/images/ml-data-recognizer-auditbeat.jpg
+++ b/docs/user/ml/images/ml-data-recognizer-auditbeat.jpg
--- a/docs/user/ml/images/ml-data-recognizer-filebeat.jpg
+++ b/docs/user/ml/images/ml-data-recognizer-filebeat.jpg
--- a/docs/user/ml/index.asciidoc
+++ b/docs/user/ml/index.asciidoc
@ -1,9 +1,6 @@
 [role="xpack"]
 [[xpack-ml]]
-= {ml-cap}
-
-[partintro]
--
+== {ml-cap}

 As datasets increase in size and complexity, the human effort required to
 inspect dashboards or maintain rules for spotting infrastructure problems,
@ -29,7 +26,7 @@ The *Data Visualizer* identifies the file format and field mappings. You can the
 optionally import that data into an {es} index.  

 If you have a trial or platinum license, you can 
-<<ml-jobs,create {anomaly-jobs}>> and manage jobs and {dfeeds} from the *Job 
+create {anomaly-jobs} and manage jobs and {dfeeds} from the *Job 
 Management* pane: 

 [role="screenshot"]
@ -67,10 +64,6 @@ web browser so that it does not block pop-up windows or create an exception for
 your {kib} URL.

 For more information about the {anomaly-detect} feature, see
+https://www.elastic.co/what-is/elastic-stack-machine-learning and
 {stack-ov}/xpack-ml.html[{ml-cap} {anomaly-detect}].

--
-
-include::creating-jobs.asciidoc[]
-include::job-tips.asciidoc[]
-
--- a/docs/user/ml/job-tips.asciidoc
+++ b/docs/user/ml/job-tips.asciidoc
@ -1,123 +0,0 @@
-[role="xpack"]
-[[job-tips]]
-=== Machine learning job tips
-++++
-<titleabbrev>Job tips</titleabbrev>
-++++
-
-When you are creating a job in {kib}, the job creation wizards can provide
-advice based on the characteristics of your data. By heeding these suggestions,
-you can create jobs that are more likely to produce insightful {ml} results.
-
-[[bucket-span]]
-==== Bucket span
-
-The bucket span is the time interval that {ml} analytics use to summarize and
-model data for your job. When you create a job in {kib}, you can choose to
-estimate a bucket span value based on your data characteristics. 
-
-NOTE: The bucket span must contain a valid time interval. For more information, 
-see {ref}/ml-job-resource.html#ml-analysisconfig[Analysis configuration objects].
-
-If you choose a value that is larger than one day or is significantly different 
-than the estimated value, you receive an informational message. For more 
-information about choosing an appropriate bucket span, see 
-{xpack-ref}/ml-buckets.html[Buckets].
-
-[[cardinality]]
-==== Cardinality
-
-If there are logical groupings of related entities in your data, {ml} analytics
-can make data models and generate results that take these groupings into
-consideration. For example, you might choose to split your data by user ID and
-detect when users are accessing resources differently than they usually do.
-
-If the field that you use to split your data has many different values, the
-job uses more memory resources. In particular, if the cardinality of the
-`by_field_name`, `over_field_name`, or `partition_field_name` is greater than 
-1000, you are advised that there might be high memory usage. 
-
-Likewise if you are performing population analysis and the cardinality of the
-`over_field_name` is below 10, you are advised that this might not be a suitable
-field to use. For more information, see
-{xpack-ref}/ml-configuring-pop.html[Performing Population Analysis].
-
-[[detectors]]
-==== Detectors
-
-Each job must have one or more _detectors_. A detector applies an analytical 
-function to specific fields in your data. If your job does not contain a 
-detector or the detector does not contain a 
-{stack-ov}/ml-functions.html[valid function], you receive an error.
-
-If a job contains duplicate detectors, you also receive an error. Detectors are 
-duplicates if they have the same `function`, `field_name`, `by_field_name`, 
-`over_field_name` and `partition_field_name`. 
-
-[[influencers]]
-==== Influencers
-
-When you create a job, you can specify _influencers_, which are also sometimes
-referred to as _key fields_. Picking an influencer is strongly recommended for
-the following reasons:
-
-* It allows you to more easily assign blame for the anomaly
-* It simplifies and aggregates the results
-
-The best influencer is the person or thing that you want to blame for the
-anomaly. In many cases, users or client IP addresses make excellent influencers.
-Influencers can be any field in your data; they do not need to be fields that
-are specified in your detectors, though they often are.
-
-As a best practice, do not pick too many influencers. For example, you generally
-do not need more than three. If you pick many influencers, the results can be
-overwhelming and there is a small overhead to the analysis.
-
-The job creation wizards in {kib} can suggest which fields to use as influencers.
-
-[[model-memory-limits]]
-==== Model memory limits
-
-For each job, you can optionally specify a `model_memory_limit`, which is the 
-approximate maximum amount of memory resources that are required for analytical 
-processing. The default value is 1 GB. Once this limit is approached, data 
-pruning becomes more aggressive. Upon exceeding this limit, new entities are not 
-modeled. 
-
-You can also optionally specify the `xpack.ml.max_model_memory_limit` setting. 
-By default, it's not set, which means there is no upper bound on the acceptable 
-`model_memory_limit` values in your jobs. 
-
-TIP: If you set the `model_memory_limit` too high, it will be impossible to open 
-the job; jobs cannot be allocated to nodes that have insufficient memory to run 
-them.
-
-If the estimated model memory limit for a job is greater than the model memory 
-limit for the job or the maximum model memory limit for the cluster, the job 
-creation wizards in {kib} generate a warning. If the estimated memory 
-requirement is only a little higher than the `model_memory_limit`, the job will 
-probably produce useful results. Otherwise, the actions you take to address 
-these warnings vary depending on the resources available in your cluster:
-
-* If you are using the default value for the `model_memory_limit` and the {ml} 
-nodes in the cluster have lots of memory, the best course of action might be to 
-simply increase the job's `model_memory_limit`. Before doing this, however, 
-double-check that the chosen analysis makes sense. The default 
-`model_memory_limit` is relatively low to avoid accidentally creating a job that 
-uses a huge amount of memory.
-* If the {ml} nodes in the cluster do not have sufficient memory to accommodate 
-a job of the estimated size, the only options are:
-** Add bigger {ml} nodes to the cluster, or 
-** Accept that the job will hit its memory limit and will not necessarily find 
-all the anomalies it could otherwise find.
-
-If you are using {ece} or the hosted Elasticsearch Service on Elastic Cloud, 
-`xpack.ml.max_model_memory_limit` is set to prevent you from creating jobs 
-that cannot be allocated to any {ml} nodes in the cluster. If you find that you 
-cannot increase `model_memory_limit` for your {ml} jobs, the solution is to 
-increase the size of the {ml} nodes in your cluster.
-
-For more information about the `model_memory_limit` property and the 
-`xpack.ml.max_model_memory_limit` setting, see 
-{ref}/ml-job-resource.html#ml-analysisconfig[Analysis limits] and 
-{ref}/ml-settings.html[Machine learning settings].