elasticsearch/docs/java-rest/high-level/ml/update-job.asciidoc
David Roberts 0059c59e25
[ML] Make ml_standard tokenizer the default for new categorization jobs (#72805)
Categorization jobs created once the entire cluster is upgraded to
version 7.14 or higher will default to using the new ml_standard
tokenizer rather than the previous default of the ml_classic
tokenizer, and will incorporate the new first_non_blank_line char
filter so that categorization is based purely on the first non-blank
line of each message.

The difference between the ml_classic and ml_standard tokenizers
is that ml_classic splits on slashes and colons, so creates multiple
tokens from URLs and filesystem paths, whereas ml_standard attempts
to keep URLs, email addresses and filesystem paths as single tokens.

It is still possible to config the ml_classic tokenizer if you
prefer: just provide a categorization_analyzer within your
analysis_config and whichever tokenizer you choose (which could be
ml_classic or any other Elasticsearch tokenizer) will be used.

To opt out of using first_non_blank_line as a default char filter,
you must explicitly specify a categorization_analyzer that does not
include it.

If no categorization_analyzer is specified but categorization_filters
are specified then the categorization filters are converted to char
filters applied that are applied after first_non_blank_line.

Closes elastic/ml-cpp#1724
2021-06-01 15:11:32 +01:00

66 lines
2.3 KiB
Text

--
:api: update-job
:request: UpdateJobRequest
:response: PutJobResponse
--
[role="xpack"]
[id="{upid}-{api}"]
=== Update {anomaly-jobs} API
Provides the ability to update an {anomaly-job}.
It accepts a +{request}+ object and responds with a +{response}+ object.
[id="{upid}-{api}-request"]
==== Update {anomaly-jobs} request
An +{request}+ object gets created with a `JobUpdate` object.
["source","java",subs="attributes,callouts,macros"]
--------------------------------------------------
include-tagged::{doc-tests-file}[{api}-request]
--------------------------------------------------
<1> Constructing a new request referencing a `JobUpdate` object.
==== Optional arguments
The `JobUpdate` object has many optional arguments with which to update an
existing {anomaly-job}. An existing, non-null `jobId` must be referenced in its
creation.
["source","java",subs="attributes,callouts,macros"]
--------------------------------------------------
include-tagged::{doc-tests-file}[{api}-options]
--------------------------------------------------
<1> Mandatory, non-null `jobId` referencing an existing {anomaly-job}.
<2> Updated description.
<3> Updated analysis limits.
<4> Updated background persistence interval.
<5> Updated detectors through the `JobUpdate.DetectorUpdate` object.
<6> Updated group membership.
<7> Updated result retention.
<8> Updated model plot configuration.
<9> Updated model snapshot retention setting.
<10> Updated custom settings.
<11> Updated renormalization window.
Included with these options are specific optional `JobUpdate.DetectorUpdate` updates.
["source","java",subs="attributes,callouts,macros"]
--------------------------------------------------
include-tagged::{doc-tests-file}[{api}-detector-options]
--------------------------------------------------
<1> The index of the detector. `O` means unknown.
<2> The optional description of the detector.
<3> The `DetectionRule` rules that apply to this detector.
include::../execution.asciidoc[]
[id="{upid}-{api}-response"]
==== Update {anomaly-jobs} response
A +{response}+ contains the updated `Job` object
["source","java",subs="attributes,callouts,macros"]
--------------------------------------------------
include-tagged::{doc-tests-file}[{api}-response]
--------------------------------------------------
<1> `getResponse()` returns the updated `Job` object.