[ML] Data frame analytics max_num_threads setting (#59254)

This adds a setting to data frame analytics jobs called
`max_number_threads`. The setting expects a positive integer.
When used the user specifies the max number of threads that may
be used by the analysis. Note that the actual number of threads
used is limited by the number of processors on the node where
the job is assigned. Also, the process may use a couple more threads
for operational functionality that is not the analysis itself.

This setting may also be updated for a stopped job.

More threads may reduce the time it takes to complete the job at the cost
of using more CPU.
This commit is contained in:
Dimitris Athanasiou 2020-07-09 16:31:26 +03:00 committed by GitHub
parent 650f20eb0d
commit da0249f6c2
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
16 changed files with 206 additions and 29 deletions

View file

@ -38,6 +38,7 @@ include-tagged::{doc-tests-file}[{api}-config]
<5> The fields to be included in / excluded from the analysis
<6> The memory limit for the model created as part of the analysis process
<7> Optionally, a human-readable description
<8> The maximum number of threads to be used by the analysis. Defaults to 1.
[id="{upid}-{api}-query-config"]

View file

@ -34,6 +34,7 @@ include-tagged::{doc-tests-file}[{api}-config-update]
<1> The {dfanalytics-job} ID
<2> The human-readable description
<3> The memory limit for the model created as part of the analysis process
<4> The maximum number of threads to be used by the analysis
[id="{upid}-{api}-query-config"]