[ML] Introduce randomize_seed setting for regression and classification (#49990)

This adds a new `randomize_seed` for regression and classification.
When not explicitly set, the seed is randomly generated. One can
reuse the seed in a similar job in order to ensure the same docs
are picked for training.
This commit is contained in:
Dimitris Athanasiou 2019-12-10 10:22:53 +02:00 committed by GitHub
parent a6351d63ad
commit 269425b54d
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
24 changed files with 460 additions and 76 deletions

View file

@ -119,7 +119,8 @@ include-tagged::{doc-tests-file}[{api}-classification]
<6> The fraction of features which will be used when selecting a random bag for each candidate split. A double in (0, 1].
<7> The name of the prediction field in the results object.
<8> The percentage of training-eligible rows to be used in training. Defaults to 100%.
<9> The number of top classes to be reported in the results. Defaults to 2.
<9> The seed to be used by the random generator that picks which rows are used in training.
<10> The number of top classes to be reported in the results. Defaults to 2.
===== Regression
@ -138,6 +139,7 @@ include-tagged::{doc-tests-file}[{api}-regression]
<6> The fraction of features which will be used when selecting a random bag for each candidate split. A double in (0, 1].
<7> The name of the prediction field in the results object.
<8> The percentage of training-eligible rows to be used in training. Defaults to 100%.
<9> The seed to be used by the random generator that picks which rows are used in training.
==== Analyzed fields