Merge remote-tracking branch 'es/master' into enrich

This commit is contained in:
Martijn van Groningen 2019-10-14 10:17:18 +02:00
commit e06598ba56
No known key found for this signature in database
GPG key ID: AB236F4FCF2AF12A
412 changed files with 6674 additions and 4217 deletions

View file

@ -20,14 +20,52 @@ include-tagged::{doc-tests-file}[{api}-request]
<1> Constructing a new evaluation request
<2> Reference to an existing index
<3> The query with which to select data from indices
<4> Kind of evaluation to perform
<5> Name of the field in the index. Its value denotes the actual (i.e. ground truth) label for an example. Must be either true or false
<6> Name of the field in the index. Its value denotes the probability (as per some ML algorithm) of the example being classified as positive
<7> The remaining parameters are the metrics to be calculated based on the two fields described above.
<8> https://en.wikipedia.org/wiki/Precision_and_recall[Precision] calculated at thresholds: 0.4, 0.5 and 0.6
<9> https://en.wikipedia.org/wiki/Precision_and_recall[Recall] calculated at thresholds: 0.5 and 0.7
<10> https://en.wikipedia.org/wiki/Confusion_matrix[Confusion matrix] calculated at threshold 0.5
<11> https://en.wikipedia.org/wiki/Receiver_operating_characteristic#Area_under_the_curve[AuC ROC] calculated and the curve points returned
<4> Evaluation to be performed
==== Evaluation
Evaluation to be performed.
Currently, supported evaluations include: +BinarySoftClassification+, +Classification+, +Regression+.
===== Binary soft classification
["source","java",subs="attributes,callouts,macros"]
--------------------------------------------------
include-tagged::{doc-tests-file}[{api}-evaluation-softclassification]
--------------------------------------------------
<1> Constructing a new evaluation
<2> Name of the field in the index. Its value denotes the actual (i.e. ground truth) label for an example. Must be either true or false.
<3> Name of the field in the index. Its value denotes the probability (as per some ML algorithm) of the example being classified as positive.
<4> The remaining parameters are the metrics to be calculated based on the two fields described above
<5> https://en.wikipedia.org/wiki/Precision_and_recall#Precision[Precision] calculated at thresholds: 0.4, 0.5 and 0.6
<6> https://en.wikipedia.org/wiki/Precision_and_recall#Recall[Recall] calculated at thresholds: 0.5 and 0.7
<7> https://en.wikipedia.org/wiki/Confusion_matrix[Confusion matrix] calculated at threshold 0.5
<8> https://en.wikipedia.org/wiki/Receiver_operating_characteristic#Area_under_the_curve[AuC ROC] calculated and the curve points returned
===== Classification
["source","java",subs="attributes,callouts,macros"]
--------------------------------------------------
include-tagged::{doc-tests-file}[{api}-evaluation-classification]
--------------------------------------------------
<1> Constructing a new evaluation
<2> Name of the field in the index. Its value denotes the actual (i.e. ground truth) class the example belongs to.
<3> Name of the field in the index. Its value denotes the predicted (as per some ML algorithm) class of the example.
<4> The remaining parameters are the metrics to be calculated based on the two fields described above
<5> Multiclass confusion matrix of size 3
===== Regression
["source","java",subs="attributes,callouts,macros"]
--------------------------------------------------
include-tagged::{doc-tests-file}[{api}-evaluation-regression]
--------------------------------------------------
<1> Constructing a new evaluation
<2> Name of the field in the index. Its value denotes the actual (i.e. ground truth) value for an example.
<3> Name of the field in the index. Its value denotes the predicted (as per some ML algorithm) value for the example.
<4> The remaining parameters are the metrics to be calculated based on the two fields described above
<5> https://en.wikipedia.org/wiki/Mean_squared_error[Mean squared error]
<6> https://en.wikipedia.org/wiki/Coefficient_of_determination[R squared]
include::../execution.asciidoc[]
@ -41,7 +79,40 @@ The returned +{response}+ contains the requested evaluation metrics.
include-tagged::{doc-tests-file}[{api}-response]
--------------------------------------------------
<1> Fetching all the calculated metrics results
<2> Fetching precision metric by name
<3> Fetching precision at a given (0.4) threshold
<4> Fetching confusion matrix metric by name
<5> Fetching confusion matrix at a given (0.5) threshold
==== Results
===== Binary soft classification
["source","java",subs="attributes,callouts,macros"]
--------------------------------------------------
include-tagged::{doc-tests-file}[{api}-results-softclassification]
--------------------------------------------------
<1> Fetching precision metric by name
<2> Fetching precision at a given (0.4) threshold
<3> Fetching confusion matrix metric by name
<4> Fetching confusion matrix at a given (0.5) threshold
===== Classification
["source","java",subs="attributes,callouts,macros"]
--------------------------------------------------
include-tagged::{doc-tests-file}[{api}-results-classification]
--------------------------------------------------
<1> Fetching multiclass confusion matrix metric by name
<2> Fetching the contents of the confusion matrix
<3> Fetching the number of classes that were not included in the matrix
===== Regression
["source","java",subs="attributes,callouts,macros"]
--------------------------------------------------
include-tagged::{doc-tests-file}[{api}-results-regression]
--------------------------------------------------
<1> Fetching mean squared error metric by name
<2> Fetching the actual mean squared error value
<3> Fetching R squared metric by name
<4> Fetching the actual R squared value

View file

@ -76,7 +76,7 @@ include-tagged::{doc-tests-file}[{api}-dest-config]
==== Analysis
The analysis to be performed.
Currently, the supported analyses include : +OutlierDetection+, +Regression+.
Currently, the supported analyses include: +OutlierDetection+, +Classification+, +Regression+.
===== Outlier detection
@ -101,6 +101,24 @@ include-tagged::{doc-tests-file}[{api}-outlier-detection-customized]
<6> The proportion of the data set that is assumed to be outlying prior to outlier detection
<7> Whether to apply standardization to feature values
===== Classification
+Classification+ analysis requires to set which is the +dependent_variable+ and
has a number of other optional parameters:
["source","java",subs="attributes,callouts,macros"]
--------------------------------------------------
include-tagged::{doc-tests-file}[{api}-classification]
--------------------------------------------------
<1> Constructing a new Classification builder object with the required dependent variable
<2> The lambda regularization parameter. A non-negative double.
<3> The gamma regularization parameter. A non-negative double.
<4> The applied shrinkage. A double in [0.001, 1].
<5> The maximum number of trees the forest is allowed to contain. An integer in [1, 2000].
<6> The fraction of features which will be used when selecting a random bag for each candidate split. A double in (0, 1].
<7> The name of the prediction field in the results object.
<8> The percentage of training-eligible rows to be used in training. Defaults to 100%.
===== Regression
+Regression+ analysis requires to set which is the +dependent_variable+ and

View file

@ -12,8 +12,8 @@ API Key can be created using this API.
[id="{upid}-{api}-request"]
==== Create API Key Request
A +{request}+ contains name for the API key,
list of role descriptors to define permissions and
A +{request}+ contains an optional name for the API key,
an optional list of role descriptors to define permissions and
optional expiration for the generated API key.
If expiration is not provided then by default the API
keys do not expire.
@ -37,4 +37,4 @@ expiration.
include-tagged::{doc-tests-file}[{api}-response]
--------------------------------------------------
<1> the API key that can be used to authenticate to Elasticsearch.
<2> expiration if the API keys expire
<2> expiration if the API keys expire