[ML] adding new PUT trained model vocabulary endpoint (#77387)

This commit removes the ability to set the vocabulary location in the model config.
This opts instead for sane defaults to be set and used. Wrapping this up in an
API.

The index is now always the internally managed .ml-inference-native index
and the document ID is always <model_id>_vocabulary

This API only works for pytorch/nlp type models.
This commit is contained in:
Benjamin Trent 2021-09-08 10:21:45 -04:00 committed by GitHub
parent 35e6039c5e
commit a68c6acdb3
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
32 changed files with 614 additions and 139 deletions

View file

@ -4,6 +4,7 @@ include::put-dfanalytics.asciidoc[leveloffset=+2]
include::put-trained-models-aliases.asciidoc[leveloffset=+2]
include::put-trained-models.asciidoc[leveloffset=+2]
include::put-trained-model-definition-part.asciidoc[leveloffset=+2]
include::put-trained-model-vocabulary.asciidoc[leveloffset=+2]
//UPDATE
include::update-dfanalytics.asciidoc[leveloffset=+2]
//DELETE

View file

@ -20,6 +20,7 @@ You can use the following APIs to perform {infer} operations:
* <<put-trained-models>>
* <<put-trained-model-definition-part>>
* <<put-trained-model-vocabulary>>
* <<put-trained-models-aliases>>
* <<delete-trained-models>>
* <<delete-trained-models-aliases>>

View file

@ -0,0 +1,45 @@
[role="xpack"]
[testenv="basic"]
[[put-trained-model-vocabulary]]
= Create trained model vocabulary API
[subs="attributes"]
++++
<titleabbrev>Create trained model vocabulary</titleabbrev>
++++
Creates a trained model vocabulary.
This is only supported on NLP type models.
experimental::[]
[[ml-put-trained-model-vocabulary-request]]
== {api-request-title}
`PUT _ml/trained_models/<model_id>/vocabulary/`
[[ml-put-trained-model-vocabulary-prereq]]
== {api-prereq-title}
Requires the `manage_ml` cluster privilege. This privilege is included in the
`machine_learning_admin` built-in role.
[[ml-put-trained-model-vocabulary-path-params]]
== {api-path-parms-title}
`<model_id>`::
(Required, string)
include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=model-id]
[[ml-put-trained-model-vocabulary-request-body]]
== {api-request-body-title}
`vocabulary`::
(array)
The model vocabulary. Must not be empty.
////
[[ml-put-trained-model-vocabulary-example]]
== {api-examples-title}
////