mirror of
https://github.com/elastic/elasticsearch.git
synced 2025-06-30 10:23:41 -04:00
This adds a new parameter to the start trained model deployment API, namely `priority`. The available settings are `normal` and `low`. For normal priority deployments the allocations get distributed so that node processors are never oversubscribed. Low priority deployments allow users to test model functionality even if there are no node processors available. They are limited to 1 allocation with a single thread. In addition, the process is executed in low priority which limits the amount of CPU that can be used when the CPU is under pressure. The intention of this is to limit the impact of low priority deployments on normal priority deployments. When we rebalance model assignments we now: 1. compute a plan just for normal priority deployments 2. fix the resources used by normal deployments 3. compute a plan just for low priority deployments 4. merge the two plans Closes #91024 |
||
---|---|---|
.. | ||
apis |