elasticsearch/docs/reference/inference
Jonathan Buttner fdb5058b13
[ML] Inference API rate limit queuing logic refactor (#107706)
* Adding new executor

* Adding in queuing logic

* working tests

* Added cleanup task

* Update docs/changelog/107706.yaml

* Updating yml

* deregistering callbacks for settings changes

* Cleaning up code

* Update docs/changelog/107706.yaml

* Fixing rate limit settings bug and only sleeping least amount

* Removing debug logging

* Removing commented code

* Renaming feedback

* fixing tests

* Updating docs and validation

* Fixing source blocks

* Adjusting cancel logic

* Reformatting ascii

* Addressing feedback

* adding rate limiting for google embeddings and mistral

---------

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2024-06-05 08:25:25 -04:00
..
delete-inference.asciidoc [DOCS] Expands DELETE inference API docs (#109282) 2024-06-03 17:32:31 +02:00
get-inference.asciidoc [Inference API] Add Google AI Studio completion docs (#109089) 2024-05-28 15:21:33 +02:00
inference-apis.asciidoc [Inference API] Add Google AI Studio completion docs (#109089) 2024-05-28 15:21:33 +02:00
post-inference.asciidoc [Inference API] Add Google AI Studio completion docs (#109089) 2024-05-28 15:21:33 +02:00
put-inference.asciidoc [ML] Inference API rate limit queuing logic refactor (#107706) 2024-06-05 08:25:25 -04:00