elasticsearch/docs/changelog/124313.yaml
Jim Ferenczi 361b51d436
Optimize memory usage in ShardBulkInferenceActionFilter (#124313)
This refactor improves memory efficiency by processing inference requests in batches, capped by a max input length.

Changes include:
- A new dynamic operator setting to control the maximum batch size in bytes.
- Dropping input data from inference responses when the legacy semantic text format isn’t used, saving memory.
- Clearing inference results dynamically after each bulk item to free up memory sooner.

This is a step toward enabling circuit breakers to better handle memory usage when dealing with large inputs.
2025-03-14 09:51:03 +00:00

5 lines
120 B
YAML

pr: 124313
summary: Optimize memory usage in `ShardBulkInferenceActionFilter`
area: Search
type: enhancement
issues: []