The inference node stats for deployed PyTorch inference
models now contain two new fields: `inference_cache_hit_count`
and `inference_cache_hit_count_last_minute`.
These indicate how many inferences on that node were served
from the C++-side response cache that was added in
https://github.com/elastic/ml-cpp/pull/2305. Cache hits
occur when exactly the same inference request is sent to the
same node more than once.
The `average_inference_time_ms` and
`average_inference_time_ms_last_minute` fields now refer to
the time taken to do the cache lookup, plus, if necessary,
the time to do the inference. We would expect average inference
time to be vastly reduced in situations where the cache hit
rate is high.
This change adds support for kNN vector fields to the `_disk_usage` API. The
strategy:
* Iterate the vector values (using the same strategy as for doc values) to
estimate the vector data size
* Run some random vector searches to estimate the vector index size
Co-authored-by: Yannick Welsch <yannick@welsch.lu>
Closes#84801
This change ensures that existing read_only_allow_delete blocks that
are placed on indices when the flood_stage watermark threshold is
exceeded, are removed when the disk threshold monitoring is disabled.
This is done by changing how InternalClusterInfoService behaves when
disabled. With this change, it will keep calling the registered
listeners periodically, but with an empty ClusterInfo.
Closes#86383
Add the dry_run query parameter to support simulating of updating of desired nodes. The update request will be validated, but no cluster state updates will be performed. In order to indicate that the response was a result of a dry run, we add the dry_run run field to the JSON representation of a response.
See #82975
This PR adds a user action to the SLM health indicator which checks each SLM policy's invocations
since last success field and reports degraded health (YELLOW) in the event that any policy is at or
above the failure threshold (default is 5 failures in a row).
This commit removes the notion of components from the health API. They are gone from being
a top-level field in the response, and indicators is promoted into its place.
Our current default for the http.max_header_size setting is 8kb. This
is lower than the current default for Kibana (16kb in 8.x), and the ESS
proxy (1mb based on the Go http library default). To align with the
current convention of other Elastic components, this PR increases the
ES header size setting default to 16kb.
Closes#88501
Remove help_url,rename summary->symptom,user_actions->diagnosis
Separate the diagnosis `message` field in `cause` and `action`
Co-authored-by: Mary Gouseti <mgouseti@gmail.com>
* Convert disk watermarks to RelativeByteSizeValues
Similar to the existing watermark setting for the frozen tier.
Pre-requisite for PR 88639 that plans to introduce max headroom
settings for the disk watermarks, similar to the frozen tier max
headroom setting.
* Add changelog
* Revert 20gb to 20GB
* Make formatNoTrailingZerosPercent non static
* ByteSizeValue.MINUS_ONE
* Remove getMinimumTotalSizeForBelowWatermark
* Remove comment
* Fix minor stuff
* Make parsing of RelativeByteSizeValue faster
Mimicks older definitelyNotPercentage function
* Remove Locale from Strings.format
* More MINUS_ONE
This PR adds a new `knn` option to the `_search` API to support ANN search.
It's powered by the same Lucene ANN capabilities as the old `_knn_search`
endpoint. The `knn` option can be combined with other search features like
queries and aggregations.
Addresses #87625
This adds support for the `cardinality` aggregation within a random_sampler.
This usecase is helpful in determining the ratio of unique values compared to the count of total documents within the sampled set.
* [DOCS] Add minimal security steps back to docs
* Update instructions to use reset password tool
* Update setting built-in user passwords with the es reset passwords tool
* Revert "Update setting built-in user passwords with the es reset passwords tool"
This reverts commit 51b72fdfdf.
* Address review feedback and make clearer distinctions between security configurations
With: https://github.com/elastic/ml-cpp/pull/2305 we now support caching pytorch inference responses per node per model.
By default, the cache will be the same size has the model on disk size. This is because our current best estimate for memory used (for deploying) is 2*model_size + constant_overhead.
This is due to the model having to be loaded in memory twice when serializing to the native process.
But, once the model is in memory and accepting requests, its actual memory usage is reduced vs. what we have "reserved" for it within the node.
Consequently, having a cache layer that takes advantage of that unused (but reserved) memory is effectively free. When used in production, especially in search scenarios, caching inference results is critical for decreasing latency.
Currently we have two parameters that control how the source of a document
is stored, `enabled` and `synthetic`, both booleans. However, there are only
three possible combinations of these, with `enabled:false` and `synthetic:true`
being disallowed. To make this easier to reason about, this commit replaces
the `enabled` parameter with a new `mode` parameter, which can take the values
`stored`, `synthetic` and `disabled`. The `mode` parameter cannot be set
in combination with `enabled`, and we will subsequently move towards
deprecating `enabled` entirely.
The build_flavor was previously removed since it is no longer relevant;
only the default distribution now exists. However, the removal of build
flavor included removing it from the version information on the info
response for the root path. This API is supposed to be stable, so
removing that key was a compatibility break. This commit adds the
build_flavor back to that API, hardcoded to `default`. Additionally, a
test is added to ensure the key exists going forward, until it can be
properly deprecated.
closes#88318
This PR adds a new setting to enable tcp keepalive probes for the
connections used by the oidc back-channel communication. It defaults to
true as tcp keepalive is generally useful for ES.
Relates: #87773
* Adding discovery troubleshooting link
* Add tags to pull in discovery troubleshooting content
* Move discovery troubleshooting to separate page and add redirects
Co-authored-by: Adam Locke <adam.locke@elastic.co>
Plumbs through a new parameter for the cardinality aggregation, to allow configuring the execution mode. This can have significant impacts on speed and memory usage. This PR exposes three collection modes and two heuristics that we can tune going forward. All of these are treated as hints and can be silently ignored, e.g. if not applicable to the given field type. I've change the default behavior to optimize for time, which potentially uses more memory. Users can override this for the old behavior if needed.
* Generate release notes for v8.3.0 (#87294)
* Generate release notes for v8.3.0
* [DOCS] Add 8.3 migration file to index
* Fixed version number
* Fix formatting of deprecation in 85326
* Use asciidoc format for deprecations
* Extract static content from migration/index
* This was just an enhancement
* Nope, it was an upgrade
* Added migration/index.asciidoc generation support (#87318)
Including extracting static content from migration/index, so the template would be as light as possible.
The reason for this work is because the gradle task `generateReleaseNotes` was not correctly adding new links and imports to the migrations/index and that caused documentation to fail building for 8.3.0.
* [DOCS] Add ml-cpp PRs to release notes
* Added back incorrectly deleted changlog
* Added missing highlight
* Fixed spelling of StackOverflowError
Co-authored-by: lcawl <lcawley@elastic.co>
* Brought back missing changelog for 87235 (#87370)
* Brought back missing changelog for 87235
* Regenerate release notes
* Regenerate release notes for BC3 (#87449)
* Regenerate release notes for BC3
* Re-applied manual fixes that Lisa Cawley used
* Fixed breaking changes generation
To match the manual edits done by Lisa Cawley
* Fixed failing test
Since the manual edits by Lisa removed the `-SNAPSHOT` from the docs
we remove it from the tests too.
* Remove changelogs for intermediate lucene upgrades
* Update release notes for BC4 (#87635)
* Update release notes for BC4
* Adding missing changelog for geo_grid
* Added missing highlight notes for 84250
* Update docs/reference/release-notes/highlights.asciidoc
Co-authored-by: David Kilfoyle <41695641+kilfoyle@users.noreply.github.com>
* Update docs/reference/release-notes/highlights.asciidoc
Co-authored-by: David Kilfoyle <41695641+kilfoyle@users.noreply.github.com>
* Backported fixes to original yaml
* Control sort order of release highlights
So that small changes to an individual highlight don't completely shuffle the entire document.
We also added links to the PRs from the highlight titles, for convenience.
Otherwise readers need to search the release notes for the changelog entry and click there,
which is a lot more work.
* Fixed failing test after new ordered highlights
Also made test verify re-ordering of highlights
Co-authored-by: David Kilfoyle <41695641+kilfoyle@users.noreply.github.com>
* Update release notes for BC5 (#87808)
* Update release notes for BC6 (#87912)
* Update release notes for BC9 (#88011)
* Regenerate release notes after version bump 8.4.0
And after forward porting 8.3.0 release notes changes.
* Regenerate after some changelog entries removed
* Re-prune the changelogs
* Removed three changelog entries
Co-authored-by: lcawl <lcawley@elastic.co>
Co-authored-by: David Kilfoyle <41695641+kilfoyle@users.noreply.github.com>