Adds a new health indicator that reports problems if indexes have a block placed on them, or if
any nodes in the cluster are running low on disk space.
This PR expands the approximate kNN docs to clarify the filter is applied during
the kNN search, not after. It explains the downsides of postfiltering.
So that they are visible in NodeIndicesStats only at the node and index (but not shard) levels. Also visible in the _cat/nodes table. And make an exact count yaml REST test.
Introduce max headroom settings for the low, high, and flood disk watermark stages, similar to the existing max headroom setting for the flood stage of the frozen tier. Introduce new max headrooms in HealthMetadata and in ReactiveStorageDeciderService. Add multiple tests in DiskThresholdDeciderUnitTests, DiskThresholdDeciderTests and DiskThresholdMonitorTests. Moreover, addition & subtraction for ByteSizeValue, and min.
This troubleshooting guide is what will be returned from the SLM health indicator
when a SLM policy has suffered from too many repeat failures without a successful
execution.
This warning was lost in #83489, but it's important we have it in these
docs since users keep on trying this kind of invalid upgrade. This
commit reinstates the lost warning.
I've been hacking on synthetic source for a while now and not seen any
need to break backwards compatibility or any major bugs. I think it's
time to remove the `preview` marker from it so folks can use it without
fear.
It seems that for now we don't have a good use for the histogram and summary metric types.
They had been left as place holders for a while, but at this point there is no concrete plan forward for them.
This PR removes the histogram and summary metric types. We may add them back in the future.
Also, this PR completely removes the time_series_metric mapping parameter from the histogram field type and only allows the gauge metric type for aggregate_metric_double fields.
Categorization of strings which break down to a huge number of tokens can cause the C++ backend process to choke - see elastic/ml-cpp#2403.
This PR adds a limit filter to the default categorization analyzer which caps the number of tokens passed to the backend at 100.
Unfortunately this isn't a complete panacea to all the issues surrounding categorization of many tokened / large messages as verification checks on the frontend can also fail due to calls to the datafeed _preview API returning an excessive amount of data.
ES_JAVA_OPTS is still the correct way to pass options to
the Elasticsearch process, CLI_JAVA_OPTS affects only the
command line tool. CLI_JAVA_OPTS is the correct way to pass
options for plugin installation or other tools.
The get snapshot status API will currently return a value of `STARTED` for the state of a snapshot that is currently running. The documentation says that the `state` value for a running snapshot is `IN_PROGRESS`. This documentation change will align the docs with the actual result of the get snapshot status API.
Co-authored-by: Austin Smith <76973609+asmith-elastic@users.noreply.github.com>
Part of the stable master history health indicator's results (the
`cluster_formation` section within `details`) used dynamic keys in a
map. This gets rid of that. So now instead of:
```
"details": {
"current_master": {
"node_id": null,
"name": null
},
"recent_masters": [
{
"node_id": "31WBm9iTTRuMyWnBhWNUGA",
"name": "master-node-3"
}
],
"cluster_formation": {
"31WBm9iTTRuMyWnBhWNUGA": "master not discovered or elected yet, an election requires at least 2 nodes with ids from [nADkAeGsT-q12gw89Ga1FA, 31WBm9iTTRuMyWnBhWNUGA, w8v48JvuRsuDCjwBn8KbRw], have only discovered non-quorum [{master-node-3}{31WBm9iTTRuMyWnBhWNUGA}{lJmGYiTPS_W7AJU7csG_gQ}{master-node-3}{127.0.0.1}{127.0.0.1:9301}{dm}]; discovery will continue using [127.0.0.1:9300, 127.0.0.1:9302, 127.0.0.1:9303, 127.0.0.1:9304, 127.0.0.1:9305, [::1]:9300, [::1]:9302, [::1]:9303, [::1]:9304, [::1]:9305] from hosts providers and [{master-node-2}{nADkAeGsT-q12gw89Ga1FA}{logzEHuuTpqwJp-RWssBPw}{master-node-2}{127.0.0.1}{127.0.0.1:9300}{dm}, {master-node-3}{31WBm9iTTRuMyWnBhWNUGA}{lJmGYiTPS_W7AJU7csG_gQ}{master-node-3}{127.0.0.1}{127.0.0.1:9301}{dm}] from last-known cluster state; node term 39, last-accepted version 461 in term 39"
}
}
```
We will have:
```
"details": {
"current_master": {
"node_id": null,
"name": null
},
"recent_masters": [
{
"node_id": "31WBm9iTTRuMyWnBhWNUGA",
"name": "master-node-3"
}
],
"cluster_formation": [
{
"node_id": "31WBm9iTTRuMyWnBhWNUGA",
"cluster_formation_message": "master not discovered or elected yet, an election requires at least 2 nodes with ids from [nADkAeGsT-q12gw89Ga1FA, 31WBm9iTTRuMyWnBhWNUGA, w8v48JvuRsuDCjwBn8KbRw], have only discovered non-quorum [{master-node-3}{31WBm9iTTRuMyWnBhWNUGA}{lJmGYiTPS_W7AJU7csG_gQ}{master-node-3}{127.0.0.1}{127.0.0.1:9301}{dm}]; discovery will continue using [127.0.0.1:9300, 127.0.0.1:9302, 127.0.0.1:9303, 127.0.0.1:9304, 127.0.0.1:9305, [::1]:9300, [::1]:9302, [::1]:9303, [::1]:9304, [::1]:9305] from hosts providers and [{master-node-2}{nADkAeGsT-q12gw89Ga1FA}{logzEHuuTpqwJp-RWssBPw}{master-node-2}{127.0.0.1}{127.0.0.1:9300}{dm}, {master-node-3}{31WBm9iTTRuMyWnBhWNUGA}{lJmGYiTPS_W7AJU7csG_gQ}{master-node-3}{127.0.0.1}{127.0.0.1:9301}{dm}] from last-known cluster state; node term 39, last-accepted version 461 in term 39"
}
]
}
```
The cross_fields scoring type can produce negative scores when some documents
are missing fields. When blending term document frequencies, we take the maximum
document frequency across all fields. If one field appears in fewer documents
than another, this means that its IDF can become negative. This is because IDF
is calculated as `Math.log(1 + (docCount - docFreq + 0.5) / (docFreq + 0.5))`
This change adjusts the docFreq for each field to `Math.min(docCount, docFreq)`
so that the IDF can never become negative. It makes sense that the term document
frequency should never exceed the number of documents containing the field.
Requests to the bulk API comprise a sequence of items, each of which
starts with a JSON object describing the item. This object includes the
type of action to perform with the item which should be one of `create`,
`update`, `index`, or `delete`. In earlier versions Elasticsearch would
ignore items with an unrecognized type, skipping the next line in the
request, but this lenient behaviour means that there is no way for the
client to associate the items in the response with the items in the
request, and in some cases it would cause the remainder of the request
to be parsed incorrectly.
With this commit, requests to the bulk API must comprise only items with
recognized types. Elasticsearch will reject requests containing any
items with an unrecognized type with a `400 Bad Request` error response.
This change adds the filter query for a filtered alias to the knn query during the dfs phase on the
shard. This ensures the correct number of k results are returned instead of removing results as a post
filter.
Fixes: #89561
* [Doc] Release notes for v8.4.1 (#89636)
* [Doc] Release notes for v8.4.1
Gradle generated release notes for v8.4.1
* address feedback
* [DOCS] Remove coming tag for 8.4.1 RNs
Co-authored-by: Yang Wang <yang.wang@elastic.co>
This adds support for synthetic _source to the `version` field type. It
works very similarly to `keyword` but with an extra decode step.
I modified the decoder to return a `BytesRef` instead of a `String`
because many of the callers seemed to be converting that string directly
into bytes again. Synthetic source would have wanted to do that. As was
the query infrastructure.
* Create restart-cluster.asciidoc
As per https://github.com/elastic/elasticsearch/issues/49972 and https://github.com/elastic/elasticsearch/issues/56578, if a node is above low disk threshold when being restarted (rolling restart, network disruption or crash), the disk threshold decider prevents reusing the shard content on the restarted node.
The consequence of the event is the node may take a long time to start.
* Update docs/reference/setup/restart-cluster.asciidoc
LGTM! Thanks!
Co-authored-by: Adam Locke <adam.locke@elastic.co>
Co-authored-by: Adam Locke <adam.locke@elastic.co>