This will correct/switch "year" unit diffing from the current integer
subtraction to a crono subtraction. Consequently, two dates are (at
least) one year apart now if (at least) a full calendar year separates
them. The previous implementation simply subtracted the year part of the
dates.
Note: this parts with ES SQL's implementation of the same function,
which itself is aligned with MS SQL's implementation, which works
equivalent to an integer subtraction.
Fixes#112482.
Apparently some users consider "node is restarting" not to apply to a
full-cluster restart. This commit further clarifies that you must not
set `cluster.initial_master_nodes` in a full cluster restart.
* Add connector permissions to fleet server service account
* [Security] Add permissions to manage connectors for fleet-server service account
* Fix tests
* Fix tests
* Fix typ again (tm)
* switch to connector/* vs manage_connectors
@jakelandis pointed out that we don't need connector secrets, which is the only difference between these too. We don't have a pretty name for the narrower permissions, but we don't need one here.
Co-authored-by: Artem Shelkovnikov <lavatroublebubble@gmail.com>
---------
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
Co-authored-by: Sean Story <sean.j.story@gmail.com>
Co-authored-by: Artem Shelkovnikov <lavatroublebubble@gmail.com>
Here we test reindexing logsdb indices, creating and restoring
snapshots. Note that logsdb uses synthetic source and restoring
source only snapshots fails due to missing _source.
ZStandard was added via #103374 a few months ago to snapshot builds of Elasticsearch only and benchmark results have shown that using zstd is a better trade off compared to deflate for when index.codec is set to best_compression.
This change removes the feature flag for ZStandard stored field compression for indices with index.codec set to best_compression.
* (Doc+) Inference Pipeline ignores Mapping Analyzers
From internal Dev feedback (will cross-link after), this updates that inference processors within ingest pipelines run before mapping analyzers effectively ignoring them. So if users want analyzers to take effect, they would need to select the analyzer's ingest pipeline process equivalent and run it higher in flow than the inference processor.
---------
Co-authored-by: István Zoltán Szabó <istvan.szabo@elastic.co>
👋 howdy, team!
1. Related to https://github.com/elastic/dev/issues/2631, highlights customers are usually seeking `heap.percent` instead of `ram.percent`
2. Aligns the claimed "(Default)" columns in doc to what returned for v8.15.1 test cluster
Adds an API which scans all the metadata (and optionally the raw data)
in a snapshot repository to look for corruptions or other
inconsistencies.
Closes https://github.com/elastic/elasticsearch/issues/52622 Closes
ES-8560
Currently, the data stream lifecycle telemetry has the following
structure:
```
{
....
"data_lifecycle" : {
"available": true,
"enabled": true,
"count": 0,
"default_rollover_used": true,
"retention": {
"minimum_millis": 0,
"maximum_millis": 0,
"average_millis": 0.0
}
}....
```
In the snippet above you can see that we track:
- The amount of data streams managed by the data stream lifecycle by `count`
- If the default rollover has been overwritten by `default_rollover_used`
- The min, max and average of the `data_retention` configured on a data stream level.
In this PR we propose the following extention:
```
....
"data_lifecycle" : {
"available": true,
"enabled": true,
"count": 0,
"default_rollover_used": true,
"effective_retention": { #https://github.com/elastic/dev/issues/2537
"retained_data_streams": 5,
"minimum_millis": 0, # Only if retained data streams > 1
"maximum_millis": 0,
"average_millis": 0.0
},
"data_retention": {
"configured_data_streams": 5,
"minimum_millis": 0, # Only if retained data streams > 1
"maximum_millis": 0,
"average_millis": 0.0
},
"global_retention": {
"default": {
"defined": true/false,
"affected_data_streams": 0,
"millis": 0
},
"max": {
"defined": true/false,
"affected_data_streams": 0,
"millis": 0
}
}
```
With this extension we are tracking:
- The amount of data streams managed by the data stream lifecycle by `count`
- If the default rollover has been overwritten by `default_rollover_used`
- The min, max and average of the `data_retention` configured on a data stream level and the number of data streams that have it configured. We add the min, max and avg only if there are data streams with data retention configuration to avoid messing with the stats in a dashboard.
- The min, max and average of the `effective_retention` and the number of data streams that are retained. We add the min, max and avg only if there are retained data streams to avoid messing with the stats in a dashboard.
- Global retention stats, if they are defined, if the number of the affected data streams and the actual value.
The above metrics allow us to answer questions like:
- How many data streams are affected by global retention.
- How big is the difference between the longest data retention compared to max global retention.
- How much does the effective retention diverging from the data retention, this will show the impact of the global retention.
* #101472 Updates default index.translog.flush_threshold_size value
* Update docs/reference/index-modules/translog.asciidoc
Co-authored-by: István Zoltán Szabó <istvan.szabo@elastic.co>
* Updates the description
---------
Co-authored-by: István Zoltán Szabó <istvan.szabo@elastic.co>
When CASE hits a multivalued field it was previously either crashing on
fold or evaluating it to the first value. Since booleans are loaded in
sorted order from lucene that *usually* means `false`. This changes the
behavior to line up with the rest of ESQL - now multivalued fields are
treated as `false` with a warning.
You might say "hey wait! multivalued fields usually become `null`, not
`false`!". Yes, dear reader, you are right. Very right. But! `CASE`'s
contract is to immediatly convert its values into `true` or `false`
using the standard boolean tri-valued logic. So `null` just become
`false` immediately. This is how PostgreSQL, MySQL, and SQLite behave:
```
> SELECT CASE WHEN null THEN 1 ELSE 2 END;
2
```
They turn that `null` into a false. And we're right there with them.
Except, of course, that we're turning `[false, false]` and the like into
`null` first. See!? It's consitent. Consistently confusing, but sane at
least.
The warning message just says "treating multivalued field as false"
rather than explaining all of that.
This also fixes up a few of CASE's docs which I noticed were kind of
busted while working on CASE. I think the docs generation is having a
lot of trouble with CASE so I've manually hacked the right thing into
place, but we should figure out a better solution eventually.
Closes#112359
- Added mv_median_absolute_deviation function
- Added possibility of having a fixed param in Multivalue "ascending" functions
- Add surrogate to MedianAbsoluteDeviation
### Calculations used to avoid overflows
First, a quick recap of how the MAD is calculated:
1. Sort values, and get the median
2. Calculate the difference between each value with the median (`abs(median - value)`)
3. Sort the differences, and get their median
Calculating a MAD may overflow when calculating the differences (Step 2), given the type is a signed number, as the difference is a positive value, with potentially the same value as `POSITIVE_MAX - NEGATIVE_MIN`.
To solve this, some types are up-casted as follow:
- Int: Stored as longs, simple approach
- Long: Stored as longs, but switched to unsigned long representation when calculating the differences
- Unsigned long: No effect; the resulting range is the same
- Doubles: Nothing. If the values overflow to +/-infinity, they're left that way, as we'll just use those outliers to sort
Closes https://github.com/elastic/elasticsearch/issues/111590
JDK 23 removes the COMPAT locale provider, leaving CLDR as the only option. This commit configures Elasticsearch
to use the CLDR provider when on JDK 23, but still use the existing COMPAT provider when on JDK 22 and below.
This causes some differences in locale behaviour; this also adapts various tests to still work whether run on COMPAT or CLDR.