There is a DLS query referencing a runtime field loaded from _source, when we create the collector manager we retrieve the numDocs which triggers going through all segments and executing the script for each document. StoredFieldSourceProvider relies on leaf ordinals to build an array, but those ordinals are not populated when computing the numDocs via BaseCompositeReader, as that goes through the subreaders contexts, and not the context leaves (there is a subtle difference that bites us there).
Fixes#111637
We should avoid wrapping EsRejectedExecutionException in an
ElasticsearchException as it would change the status code from 429 to
500. Ideally, we should avoid wrapping exceptions altogether, but that
would require bigger changes.
Closes#112106
This commit will remove the "remote_cluster" from the role descriptor of API keys that is sent to elder clusters for RCS 1.0.
This will allow API keys created in 8.15.0 that reference "remote_cluster" to work when sent to an elder cluster.
The API key could either explicitly reference "remote_cluster" or implicitly reference it via the limited by permissions of the superuser built in role.
Note this in reference to the standard API key, not cross cluster API key. The cross cluster API already removes this for elder clusters.
fixes: #112222
related: #107493
I have investigated an issue with QA clusters that run release builds. I
wish I could enable query pragmas to confirm the problem instead of
setting up new clusters and replicating data before testing the theory.
This change allows users to enable query pragmas in release builds.
However, due to the risks associated with using pragmas, the
accept_pragma_risks parameter must be explicitly set to true to proceed.
* Semantic reranking should fail whenever inference ID does not exist
* Short circuit text similarity reranking on empty result set
* Update tests
* Remove test - it doesn't do anything useful
* Update docs/changelog/112038.yaml
* Always check crsType when folding spatial functions
* Update docs/changelog/112090.yaml
* Only require capability for fixed test
The other tests passed on older versions anyway.
This changes the generated types tables in the docs to say `date`
instead of `datetime`. That's the name of the field in Elasticsearch so
it's a lot less confusing to call it that.
Closes#111650
The node-disconnected exception might not include the root cause. In
this case, the failure collector incorrectly unwraps the exception and
wraps it in a new Elasticsearch exception, losing the message. We should
instead use the original exception to preserve the reason.
Closes#111894
This change ensures that we don't try to compute stats on mappings that don't have dense or sparse vector fields. We don't need to go through all the fields on every segment, instead we can extract the vector fields upfront and limit the work to only indices that define these types.
Closes#111715
Fix validation of fields mapped to different types in different indices and align with validation of fields of unsupported type.
* Allow using multi-typed fields in KEEP and DROP, just like unsupported fields.
* Explicitly invalidate using both these field kinds in RENAME.
* Map both kinds of fields to UnsupportedAttribute to enforce consistency.
* Consider convert functions containing valid multi-typed fields as resolved to avoid weird workarounds when resolving STATS.
* Add a bunch of tests.
(cherry picked from commit 585480fe44)
# Conflicts:
# x-pack/plugin/esql/qa/testFixtures/src/main/resources/union_types.csv-spec
# x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/analysis/Analyzer.java
# x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/plan/logical/Stats.java
Prior to this PR, when the security-crypto threadpool queue overflows and rejects API key hashing submissions, a toxic value (specifically, a future which will never be completed) is added to the API key auth cache. This toxic cache value causes future authentication attempts with that API key to fail by timeout, because they will attempt to wait for the toxic future, until that value is invalidated and removed from the cache. Additionally, this will hold on to memory for each request that waits on the toxic future, even after the request has timed out.
This PR adds a unit test to replicate this case, and adjusts the code which submits the key hashing task to the security-crypto threadpool to properly handle this point of failure by invalidating the cached future and notifying waiting handlers that the computation has failed.
* Replace model_id with inference_id in inference API except when storing ModelConfigs
* Update docs/changelog/111366.yaml
* replace missed literals in tests
Fix bugs caused by pushing down Eval, Grok, Dissect and Enrich past Rename, where after the pushdown, the columns added shadowed the columns to be renamed.
For Dissect and Grok, this enables naming their generated attributes to deviate from the names obtained from the dissect/grok patterns.
(cherry picked from commit e8a01bbd9c)
# Conflicts:
# x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/optimizer/OptimizerRules.java
# x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/plan/logical/Dissect.java
# x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/plan/logical/Enrich.java
# x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/plan/logical/Eval.java
# x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/plan/logical/RegexExtract.java
Update stack monitoring template for .monitoring-beats-mb to include latest apm-server monitoring metrics. All stack monitoring apm-server metrics references in kibana should be intact. To avoid breaking stack monitoring UI, although beat.stats.apm_server.server.response.errors.concurrency is unused and is not present in apm-server stats, it is manually kept in the mapping.
(cherry picked from commit 2fb6c80df2)
* Inject `host.name` field without relying on (component) templates (#110938)
We do not want to rely on templates or component templates to include
the host.name field in indices using LogsDB. The host.name field is a field
we sort on by default when LogsDB is used. As a result, we just inject it
by default, the same way we do for the @timestamp field. This prevents
sorting errors due to missing host.name field in mappings.
The host.name is a keyword field and depending on the value of subobjects it will
be mapped as a name keyword nested inside a host or as a flat host.name keyword.
We also include ignore_above as we normally do for keywords in observability mappings.
* Enable missing hostname test
Calling Rename.output() previously returned wrong results.
Since #110488, instead it throws an IllegalStateException. That leads to test failures in the EsqlNodeSubclassTests because e.g. MvExpandExec and FieldExtractExec eagerly calls .output() on its child when it's being constructed, and the child can be a fragment containing a Rename.
(cherry picked from commit 7df1b06525)
# Conflicts:
# x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/plan/logical/Rename.java
* ESQL: Validate unique plan attribute names (#110488)
* Enforce an invariant in our dependency checker so that logical plans never have duplicate output attribute names or ids.
* Fix ROW to not produce columns with duplicate names.
* Fix ResolveUnionTypes to not create multiple synthetic field attributes for the same union type.
* Add tests for commands using the same column name more than once.
* Update docs w.r.t. how commands behave if they are used with duplicate column names.
(cherry picked from commit da5392134f)
# Conflicts:
# x-pack/plugin/esql/qa/testFixtures/src/main/resources/stats.csv-spec
# x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/action/EsqlCapabilities.java
# x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/optimizer/OptimizerRules.java
# x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/plan/logical/Rename.java
* Remove unrelated csv tests
These slipped in via merge conflicts.
Non-master-eligible nodes that are already part of a cluster when the master is upgraded don't re-join the cluster, so their cluster features never get updated. This adds a cluster listener that spots this occurring, and manually gets the node's features with a new transport action and updates the cluster state after the fact.
* Fix bug in union-types with type-casting in grouping key of STATS (#110476)
* Allow auto-generated type-cast fields in CsvTests
This allows, for example, a csv-spec test result header like `client_ip::ip:ip`, which is generated with a command like `STATS count=count(*) BY client_ip::ip`
It is also a small cleanup of the header parsing code, since it was using Strings.split() in an odd way.
* Fix bug in union-types with type-casting in grouping key of STATS
* Update docs/changelog/110476.yaml
* Added casting_operator required capability
Using the new `::` syntax requires disabling support for older versions in multi-cluster tests.
* Added more tests for inline stats over long/datetime
* Trying to fix the STATS...STATS bug
This makes two changes:
* Keeps the Alias in the aggs.aggregates from the grouping key, so that ReplaceStatsNestedExpressionWithEval still works
* Adds explicit support for union-types conversion at grouping key loading in the ordinalGroupingOperatorFactory
Neither fix the particular edge case, but do seem correct
* Added EsqlCapability for this change
So that mixed cluster tests don't fail these new queries.
* Fix InsertFieldExtract for union types
Union types require a FieldExtractExec to be performed first thing at
the bottom of local physical plans.
In queries like
```
from testidx*
| eval x = to_string(client_ip)
| stats c = count(*) by x
| keep c
```
The `stats` has the grouping `x` but the aggregates get pruned to just
`c`. In cases like this, we did not insert a FieldExtractExec, which
this fixes.
* Revert query that previously failed
With Alex's fix, this query now passes.
* Revert integration of union-types to ordinals aggregator
This is because we have not found a test case that actually demonstrates this is necessary.
* More tests that would fail without the latest fix
* Correct code style
* Fix failing case when aggregating on union-type with invalid grouping key
* Capabilities restrictions on the new YML tests
* Update docs/changelog/110476.yaml
---------
Co-authored-by: Alexander Spies <alexander.spies@elastic.co>
* An alternative approach to supporting union-types on stats grouping field (#110600)
* Added union-types field extration to ordinals aggregation
* Revert previous approach to getting union-types working in aggregations
Where the grouping field is erased by later commands, like a subsequent stats.
Instead we include union-type supports in the ordinals aggregation and mark the block loader as not supporting ordinals.
* Fix union-types when aggregating on inline conversion function (#110652)
A query like:
```
FROM sample_data, sample_data_str
| STATS count=count(*) BY client_ip = TO_IP(client_ip)
| SORT count DESC, client_ip ASC
| KEEP count, client_ip
```
Failed due to unresolved aggregates from the union-type in the grouping key
* Fix for union-types for multiple columns with the same name (#110793)
* Make union types use unique attribute names
* Cleanup leftover
* Added failing test and final fix to EsRelation
* Implement FieldAttribute.fieldName()
* Fix tests
* Refactor
* Do not ignore union typed field's parent
* Fix important typo
D'oh
* Mute unrelated (part of) test
* Move capability to better location
* Fix analyzer tests
* multi-node tests with an earlier version of union-types (before this change) fail
* Add capability to remaining failing tests
* Remove variable
* Add more complex test
* Consolidate union type cleanup rules
* Add 3 more required_capability's to make CI happy
* Update caps for union type subfield yaml tests
* Update docs/changelog/110793.yaml
* Refined changelog text
* Mute BWC for 8.15.0 for failing YAML tests
* union_types_remove_fields for all 160_union_types tests
The tests fail spordically, so safer to mute the entire suite.
---------
Co-authored-by: Alexander Spies <alexander.spies@elastic.co>
---------
Co-authored-by: Alexander Spies <alexander.spies@elastic.co>
* [ESQL] Count_distinct(_source) should return a 400 (#110824)
Resolves
[#105240](https://github.com/elastic/elasticsearch/issues/105240)
Count_distinct doesn't work on source, but the type resolution was
allowing that through. This resulted in a 500 layer deeper in the
aggregations code. This PR fixes the 500 error by correctly failing
during type resolution.
* Even hand-backporting, I messed up the capabilities file
* one more
Min/max range for the event.ingested timestamp field (part of Elastic Common
Schema) was added to IndexMetadata in cluster state for searchable snapshots
in #106252.
This commit modifies the search coordinator to rewrite searches to MatchNone
if the query searches a range of event.ingested that, from the min/max range
in cluster state, is known to not overlap. This is the same behavior we currently
have for the @timestamp field.
Resolves https://github.com/elastic/elasticsearch/issues/104323
This fixes and adds tests for the first of the two bullets in the linked
issue. `ExpressionBuilder#visitIntegerValue` will attempt to parse a
string as an integral value, and return a Literal of the appropriate
type. The actual parsing happens in `StringUtils#parseIntegral`. That
function has special handling for values that are larger than
`Long.MAX_VALUE` where it attempts to turn them into unsigned longs, and
if the number is still out of range, throw `InvalidArgumentException`.
`ExpressionBuilder` catches that `InvalidArgumentException` and tries to
parse a `double` instead. If, on the other hand, the value is smaller
than `Long.MIN_VALUE`, `StringUtils` never enters the unsigned long path
and just calls `intValueExact`, which throws `ArithmeticException`.
This PR solves the issue by catching that `ArithmeticException` and
rethrowing it as an `InvalidArgumentException`.
The types from com.nimbusds.jwt are almost not needed in x-pack/plugin/core.
They're only needed in module org.elasticsearch.security, x-pack:plugin:security project.