Commit graph

17952 commits

Author SHA1 Message Date
Richard Dennehy
ceaa01a538
Add Issuer to failed SAML Signature validation logs when available (#126310)
* Add Issuer to failed SAML Signature validation logs when available

* [CI] Auto commit changes from spotless

* Fix tests

* Update docs/changelog/126310.yaml

* address review comments

* replace String.format call

* update formatIssuer to describeIssuer

* [CI] Auto commit changes from spotless

* truncate long issuers in log messages

* [CI] Auto commit changes from spotless

* handle null issuer value

* address review comments

---------

Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
2025-04-08 10:50:45 +01:00
Slobodan Adamović
284121ad9f
Set keyUsage for generated HTTP certificates and self-signed CA (#126376)
The `elasticsearch-certutil http` command, and security auto-configuration, 
generate the HTTP certificate and CA without setting the `keyUsage` extension.

This PR fixes this by setting (by default):
- `keyCertSign` and `cRLSign` for self-signed CAs 
- `digitalSignature` and `keyEncipherment` for HTTP certificates and CSRs

These defaults can be overridden when running `elasticsearch-certutil http` 
command. The user will be prompted to change them as they wish.

For `elasticsearch-certutil ca`, the default value can be overridden by passing 
the `--keysage` option, e.g.
```
elasticsearch-certutil ca --keyusage "digitalSignature,keyCertSign,cRLSign" -pem    
```

Fixes #117769
2025-04-08 09:44:09 +02:00
Yang Wang
997a7b8fab
FileWatchingService should not throw for missing file (#126264)
Missing file is a valid state for FileWatchingService so that the
exception should be suppressed.
2025-04-08 09:56:35 +10:00
Kaarina Tungseth
e3e03bc28b
Removes known issues page content (#126429)
* Removes known issues page

* Adds empty known issues page
2025-04-07 15:58:52 -05:00
Joe Gallo
bead858ccd
Correctly handle nulls in nested paths in the remove processor (#126417) 2025-04-07 16:54:07 -04:00
Brian Seeders
5888e903e4
[docs] Re-generate and fix 9.0.0 release notes in markdown (#126425) 2025-04-07 14:17:06 -04:00
István Zoltán Szabó
212971a435
[DOCS] Adds ML-CPP release notes. (#126420) 2025-04-07 19:15:20 +02:00
Oleksandr Kolomiiets
21ff72bef4
Use FallbackSyntheticSourceBlockLoader for text fields (#126237) 2025-04-07 09:32:35 -07:00
Nik Everett
7e1e45eaa4
ESQL: Speed up TO_IP (#126338)
Speed up the TO_IP method by converting directly from utf-8 encoded
strings to the ip encoding. Previously we did:
```
utf-8 -> String -> INetAddress -> ip encoding
```

In a step towards solving #125460 this creates three IP parsing
functions, one the rejects leading zeros, one that interprets leading
zeros as decimal numbers, and one the interprets leading zeros as octal
numbers. IPs have historically been parsed in all three of those ways.

This plugs the "rejects leading zeros" parser into `TO_IP` because
that's the behavior it had before.

Here is the performance:
```
Benchmark               Score    Error  Units
leadingZerosAreDecimal  14.007 ± 0.093  ns/op
leadingZerosAreOctal    15.020 ± 0.373  ns/op
leadingZerosRejected    14.176 ± 3.861  ns/op
original                32.950 ± 1.062  ns/op
```

So this is roughly 45% faster than what we had.
2025-04-07 09:34:53 -04:00
David Turner
527d2a203b
Improve handling of empty response (#125562)
Today `ActionResponse$Empty` implements `ToXContentObject`, but yields
no bytes of content when serialized which creates an invalid JSON
response. This commit removes the bogus interface and adjusts the
affected REST APIs to send a `text/plain` response instead.
2025-04-07 12:10:07 +01:00
Alexander Spies
a152b4e29b
ESQL: Fail with 500 not 400 for ValueExtractor bugs (#126296)
In case of wrong layouts of ESQL's operators, it can happen that
ValueExtractor.extractorFor encounters a data type mismatch. Currently,
this throws IllegalArgumentException, which is treated like a user
exception and triggers a 400 response.

We need to return a 500 status code for such errors; this is also
important for observability of ES clusters, which can normally use 500
responses as an indicator of a bug.

Throw IllegalStateException instead, it's close enough.
2025-04-07 11:21:57 +02:00
Craig Taverner
1f6518f371
Document special behaviour of ignore_malformed for geo_point mappings (#125692)
With `geo_point` fields, here is the special case of values that have a syntactically valid format, but the numerical values for `latitude` and `longitude` are out of range.

If `ignore_malformed` is `false`, an exception will be thrown as usual. But if it is `true`, the document will be indexed correctly, by normalizing the latitude and longitude values into the valid range. The special `_ignored` field will not be set. The original source document will remain as before, but indexed values, doc-values and stored fields will all be normalized.
2025-04-07 11:05:51 +02:00
Lisa Cawley
1d1feb6010
[DOCS] Migrate search profile API examples (#126347) 2025-04-04 22:42:09 +01:00
George Wallace
ce8b418686
Update esql-lookup-join.md (#126290) 2025-04-04 09:43:45 -06:00
Aurélien FOUCRET
a4a271415d
Adding ES|QL RERANK command in snapshot builds (#123074) 2025-04-04 15:39:18 +01:00
Alexander Spies
8f38b13059
ESQL: Revert "Allow partial results by default in ES|QL (#125060)" (#126286)
This reverts commit 81555cc9d9 from
https://github.com/elastic/elasticsearch/pull/125060.

Fix https://github.com/elastic/elasticsearch/issues/126275

@idegtiarenko and I investigated and believe this needs reverting:
silently dropping results from the query response in case any index is
missing can lead to real problems if users don't spot their mistake. I'm
also not sure if all the results will get dropped, or only from some
nodes/shards/clusters, meaning that this might be hard to spot by users
if only some results get dropped.

The main PR has no transport version bump, no new ESQL capability, and
was merged 15h ago - so it should be safe to just revert it. I noticed
there was a linked Serverless PR on the original PR, but it merely
disabled some obsolete tests on Serverless and doesn't require reverting
itself.
2025-04-05 01:27:13 +11:00
Jeremy Dahlgren
4c979aa365
Accumulate compute() calls and iterations between convergences in DesiredBalanceComputer (#126008)
Add tracking of the number of compute() calls and total iterations
between convergences in the DesiredBalanceComputer, along with the
time since the last convergence.  These are included in the log
message when the computer doesn't converge.

Closes #100850.
2025-04-04 08:33:17 -04:00
Mikhail Berezovskiy
70654a3633
Add GCS telemtry with ThreadLocal (#125452) 2025-04-03 23:46:06 -07:00
Kathleen DeRusso
e7d4a28a87
Support configurable chunking in semantic_text fields (#121041)
* test

* Revert "test"

This reverts commit 9f4e2adba0.

* Refactor InferenceService to allow passing in chunking settings

* Add chunking config to inference field metadata and store in semantic_text field

* Fix test compilation errors

* Hacking around trying to get ingest to work

* Debugging

* [CI] Auto commit changes from spotless

* POC works and update TODO to fix this

* [CI] Auto commit changes from spotless

* Refactor chunking settings from model settings to field inference request

* A bit of cleanup

* Revert a bunch of changes to try to narrow down what broke CI

* test

* Revert "test"

This reverts commit 9f4e2adba0.

* Fix InferenceFieldMetadataTest

* [CI] Auto commit changes from spotless

* Add chunking settings back in

* Update builder to use new map

* Fix compilation errors after merge

* Debugging tests

* debugging

* Cleanup

* Add yaml test

* Update tests

* Add chunking to test inference service

* Trying to get tests to work

* Shard bulk inference test never specifies chunking settings

* Fix test

* Always process batches in order

* Fix chunking in test inference service and yaml tests

* [CI] Auto commit changes from spotless

* Refactor - remove convenience method with default chunking settings

* Fix ShardBulkInferenceActionFilterTests

* Fix ElasticsearchInternalServiceTests

* Fix SemanticTextFieldMapperTests

* [CI] Auto commit changes from spotless

* Fix test data to fit within bounds

* Add additional yaml test cases

* Playing with xcontent parsing

* A little cleanup

* Update docs/changelog/121041.yaml

* Fix failures introduced by merge

* [CI] Auto commit changes from spotless

* Address PR feedback

* [CI] Auto commit changes from spotless

* Fix predicate in updated test

* Better handling of null/empty ChunkingSettings

* Update parsing settings

* Fix errors post merge

* PR feedback

* [CI] Auto commit changes from spotless

* PR feedback and fix Xcontent parsing for SemanticTextField

* Remove chunking settings check to use what's passed in from sender service

* Fix some tests

* Cleanup

* Test failure whack-a-mole

* Cleanup

* Refactor to handle memory optimized bulk shard inference actions - this is ugly but at least it compiles

* [CI] Auto commit changes from spotless

* Minor cleanup

* A bit more cleanup

* Spotless

* Revert change

* Update chunking setting update logic

* Go back to serializing maps

* Revert change to model settings - source still errors on missing model_id

* Fix updating chunking settings

* Look up model if null

* Fix test

* Work around https://github.com/elastic/elasticsearch/issues/125723 in semantic text field serialization

* Add BWC tests

* Add chunking_settings to docs

* Refactor/rename

* Address minor PR feedback

* Add test case for null update

* PR feedback - adjust refactor of chunked inputs

* Refactored AbstractTestInferenceService to return offsets instead of just Strings

* [CI] Auto commit changes from spotless

* Fix tests where chunk output was of size 3

* Update mappings per PR feedback

* PR Feedback

* Fix problems related to merge

* PR optimization

* Fix test

* Delete extra file

---------

Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
2025-04-03 17:45:26 -04:00
Nhat Nguyen
81555cc9d9
Allow partial results by default in ES|QL (#125060)
With this change, ES|QL will return partial results instead of failing
the entire query when encountering errors. Callers should check the
partial_results flag in the response to determine if the result is
partial or complete. If returning partial results is not desired, this
option can be overridden per request via the allow_partial_results
parameter in the query URL or globally via the cluster setting
esql.allow_partial_results.

Relates #122802
2025-04-03 12:30:47 -07:00
Ben Chaplin
9f6eb1d4e3
Log stack traces on data nodes before they are cleared for transport (#125732)
We recently cleared stack traces on data nodes before transport back to the coordinating node when error_trace=false to reduce unnecessary data transfer and memory on the coordinating node (#118266). However, all logging of exceptions happens on the coordinating node, so stack traces disappeared from any logs. This change logs stack traces directly on the data node when error_trace=false.
2025-04-03 13:45:09 -04:00
Omri Cohen
856ee3a177
Support explicit Z/M attributes using WKT geometry (#125896) 2025-04-03 17:00:12 +02:00
kanoshiou
30b2a1f729
ESQL: Enhanced DATE_TRUNC with arbitrary intervals (#120302)
Originally, `DATE_TRUNC` only supported 1-month and 3-month intervals for months, and 1-year interval for years, while arbitrary intervals were supported for weeks and days. This PR adds support for `DATE_TRUNC` with arbitrary month and year intervals. 

Closes #120094
2025-04-03 16:55:56 +02:00
Richard Dennehy
f821930518
Fix NPE for missing Content Type header in OIDC Authenticator (#126191)
* Fix NPE for missing Content Type header in OIDC Authenticator

* Update docs/changelog/126191.yaml
2025-04-03 12:38:53 +01:00
Alexander Spies
28a544e0c5
ESQL: Fix ReplaceMissingFieldsWithNull (#125764)
* Revert changes to Layout.java

The change in 80125a4bac is a quick fix
and allows breaking an invariant of Layout. Revert that.

* Simplify ReplaceMissingFieldWithNull

When encountering projections, it tries to do the job of field
extraction for missing fields by injecting an Eval that creates a
literal null with the same name id as the field attribute for the
missing field. This is wrong:
1. We only insert an Eval in case that a Project relies on the missing
   attribute. There could be other plan nodes that rely on the missing
   attribute.
2. Even for Projects, we only insert an Eval in case we squarely project
   for the field - in case of aliases (e.g. from RENAME), we do nothing.
3. In case of multiple Projects that use this attribute, we create
   multiple attributes with the original field attribute's id, causing
   a wrong Layout. This triggered
   https://github.com/elastic/elasticsearch/issues/121754.

* Revive logic for EsRelation instead of Project

* Update LocalLogicalPlanOptimizerTests

* Update docs/changelog/125764.yaml

* Update test expectations

* Do not prune attributes from EsRelation

This can lead to empty output, which leads to the EsRelation being
replaced by a LocalRelation with 0 rows.

* Add tests + capability

* Update docs/changelog/125764.yaml

* Add comments
2025-04-03 09:26:26 +02:00
Benjamin Trent
33dcc921be
Mark rescore_vector as generally available (#126038)
* Mark rescore_vector as generally available

* Update docs/changelog/126038.yaml
2025-04-02 16:10:01 -04:00
Niels Bauman
483f97915c
Run TransportGetIndexAction on local node (#125652)
This action solely needs the cluster state, it can run on any node.
Since this is the last class/action that extends the `ClusterInfo`
abstract classes, we remove those classes too as they're not required
anymore.

Relates #101805
2025-04-02 18:41:35 +01:00
Pawan Kartik
e4fb22c4f3
ES|QL: Wrap remote errors with cluster name to provide more context (#123156)
Wrap remote errors with cluster name to provide more context

Previously, if a remote encountered an error, user would see a top-level error that would provide no context about which remote ran into the error. Now, such errors are wrapped in a separate remote exception whose error message clearly specifies the name of the remote cluster and the error that occurred is the cause of this remote exception.
2025-04-02 18:08:20 +01:00
Joe Gallo
078f7ff9f7
Minor docs fixes (#126143) 2025-04-02 12:30:07 -04:00
Mark J. Hoy
e77bf808ab
Add Bounded Window to Inference Models for Rescoring to Ensure Positive Score Range (#125694)
* apply bounded window inference model

* linting

* add unit tests

* [CI] Auto commit changes from spotless

* add additional tests

* remove unused constructor

---------

Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
2025-04-02 11:50:04 -04:00
Niels Bauman
509a12058f
Run TransportGetLifecycleAction on local node (#126002)
This action solely needs the cluster state, it can run on any node.

Relates #101805
2025-04-02 16:35:25 +01:00
Niels Bauman
eb4d64f94a
Run TransportGetSettingsAction on local node (#126051)
This action solely needs the cluster state, it can run on any node.
Additionally, it needs to be cancellable to avoid doing unnecessary work
after a client failure or timeout.

Relates #101805
2025-04-02 15:05:31 +01:00
Nick Tindall
58c8f4abae
Upgrade to latest GCS SDK (#126087)
Upgrades google cloud SDK used by repository-gcs to com.google.cloud:google-cloud-storage-bom:2.50.0

Closes: ES-9287
2025-04-02 15:41:50 +11:00
Keith Massey
bb762107b6
Preventing ConcurrentModificationException when updating settings for more than one index (#126077) 2025-04-01 17:10:08 -05:00
Nik Everett
d30296229b
ESQL: Hide some "extras" from docs (#124763)
Hides some of the "extra" lines from ESQL's documentation. These lines
are required to make the documentation into nice tests which is
important to make sure the docs don't get out of date. But readers don't
need to see them.
2025-04-01 21:24:15 +01:00
Oleksandr Kolomiiets
f3ccde6959
Use FallbackSyntheticSourceBlockLoader for point and geo_point (#125816) 2025-04-01 12:55:18 -07:00
Colleen McGinnis
d966938842
add missing mapped pages (#126054) 2025-04-01 19:41:37 +02:00
Colleen McGinnis
0e537325fc
[docs] Remove as many redirects as possible (#125663)
* remove as many assembler-related redirects as possible

* Update docs/redirects.yml

* delete more unused temp redirects

* remove more redirects

* remove all redirects to see remaining errors
2025-04-01 16:53:59 +02:00
Craig Taverner
7b263b4b83
Kibana updates, remove links from JSON and split is-null/is-not-null (#125986)
In particular:
* Remove all links (both asciidoc and markdown) from the JSON definition files.
  * This required a two phase edit, from asciidoc links to markdown, and then removal of markdown (replace with markdown text). This is because the asciidoc does not have the display text, and because some links were already markdown.
* Split predicates into is_null and is_not_null
  * We kept the old combined version because the main docs still use that, so now we have both combined and separate versions, and Kibana can select the version they want.
2025-04-01 15:46:24 +02:00
조혜온
89adec154c
[ML] Resolve duplicate key exception in GetDatafeedRunningStateAction (#125477) 2025-04-01 14:16:17 +01:00
Jim Ferenczi
42b7b78a31
[ES|QL] Infer the score mode to use from the Lucene collector (#125930)
This change uses the Lucene collector to infer which score mode to use
when the topN collector is used.
2025-04-01 11:52:27 +01:00
Luca Cavanna
b01438a95f
Re-enable parallel collection for field sorted top hits (#125916)
With #123610 we disabled parallel collection for field and script sorted top hits,
aligning its behaviour with that of top level search. This was mainly to work around
a bug in script sorting that did not support inter-segment concurrency.

The bug with script sort has been fixed with #123757 and concurrency re-enabled for it.

While sort by field is not optimized for search concurrency, top hits benefits from it
and disabling concurrency for sort by field in top hits has caused performance
regressions in our nightly benchmarks.

This commit re-enables concurrency for top hits with sort by field is used. This
introduces back a discrepancy between top level search and top hits, in that concurrency
is applied for top hits despite sort by field normally disables it. The key difference
is the context where sorting is applied, and the fact that concurrency is disabled
only for performance reasons on top level searches and not for functional reasons.
2025-04-01 09:27:43 +02:00
Niels Bauman
fd2492f935
Optimize usage calculation in ILM policies retrieval API (#106953)
Optimize calculating the usage of ILM policies in the `GET _ilm/policy` and `GET _ilm/policy/<policy_id>` endpoints by xtracting a separate class that pre-computes some parts on initialization (i.e. only once per request) and then uses those pre-computed parts when calculating the usage for an individual policy. By precomputing all the usages, the class makes a tradeoff by using a little bit more memory to significantly improve the overall processing time.
2025-03-31 16:11:41 +01:00
Armin Braun
fd2cc97541
Introduce batched query execution and data-node side reduce (#121885)
This change moves the query phase a single roundtrip per node just like can_match or field_caps work already. 
A a result of executing multiple shard queries from a single request we can also partially reduce each node's query results on the data node side before responding to the coordinating node.

As a result this change significantly reduces the impact of network latencies on the end-to-end query performance, reduces the amount of work done (memory and cpu) on the coordinating node and the network traffic by factors of up to the number of shards per data node!

Benchmarking shows up to orders of magnitude improvements in heap and network traffic dimensions in querying across a larger number of shards.
2025-03-29 16:53:18 +01:00
Brandon Morelli
74e4ce23e0
Update limitations.md (#125893) 2025-03-28 22:35:41 +01:00
Craig Taverner
98a2c711f8
Refine ESQL docs handling of applies_to (#125835)
This primarily splits the old preview:true warning from the newer applies_to approach. Since all of our current applies_to examples are actually just behaviour modifications of current functions, we do not use the official docs {applies_to} syntax. However there is code to make use of that in the case where we have an entirely new function which will appear in a new version.

Co-authored-by: Alexander Spies <alexander.spies@elastic.co>
2025-03-28 22:09:15 +01:00
Larisa Motova
b4f534cb25
[ES|QL] Fix sorting when aggregate_metric_double present (#125191)
Previously if an aggregate_metric_double was present amongst fields and
you tried to sort on any (not necessarily even on the agg_metric itself)
field in ES|QL, it would break the results.

This commit doesn't add support for sorting _on_ aggregate_metric_double
(it is unclear what aspect would be sorted), but it fixes the previous
behavior.
2025-03-28 10:16:26 -10:00
John Verwolf
848a6783f0
Fix system data stream restore warning (#125881)
This PR fixes a bug in the RestoreService whereby the validation logic for index templates didn't account for system datastreams.
2025-03-28 20:14:57 +01:00
Bogdan Pintea
1bd80d10a6
ESQL: supplement docs on LIMIT (#125839)
This adds a few extra details around how ESQL processes input docs and
how it limits output results.

Closes #125819
2025-03-29 06:03:27 +11:00
Mayya Sharipova
332abe4198
[DOCS] Clarify that min_score applies to aggs (#125882)
Clarify that min_score param of a search request
also applies to aggregations.
2025-03-28 14:41:14 -04:00