When an index alias/data stream is replicated via CCR, the follower index needs to unfollow the leader index before executing actions that will remove the original index.
The downsample action is also an action that removes the original index so the expected behaviour is that ILM would automatically unfollow.
Co-authored-by: Niels Bauman <nielsbauman@gmail.com>
* Fix unsupported privileges error message during role and API key creation
* Update docs/changelog/128858.yaml
* [CI] Auto commit changes from spotless
* Update docs/changelog/128858.yaml
---------
Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
We reported in https://github.com/elastic/elasticsearch/issues/127249, there is no support for DATE_NANOS in LOOKUP JOIN, even though DATETIME is supported. This PR attempts to fix that.
The way that date-time was supported in LOOKUP JOIN (and ENRICH) was by using the `DateFieldMapper.DateFieldType.rangeQuery` (hidden behind the `termQuery` function) which internally takes our long values, casts them to Object, renders them to a string, parses that string back into an Instant (with a bunch of fancy and unnecessary checks for date-math, etc.), and then converts that instant back into a long for the actual query. Parts of this complex process are precision aware (ie. differentiate between ms and ns dates), but not the whole process. Simply dividing the original longs by 1_000_000 before passing them in actually works, but obviously looses precision. And the only reason it works anyway is that the date parsing code will accept a string containing a simple number and interpret it as either ms since the epoch, or years if the number is short enough. This does not work for nano-second dates, and in fact is far from ideal for LOOKUP JOIN on dates which does not need to re-parse the values at all.
This complex loop only makes sense in the Query DSL, where we can get all kinds of interesting sources of range values, but seems quite crazy for LOOKUP JOIN where we will always provide the join key from a LongBlock (the backing store of the DATE_TIME DataType, and the DATE_NANOS too).
So what we do here for DateNanos is provide two new methods to `DateFieldType`:
* `equalityQuery(Long, ...)` to replace `termQuery(Object, ...)`
* `rangeQuery(Long, Long, ...)` to replace `rangeQuery(Object, Object, ...)`
This allows us to pass in already parsed `long` values, and entirely skip the conversion to strings and re-parsing logic. The new methods are based on the original methods, but considerably simplified due to the removal of the complex parsing logic. The reason for both `equalityQuery` and `rangeQuery` is that it mimics the pattern used by the old `termQuery` with delegated directly down to `rangeQuery`. In addition to this, we hope to support range matching in `LOOKUP JOIN` in the near future.
* Initial commit of match_phrase
* Add MatchPhraseQueryTests
* First pass at CSV specs
* Update docs/changelog/127661.yaml
* Refactor so MatchPhrase doesn't use all fulltext test cases, just text only
* Fix tests
* Add some CSV test cases
* Fix test
* Update changelog
* Update tests
* Comment out MATCH_PHRASE in search-functions Markdown
* Minor PR feedback
* PR feedback - refactor/consolidate code
* Add some more tests
* Fix some tests
* [CI] Auto commit changes from spotless
* Fix tests
* PR feedback - add tests, support boost and numeric data
* Revert "PR feedback - add tests, support boost and numeric data"
This reverts commit 4e7a699e3e.
* Apply testing/PR feedback outside numeric support only
* Regenerate docs
* Add negative test
* Update x-pack/plugin/esql/qa/testFixtures/src/main/resources/match-phrase-function.csv-spec
Co-authored-by: Carlos Delgado <6339205+carlosdelest@users.noreply.github.com>
* Update x-pack/plugin/esql/qa/testFixtures/src/main/resources/match-phrase-function.csv-spec
Co-authored-by: Carlos Delgado <6339205+carlosdelest@users.noreply.github.com>
* Update x-pack/plugin/esql/qa/testFixtures/src/main/resources/match-phrase-function.csv-spec
Co-authored-by: Carlos Delgado <6339205+carlosdelest@users.noreply.github.com>
* PR feedback
* Fix auto-commit error
* Regenerate docs
* Update x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/expression/function/fulltext/MatchPhrase.java
Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com>
* Remove non text field types
* Fake test data
* Remove tests that no longer should pass without ip/date/version support
* Put real data in score tests now that I was able to engineer a failure
* Realized the scoring test might be flakey because how it was written, updated
* PR feedback
* PR feedback
* [CI] Auto commit changes from spotless
* Add check to MatchPhrase tests
* Fix merge errors
* [CI] Auto commit changes from spotless
* Test generated docs
* Add additional verifier tests
---------
Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
Co-authored-by: Carlos Delgado <6339205+carlosdelest@users.noreply.github.com>
Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com>
The index pattern provided in the body of `_has_privileges` can
trigger a `TooComplexToDeterminizeException` which is then bubbled up
(badly).
This change catches that exception and provides a better message
This adds the reserved optional characters to the list that is escaped
during conversion. These characters are all enabled by the `RegExp.ALL`
flag in our use.
Closes#128676, closes#128677.
* Granting `kibana_system` reserved role access to "all" privileges to `.adhoc.alerts*` and `.internal.adhoc.alerts*` indices
* Update docs/changelog/127321.yaml
* [CI] Auto commit changes from spotless
* Replace `"all"` with the specific privileges for the `kibana_system` role
* Fix tests
* Fix CI
* Updated privileges
* Updated privileges
Add `"maintenance"` to allow `refresh=true` option on bulk API call.
* Remove redundant code
---------
Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
The auth code injects the pattern `["*", "-*"]` to specify that it's okay to return an empty response because user's patterns did not match any remote clusters. However, we fail to recognise this specific pattern and `groupIndices()` eventually associates it with the local cluster. This causes Elasticsearch to fallback to the local cluster unknowingly and return its details back to the user even though the user hasn't requested any info about the local cluster.
Added support for the three primary scalar grid functions:
* `ST_GEOHASH(geom, precision)`
* `ST_GEOTILE(geom, precision)`
* `ST_GEOHEX(geom, precision)`
As well as versions of these three that take an optional `geo_shape` boundary (must be a `BBOX` ie. `Rectangle`).
And also supporting conversion functions that convert the grid-id from long to string and back to long.
This work represents the core of the feature to support geo-grid aggregations in ES|QL.
Instead of waiting for the next run of the `ClusterStateObserver` (which
might be arbitrarily far in the future, but bound by the timeout if one
is set), we notify the listener immediately that the task has been
cancelled. While doing so, we ensure we invoke the listener only once.
Fixes#117971
* Implement SAML custom attributes support for Identity Provider
This commit adds support for custom attributes in SAML single sign-on requests
in the Elasticsearch X-Pack Identity Provider plugin. This feature allows
passage of custom key-value attributes in SAML requests and responses.
Key components:
- Added SamlInitiateSingleSignOnAttributes class for holding attributes
- Added validation for null and empty attribute keys
- Updated request and response objects to handle attributes
- Modified authentication flow to process attributes
- Added test coverage to validate attributes functionality
The implementation follows Elasticsearch patterns with robust validation
and serialization mechanisms, while maintaining backward compatibility.
* Add test for SAML custom attributes in authentication response
This commit adds a comprehensive test that verifies SAML custom attributes
are correctly handled in the authentication response builder. The test ensures:
1. Custom attributes with single and multiple values are properly included
2. The response with custom attributes is still correctly signed
3. The XML schema validation still passes with custom attributes
4. We can locate and verify individual attribute values in the response
This provides critical test coverage for the SAML custom attributes
feature implementation.
* Add backward compatibility overload for SuccessfulAuthenticationResponseMessageBuilder.build
This commit adds an overloaded build method that accepts only two parameters
(user and authenticationState) and forwards the call to the three-parameter
version with null for the customAttributes parameter. This maintains backward
compatibility with existing code that doesn't use custom attributes.
This fixes a compilation error in ServerlessSsoIT.java which was still using
the two-parameter method signature.
Signed-off-by: lloydmeta <lloydmeta@gmail.com>
* Add validation for duplicate SAML attribute keys
This commit enhances the SAML attributes implementation by adding validation
for duplicate attribute keys. When the same attribute key appears multiple
times in a request, the validation will now fail with a clear error message.
Signed-off-by: lloydmeta <lloydmeta@gmail.com>
* Refactor SAML attributes validation to follow standard patterns
This commit improves the SAML attributes validation by:
1. Adding a dedicated validate() method to SamlInitiateSingleSignOnAttributes
that centralizes validation logic in one place
2. Moving validation from constructor to dedicated method for better error reporting
3. Checking both for null/empty keys and duplicate keys in the validate() method
4. Updating SamlInitiateSingleSignOnRequest to use the new validation method
5. Adding comprehensive tests for the new validation approach
These changes follow standard Elasticsearch validation patterns, making the
code more maintainable and consistent with the rest of the codebase.
* Update docs/changelog/128176.yaml
* Improve SAML response validation in identity provider tests
Enhanced the testCustomAttributesInIdpInitiatedSso test to properly validate
both SAML response structure and custom attributes using DOM parsing and XPath.
Key improvements:
- Validate SAML Response/Assertion elements exist
- Precisely validate custom attributes (department, region) and their values
- Use namespace-aware XML parsing for resilience to format changes
Signed-off-by: lloydmeta <lloydmeta@gmail.com>
* Simplify SAML attributes representation using JSON object/Map structure
Also, replace internal Attribute class list with a simpler Map<String, List<String>>
structure
This change:
- Removes the redundant Attribute class and replaces it with a direct Map
implementation for storing attribute key-value pairs
- Eliminates the duplicate "attributes" nesting in the JSON structure
- Simplifies attribute validation without needing duplicate key checking
- Updates all related tests and integration points to work with the new structure
Before:
```js
{
// others
"attributes": {
"attributes": [
{
"key": "department",
"values": ["engineering", "product"]
}
]
}
}
After:
```js
{
// other
"attributes": {
"department": ["engineering", "product"]
}
}
```
(Verified by spitting out JSON entity in IdentityProviderAuthenticationIT.generateSamlResponseWithAttributes
... saw `{"entity_id":"ec:123456:abcdefg","acs":"https://sp1.test.es.elasticsearch.org/saml/acs","attributes":{"department":["engineering","product"],"region":["APJ"]}}`)
Signed-off-by: lloydmeta <lloydmeta@gmail.com>
* * Fix up toString dangling quote.
Signed-off-by: lloydmeta <lloydmeta@gmail.com>
* * Remove attributes from Response object.
Signed-off-by: lloydmeta <lloydmeta@gmail.com>
* * Remove friendly name.
* Make attributes map final in SamlInitiateSingleSignOnAttributes
Signed-off-by: lloydmeta <lloydmeta@gmail.com>
* * Cleanup serdes by using existing utils in the ES codebase
Signed-off-by: lloydmeta <lloydmeta@gmail.com>
* Touchup comment
Signed-off-by: lloydmeta <lloydmeta@gmail.com>
* Update x-pack/plugin/identity-provider/src/test/java/org/elasticsearch/xpack/idp/action/SamlInitiateSingleSignOnRequestTests.java
Co-authored-by: Tim Vernum <tim@adjective.org>
* Add transport-version checks
---------
Signed-off-by: lloydmeta <lloydmeta@gmail.com>
Co-authored-by: Tim Vernum <tim@adjective.org>
Before this PR sorting on integer, short and byte fields types used
SortField.Type.LONG. This made sort optimization impossible for these
field types.
This PR uses SortField.Type.INT for integer, short and byte fields. This
enables sort optimization.
There are several caveats with changing sort type that are addressed: -
Before mixed sort on integer and long fields was automatically
supported, as both field types used SortField.TYPE.LONG. Now when
merging results from different shards, we need to convert sort to LONG
and results to long values. - Similar for collapsing when there is mixed
INT and LONG sort types. - Index sorting. Similarly, before for index
sorting on integer field, SortField.Type.LONG was used. This sort type
is stored in the index writer config on disk and can't be modified. Now
when providing sortField() for index sorting, we need to account for
index version: for older indices return sort with SortField.Type.LONG
and for new indices return SortField.Type.INT.
---
There is only 1 change that may be considered not backwards compatible:
Before if an integer field was [missing a
value](https://www.elastic.co/docs/reference/elasticsearch/rest-apis/sort-search-results#_missing_values)
, it sort values will return Long.MAX_VALUE in a search response. With
this integer, it sort valeu will return Integer.MAX_VALUE. But I think
this change is ok, as in our documentation, we don't provide information
what value will be returned, we just say it will be sorted last.
---
Also closes#127965 (as same type validation in added for collapse
queries)
This change improves the performance of sparse vector statistics gathering by using the document count of terms directly, rather than relying on the field name field to compute stats.
By avoiding per-term disk/network reads and instead leveraging statistics already loaded into leaf readers at index opening, we expect to significantly reduce overhead.
Relates to #128583
Remove use_default_lucene_postings_format feature flag and
let the IndexMode decide whether to default lucene postings instead of checking for standard index mode.
The `Lucene101PostingsFormat` is now used for a while behind a feature flag. Regressions were found by were fixed via apache/lucene#14511. The `Lucene101PostingsFormat` is now a better trade off when the index mode is standard.
Last part size is wrongly computed to 0 when the last part's length is
exactly equal to the size of a part. Would have probably be caught by an
existing assertion.
Relates ES-11815
* New l2 normalizer added
* L2 score normaliser is registered
* test case added to the yaml
* Documentation added
* Resolved checkstyle issues
* Update docs/changelog/128504.yaml
* Update docs/reference/elasticsearch/rest-apis/retrievers.md
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* Score 0 test case added to check for corner cases
* Edited the markdown doc description
* Pruned the comment
* Renamed the variable
* Added comment to the class
* Unit tests added
* Spotless and checkstyle fixed
* Fixed build failure
* Fixed the forbidden test
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Starting with Lucene 10, `CharacterRunAutomaton` is no longer determinized automatically.
In Elasticsearch 9, we adapted to this by eagerly determinizing automatons early (via `Regex#simpleMatchToAutomaton`).
However, this introduced regression: operations like index template conflict checks, which only require intersection testing, now pay the cost of determinization—an expensive step that wasn’t needed before. In some cases, especially when many wildcard patterns are involved, determinization can even fail due to state explosion.
This change removes the unnecessary determinization, restoring the pre-9.0 behavior and allowing valid index templates with many patterns to be registered again.
If an index is deleted after a snapshot has written its shardGenerations
file but before the snapshot is finalized, we exclude this index from the
snapshot because its indexMetadata is no longer available. However,
the shardGenerations file is still valid in that it is the latest copy with all
necessary information despite it containing an extra snapshot entry.
This is OK. Instead of dropping this shardGenerations file, this PR
changes to carry it forward by updating RepositoryData and relevant
in-progress snapshots so that the next finalization builds on top of this one.
Co-authored-by: David Turner <david.turner@elastic.co>
This PR changes the mechanism to pause indexing which was introduced in #127173. The original PR caused IndexStatsIT#testThrottleStats to fail. See #126359.
Today if the get-snapshots API cannot access one of the repositories we
return a fairly low-level message about the problem, perhaps something
like `Could not determine repository generation from root blobs`. This
message is shown verbatim in the Kibana UI so users need something a
little more descriptive. With this commit we wrap the exception in one
that indicates the problem in terms that users are more likely to
understand.
Relates #128208
This introduces an optimization to pushdown to Lucense those language constructs that aim at case-insensitive regular expression matching, used with `LIKE` and `RLIKE` operators, such as:
* `| WHERE TO_LOWER(field) LIKE "abc*"`
* `| WHERE TO_UPPER(field) RLIKE "ABC.*"`
These are now pushed as case-insensitive `wildcard` and `regexp` respectively queries down to Lucene.
Closes#127479
* Fix scoring and sort handling in pinned retriever
* Remove books.es from version control and add to .gitignore
* Remove books.es from version control and add to .gitignore
* Remove books.es entries from .gitignore
* fixed the mess
* Update x-pack/plugin/search-business-rules/src/test/java/org/elasticsearch/xpack/searchbusinessrules/retriever/PinnedRetrieverBuilderTests.java
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* Update x-pack/plugin/search-business-rules/src/main/java/org/elasticsearch/xpack/searchbusinessrules/retriever/PinnedRetrieverBuilder.java
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Currently, the Limit operator does not combine small pages into larger
ones; it simply passes them along, except for chunking pages larger than
the limit. This change integrates EstimatesRowSize into Limit and
adjusts it to emit larger pages. As a result, pages up to twice the
pageSize may be emitted, which is preferable to emitting undersized
pages. This should reduce the number of transport requests and responses
between clusters or coordinator-data nodes for queries without TopN or
STATS when target shards produce small pages due to their size or highly
selective filters.
* Inference changes
* Custom service fixes
* Update docs/changelog/127939.yaml
* Cleaning up from failed merge
* Fixing changelog
* [CI] Auto commit changes from spotless
* Fixing test
* Adding feature flag
* [CI] Auto commit changes from spotless
---------
Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
* Implemented ChatCompletion task for Google VertexAI with Gemini Models
* changelog
* System Instruction bugfix
* Mapping role assistant -> model in vertex ai chat completion request for compatibility
* GoogleVertexAI chat completion using SSE events. Removed JsonArrayEventParser
* Removed buffer from GoogleVertexAiUnifiedStreamingProcessor
* Casting inference inputs with `castoTo`
* Registered GoogleVertexAiChatCompletionServiceSettings in InferenceNamedWriteablesProvider. Added InferenceSettingsTests
* Changed transport version to 8_19 for vertexai chatcompletion
* Fix to transport version. Moved ML_INFERENCE_VERTEXAI_CHATCOMPLETION_ADDED to the right location
* VertexAI Chat completion request entity jsonStringToMap using `ensureExpectedToken`
* Fixed TransportVersions. Left vertexAi chat completion 8_19 and added new one for ML_INFERENCE_VERTEXAI_CHATCOMPLETION_ADDDED
* Refactor switch statements by if-else for older java compatibility. Improved indentation via `{}`
* Removed GoogleVertexAiChatCompletionResponseEntity and refactored code around it.
* Removed redundant test `testUnifiedCompletionInfer_WithGoogleVertexAiModel`
* Returning whole body when fail to parse response from VertexAI
* Refactor use GenericRequestManager instead of GoogleVertexAiCompletionRequestManager
* Refactored to constructorArg for mandatory args in GoogleVertexAiUnifiedStreamingProcessor
* Changed transport version in GoogleVertexAiChatCompletionServiceSettings
* Bugfix in tool calling with role tool
* GoogleVertexAiModel added documentation info on rateLimitGroupingHash
* [CI] Auto commit changes from spotless
* Fix: using Locale.ROOT when calling toLowerCase
* Fix: Renamed test class to match convention & modified use of forbidden api
* Fix: Failing test in InferenceServicesIT
---------
Co-authored-by: lhoet <lhoet@google.com>
Co-authored-by: Jonathan Buttner <56361221+jonathan-buttner@users.noreply.github.com>
Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
Fix and test off-heap stats when using direct IO for accessing the raw vectors. The direct IO reader is not using off-heap, so it returns an empty map to indicate that there is no off-heap requirements. I added some overloaded of tests with different directories to verify this.
Note: For 9.1 we're still using reflection to access the internals of non-ES readers, but DirectIO is an ES reader so we can use our internal OffHeapStats interface (rather than reflection). This is all replaced when we eventually get Lucene 10.3.