Commit graph

18276 commits

Author SHA1 Message Date
Dan Kortschak
6fe7d1c91c Update docs/changelog/128540.yaml 2025-06-27 01:13:07 +09:30
Dan Kortschak
0e9522c419 address pr comments 2025-06-06 09:51:45 +09:30
Dan Kortschak
04430e39fd Update docs/changelog/128540.yaml 2025-06-06 09:51:45 +09:30
Marci W
a0bfe61a83
Remove stale synthetic source preview note (#128981) 2025-06-05 09:31:25 -04:00
Fang Xing
79e600a269
[ES|QL] Date nanos implicit casting in union types option #2 (#127797)
* implicit casting for union typed fields mixed with datetime and date_nanos
2025-06-05 08:08:01 -04:00
Anirudh2112
cc5a2d86df
Inject an unfollow action before executing a downsample action in ILM (#128788)
When an index alias/data stream is replicated via CCR, the follower index needs to unfollow the leader index before executing actions that will remove the original index.
The downsample action is also an action that removes the original index so the expected behaviour is that ILM would automatically unfollow.

Co-authored-by: Niels Bauman <nielsbauman@gmail.com>
2025-06-05 07:53:12 +02:00
Graeme Mjehovich
64460dfdae
Fix unsupported privileges error message during role and API key crea… (#128858)
* Fix unsupported privileges error message during role and API key creation

* Update docs/changelog/128858.yaml

* [CI] Auto commit changes from spotless

* Update docs/changelog/128858.yaml

---------

Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
2025-06-04 18:46:18 -04:00
Craig Taverner
b01d55218f
Support DATE_NANOS in LOOKUP JOIN (#127962)
We reported in https://github.com/elastic/elasticsearch/issues/127249, there is no support for DATE_NANOS in LOOKUP JOIN, even though DATETIME is supported. This PR attempts to fix that.

The way that date-time was supported in LOOKUP JOIN (and ENRICH) was by using the `DateFieldMapper.DateFieldType.rangeQuery` (hidden behind the `termQuery` function) which internally takes our long values, casts them to Object, renders them to a string, parses that string back into an Instant (with a bunch of fancy and unnecessary checks for date-math, etc.), and then converts that instant back into a long for the actual query. Parts of this complex process are precision aware (ie. differentiate between ms and ns dates), but not the whole process. Simply dividing the original longs by 1_000_000 before passing them in actually works, but obviously looses precision. And the only reason it works anyway is that the date parsing code will accept a string containing a simple number and interpret it as either ms since the epoch, or years if the number is short enough. This does not work for nano-second dates, and in fact is far from ideal for LOOKUP JOIN on dates which does not need to re-parse the values at all.

This complex loop only makes sense in the Query DSL, where we can get all kinds of interesting sources of range values, but seems quite crazy for LOOKUP JOIN where we will always provide the join key from a LongBlock (the backing store of the DATE_TIME DataType, and the DATE_NANOS too).

So what we do here for DateNanos is provide two new methods to `DateFieldType`:
* `equalityQuery(Long, ...)` to replace `termQuery(Object, ...)`
* `rangeQuery(Long, Long, ...)` to replace `rangeQuery(Object, Object, ...)`

This allows us to pass in already parsed `long` values, and entirely skip the conversion to strings and re-parsing logic. The new methods are based on the original methods, but considerably simplified due to the removal of the complex parsing logic. The reason for both `equalityQuery` and `rangeQuery` is that it mimics the pattern used by the old `termQuery` with delegated directly down to `rangeQuery`. In addition to this, we hope to support range matching in `LOOKUP JOIN` in the near future.
2025-06-04 23:25:39 +02:00
Jan-Kazlouski-elastic
767d53fefa
Add Mistral AI Chat Completion support to Inference Plugin (#128538)
* Add Mistral AI Chat Completion support to Inference Plugin

* Add changelog file

* Fix tests and typos

* Refactor Mistral chat completion integration and add tests

* Refactor Mistral error response handling and extract StreamingErrorResponse entity

* Add Mistral chat completion request and response tests

* Enhance error response documentation and clarify StreamingErrorResponse structure

* Refactor Mistral chat completion request handling and introduce skip stream options parameter

* Refactor MistralChatCompletionServiceSettings to include rateLimitSettings in equality checks

* Enhance MistralErrorResponse documentation with detailed error examples

* Add comment for Mistral-specific 422 validation error in OpenAiResponseHandler

* [CI] Auto commit changes from spotless

* Refactor OpenAiUnifiedChatCompletionRequestEntity to remove unused fields and streamline constructor

* Refactor UnifiedChatCompletionRequestEntity and UnifiedCompletionRequest to rename and update stream options parameter

* Refactor MistralChatCompletionRequestEntityTests to improve JSON assertion and remove unused imports

* Add unit tests for MistralUnifiedChatCompletionResponseHandler to validate error handling

* Add unit tests for MistralService

* Update expected service count in testGetServicesWithCompletionTaskType

---------

Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
2025-06-04 13:43:33 -04:00
Kathleen DeRusso
eee423aaa0
[ES|QL] Add MATCH_PHRASE (#127661)
* Initial commit of match_phrase

* Add MatchPhraseQueryTests

* First pass at CSV specs

* Update docs/changelog/127661.yaml

* Refactor so MatchPhrase doesn't use all fulltext test cases, just text only

* Fix tests

* Add some CSV test cases

* Fix test

* Update changelog

* Update tests

* Comment out MATCH_PHRASE in search-functions Markdown

* Minor PR feedback

* PR feedback - refactor/consolidate code

* Add some more tests

* Fix some tests

* [CI] Auto commit changes from spotless

* Fix tests

* PR feedback - add tests, support boost and numeric data

* Revert "PR feedback - add tests, support boost and numeric data"

This reverts commit 4e7a699e3e.

* Apply testing/PR feedback outside numeric support only

* Regenerate docs

* Add negative test

* Update x-pack/plugin/esql/qa/testFixtures/src/main/resources/match-phrase-function.csv-spec

Co-authored-by: Carlos Delgado <6339205+carlosdelest@users.noreply.github.com>

* Update x-pack/plugin/esql/qa/testFixtures/src/main/resources/match-phrase-function.csv-spec

Co-authored-by: Carlos Delgado <6339205+carlosdelest@users.noreply.github.com>

* Update x-pack/plugin/esql/qa/testFixtures/src/main/resources/match-phrase-function.csv-spec

Co-authored-by: Carlos Delgado <6339205+carlosdelest@users.noreply.github.com>

* PR feedback

* Fix auto-commit error

* Regenerate docs

* Update x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/expression/function/fulltext/MatchPhrase.java

Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com>

* Remove non text field types

* Fake test data

* Remove tests that no longer should pass without ip/date/version support

* Put real data in score tests now that I was able to engineer a failure

* Realized the scoring test might be flakey because how it was written, updated

* PR feedback

* PR feedback

* [CI] Auto commit changes from spotless

* Add check to MatchPhrase tests

* Fix merge errors

* [CI] Auto commit changes from spotless

* Test generated docs

* Add additional verifier tests

---------

Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
Co-authored-by: Carlos Delgado <6339205+carlosdelest@users.noreply.github.com>
Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com>
2025-06-04 12:32:24 -04:00
mbivert-ipsos
aa0a829a08
[DOCS] Fix missing spaces (#128550) 2025-06-04 18:59:42 +03:00
Samiul Monir
d1b5532dbf
Semantic_text match_all with Highlighter (#128702)
* initial implementation for match_All

* reformat

* [CI] Auto commit changes from spotless

* Excluding matchAllintercepter

* Adding matchAllDocs support for vector fields

* [CI] Auto commit changes from spotless

* Remove previous implementation

* Adding yaml tests for match_all

* fixed yaml tests

* Update docs/changelog/128702.yaml

* Update changelog

* changelog - update summary

* Fix wrong inference names for the yaml tests

---------

Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2025-06-04 09:33:25 -04:00
Iván Cea Fontenla
36828e2f72
Aggs: Fix significant terms not finding background documents for nested fields (#128472)
Closes https://github.com/elastic/elasticsearch/issues/101163

Fixes the `significant_terms` aggregation not working correctly on nested fields.
2025-06-04 15:54:35 +03:00
Tim Vernum
fb874848ef
Check TooComplex exception for HasPrivileges body (#128870)
The index pattern provided in the body of `_has_privileges` can
trigger a `TooComplexToDeterminizeException` which is then bubbled up
(badly).

This change catches that exception and provides a better message
2025-06-04 21:55:23 +10:00
Bogdan Pintea
5eb54bff84
ESQL: Fix conversion of a Lucene wildcard pattern to a regexp (#128750)
This adds the reserved optional characters to the list that is escaped
during conversion. These characters are all enabled by the `RegExp.ALL`
flag in our use.

Closes #128676, closes #128677.
2025-06-04 17:28:17 +10:00
Niels Bauman
269fbbc289
Readd index.lifecycle.skip setting (#128736)
We want to be able to skip specific indices in ILM again for #109206.
This is essentially just a revert of #34823.
2025-06-04 00:08:12 +03:00
Mayya Sharipova
1ba21c2db6
Add bucketedSort based on int (#128848)
Add bucketedSort on Int

Follow up on #127968
2025-06-04 06:59:21 +10:00
Lisa Cawley
4b7a9bd563
[DOCS] Add missing index setting link (#128794) 2025-06-03 11:40:05 -07:00
eyalkoren
d3d2d9b996
Adding NormalizeForStreamProcessor (#125699) 2025-06-03 13:11:12 -04:00
elasticsearchmachine
2cb3d0d4de Prune changelogs after 9.0.2 release 2025-06-03 15:37:26 +00:00
Ievgen Sorokopud
550cddf5ee
Granting kibana_system reserved role access to "all" privileges to .adhoc.alerts* and .internal.adhoc.alerts* indices (#127321)
* Granting `kibana_system` reserved role access to "all" privileges to `.adhoc.alerts*` and `.internal.adhoc.alerts*` indices

* Update docs/changelog/127321.yaml

* [CI] Auto commit changes from spotless

* Replace `"all"` with the specific privileges for the `kibana_system` role

* Fix tests

* Fix CI

* Updated privileges

* Updated privileges

Add `"maintenance"` to allow `refresh=true` option on bulk API call.

* Remove redundant code

---------

Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
2025-06-03 15:37:52 +02:00
Charlotte Hoblik
38fb46d366
Add Connectors release notes for 9.0.2 (#128555) 2025-06-03 15:24:06 +02:00
Pawan Kartik
3f1e1b3c30
Handle the indices pattern ["*", "-*"] when grouping indices by cluster name (#128610)
The auth code injects the pattern `["*", "-*"]` to specify that it's okay to return an empty response because user's patterns did not match any remote clusters. However, we fail to recognise this specific pattern and `groupIndices()` eventually associates it with the local cluster. This causes Elasticsearch to fallback to the local cluster unknowingly and return its details back to the user even though the user hasn't requested any info about the local cluster.
2025-06-03 13:51:36 +01:00
Luigi Dell'Aquila
d1d7302574
ES|QL: Add support for LOOKUP JOIN on aliases (#128519) 2025-06-03 14:23:49 +03:00
Craig Taverner
11f0c5526a
ES|QL Support for ST_GEOHASH, ST_GEOTILE and ST_GEOHEX (#125143)
Added support for the three primary scalar grid functions:
* `ST_GEOHASH(geom, precision)`
* `ST_GEOTILE(geom, precision)`
* `ST_GEOHEX(geom, precision)`

As well as versions of these three that take an optional `geo_shape` boundary (must be a `BBOX` ie. `Rectangle`).

And also supporting conversion functions that convert the grid-id from long to string and back to long.

This work represents the core of the feature to support geo-grid aggregations in ES|QL.
2025-06-03 11:49:34 +02:00
Johannes Fredén
2696451275
Add retry for AccessDeniedException in AbstractFileWatchingService (#128653)
* Unmute testSymlinkUpdateTriggerReload

* Add retry for AccessDeniedException in AbstractFileWatchingService

* Update docs/changelog/128653.yaml
2025-06-03 11:36:58 +02:00
Niels Bauman
f988611691
React more prompty to task cancellation while waiting for the cluster to unblock (#128737)
Instead of waiting for the next run of the `ClusterStateObserver` (which
might be arbitrarily far in the future, but bound by the timeout if one
is set), we notify the listener immediately that the task has been
cancelled. While doing so, we ensure we invoke the listener only once.

Fixes #117971
2025-06-03 11:00:20 +03:00
Lloyd
70368c26e5
Add transport version support for IDP_CUSTOM_SAML_ATTRIBUTES_ADDED_8_19 (#128798) 2025-06-03 15:45:18 +09:00
Lloyd
2625200341
Implement SAML custom attributes support for Identity Provider (#128176)
* Implement SAML custom attributes support for Identity Provider

This commit adds support for custom attributes in SAML single sign-on requests
in the Elasticsearch X-Pack Identity Provider plugin. This feature allows
passage of custom key-value attributes in SAML requests and responses.

Key components:
- Added SamlInitiateSingleSignOnAttributes class for holding attributes
- Added validation for null and empty attribute keys
- Updated request and response objects to handle attributes
- Modified authentication flow to process attributes
- Added test coverage to validate attributes functionality

The implementation follows Elasticsearch patterns with robust validation
and serialization mechanisms, while maintaining backward compatibility.

* Add test for SAML custom attributes in authentication response

This commit adds a comprehensive test that verifies SAML custom attributes
are correctly handled in the authentication response builder. The test ensures:

1. Custom attributes with single and multiple values are properly included
2. The response with custom attributes is still correctly signed
3. The XML schema validation still passes with custom attributes
4. We can locate and verify individual attribute values in the response

This provides critical test coverage for the SAML custom attributes
feature implementation.

* Add backward compatibility overload for SuccessfulAuthenticationResponseMessageBuilder.build

This commit adds an overloaded build method that accepts only two parameters
(user and authenticationState) and forwards the call to the three-parameter
version with null for the customAttributes parameter. This maintains backward
compatibility with existing code that doesn't use custom attributes.

This fixes a compilation error in ServerlessSsoIT.java which was still using
the two-parameter method signature.

Signed-off-by: lloydmeta <lloydmeta@gmail.com>

* Add validation for duplicate SAML attribute keys

This commit enhances the SAML attributes implementation by adding validation
for duplicate attribute keys. When the same attribute key appears multiple
times in a request, the validation will now fail with a clear error message.

Signed-off-by: lloydmeta <lloydmeta@gmail.com>

* Refactor SAML attributes validation to follow standard patterns

This commit improves the SAML attributes validation by:

1. Adding a dedicated validate() method to SamlInitiateSingleSignOnAttributes
   that centralizes validation logic in one place
2. Moving validation from constructor to dedicated method for better error reporting
3. Checking both for null/empty keys and duplicate keys in the validate() method
4. Updating SamlInitiateSingleSignOnRequest to use the new validation method
5. Adding comprehensive tests for the new validation approach

These changes follow standard Elasticsearch validation patterns, making the
code more maintainable and consistent with the rest of the codebase.

* Update docs/changelog/128176.yaml

* Improve SAML response validation in identity provider tests

Enhanced the testCustomAttributesInIdpInitiatedSso test to properly validate
both SAML response structure and custom attributes using DOM parsing and XPath.

Key improvements:
- Validate SAML Response/Assertion elements exist
- Precisely validate custom attributes (department, region) and their values
- Use namespace-aware XML parsing for resilience to format changes

Signed-off-by: lloydmeta <lloydmeta@gmail.com>

* Simplify SAML attributes representation using JSON object/Map structure

Also, replace internal Attribute class list with a simpler Map<String, List<String>>
structure

This change:

- Removes the redundant Attribute class and replaces it with a direct Map
  implementation for storing attribute key-value pairs
- Eliminates the duplicate "attributes" nesting in the JSON structure
- Simplifies attribute validation without needing duplicate key checking

- Updates all related tests and integration points to work with the new structure

Before:

```js
{
  // others
  "attributes": {
    "attributes": [
      {
        "key": "department",
        "values": ["engineering", "product"]
      }
    ]
  }
}

After:

```js
{
  // other
  "attributes": {
    "department": ["engineering", "product"]
  }
}
```

(Verified by spitting out JSON entity in IdentityProviderAuthenticationIT.generateSamlResponseWithAttributes
... saw `{"entity_id":"ec:123456:abcdefg","acs":"https://sp1.test.es.elasticsearch.org/saml/acs","attributes":{"department":["engineering","product"],"region":["APJ"]}}`)

Signed-off-by: lloydmeta <lloydmeta@gmail.com>

* * Fix up toString dangling quote.

Signed-off-by: lloydmeta <lloydmeta@gmail.com>

* * Remove attributes from Response object.

Signed-off-by: lloydmeta <lloydmeta@gmail.com>

* * Remove friendly name.
* Make attributes map final in SamlInitiateSingleSignOnAttributes

Signed-off-by: lloydmeta <lloydmeta@gmail.com>

* * Cleanup serdes by using existing utils in the ES codebase

Signed-off-by: lloydmeta <lloydmeta@gmail.com>

* Touchup comment

Signed-off-by: lloydmeta <lloydmeta@gmail.com>

* Update x-pack/plugin/identity-provider/src/test/java/org/elasticsearch/xpack/idp/action/SamlInitiateSingleSignOnRequestTests.java

Co-authored-by: Tim Vernum <tim@adjective.org>

* Add transport-version checks

---------

Signed-off-by: lloydmeta <lloydmeta@gmail.com>
Co-authored-by: Tim Vernum <tim@adjective.org>
2025-06-03 05:26:41 +03:00
Mayya Sharipova
080a0cdd89
Enable sort optimization on int, short and byte fields (#127968)
Before this PR sorting on integer, short and byte fields types used
SortField.Type.LONG. This made sort optimization impossible for these
field types.

This PR uses SortField.Type.INT for integer, short and byte fields. This
enables sort optimization.

There are several caveats with changing sort type that are addressed: -
Before mixed sort on integer and long fields was automatically
supported, as both field types used SortField.TYPE.LONG. Now when
merging results from different shards, we need to convert sort to LONG
and results to long values. - Similar for collapsing when there is mixed
INT and LONG sort types. - Index sorting. Similarly, before for index
sorting on integer field, SortField.Type.LONG was used. This sort type
is stored in the index writer config on disk and can't be modified. Now
when providing sortField() for index sorting, we need to account for
index version: for older indices return sort with SortField.Type.LONG
and for new indices return SortField.Type.INT.

---

There is only 1 change that  may be considered not backwards compatible:
Before if an integer field was [missing a
value](https://www.elastic.co/docs/reference/elasticsearch/rest-apis/sort-search-results#_missing_values)
, it sort values will return Long.MAX_VALUE in a search response. With
this integer, it sort valeu will return Integer.MAX_VALUE.  But I think
this change is ok, as in our documentation, we don't provide information
what value will be returned, we just say it will be sorted last. 

---

Also closes #127965 (as same type validation in added for collapse
queries)
2025-06-03 07:50:11 +10:00
Ben Chaplin
13bce60be9
Fix inner hits + aggregations concurrency bug (#128036)
Fork InnerHitSubContext instances before source is fetched in 
aggregations to prevent inter-segment race conditions.

Relates to #122419
2025-06-02 16:44:53 -04:00
Jim Ferenczi
e7565b1f05
Optimize sparse vector stats collection (#128740)
This change improves the performance of sparse vector statistics gathering by using the document count of terms directly, rather than relying on the field name field to compute stats.
By avoiding per-term disk/network reads and instead leveraging statistics already loaded into leaf readers at index opening, we expect to significantly reduce overhead.

Relates to #128583
2025-06-02 17:05:55 +02:00
Martijn van Groningen
041c42a779
Remove use_default_lucene_postings_format feature flag (#128509)
Remove use_default_lucene_postings_format feature flag and
let the IndexMode decide whether to default lucene postings instead of checking for standard index mode.

The `Lucene101PostingsFormat` is now used for a while behind a feature flag. Regressions were found by were fixed via apache/lucene#14511. The `Lucene101PostingsFormat` is now a better trade off when the index mode is standard.
2025-06-02 17:04:08 +02:00
Tanguy Leroux
9b2252afb2
Fix computation of last block size in Azure concurrent multipart uploads (#128746)
Last part size is wrongly computed to 0 when the last part's length is
exactly equal to the size of a part. Would have probably be caught by an
existing assertion.

Relates ES-11815
2025-06-03 00:11:16 +10:00
Mridula
81fba27b6b
Add l2_norm normalization support to linear retriever (#128504)
* New l2 normalizer added

* L2 score normaliser is registered

* test case added to the yaml

* Documentation added

* Resolved checkstyle issues

* Update docs/changelog/128504.yaml

* Update docs/reference/elasticsearch/rest-apis/retrievers.md

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Score 0 test case added to check for corner cases

* Edited the markdown doc description

* Pruned the comment

* Renamed the variable

* Added comment to the class

* Unit tests added

* Spotless and checkstyle fixed

* Fixed build failure

* Fixed the forbidden test

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-06-02 14:59:03 +01:00
Mike Pellegrini
adda402a4c
Fix minmax normalizer handling of single-doc result sets (#128689) 2025-06-02 09:39:44 -04:00
Liam Thompson
5a4c42819f
[DOCS] Move applies_to to sit under heading in ES release notes (#128731)
^^
2025-06-02 12:16:43 +02:00
Ioana Tagirta
abf5f00413
Document boost option for match_phrase (#128738) 2025-06-02 11:57:03 +02:00
Jim Ferenczi
83126135fa
Avoid unnecessary determinization in index pattern conflict checks (#128362)
Starting with Lucene 10, `CharacterRunAutomaton` is no longer determinized automatically.
In Elasticsearch 9, we adapted to this by eagerly determinizing automatons early (via `Regex#simpleMatchToAutomaton`).
However, this introduced  regression: operations like index template conflict checks, which only require intersection testing, now pay the cost of determinization—an expensive step that wasn’t needed before. In some cases, especially when many wildcard patterns are involved, determinization can even fail due to state explosion.

This change removes the unnecessary determinization, restoring the pre-9.0 behavior and allowing valid index templates with many patterns to be registered again.
2025-06-02 10:39:35 +02:00
George Wallace
4eca31756f
Update dissect-processor.md (#128708) 2025-06-02 08:49:45 +02:00
Yang Wang
aa0397fb49
Update shardGenerations for all indices on snapshot finalization (#128650)
If an index is deleted after a snapshot has written its shardGenerations 
file but before the snapshot is finalized, we exclude this index from the 
snapshot because its indexMetadata is no longer available. However, 
the shardGenerations file is still valid in that it is the latest copy with all 
necessary information despite it containing an extra snapshot entry. 
This is OK. Instead of dropping this shardGenerations file, this PR 
changes to carry it forward by updating RepositoryData and relevant 
in-progress snapshots so that the next finalization builds on top of this one.

Co-authored-by: David Turner <david.turner@elastic.co>
2025-06-02 15:09:04 +10:00
Ankita Kumar
3e0584a3c0
Modify the mechanism to pause indexing (#128405)
This PR changes the mechanism to pause indexing which was introduced in #127173. The original PR caused IndexStatsIT#testThrottleStats to fail. See #126359.
2025-05-30 14:39:01 -04:00
David Turner
3443a6ef3b
Improve get-snapshots message for unreadable repository (#128273)
Today if the get-snapshots API cannot access one of the repositories we
return a fairly low-level message about the problem, perhaps something
like `Could not determine repository generation from root blobs`. This
message is shown verbatim in the Kibana UI so users need something a
little more descriptive. With this commit we wrap the exception in one
that indicates the problem in terms that users are more likely to
understand.

Relates #128208
2025-05-30 21:02:49 +10:00
Bogdan Pintea
0a8091605b
ESQL: Pushdown constructs doing case-insensitive regexes (#128393)
This introduces an optimization to pushdown to Lucense those language constructs that aim at case-insensitive regular expression matching, used with `LIKE` and `RLIKE` operators, such as:
* `| WHERE TO_LOWER(field) LIKE "abc*"`
* `| WHERE TO_UPPER(field) RLIKE "ABC.*"` 
 
These are now pushed as case-insensitive `wildcard` and `regexp` respectively queries down to Lucene.

Closes #127479
2025-05-30 10:55:00 +02:00
Mridula
cc461afa0a
Fix: Allow non-score sorts in pinned retriever sub-retrievers (#128323)
* Fix scoring and sort handling in pinned retriever

* Remove books.es from version control and add to .gitignore

* Remove books.es from version control and add to .gitignore

* Remove books.es entries from .gitignore

* fixed the mess

* Update x-pack/plugin/search-business-rules/src/test/java/org/elasticsearch/xpack/searchbusinessrules/retriever/PinnedRetrieverBuilderTests.java

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update x-pack/plugin/search-business-rules/src/main/java/org/elasticsearch/xpack/searchbusinessrules/retriever/PinnedRetrieverBuilder.java

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-05-30 10:01:01 +03:00
Nhat Nguyen
1ab2e6ca6c
Combine small pages in Limit (#128531)
Currently, the Limit operator does not combine small pages into larger 
ones; it simply passes them along, except for chunking pages larger than
the limit. This change integrates EstimatesRowSize into Limit and
adjusts it to emit larger pages. As a result, pages up to twice the
pageSize may be emitted, which is preferable to emitting undersized
pages. This should reduce the number of transport requests and responses
between clusters or coordinator-data nodes for queries without TopN or
STATS when target shards produce small pages due to their size or highly
selective filters.
2025-05-29 16:04:49 -07:00
Jonathan Buttner
9db18373ba
Adding configurable inference service (#127939)
* Inference changes

* Custom service fixes

* Update docs/changelog/127939.yaml

* Cleaning up from failed merge

* Fixing changelog

* [CI] Auto commit changes from spotless

* Fixing test

* Adding feature flag

* [CI] Auto commit changes from spotless

---------

Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
2025-05-30 00:12:19 +03:00
elasticsearchmachine
e241efabb6 Prune changelogs after 8.18.2 release 2025-05-29 17:47:46 +00:00
Leonardo Hoet
107daf3321
Implemented ChatCompletion task for Google VertexAI with Gemini Models (#128105)
* Implemented ChatCompletion task for Google VertexAI with Gemini Models

* changelog

* System Instruction bugfix

* Mapping role assistant -> model in vertex ai chat completion request for compatibility

* GoogleVertexAI chat completion using SSE events. Removed JsonArrayEventParser

* Removed buffer from GoogleVertexAiUnifiedStreamingProcessor

* Casting inference inputs with `castoTo`

* Registered GoogleVertexAiChatCompletionServiceSettings in InferenceNamedWriteablesProvider. Added InferenceSettingsTests

* Changed transport version to 8_19 for vertexai chatcompletion

* Fix to transport version. Moved ML_INFERENCE_VERTEXAI_CHATCOMPLETION_ADDED to the right location

* VertexAI Chat completion request entity jsonStringToMap using `ensureExpectedToken`

* Fixed TransportVersions. Left vertexAi chat completion 8_19 and added new one for ML_INFERENCE_VERTEXAI_CHATCOMPLETION_ADDDED

* Refactor switch statements by if-else for older java compatibility. Improved indentation via `{}`

* Removed GoogleVertexAiChatCompletionResponseEntity and refactored code around it.

* Removed redundant test `testUnifiedCompletionInfer_WithGoogleVertexAiModel`

* Returning whole body when fail to parse response from VertexAI

* Refactor use GenericRequestManager instead of GoogleVertexAiCompletionRequestManager

* Refactored to constructorArg for mandatory args in GoogleVertexAiUnifiedStreamingProcessor

* Changed transport version in GoogleVertexAiChatCompletionServiceSettings

* Bugfix in tool calling with role tool

* GoogleVertexAiModel added documentation info on rateLimitGroupingHash

* [CI] Auto commit changes from spotless

* Fix: using Locale.ROOT when calling toLowerCase

* Fix: Renamed test class to match convention & modified use of forbidden api

* Fix: Failing test in InferenceServicesIT

---------

Co-authored-by: lhoet <lhoet@google.com>
Co-authored-by: Jonathan Buttner <56361221+jonathan-buttner@users.noreply.github.com>
Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
2025-05-29 13:35:24 -04:00
Chris Hegarty
7f05ab9cf6
Fix and test off-heap stats when using direct IO for accessing the raw vectors (#128615)
Fix and test off-heap stats when using direct IO for accessing the raw vectors. The direct IO reader is not using off-heap, so it returns an empty map to indicate that there is no off-heap requirements. I added some overloaded of tests with different directories to verify this.

Note: For 9.1 we're still using reflection to access the internals of non-ES readers, but DirectIO is an ES reader so we can use our internal OffHeapStats interface (rather than reflection). This is all replaced when we eventually get Lucene 10.3.
2025-05-29 17:43:07 +01:00