Frozen indices, the freeze index API and the private index.frozen setting have been removed with #120539.
There is also a search throttled thread pool that can now be removed, as well as a private search.throttled index settings that is no longer used as it could only be set internally by freezing an index.
While the index setting is private and can be removed, as it should no longer be present in any index on 9.0+ indices, the thread pool settings associated to the removed pool are still accepted as no-op in case users have customized them and are upgrading without removing these. These will also trigger a deprecating warning.
This change also removes the search.throttled related output from the thread pool section of the cluster info API.
* Fix Gradle Deprecation warning as declaring an is- property with a Boolean type has been deprecated.
* Make use of new layout.settingsFolder api to address some cross project references
* Fix buildParams snapshot check for multiprojet projects
The keyword doc values field gets an extra sorted doc values field, that encodes the order of how array values were specified at index time. This also captures duplicate values. This is stored in an offset to ordinal array that gets zigzag vint encoded into a sorted doc values field.
For example, in case of the following string array for a keyword field: ["c", "b", "a", "c"].
Sorted set doc values: ["a", "b", "c"] with ordinals: 0, 1 and 2. The offset array will be: [2, 1, 0, 2]
Null values are also supported. For example ["c", "b", null, "c"] results into sorted set doc values: ["b", "c"] with ordinals: 0 and 1. The offset array will be: [1, 0, -1, 1]
Empty arrays are also supported by encoding a zigzag vint array of zero elements.
Limitations:
currently only doc values based array support for keyword field mapper.
multi level leaf arrays are flattened. For example: [[b], [c]] -> [b, c]
arrays are always synthesized as one type. In case of keyword field, [1, 2] gets synthesized as ["1", "2"].
These limitations can be addressed, but some require more complexity and or additional storage.
With this PR, keyword field array will no longer be stored in ignored source, but array offsets are kept track of in an adjacent sorted doc value field. This only applies if index.mapping.synthetic_source_keep is set to arrays (default for logsdb).
This patch removes the check that fails requests that attempt to use fields of type: nested within indices with mode time_series.
This patch also updates TimeSeriesIdFieldMapper#postParse to set the _id field on child documents once it's calculated.
Closes#120874
This change disables auto_expand_replicas on lookup indices to enhance
the lookup join user experience. Users can, however, enable this setting
at any time to optimize performance.
When originally added, the knn query didn't apply `top-k` restrictions
to the query. Instead it would allow the resulting `num_candidate` to be
combined with sibling queries without restricting to `top-size` results
ahead of time.
This honestly is confusing behavior and leads to some bugs in understand
how it all works.
This commit addresses this by eagerly gathering only `size` results when
`k==null` before combining with other queries.
To achieve the previous behavior, this can be done directly by setting
`k==num_candidates` in the query.
* Reapply "[Build] Do not invalidate configuration cache when branch is switched (#118894)" (#119300)
The original PR (#118894) has broken serverless.
* Fix gitinfo plugin for serverless usage
* Update buildscan git revision reference
This pull request introduces a new retriever called `rescorer`, which leverages the `rescore` functionality of the search request.
The `rescorer` retriever re-scores only the top documents retrieved by its child retriever, offering fine-tuned scoring capabilities.
All rescorers supported in the `rescore` section of a search request are available in this retriever, and the same format is used to define the rescore configuration.
<details>
<summary>Example:</summary>
```yaml
- do:
search:
index: test
body:
retriever:
rescorer:
rescore:
window_size: 10
query:
rescore_query:
rank_feature:
field: "features.second_stage"
linear: { }
query_weight: 0
retriever:
standard:
query:
rank_feature:
field: "features.first_stage"
linear: { }
size: 2
```
</details>
Closes#118327
Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com>
This measurably improves BBQ by adjusting the underlying algorithm to an
optimized per vector scalar quantization.
This is a brand new way to quantize vectors. Instead of there being a
global set of upper and lower quantile bands, these are optimized and
calculated per individual vector. Additionally, vectors are centered on
a common centroid.
This allows for an almost 32x reduction in memory, and even better
recall than before at the cost of slightly increasing indexing time.
Additionally, this new approach is easily generalizable to various other
bit sizes (e.g. 2 bits, etc.). While not taken advantage of yet, we may
update our scalar quantized indices in the future to use this new
algorithm, giving significant boosts in recall.
The recall gains spread from 2% to almost 10% for certain datasets with
an additional 5-10% indexing cost when indexing with HNSW when compared
with current BBQ.
Remove to, from, include_lower, include_upper range query params.
These params have been removed from our documentation in v. 0.90.4 (d6ecdec),
and got deprecated in 8.16 in #113286.
* Track source for objects and fields with [synthetic_source_keep:arrays] in arrays as ignored
* Update TransportResumeFollowActionTests.java
* rest compat fixes
* rest compat fixes
* update test
A Lucene commit doesn't contain sync ids `SegmentInfos` anymore, so we can't rely on them during recovery. The fields was marked as deprecated in #102343.
It was deprecated in #104209 (8.13) and shouldn't be set or returned in 9.0
The Desired Nodes API is an internal API, and users shouldn't depend on its backward compatibility.
The most relevant ES changes that upgrading to Lucene 10 requires are:
- use the appropriate IOContext
- Scorer / ScorerSupplier breaking changes
- Regex automaton are no longer determinized by default
- minimize moved to test classes
- introduce Elasticsearch900Codec
- adjust slicing code according to the added support for intra-segment concurrency
- disable intra-segment concurrency in tests
- adjust accessor methods for many Lucene classes that became a record
- adapt to breaking changes in the analysis area
Co-authored-by: Christoph Büscher <christophbuescher@posteo.de>
Co-authored-by: Mayya Sharipova <mayya.sharipova@elastic.co>
Co-authored-by: ChrisHegarty <chegar999@gmail.com>
Co-authored-by: Brian Seeders <brian.seeders@elastic.co>
Co-authored-by: Armin Braun <me@obrown.io>
Co-authored-by: Panagiotis Bailis <pmpailis@gmail.com>
Co-authored-by: Benjamin Trent <4357155+benwtrent@users.noreply.github.com>
This commit bumps the REST API version from 8 to 9. This effectively removes all support for REST API
compatibility with version 7 (though earlier commits already chipped away at some v7 support).
This also enables REST API compatibility support for version 8, providing support for v8 compatibility headers,
i.e. "application/vnd.elasticsearch+json;compatible-with=8" and no-op support (no errors) to accept v9
compatibility headers i.e. "application/vnd.elasticsearch+json;compatible-with=9".
see additional context in the GH PR #113151