Always return `KEYWORD` for functions that previously returned `TEXT`, because any change to the value, no matter how small, is enough to render meaningless the original analyzer associated with the `TEXT` field value. In principle, if the attribute is no longer the original `FieldAttribute`, it can no longer claim to have the type `TEXT`.
This has been done for all functions: conversion functions, aggregating functions, multi-value functions. There were several that already produced `KEYWORD` for `TEXT` input (eg. ToString, FromBase64 and ToBase64, MvZip, ToLower, ToUpper, DateFormat, Concat, Left, Repeat, Replace, Right, Split, Substring), but many others that incorrectly claimed to produce `TEXT`, while this was really a false claim. This PR makes that now strict, and includes changes to the functions' units tests to disallow the tests to expect any functions output to be `TEXT`.
One side effect of this change is that methods that take multiple parameters that require all of them to have the same type, will now treat TEXT and KEYWORD the same. This was already the case for functions like `Concat`, but is now also the case for `Greatest`, `Least`, `Case`, `Coalesce` and `MvAppend`.
An associated change is that the type casting operator `::text` has been entirely removed. It used to map onto the `ToString` function which returned type KEYWORD, and so `::text` really produced a `KEYWORD`, which is a lie, or at least a `bug`, which is now fixed. Should we ever wish to actually produce real `TEXT`, we might love the fact that this operator has been freed up for future use (although it seems likely that function will require parameters to specify the analyzer, so might never be an operator again).
### Backwards compatibility issues:
This is a change that will fail BWC tests, since we have many tests that assert on TEXT output to functions. For this reason we needed to block two scenarios:
* We used the capability `functions_never_emit_text` to prevent 7 csv-spec tests and 2 yaml tests from being run against older versions that still emit text.
* We used `skipTest` to also block those two yaml tests from being run against the latest build, but using older yaml files downloaded (as far back as 8.14).
In all cases the change observed in these tests was simply the results columns no longer having `text` type, and instead being `keyword`.
---------
Co-authored-by: Luigi Dell'Aquila <luigi.dellaquila@gmail.com>
This PR adds detailed documentation for `logsdb` mode, covering several key aspects of its default behavior and configuration options.
It includes:
- default settings for index sorting (`index.sort.field`, `index.sort.order`, etc.).
- usage of synthetic `_source` by default.
- information about specialized codecs and how users can override them.
- default behavior for `ignore_malformed` and `ignore_above` settings, including precedence rules.
- explanation of how fields without `doc_values` are handled and what we do if they are missing.
* Update settings endpoint modified
Now accepts index.routing.allocation.* settings but denies changing
the allocation setting that keeps watches on data nodes
* Get settings endpoint modified
Now returns index.routing.allocation.* settings explicitly filters out
the `index.routing.allocation.include._tier_preference` setting
* Tests for modified endpoints
* Update docs
The most relevant ES changes that upgrading to Lucene 10 requires are:
- use the appropriate IOContext
- Scorer / ScorerSupplier breaking changes
- Regex automaton are no longer determinized by default
- minimize moved to test classes
- introduce Elasticsearch900Codec
- adjust slicing code according to the added support for intra-segment concurrency
- disable intra-segment concurrency in tests
- adjust accessor methods for many Lucene classes that became a record
- adapt to breaking changes in the analysis area
Co-authored-by: Christoph Büscher <christophbuescher@posteo.de>
Co-authored-by: Mayya Sharipova <mayya.sharipova@elastic.co>
Co-authored-by: ChrisHegarty <chegar999@gmail.com>
Co-authored-by: Brian Seeders <brian.seeders@elastic.co>
Co-authored-by: Armin Braun <me@obrown.io>
Co-authored-by: Panagiotis Bailis <pmpailis@gmail.com>
Co-authored-by: Benjamin Trent <4357155+benwtrent@users.noreply.github.com>
* docs: update synthetic source docs
* fix: also doc values false works
* Revert "fix: also doc values false works"
This reverts commit 0895a76758.
* fix: update synthetic source documentation
* fix: all field types support it
* fix: no need to explicitly mention it
* fix: synthetic source sorting
* fix: may instead of might
Because of #93575 it's not sufficient to mark repositories with
`readonly: true` while taking a backup. The only safe way to avoid
writes is to completely unregister them.
* (Doc+) Cross-link max shards
👋 It appears we have two docs of similar content about max open shards. This one contains the error users search (so is what we linked the error to in https://github.com/elastic/elasticsearch/pull/110993) but the other I believe is a placeholder doc for the health api code. Should maybe consolidate some day but in the mean time at least cross-link.
---------
Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com>
We will deprecate the `_source.mode` mapping level configuration
in favor of the index-level `index.mapping.source.mode` setting.
As a result, we go through the documentation and update it to reflect
the introduction of the setting.
S3 register reads are subject to the regular client retry policy, but in
practice we see failures of these reads sometimes for errors that are
transient but for which the SDK does not retry. This commit adds another
layer of retries to these reads.
Relates ES-9721
* Add data stream template validation
to snapshot restore
* Add data stream template validation
to data stream promotion endpoint
* Add new assertion for response headers
Add a new assertion to synchronously execute a request and check the
response contains a specific warning header
* Test for warning header on snapshot restore
When missing templates
* Test for promotion warnings
* Add documentation for the potential error states
* PR changes
* Spotless reformatting
* Add logic to look in snapshot global metadata
This checks if the snapshot contains a matching template for the DS
* Comment on test cleanup to explain it was copied
* Removed cluster service field