We have gotten more than one SDH due to customers not understanding
why restarts involving fully-mounted indices can pull a lot of data
from the snapshot tier, so it may help to be more explicit about
why this happens and how it can be avoided.
* Add total rule type counts to list calls and xpack usage
* Add feature
* Update docs/changelog/116357.yaml
* Fix docs test failure & update yaml tests
* remove additional spaces
---------
Co-authored-by: Mark J. Hoy <mark.hoy@elastic.co>
Adds clarification for vector preloading, what extension is to what
storage kind, and that quantized vectors are stored in separate files
allowing for individual preload.
closes: https://github.com/elastic/elasticsearch/issues/116273
This commit does the following:
* Add a new monitor_stats privilege
* Ensure that monitor_stats can be set in the remote_cluster privileges
* Give's Kibana the ability to remotely call monitor_stats via RCS 2.0
Since this is the first case where there is more than 1 remote_cluster privilege,
the following framework concern has been added:
* Ensure that when sending to elder RCS 2.0 clusters that we don't send the new privilege
previous only supported all or nothing remote_cluster blocks
* Ensure that we when sending API key role descriptors that contains remote_cluster,
we don't send the new privileges for RCS 1.0/2.0 if it not new enough
* Fix and extend the BWC tests for RCS 1.0 and RCS 2.0
Corrects the explanation of `percentiles_bucket` so it's clear that it
returns the `nth` largest item always, and it rounds `n` towards
infinity. That's how it's worked since 2016 but the docs talked about
"not greater than" which I don't think is particularly clear.
If xpack.ml.use_auto_machine_memory_percent is not explicitly set to true then
the default value (false) means ML will only use 30% of the available memory making
it impractical to run the ELSER model. This is useful for users wanting to get started
with semantic search.The single node docker instructions have been updated with a
command that gives the container enough memory to run the ELSER model and enables xpack.ml.use_auto_machine_memory_percent. For the multi-node guide the docker
compose file is updated to enable the ml setting for every node in the cluster.
Documentation for the remote_cluster in the role was added
in #111682 and #108840, but a few places were missed.
This commit fill the gaps in the documentation.
This adds bitwise inner product to painless.
The idea here is:
- For two bit arrays, which we determine to be a byte array whose dimensions match `dense_vector.dim/8`, we simply return bitwise `&`
- For a stored bit array (remember, with `dense_vector.dim/8` bytes), sum up the provided byte or float array using the bit array as a mask.
This is effectively supporting asynchronous quantization. A prime
example of how this works is:
https://github.com/cohere-ai/BinaryVectorDB
Basically, you do your initial search against the binary space and then
rerank with a differently quantized vector allowing for more information
without additional storage space.
closes: https://github.com/elastic/elasticsearch/issues/111232
The docs kinda imply that circuit breakers protect against OOMEs, at
least that's how some customers seem to interpret them. This commit adds
a note spelling out that this isn't the case.
While working on #110008 I discovered that the Date Trunc tests were only running in folding mode, because the interval types are marked as not representable. The correct way to test this is to set the forceLiteral flag for those fields, which will (as the name suggests) force them to be literals even in non-folding tests.
Doing that turned up errors in the evaluatorToString tests, which I fixed. There are two big changes here. First, the second parameter to the evaluator is a Rounding instance, not the actual interval. Since Rounding includes some information about the specific rounding in the toString results, I am just using a starts with matcher to validate the majority of the string, rather than trying to reconstruct the expected rounding string. Second, passing in a literal null for the interval parameter folds the whole expression to null, and thus a completely different toString. I added a clause in AnyNullIsNull to account for this.
While I was in there, I moved some specific test cases to a different file. I know moving code is something we're trying to minimize right now, but this seemed worth it. The tests in question do not depend on the parameters of the test case, but all methods in the class get run for every set of parameters. This was causing these tests to be run many times with the same values, which bloats our test run time and test count. Moving them to a distinct class means they'll only be executed once per test run. I feel like this benefit outweighs the cost of git history complexity.
* Refine ESQL limitations (full-text, TEXT fields, unassigned indexes)
This PR refactors a section of the ES|QL Limitations page to:
* Refactor both full-text and text-behaves-as-keyword sections to better reflect the new behaviour (the old text implies that no full-text search of any kind exists anywhere, which immediately contradicts the statements directly above it).
* Update text-behaves-as-keyword to include my recent work on making all functions return KEYWORD instead of TEXT or SEMANTIC_TEXT
* Add a section on multi-index querying to cover two limitations (union types and unassigned indexes).
* Fix full-text-search examples
This PR adds telemetry for logsdb. However, this change only tracks the
count of indices using logsdb and those that use synthetic source.
Additional stats, such as shard, indexing, and search stats, will be
added in a follow-up, as they require reaching out to data nodes.
A Lucene commit doesn't contain sync ids `SegmentInfos` anymore, so we can't rely on them during recovery. The fields was marked as deprecated in #102343.
Now that the match and qstr functions are Tech Previewing, we should add them to the top-level functions doc page.
Co-authored-by: Craig Taverner <craig@amanzi.com>
This commit prepares the documentation for version 9.
Some of the automation generates docs that are not correct for version 9.
The content has been commented out with a reference to an internal issue
for us to address before this documentation is used.
* Term Stats documentation
* Update docs/reference/reranking/learning-to-rank-model-training.asciidoc
Co-authored-by: István Zoltán Szabó <istvan.szabo@elastic.co>
* Fix query example.
---------
Co-authored-by: István Zoltán Szabó <istvan.szabo@elastic.co>