Add the semantic query to the Query DSL, which is used to query semantic_text fields
---------
Co-authored-by: carlosdelest <carlos.delgado@elastic.co>
Co-authored-by: Jim Ferenczi <jim.ferenczi@elastic.co>
In order to improve the experience of cross-cluster search, we are changing
the default value of the remote cluster `skip_unavailable` setting from `false` to `true`.
This setting causes any cross-cluster _search (or _async_search) to entirely fail when
any remote cluster with `skip_unavailable=false` is either unavailable (connection to it fails)
or if the search on it fails on all shards.
Setting `skip_unavailable=true` allows partial results from other clusters to be
returned. In that case, the search response cluster metadata will show a `skipped`
status, so the user can see that no data came in from that cluster. Kibana also
now leverages this metadata in the cross-cluster search responses to allow users
to see how many clusters returned data and drill down into which clusters did not
(including failure messages).
Currently, the user/admin has to specifically set the value to `true` in the configs, like so:
```
cluster:
remote:
remote1:
seeds: 10.10.10.10:9300
skip_unavailable: true
```
even though that is probably what search admins want in the vast majority of cases.
Setting `skip_unavailable=false` should be a conscious (and probably rare) choice
by an Elasticsearch admin that a particular cluster's results are so essential to a
search (or visualization in dashboard or Discover panel) that no results at all should
be shown if it cannot return any results.
To simplify the migration away from version based skip checks in YAML specs,
this PR adds a synthetic version feature `gte_vX.Y.Z` for any version at or before 8.14.0.
New test specs for 8.14 or later are expected to use respective new cluster features,
or a test-only feature supplied via ESRestTestCase#createAdditionalFeatureSpecifications
if sufficient.
We are adding a query parameter to the field_caps api in order to filter out
fields with no values. The parameter is called `include_empty_fields` and
defaults to true, and if set to false it will filter out from the field_caps
response all the fields that has no value in the index.
We keep track of FieldInfos during refresh in order to know which field has
value in an index. We added also a system property
`es.field_caps_empty_fields_filter` in order to disable this feature if needed.
---------
Co-authored-by: Matthias Wilhelm <ankertal@gmail.com>
A couple of children of `BroadCastResponse` are completely redundant,
adding no extra fields or separate serialization.
Removed them and replaced their use by the broadcast response itself.
* Pushing down node versions as strings
Deferring Version parsing to the actual places where a minimum node version/common cluster version is needed; eventually this will be completely lazy and/or replaced by other checks (e.g. features).
Combine versions, oses and features in multi-cluster YAML test contexts.
Just moving stuff around, no change in behaviour. Moving these fields showed how we are not treating correctly in derived classes where multiple clusters are tested (ex: CCR), but this is for another time.
Co-authored-by: Moritz Mack <moritz@mackmail.net>
Just found that we have a lot of inconsistency and needless verbosity
here in tests. We can just use `assertAcked` in a couple spots
to save `.get`, `.actionGet` etc., especially with the signature
change I added here.
To help the user know what the possible cluster states are and to
provide an accurate accounting, we added counters summarising
`running`, `partial` and `failed` clusters to the `_clusters` section.
Changes:
- Now in the response is present the number of `running` clusters.
- We split up `partial` and `successful` (before was summed up in the
`successful` counter).
- We now have a counter for `failed` clusters.
- Now `total` is always equal to `running` + `skipped` + `failed` +
`partial` + `successful`.
This commit tracks progress for each shard search by cluster alias
using a new SearchProgressListener (CCSSingleCoordinatorSearchProgressListener).
Both sync and async CCS searches use this new progress listener when
minimize_roundtrips=false.
Two of the SearchProgressListener method had to be extended to allow tracking
per-cluster took values (TransportSearchAction.SearchTimeProvider) and
whether searches timed out (by passing in QuerySearchResult to the onQueryResult
listener method).
This commit brings parity between minimize_roundtrips=true and false to have
the same _cluster/details sections in CCS search responses.
Note that there are still a few differences between minimize_roundtrips=true and false.
1. The per-cluster took value for minimize_roundtrips=true is accurate, but the
for 'false' it is only measured at the granualarity of each partial reduce,
so the per cluster took time is overestimated in basically all cases.
2. For minimize_roundtrips=true, a skip_unavailable=false cluster that disconnects
during the search or has all searches on all shards fail, will cause the entire
search to fail. This is (still) not true for minimize_roundtrips=false. The search
is only failed if the skip_unavailable=false cluster cannot be connected to at the
start of the search. (This will likely be changed in a follow up ticket that implements
fail-fast logic for in-progress searches that should fail due to a skip_unavailable=true
cluster failing.)
3. The shard accounting for minimize_roundtrips=false is always accurate (total shard counts
are known at the start of the search). For minimize_roundtrips=true, the shard accounting
is only accurate per cluster unless all clusters have successful (or partially successful)
searches. For clusters that have failures we do not have shard count info.
Several minor structural and test improvements for cross-cluster search
These changes set the stage for a follow-on ticket to add _cluster/details to
cross-cluster searches with minimize_roundtrips = false. To help keep that
PR from being too large some of the simpler required changes and tests
are added in this PR.
* Support CCS minimize round trips in async search
This commit makes the smallest set of changes to allow async-search based cross-cluster search
to work with the CCS minimize_round_trips feature without changing the internals/architecture of
the search action.
When ccsMinimizeRoundtrips is set to true on SubmitAsyncSearchRequest, the AsyncSearchTask on the
primary CCS coordinator sends a synchronous SearchRequest to all to clusters for a remote coordinator
to orchestrate and return the entire result set to the CCS coordinator as a single response.
This is the same functionality provided by synchronous CCS search using minimize_roundtrips.
Since this is an async search, it means that the async search coordinator has no visibility
into search progress on the remote clusters while they are running the search, thus losing one of
the key features of async search. However, this is a good first approach for improving overall search
latency for cross cluster searches that query a large number of shards on remote clusters, since
Kibana does not currently expose incremental progress of an async search to users.
Relates #73971
In https://github.com/elastic/elasticsearch/pull/91238 we rewrote
BulkProcessor to avoid deadlock that had been seen in the
IlmHistoryStore. At some point we will remove BulkProcessor altogether.
This PR ports a couple of integration tests that were using BulkProcesor
over to BulkProcessor2.
This cleans up the derivative pipeline agg in a few days:
1. Moves it to the `aggregations` module ala #90283
2. Drops the ceremonial interface from the result ala #82273
3. Adds comprehensive REST layer tests for it ala #26220
4. Removes some `ESIntegTestCase tests` that duplicated those in the
`AggregatorTestCase`.
Currently the "skip" section in out rest yaml tests take into account the lowest
minor version of nodes in the connected local cluster. For some multi cluster
CCS test we will also need the ability to skip certain tests based on the
connected remote cluster version, e.g. if in bwc tests some functionality isn't
available yet on some bwc versions we test against. This change adds that
ability to yaml rest test in the :qa:multi-cluster-search module.
Relates to #84481
Most classes under elasticsearch-core had been moved to the o.e.core
package. However, a couple io related classes remained in an "internal"
package. This commit moves Streams and IOUtils to the core package, as
they are no more "internal" than the rest of the classes in core.
Currently this qa module runs integration tests that cover cross-cluster search
with both the local and the remote cluster on the same version. In order to also
cover cross-cluster communication across multiple versions, this change adds
additional test tasks that also start a remote cluster in all wire-compatible
previous minor version and run the same tests against this configuration.
Relates #84481
Lucene issues that resulted in elasticsearch changes:
LUCENE-9820 Separate logic for reading the BKD index from logic to intersecting it.
LUCENE-10377: Replace 'sortPos' with 'enableSkipping' in SortField.getComparator()
LUCENE-10301: make the test-framework a proper module by moving all test
classes to org.apache.lucene.tests
LUCENE-10300: rewrite how resources are read in ukrainian morfologik analyzer:
LUCENE-10054 Make HnswGraph hierarchical
The ES code base is quite JSON heavy. It uses a lot of multi-line JSON requests in tests which need to be escaped and concatenated which in turn makes them hard to read. Let's try to leverage Java 15 text blocks for representing them.
Make `index.time_series.start_time` and `index.time_series.end_time` settings as required settings in tsdb indices.
It will change many tests, among which a time_series index is created, and it will add the `index.time_series.start_time` and `index.time_series.end_time` settings
Changes can-match from a shard-level to a node-level action, which helps avoid an explosion of shard-level can-match
subrequests in clusters with many shards, that can cause stability issues. Also introduces a new search_coordination
thread pool to handle the sending and handling of node-level can-match requests.
* Route documents to the correct shards in tsdb
This causes elasticsearch to land documents from the same time series on
the same shard. It does so by adding a new index setting `routing_path`
which must be set when an index is in `mode: time_series` and may not be
set outside of that mode. That setting contains a list of patterns to
extract from the `_source` document that are hashed into the routing
value.
* Moar skip
* tsdb survives full cluster restart
* Remove auto generated id rejection
We do want to reject these documents but let's sae that for a follow up
change.
* Simplify
* Forbid routing_required
* Small
* Fork fork knife
* Let failures flow
* Fix full cluster
* Always fork
* Retry?
* Remove pressure test arm
* Add missing settings
* WIP
* Remove leftover
* More cleaning
* Fixup more tests
* Remove old skip
* New tests for Retry
* More tests
* Revert unrelated
* One dispatch please
* Stuff moved
* More moving
* Explain why fork
* Back to ActionRunnable
* Update comment
* Utility method
* Move routing_path under feature flag
* Imports
Fix the split package org.elasticsearch.common.xcontent, between server and the x-content lib. Move the x-content lib exported package from org.elasticsearch.common.xcontent to org.elasticsearch.xcontent ( following the naming convention of similar libraries ). Removing split packages is a prerequisite to modularization.
When we run the build in non-snapshot mode we don't enable feature flags
by default. This would cause the tests for tsdb to fail. So we enable
the tsdb feature flag in every build that needs it. There were a few
builds that needed it that didn't have it. This adds it to those builds.
Closes#78443Closes#78598
* Do not create unused testCluster
This avoids creating test clusters that are not required during the build.
We use lazy configuration here on testClusters and only instantiate them as theyre
* Do not fail on run task (debug)
* Create more test cluster lazy
* Make more test cluster lazy
* Avoid creating unused testcluster
* Fix PluginBuildPlugin
* Fix disabling geo db download
* Fix cluster setup in repository-multi-version
* Polishing
* Fix issue with irretic groovy ogic
* Fix bwc tests
* Fix more bwcTests
* Fix more bwc tests
* Fix more bwc tests
* Fix more bwc tests
* Fix typo
* Minor polishing
* Fix rolling upgrade tests
* Fix cluster config in sql qa mixedcluster project
* Fix more bwc tests
* Clean up before review
* Document test cluster usage
* Api polising after Review
provide useCluster(Provider) method to TestClusterAware
Ideally we take this a step further and realize those test clusters only on use.
But out of scope of this PR.
* Allow gradle provider as value for nonSystemProperties
* Some simplification on test configuration
* Fix typo in rest test config
* Fix more typos
* Fix another typo
* Fix more typos
This adds profiling to the fetch phase so we can tell when fetching is
slower than we'd like and we can tell which portion of the fetch is
slow. The output includes which stored fields were loaded, how long it
took to load stored fields, which fetch sub-phases were run, and how
long those fetch sub-phases took.
Closes#75892
* Skip bwc
* Don't compare fetch profiles
* Use passed one
* no npe
* Do last rename
* Move method down
* serialization tests
* Fix sneaky serialization
* Test for sneaky bug
* license header
* Document
* Fix test
* newline
* Restore assertion
* unit test merging
* Handle inner hits
* Fixup
* Revert unneeded
* Revert inner hits profiling
* Fix names
* Fixup names
* Move results building
* Drop loaded_nested
* Checkstyle
* Fixup more
* Finish writeable cleanup
Add unit tests for merge
* Remove null checking builder
* Fix wire mistake
How did this pass before?!
* Rename
* Remove funny builder
* Remove name munging
CCSDuelIT sometimes fails due to an unexpected number of shards returned in the refresh response. We create an index with a certain number of shards and no replicas, and sometimes the number of shards refresh is higher by one than the shards we created. This is not symptom of a problem though, as we are checking that there are no failures, but rather indicates a relocation happening, during which both copies of the same shard have been refreshed.
Closes#61258Closes#72637Closes#67754
With the overall theme of trying to configure and add less to the build instead of just disabling it later,
we're replacing standalone-test by standalone-rest tasks avoids creating the
unused test tasks.
Standalone rest test plugin and the other rest test plugins behave a little bit different in the sense how source sets and test tasks are wired.
The standalone rest test plugin assumes that all RestTestTasks are using the same sourceSet (test). The yaml, java Rest test plugins use one dedicated sourceSet per test task.
In the long run we probably will migrate standalone-rest-test usages to one of the other plugins and deprecate standalone-rest-test
Extract usage of internal API from TestClustersPlugin and PluginBuildPlugin and related plugins and build logic
This includes a refactoring of ElasticsearchDistribution to handle types
better in a way we can differentiate between supported Elasticsearch
Distribution types supported in TestCkustersPlugin and types only supported
in internal plugins.
It also introduces a set of internal versions of public plugins.
As part of this we also generate the plugin descriptors now.
As a follow up on this we can actually move these public used classes into
an extra project (declared as included build)
We keep LoggedExec and VersionProperties effectively public And workaround for RestTestBase
Related to #71593 we move all build logic that is for elasticsearch build only into
the org.elasticsearch.gradle.internal* packages
This makes it clearer if build logic is considered to be used by external projects
Ultimately we want to only expose TestCluster and PluginBuildPlugin logic
to third party plugin authors.
This is a very first step towards that direction.