Commit graph

9531 commits

Author SHA1 Message Date
Jack Conradson
58fafe224f
Add source fallback support for match_only_text mapped type (#89473)
This change adds access to mapped match_only_text fields via the Painless scripting fields API. The 
values returned from a match_only_text field via the scripting fields API always use source as described 
by (#81246). These are not available via doc values so there are no bwc issues.
2022-08-22 07:32:25 -07:00
Dimitris Athanasiou
b15f6dd6d0
Mute test in StableMasterDisruptionIT (#89501)
Relates #89431
2022-08-22 13:29:26 +03:00
William Brafford
3c2fc5aab8
Mute failing tests (#89465)
Mute for https://github.com/elastic/elasticsearch/issues/89464
2022-08-19 01:12:09 +09:30
Jack Conradson
058ea4594a
Add source fallback support for date and date_nanos mapped types (#89440)
This change adds source fallback support for date and date_nanos by using the existing 
SourceValueFetcherSortedNumericIndexFieldData to emulate doc values.
2022-08-18 07:33:05 -07:00
David Turner
18328b014f
Remove LegacyClusterTaskResultActionListener (#89459)
Also removes the now-unused legacy method
`ClusterStateTaskListener#onClusterStateProcessed`.

Closes #83784
2022-08-18 22:39:09 +09:30
Nikola Grcevski
825c354791
Clean-up file watcher keys. (#89429)
Clean-up all open watcher keys in FileSettingsService.
2022-08-17 13:02:29 -04:00
dh-cloud
f31b1f6d57
fix a typo in Security.java (#89248)
Fix name of path.conf option.
2022-08-17 12:53:48 -04:00
Jack Conradson
1aa43ecf2c
Add text field support in the Painless scripting fields API (#89396)
This change adds access to mapped text fields via the Painless scripting fields API. The values returned 
from a text field via the scripting fields API always use source as described by (#81246). Access via the 
old-style through doc will still depend on field data, so there is no change and avoids bwc issues.
2022-08-17 09:13:13 -07:00
Jack Conradson
f849847aef
Fix duplication bug for source fallback in numeric types (#89352)
Currently, source fallback numeric types do not match doc values numeric types. Source fallback 
numeric types de-duplicate numeric values in multi-valued fields. This change removes the de-
duplication for source fallback values for numeric types using value fetchers. This also adds test cases 
for all the supported source fallback types to ensure they continue to match their doc values 
counterparts exactly.
2022-08-17 08:10:51 -07:00
Nik Everett
79a89790e3
Synthetic source: load text from stored fields (#87480)
Adds support for loading `text` and `keyword` fields that have
`store: true`. We could likely load *any* stored fields, but I
wanted to blaze the trail using something fairly useful.
2022-08-17 10:18:36 -04:00
Armin Braun
2a08258224
Fix BlobStoreIncrementalityIT.testRecordCorrectSegmentCountsWithBackgroundMerges (#89416)
Create more segments here to make sure the background merge always merges.
With just 3 segments and a max-segments-per-tier of 2 we don't have the guarantee
that a merge will actually run and hence the test will fail when waiting for the background
merge.

closes #89412
2022-08-17 15:13:01 +02:00
Francisco Fernández Castaño
837a8d7a6e
Add support for floating point node.processors setting (#89281)
This commit adds support for floating point node.processors setting.
This is useful when the nodes run in an environment where the CPU
time assigned to the ES node process is limited (i.e. using cgroups).
With this change, the system would be able to size the thread pools
accordingly, in this case it would round up the provided setting
to the closest integer.
2022-08-17 15:00:39 +02:00
Luca Cavanna
695d1a84af
Remove root argument from buildMappers method (#89390)
The callers of buildMappers can provide the right context, instead of passing a boolean
argument that controls what context is used.
2022-08-17 14:46:20 +02:00
Luca Cavanna
c038a91c60
Assign the right path to objects merged when parsing mappings (#89389)
When parsing mappings, we may find a field with same name specified twice, either
because JSON duplicate keys are allowed, or because a mix of object notation and dot
notation is used when providing mappings. The same can happen when applying dynamic
mappings as part of parsing an incoming document, as well as when merging separate
index templates that may contain the definition for the same field using a
mix of object notation and dot notation.

While we propagate the MapperBuilderContext across merge calls thanks to #86946, we do not
propagate the right context when we call merge on objects as part of parsing/building
mappings. This causes a situation in which the leaf fields that result from the merge
have the wrong path, which misses the first portion e.g. sub.field instead of obj.sub.field.

This commit applies the correct mapper builder context when building the object mapper builder
and two objects with same name are found.

Relates to #86946

Closes #88573
2022-08-17 14:39:12 +02:00
Artem Prigoda
f1071cab36
Remove side-effects in streams in PrimaryShardAllocator (#89218)
`map` should be a side-effect free function, because it's a non-terminal operation.
If we want to have side effect, we should use `forEach` which is terminal.
2022-08-17 14:27:36 +02:00
Armin Braun
3c30674a3b
Fix ConcurrentSnapshotsIT.testAssertMultipleSnapshotsAndPrimaryFailOver (#89413)
We have to wait for the snapshot to actually have started before we restart
the data node. This is not guaranteed since we start the snapshot via an async
client call. Otherwise the expectation of partial snapshot failure will not hold
because we will only partially fail if the data node has actually started work on
the snapshot when it's restarted.

closes #89039
2022-08-17 11:47:24 +02:00
Alan Woodward
189f279b4f
Don't modify source map when parsing composite runtime field (#89114)
When calling RuntimeField.parseRuntimeFields() for fields defined in the
search request, we need to wrap the Map containing field definitions in another
Map that supports value removal, so that we don't inadvertently remove the
definitions from the root request. CompositeRuntimeField was not doing this
extra wrapping, which meant that requests that went to multiple shards and
that therefore parsed the definitions multiple times would throw an error
complaining that the fields parameter was missing, because the root request
had been modified.
2022-08-17 10:00:16 +01:00
Ievgen Degtiarenko
2c37c596d0
Allocation commands related refactoring (#89400)
This change includes:
* moving resetFailedAllocationCounter to a common place in RoutingNodes so that it is accessible from multiple allocation service implementation
* splitting ClusterRerouteTests#testClusterStateUpdateTask into 2 distinct test scenarios to avoid reusing the task
2022-08-17 10:17:29 +02:00
Nikola Grcevski
dc672b0e65
Handle snapshot restore in file settings (#89321) 2022-08-16 17:18:18 -04:00
Nik Everett
82ad45f411
TSDB: Build _id without reparsing (#88789)
This replaces the code that build `_id` in tsid indices that used to
re-parse the entire json object with one that reuses the parsed values.
It speed up writes by about 4%. Here's the rally output:

```
|    Min Throughput |  8164.67 |  8547.24 | docs/s | +4.69% |
|   Mean Throughput |  8891.11 |  9256.75 | docs/s | +4.11% |
| Median Throughput |  8774.52 |  9134.15 | docs/s | +4.10% |
|    Max Throughput | 10246.7  | 10482.3  | docs/s | +2.30% |
```
2022-08-16 16:17:45 -04:00
Nikola Grcevski
5af8ec52fe
Support camel case dates on 7.x indices (#88914)
This adds back compatibility support for camel case dates
for 7.x indices used in 8.x.
2022-08-16 15:57:59 -04:00
Nik Everett
b327b17653
Fix shard splitting for nested (#89351)
I broke shard splitting when `_routing` is required and you use `nested`
docs. The mapping would look like this:
```
"mappings": {
  "_routing": {
    "required": true
  },
  "properties": {
    "n": { "type": "nested" }
  }
}
```

If you attempt to split an index with a mapping like this it'll blow up
with an exception like this:
```
Caused by: [idx] org.elasticsearch.action.RoutingMissingException: routing is required for [idx]/[0]
	at org.elasticsearch.cluster.routing.IndexRouting$IdAndRoutingOnly.checkRoutingRequired(IndexRouting.java:181)
	at org.elasticsearch.cluster.routing.IndexRouting$IdAndRoutingOnly.getShard(IndexRouting.java:175)
```

This fixes the problem by entirely avoiding the branch of code. That
branch was trying to find any top level documents that don't have a
`_routing`. But we *know* that there aren't any top level documents
without a routing in this case - the routing is "required". ES wouldn't
have let you index any top level documents without the routing.

This also adds a small pile of REST layer tests for shard splitting that
hit various branches in this area. For extra paranoia.

Closes #88109
2022-08-16 11:55:46 -04:00
Tim Brooks
ac9f12fd63
Add logging in GlobalCheckpointSyncIT (#89185)
The test GlobalCheckpointSyncIT#testBackgroundGlobalCheckpointSync
failed once recently due to an engine already close exception. It has
not occurred again and the reasoning is unclear.

This commit adds a log line to indicate exactly when it happens, which
shard it is, and what the current state of the index shard is.

Closes #88428.
2022-08-16 09:05:01 -06:00
Armin Braun
80796fba28
Small cleanups to Allocation Performance (#89378)
Two fixes:
1. Looking up `Custom` values over and over for every shard incurs a measurable cost.
This removes that cost for desired nodes and node shutdown metadata.
2. Node shutdown metadata logic wasn't inlining nicely because of the wrapped map.
No need to be as complicated as we were in many spots, use a simple immutable map
for all operations and remove a bunch of branching.
2022-08-16 14:16:46 +02:00
Keith Massey
8360bf9a47
Fixing a version check for master stability functionality (#89322) 2022-08-15 09:43:35 -05:00
Mark Tozzi
60016c8cf0
convert raw url to hyperlink in javadoc (#89319)
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2022-08-15 10:06:49 -04:00
Mayya Sharipova
10b804730d
Include runtime fields in total fields count (#89251)
We have a check that enforces the total number of fields needs to be below a
certain (configurable) threshold. Before runtime fields did not contribute
to the count. This patch makes all runtime fields contribute to the
count, runtime fields:
- that were explicitly defined in mapping by a user
- as well as runtime fields that were dynamically created by dynamic
 mappings

Closes #88265
2022-08-15 09:43:12 -04:00
David Turner
9b24b418a1
Force rejection of unsupported bulk actions in v9 (#89339)
Adds a mention of `Version.V_7_17_0` so that we don't forget to remove
this deprecated behaviour in the next major version.

Relates #78876
2022-08-15 13:14:36 +01:00
David Turner
8d37d48426
Check circuit breaker before sending join request (#89318)
Adds a simple preflight check to `JoinHelper#sendJoinRequest` to avoid
sending a join request if it looks like the `inflight_requests` circuit
breaker is going to trip on the join validation message.

Closes #85003
2022-08-15 10:53:41 +01:00
David Turner
745947e854
Capture deprecation warnings in batched master tasks (#85525)
It's possible for a cluster state update task to emit deprecation
warnings, but if the task is executed in a batch then these warnings
will be exposed to the listener for every item in the batch. With this
commit we introduce a mechanism for tasks to capture just the warnings
relevant to them, along with assertions that warnings are not
inadvertently leaked back to the master service.

Closes #85506
2022-08-15 19:20:15 +09:30
David Turner
51f89f43e5
Handle rejection in LeaderChecker (#89326)
Closes #89325
2022-08-15 10:09:14 +01:00
David Turner
4779893b25
Introduce BatchExecutionContext (#89323)
Replaces the two arguments to `ClusterStateTaskExecutor#execute` with a
parameter object called `BatchExecutionContext` so that #85525 can add a
new and rarely-used parameter without generating tons of noise.
2022-08-13 19:03:15 +09:30
David Turner
dcc87ddb41 AwaitsFix for #89325 2022-08-13 08:33:13 +01:00
Keith Massey
5a26455170
Adding a check to the master stability health API when there is no master and the current node is not master eligible (#89219)
This PR builds on #86524, #87482, and #87306 by supporting the case where there has been no
master node in the last 30 second, no node has been elected master, and the current node is not
master eligible.
2022-08-12 08:28:31 -05:00
David Turner
f9055b5acf
Miscellaneous cleanups in TransportService (#89299) 2022-08-12 18:36:23 +09:30
David Turner
ed940b6ed5
Clarify that TransportService#sendRequest never throws (#89298)
It's not obvious from reading the code that
`TransportService#sendRequest` and friends always catch exceptions and
pass them to the response handler, which means some callers are wrapping
calls to `sendRequest` in their own unnecessary try/catch blocks. This
commit makes it clear that all exceptions are handled and removes the
unnecessary exception handling in callers.

Closes #89274
2022-08-12 09:52:58 +01:00
Yang Wang
96febb7d1a
Ensure secureString remain open when reloading secure settings (#88922)
The reloading secure settings action sends one node level request with
password (secureString) to each node. These node level requests share
the same secureString instance. This is not a problem when the requests
are sent across the wire because the secureString will end up to be
independent instances after de/serilization. But when the request is
handled by the local node, it skips the de/serialization process. This
means when the secureString gets closed, it is closed for all the node
level requests. If a node level request has not been sent across wire
when the secureString is closed under it, the serialization process will
result into error.

This PR fixes the bug by letting each Node level request creates a clone
of the secureString and have the Nodes level request to track all Node
level requests. All copies of secureString (on the coordinate node) will
be closed at Nodes request level which is safe because it is after
completion of all Node level requests.

Resolves: #88887
2022-08-12 15:31:35 +09:30
Tanguy Leroux
89ff87d20c
Fix CloseIndexIT.testConcurrentClose (#89173)
Closes #88936
2022-08-11 17:07:00 +02:00
Keith Massey
e4a19d4c03
Fixing remote master stability request when there has never been an elected master (#89214)
This fixes an edge case in the master stability polling code from
#89014. If there has not been an elected master node for the entire life
of a non-master-eligible node, then `clusterChanged()` will have never
been called on that node, so
`beginPollingRemoteMasterStabilityDiagnostic()` will have never been
called. And even though the node might know of some master-eligible
nodes, it will never have requested diagnostic information from them.
This PR adds a call to `beginPollingRemoteMasterStabilityDiagnostic` in
`CoordinationDiagnosticsService`'s constructor to cover this edge case.
In almost all cases, `clusterChanged()` will be called within 10 seconds
so the polling will never occur. However if there is no master node then
there will be no cluster changed events, and `clusterChanged()` will not
be called, and the results of the polling will likely be useful. This PR
has several possibly controversial pieces of code. I'm listing them here
with some discussion:

1. Because there is now a call to `beginPollingRemoteMasterStabilityDiagnostic()` in the ~~constructor~~ object's initialization code, `beginPollingRemoteMasterStabilityDiagnostic()` is no longer solely called from the cluster change thread. However, this call happens before the object is registered as a cluster service listener, so there is no new thread safety concern.
2. Because there is now a call to `beginPollingRemoteMasterStabilityDiagnostic()` in the ~~constructor~~ object's initialization code, we have to explicitly switch to the system context so that the various transport requests work in secure mode.
3. ~~When we're in the constructor, we don't actually know yet whether we're a master eligible node or not, so we kick off `beginPollingRemoteMasterStabilityDiagnostic()` for all node types, including master-eligible nodes. This will be fairly harmless for master eligible nodes though. In the worst case, they'll retrieve some information that they'll never use. This explains why `clusterChanged()` now cancels polling even if we are on a master eligible node.~~
4. ~~It is now possible that we use `clusterService.state()` before it is ready when we're trying to get the list of master-eligible peers. In production mode this method returns null, so we can check that before using it. If assertions are enabled in the JVM, just calling that method throws an `AssertionError`. I'm currently catching that with the assumption that it is harmless because there does not seem to be a way around it (without even further complicating code).~~
5. ~~It is now possible that we call `transportService.sendRequest()` before the transport service is ready. This happens if the server is initializing unusually slowly (i.e. it takes more than 10 seconds to complete the `Node` constructor) and if assertions are enabled. I don't see a way around this without further complicating the code, so I'm catching `AssertionError` and moving on, with the assumption that it will work 10 seconds later when it runs again. I'm also catching and storing `Exception`, which I think I should have been doing before anyway.~~

Note: Points 3, 4, and 5 are no longer relevant because I moved the call
to `beginPollingRemoteMasterStabilityDiagnostic()` out of the
constructor, and am now calling it after the transport service and
cluster state have been initialized.
2022-08-11 23:48:50 +09:30
Ignacio Vera
993e467615
Sort ranges in geo_distance aggregation (#89154)
This commit sorts ranges provided in a geo_distance aggregation, otherwise it fails to provided the right results if 
ranges are unordered.
2022-08-11 13:53:32 +02:00
Mary Gouseti
892ad014ff
Refactor registering listeners out of constructors (#89265)
Classes affected:
- Fix LocalHealthMonitor
- Refactor HealthNodeTaskExecutor
- Refactor HealthMetadataService
2022-08-11 13:40:55 +02:00
David Turner
0bf31b77fb
Fix message for stalled shutdown (#89254)
Today if a node shutdown is stalled due to unmoveable shards then we say
to use the allocation explain API to find details. In fact, since #78727
we include the allocation explanation in the response already so we
should tell users just to look at that instead. This commit adjusts the
message to address this.
2022-08-11 07:48:03 +01:00
Mary Gouseti
399a8ac283
Add TransportHealthNodeAction (#89127) 2022-08-10 17:04:22 +02:00
David Turner
ceffaf9aad
Improve rejection of ambiguous voting config name (#89239)
Today if there are multiple nodes with the same name then
`POST /_cluster/voting_config_exclusions?node_names=ambiguous-name` will
return a `500 Internal Server Error` and a mysterious message. This
commit changes the behaviour to throw an `IllegalArgumentException`
(i.e. `400 Bad Request`) along with a more useful message describing the
problem.
2022-08-10 12:39:24 +01:00
Ievgen Degtiarenko
72e24d38bb
Log when repository is marked as corrupted (#89132) 2022-08-10 08:12:00 +02:00
Nikola Grcevski
895baf011c
Delete invalid settings for system indices (#88903) 2022-08-09 17:11:55 -04:00
Stuart Tettemer
264f09f3d5
Script: Common base class for write scripts (#89141)
Adds `WriteScript` as the common base class for the write scripts: `IngestScript`, `UpdateScript`, `UpdateByQueryScript` and `ReindexScript`.

This pulls the common `getCtx()` and `metadata()` methods into the base class and prepares for the implementation of the ingest fields api (https://github.com/elastic/elasticsearch/issues/79155).

As part of the refactor, `IngestScript` now takes a `CtxMap` directly rather than taking "sourceAndMetadata" (`CtxMap`) and `Metadata` (from `CtxMap`).  There is a new `getCtxMap()` getter to get the typed `CtxMap`.  `getSourceAndMetadata` could have been refactored to do this, but most of the callers of that don't need to know about `CtxMap` and are happy with a `Map<String, Object>`.
2022-08-09 12:31:18 -05:00
David Turner
de281b5072
Complete listener in ReservedStateErrorTaskExecutor (#89191) 2022-08-10 01:10:38 +09:30
Keith Massey
e63bcb550e
Fixing internal action names (#89182)
Fixing the names of the internal actions used by CoordinationDiagnosticsService to begin with "internal:" so
that they can be used in the system context with security enabled.
2022-08-09 08:47:29 -05:00
Armin Braun
c6c05bb625
Deduplicate ShardRouting instances when building ClusterInfo (#89190)
The equality checks on these in `DiskThresholdDecider` become very expensive
during reroute in a large cluster. Deduplicating these when building the `ClusterInfo`
saves more than 2% CPU time during many-shards benchmark bootstrapping because
the lookup of the shard data path by shard-routing mostly hit instance equality.
Also, this saves a little memory.

This PR also moves the callback for building `ClusterInfo` from the stats response to
the management pool as it is now more expensive (though the overall CPU use from it is trivial
relative to the cost savings during reroute) and was questionable to run on
a transport thread in a large cluster to begin with.

Co-authored-by: David Turner <david.turner@elastic.co>
2022-08-09 11:08:13 +02:00