elasticsearch

mirror of https://github.com/elastic/elasticsearch.git synced 2025-04-22 06:07:55 -04:00

Author	SHA1	Message	Date
Mark Vieira	6c4e55e714	Fix file path when looking for docker exclusions file (#105304 )	2024-02-08 12:27:09 -08:00
David Turner	97dbb2a27e	Fix leaked HTTP response sent after close (#105293 ) Today a `HttpResponse` is always released via a `ChannelPromise` which means the release happens on a network thread. However, it's possible we try and send a `HttpResponse` after the node has got far enough through shutdown that it doesn't have any running network threads left, which means the response just leaks. This is no big deal in production, it becomes irrelevant when the process exits, but in tests we start and stop many nodes within the same process so mustn't leak anything. At this point in shutdown, all HTTP channels are now closed, so it's sufficient to check whether the channel is open first, and to fail the listener on the calling thread if not. That's what this commit does. Closes #104651	2024-02-08 14:57:02 -05:00
Przemysław Witek	0cbc745f57	[Transform] Do not log warning when triggering an `ABORTING` transform (#105234 )	2024-02-08 20:10:34 +01:00
John Verwolf	95570af0eb	Testfix/fixes 103781 (#105251 ) Updates transport telemetry tests to use a single node test case in an attempt to reduce flakyness.	2024-02-08 11:03:47 -08:00
Michael Peterson	33e22c4467	Docs improvements for the new resolve/cluster API (#105297 )	2024-02-08 13:36:07 -05:00
Jonathan Buttner	563c3e60ab	Moving byte embeddings to text_embedding_bytes field (#105290 )	2024-02-08 13:21:20 -05:00
Przemysław Witek	8cfcb706f8	[Transform] Add test for cancelling transform persistent task. (#105285 )	2024-02-08 19:09:13 +01:00
Michael Peterson	15ded61150	Reversing unintentional change to indices.resolve_index/10_basic_resolve_index.yml (#105300 ) Was changed in #102726	2024-02-08 12:48:40 -05:00
Armin Braun	8e42535440	Fix potential huge allocations when reading TermsQueryBuilder.BinaryValues from the network (#105235 ) We should be reading a single `BytesReference` (that would be backed by a single large `byte[]`) here when we care about the individual values in the list only. Without breaking the behavior of only serializing once when sending to multiple targets this change: * lazy serializes as needed and keeps the original terms, so we don't needlessly go through serialization in e.g. a single node situation or or requests that are handled on the coordinator directly (concurrency should be fine here, we serialize on the same thread in practice and should we ever not be on the same thread at all times this will worst case lead to serializing multiple times). * stops allocating a potentially huge byte[] when receiving these things over the wire	2024-02-08 17:54:56 +01:00
Dianna Hohensee	ffc711bf35	Add an outline for the distributed area team architecture guide (#105264 )	2024-02-08 11:53:08 -05:00
Matteo Piergiovanni	54cfce4379	Flag in _field_caps to return only fields with values in index (#103651 ) We are adding a query parameter to the field_caps api in order to filter out fields with no values. The parameter is called `include_empty_fields` and defaults to true, and if set to false it will filter out from the field_caps response all the fields that has no value in the index. We keep track of FieldInfos during refresh in order to know which field has value in an index. We added also a system property `es.field_caps_empty_fields_filter` in order to disable this feature if needed. --------- Co-authored-by: Matthias Wilhelm <ankertal@gmail.com>	2024-02-08 17:52:21 +01:00
Costin Leau	5c1e3e2c91	ESQL: Replace [ccq.mode] in favor of a policy prefix (#105224 ) For consistency, replace [ccq.mode:<type>] with _<resolution>:policyName `ENRICH [ccq.mode=any] policyName` becomes `ENRICH _any:policyName`	2024-02-08 11:09:00 -05:00
Michael Peterson	ac36aa7795	Resolve Cluster API (#102726 ) To improve cross-cluster search user experience, Kibana needs an endpoint that is accessible by arbitrary Kibana dashboard search users and provides: 1. a listing of clusters in scope for a CCS query (based on the index expression and whether there are any indices on each cluster that the Kibana user has access to query). 2. whether that cluster is currently connected to the querying cluster (will it come back as skipped or failed in a CCS search) 3. showing the skip_unavailable setting for those clusters (so you can know whether it will return skipped or failed in a CCS search) 4. the ES version of the cluster Since no single Elasticsearch endpoint provides all of these features, this PR creates a new endpoint `_resolve/cluster` that works along side the existing `_resolve/index` endpoint (and leverages some of its features). Example usage against a cluster with 2 remote clusters configured: GET /_resolve/cluster/,remote:bl* Response: { "(local)": { "connected": true, "skip_unavailable": false, "matching_indices": true, "version": { "number": "8.12.0-SNAPSHOT", "build_flavor": "default", "minimum_wire_compatibility_version": "7.17.0", "minimum_index_compatibility_version": "7.0.0" } }, "remote2": { "connected": true, "skip_unavailable": true, "matching_indices": true, "version": { "number": "8.12.0-SNAPSHOT", "build_flavor": "default", "minimum_wire_compatibility_version": "7.17.0", "minimum_index_compatibility_version": "7.0.0" } }, "remote1": { "connected": true, "skip_unavailable": false, "matching_indices": false, "version": { "number": "8.12.0-SNAPSHOT", "build_flavor": "default", "minimum_wire_compatibility_version": "7.17.0", "minimum_index_compatibility_version": "7.0.0" } } } Almost all errors show up as "error" entries in the response. Only the local SecurityException returns a 403 since that happens before the ResolveCluster Transport code kicks in.	2024-02-08 10:50:05 -05:00
Costin Leau	fca3fc82be	ESQL: Grammar - FROM METADATA no longer require [] (#105221 ) Remove usage of [ ] through-out the grammar, in this case inside FROM METADATA.	2024-02-08 07:03:19 -08:00
David Turner	cda94ac3ca	Extend timeout in RepositoryAnalysisFailureIT (#105287 ) We see occasional test failures in CI due to the analysis not completing within this 30s timeout. It doesn't look like anything is actually wrong, the test machine is just busy and these tests can be quite IO-intensive. This commit gives them more time. Closes #99422	2024-02-08 09:49:06 -05:00
Felix Barnsteiner	f36dff7485	Efficiently encode multi-valued dimensions (#105271 ) Detects and efficiently encodes cyclic ordinals, as proposed by @jpountz. This is beneficial for encoding dimensions that are multivalued, such as host.ip. A follow-up on #99747	2024-02-08 09:30:16 -05:00
Dmitry Cherniachenko	263ea5e987	Replace generic HashSet / HashMap with more efficient EnumSet / EnumMap (#105238 )	2024-02-08 13:43:14 +00:00
Felix Barnsteiner	f426b68a82	Unmute LogsDataStreamIT.testIgnoreDynamicBeyondLimit (#105282 )	2024-02-08 13:26:42 +01:00
Kostas Krikellas	b6f20ff166	Increase timeout for ensureGreen in testDownsampleIndexWithRollingRestart (#105277 )	2024-02-08 14:23:28 +02:00
David Turner	e489951d84	Close `currentChunkedWrite` on client cancel (#105258 ) If the client closes the channel while we're in the middle of a chunked write then today we don't complete the corresponding listener. This commit fixes the problem.	2024-02-08 07:07:04 -05:00
Felix Barnsteiner	9dfd5dbd8f	Mute LogsDataStreamIT.testIgnoreDynamicBeyondLimit (#105280 )	2024-02-08 12:25:45 +01:00
Albert Zaharovits	6cd29a6331	Introduce AggregationBuilder#deepCopy (#105114 ) Introduces an AggregationBuilder#deepCopy method that iteratively copies, and optionally modifies, AggregationBuilder instances. Relates: #104895	2024-02-08 13:23:38 +02:00
Simon Cooper	9ba5651e74	Collapse all transport versions between 8.11 and 8.12 into a constant for 8.12 (#104937 ) This also cleans up the versioning checks around ELSER models	2024-02-08 10:54:19 +00:00
Ignacio Vera	8f37ef977f	Remove abstract method InternalMultiBucketAggregation#reduceBucket (#105275 )	2024-02-08 11:24:02 +01:00
David Turner	7b44334727	AwaitsFix for #105276	2024-02-08 09:24:29 +00:00
Felix Barnsteiner	50902e15a6	Use new `ignore_dynamic_beyond_limit` setting in logs and metrics data streams (#105180 ) This reduces the risk of document loss if too many fields are added. As these component templates are imported by Fleet, this also affects integrations.	2024-02-08 04:23:50 -05:00
David Turner	4467352887	More descriptive messages for `safeAwait` (#105260 ) Today the various `ESTestCase#safeAwait` variants do not include a descriptive message with their failures, which means you have to dig through the stack trace to work out the reason for a test failure. This commit adds the missing messages to make it a little easier on the reader.	2024-02-08 08:45:41 +00:00
Ignacio Vera	609e8059eb	Introduce an AggregatorReducer to reduce the footprint of aggregations in the coordinating node (#105207 ) This commit adds an abstraction that performs reduction of InternalAggregations in a streaming fashion.	2024-02-08 09:30:54 +01:00
Henning Andersen	d00b5d37bd	Lower G1 minimum full GC interval (#105259 ) We sometimes see a need to do full GC twice within the current 5s interval. While we should work to improve our allocation pattern for that, it also seems too conservative to not allow more full GCs, as long as we also get some real work done. Hence lowering it to 2s here, which would fix the current problematic cases.	2024-02-08 08:53:17 +01:00
Bogdan Pintea	f26691f987	ESQL: Mark a few features as experimental (#105263 ) Mark the following features as experimental in the docs: * `AUTO_BUCKET()` * `SHOW FUNCTIONS` * unsigned_long type	2024-02-07 17:28:13 -08:00
Ryan Ernst	6375e9f443	Add native access library (#105100 ) Elasticsearch requires access to some native functions. Historically this has been achieved with the JNA library. However, JNA is a complicated, magical library, and has caused various problems booting Elasticsearch over the years. The new Java Foreign Function and Memory API allows access to call native functions directly from Java. It also has the advantage of tight integration with hotspot which can improve performance of these functions (though performance of Elasticsearch's native calls has never been much of an issue since they are mostly at boot time). This commit adds a new native lib that is internal to Elasticsearch. It is built to use the foreign function api starting with Java 21, and continue using JNA with Java versions below that. Only one function, checking whether Elasticsearch is running as root, is migrated. Future changes will migrate other native functions.	2024-02-07 18:27:09 -05:00
Ry Biesemeyer	0022005e17	Add stable ThreadPool constructor to LogstashInternalBridge (#105163 )	2024-02-07 17:20:59 -05:00
Nhat Nguyen	c736c34035	Avoid wrapping searchers multiple times in mget (#104227 ) Wrapping a searcher can be expensive; and this optimization avoids wrapping the same searcher multiple times for a MGET request. Closes #85069	2024-02-07 12:53:34 -08:00
Fabio Busatto	b1adb78f6c	[DOCS] Update remote cluster setup instructions (#105256 )	2024-02-07 21:11:57 +01:00
Niels Bauman	4b54526e8f	Fix `UpdateHealthInfoCacheActionTests.testRequestSerialization` failing (#105257 ) Fixes #105254	2024-02-07 14:25:01 -05:00
Ryan Ernst	18a1ac09e7	Use open and fstat in preallocate (#105171 ) Preallocate opens a FileInputStream in order to get a native file desctiptor to pass to native functions. However, getting at the file descriptor requires breaking modular access. This commit adds native posix functions for opening/closing and retrieving stats on a file in order to avoid requiring additional permissions.	2024-02-07 13:40:05 -05:00
Ryan Ernst	2ca6df71d6	Make ProviderLocator aware of boot qualified exports (#105250 ) Qualfied exports in the boot layer only work when they are to other boot modules. Yet Elasticsearch has dynamically loaded modules as in plugins. For this purpose we have ModuleQualifiedExportsService. This commit moves loading of ModuleQualfiedExportService instances in the boot layer into core so that it can be reused by ProviderLocator when a qualified export applies to an embedded module.	2024-02-07 09:43:22 -08:00
Keith Massey	d8fdf6f04d	Releasing child request builder memory from BulkRequestBuilder (#105194 )	2024-02-07 10:57:58 -06:00
Mary Gouseti	65d1d3d47d	Change the rest client configuration in the LazyRolloverDataStreamIT (#105243 )	2024-02-07 17:44:40 +02:00
Tim Rühsen	0ea58c8ec2	[Profiling] Add azure_cost_factor request parameter (#105231 )	2024-02-07 16:42:02 +01:00
Luca Cavanna	32cbb49a3f	Remove SearchException usages without a proper status code (#105150 ) We have some usages of SearchException that don't provide a cause exception and also don't define a status code. That means that the status code of such requests will default to 500 which is in many cases not a good choice. Normally, for internal server error a cause is associated with the wrapper exception. This scenario is not very common, and looks like a leftover of shard validation that used to happen on shards, which can be moved to the coordinating node. This commit moves some of the exceptions thrown in SearchService#parseSource to SearchRequest#validate. This way we will fail before serializing the shard level request to all the shards, which is much better. Note that for bw comp reasons, we need to keep on throwing the same exception from the data node, while intuitively this is now replaced by the same validation in the coord node. This is because in a mixed cluster scenario, an older node that does not perform the validation as coord node, could serialize shard level requests that need to be checked again on data nodes, to prevent unexpected situations.	2024-02-07 16:12:27 +01:00
Liam Thompson	fb743da0d7	[DOCS][ESQL] Document _source metadata field (#105237 ) * [DOCS][ESQL] Document _source metadata field * 🚗 Minor copyedit to entire page	2024-02-07 15:57:51 +01:00
Martijn van Groningen	cc67205c25	Assign index.downsample.interval setting when downsample index gets created. (#105241 ) This avoids keeping downsamplingInterval field around. Additionally, the downsample interval is known when downsample interval is invoked and doesn't change.	2024-02-07 09:31:26 -05:00
Niels Bauman	64891011d3	Extend `repository_integrity` health indicator for unknown and invalid repos (#104614 ) This PR extends the repository integrity health indicator to cover also unknown and invalid repositories. Because these errors are local to a node, we extend the `LocalHealthMonitor` to monitor the repositories and report the changes in their health regarding the unknown or invalid status. To simplify this extension in the future, we introduce the `HealthTracker` abstract class that can be used to create new local health checks. Furthermore, we change the severity of the health status when the repository integrity indicator reports unhealthy from `RED` to `YELLOW` because even though this is a serious issue, there is no user impact yet.	2024-02-07 15:18:55 +01:00
Craig Taverner	a58b2c2b05	Move doc-values classes needed by ST_INTERSECTS to server (#104980 ) * Move doc-values classes needed by ST_INTERSECTS to server This classes are needed by ESQL spatial queries, and are not licensed in a way that prevents this move. Since they depend on lucene it is not possible to move them to a library. Instead they are moved to be co-located with the GeoPoint doc-values classes that already exist in server. * Moved to lucene package org.elasticsearch.lucene.spatial * Moved Geo/ShapeDocValuesQuery to server because it is Lucene specific And this gives us access to these classes from ESQL for lucene-pushdown of spatial queries.	2024-02-07 15:00:38 +01:00
Daniel Mitterdorfer	9651cd7e26	[Profiling] Use plain arrays in stack traces (#105226 ) With this commit we refactor the internal representation of stacktraces to use plain arrays instead of lists for some of its properties. The motivation behind this change is simplicity: * It avoids unnecessary boxing * We could eliminate a few redundant null checks because we use primitive types now in some places * We could slightly simplify runlength decoding	2024-02-07 14:39:38 +01:00
Martijn van Groningen	baf8b5ae38	Fix a few downsample api issues (#105228 ) Improve downsampling by making the following changes: - Avoid NPE and assert tripping when fetching the last processed tsid. - If the write block has been set, then there is no reason to start the downsample persistent tasks, since shard level downsampling has completed. Not doing so also causes ILM/DSL to get stuck on downsampling. In this case shard level downsampling should be skipped. - Sometimes the source index may not be allocated yet on the node performing shard level downsampling operation. This causes a NPE, with this PR, this now fails a shard level downsample with a less disturbing error. Additionally unmute DataStreamLifecycleDownsampleDisruptionIT#testDataStreamLifecycleDownsampleRollingRestart Relates to #105068	2024-02-07 08:28:28 -05:00
David Turner	25dd12df3b	AwaitsFix for #105236	2024-02-07 12:11:51 +00:00
Pooya Salehi	db4d31ddb4	Improve exception handling for stateless realtime-get/mget (#105028 ) Relates #105003, ES-5727	2024-02-07 12:50:57 +01:00
Mary Gouseti	011876367a	Execute lazy rollover with an internal dedicated user #104732 (#104905 ) The unconditional rollover that is a consequence of a lazy rollover command is triggered by the creation of a document. In many cases, the user triggering this rollover won't have sufficient privileges to ensure the successful execution of this rollover. For this reason, we introduce a dedicated rollover action and a dedicated internal user to cover this case and enable this functionality.	2024-02-07 13:01:01 +02:00

... 7 8 9 10 11 ...

75574 commits