Commit graph

7747 commits

Author SHA1 Message Date
Yang Wang
50c71bcfdb
[Test] Ranged read should read non-empty content (#106000) (#106525)
Empty read is
[short-circuited](e8039b9ecb/modules/repository-s3/src/main/java/org/elasticsearch/repositories/s3/S3BlobContainer.java (L115-L116))
without going to the blob store. In order to test s3 blob store, ranged
read should read at least one byte. This PR ensures that.

Resolves: #105958
2024-03-20 04:56:14 -04:00
Martijn van Groningen
0d1e7a4a0e
Small time series agg improvement (#106288) (#106307)
After tsid hashing was introduced (#98023), the time series aggregator generates the tsid (from all dimension fields) instead of using the value from the _tsid field directly. This generation of the tsid happens for every time serie, parent bucket and segment combination.

This changes alters that by only generating the tsid once per time serie and segment. This is done by just locally recording the current tsid.
2024-03-13 13:02:58 -04:00
Matteo Piergiovanni
4cc9f5cf64
Field-caps field has value lookup use map instead of looping array (#105770) (#106131)
(cherry picked from commit 35b2dbee2a)
2024-03-11 08:40:27 +01:00
Jim Ferenczi
34fe40b5b0
Fix performance bug in SourceConfirmedTextQuery#matches (#105930) (#105983)
This change ensures that the matches implementation of the `SourceConfirmedTextQuery` only checks the current document instead of calling advance on the two phase iterator. The latter tries to find the first doc that matches the query instead of restricting the search to the current doc. This can lead to abnormally slow highlighting if the query is very restrictive and the highlight is done on a non-matching document.

Closes #103298
2024-03-06 08:48:28 +00:00
David Turner
203f549e14
URLRepository should not block shutdown (#105588) (#105614)
Today a node with a registered `URLRepository` will not shut down
cleanly because it never releases the last of the `activityRefs`. This
commit fixes that.
2024-02-19 06:11:55 -05:00
David Turner
127da57578
Fix use-after-free at event-loop shutdown (#105486) (#105575)
We could still be manipulating a network message when the event loop
shuts down, causing us to close the message while it's still in use.
This is at best going to be a little surprising to the caller, and at
worst could be an outright use-after-free bug.

This commit moves the double-check for a leaked promise to happen
strictly after the event loop has fully terminated, so that we can be
sure we've finished using it by this point.

Relates #105306, #97301
2024-02-15 15:24:11 -05:00
David Turner
c0e931af06
Detach persistent task execution from ThreadPool (#105460)
Similar to #99392, #97879 etc, no need to have the
`NodePersistentTasksExecutor` look up the executor to use each time, nor
does it necessarily need to use a named executor from the `ThreadPool`.
This commit pulls the lookup earlier in initialization so we can just
use a bare `Executor` instead.
2024-02-14 08:55:05 +00:00
Martijn van Groningen
67bf5f3d28
Improve test coverage for index shrinking a tsdb index. (#105459) 2024-02-14 08:10:25 +01:00
Keith Massey
f0ec294382
Limiting the number of nested pipelines that can be executed (#105428)
Limiting the number of nested pipelines that can be executed within a single pipeline to 100
2024-02-13 16:28:31 -06:00
Keith Massey
c884945a93
Adding executedPipelines to the IngestDocument copy constructor (#105427) 2024-02-13 15:11:47 -06:00
Jack Conradson
b5828fbb67
Add plumbing to check cluster features in SearchSourceBuilder (#105417)
This change adds additional plumbing to pipe through the available cluster features into 
SearchSourceBuilder. A number of different APIs use SearchSourceBuilder so they had to make this 
available through their parsers as well often through ParserContext. This change is largely mechanical 
passing a Predicate into existing REST actions to check for feature availability.

Note that this change was pulled mostly from this PR (#105040).
2024-02-13 08:30:04 -08:00
Keith Massey
e2b2232569
Improving the performance of the ingest simulate verbose API (#105265)
This updates the simulate verbose API to run in O(N) (for number of pipelines)
time and memory like the simulate and ingest APIs rather than O(N^2).
2024-02-12 16:04:21 -06:00
Dmitry Cherniachenko
a50e58d99a
Use single-char variant of String.indexOf() where possible (#105205)
* Use single-char variant of String.indexOf() where possible

indexOf(char) is more efficient than searching for the same one-character String.

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2024-02-12 14:14:32 -05:00
Keith Massey
5ccadaee7b
Adding a custom exception for problems with the graph of pipelines to be applied to a document (#105196)
This PR removes the need to parse the exception message to detect if a cycle has been detected
in the ingest pipelines to be run on a document.
2024-02-12 13:11:00 -06:00
Dmitry Cherniachenko
e21a4874ab
Use String.replace() instead of replaceAll() for non-regexp replacements (#105127)
* Use String.replace() instead of replaceAll() for non-regexp replacements

When arguments do not make use of regexp features replace() is a more efficient option, especially the char-variant.
2024-02-12 13:11:15 -05:00
Przemyslaw Gomulka
11f3c29089
DocumentSizeObserver infrastructure to allow not reporting upon failures (#104859)
We want to report that observation of document parsing has finished only upon a successful indexing.
To achieve this, we need to perform reporting only in one place (not as previously in both IngestService and 'bulk action')

This commit splits the DocumentParsingObserver in two. One for wrapping an XContentParser and returning the observed state - the DocumentSizeObserver and a DocumentSizeReporter to perform an action when parsing has been completed and indexing successful.

To perform reporting in one place we need to pass the state from IngestService to 'bulk action'. The state is currently represented as long - normalisedBytesParsed.

In TransportShardBulkAction we are getting the normalisedBytesParsed information and in the serverless plugin we will check if the value is indicating that parsing already happened in IngestService (value being != -1) we create a DocumentSizeObserver with the fixed normalisedBytesParsed and won't increment it.

When the indexing is completed and successful we report the observed state for an index with DocumentSizeReporter

small nit: by passing the documentParsingObserve via SourceToParse we no longer have to inject it via complex hierarchy for DocumentParser. Hence some constructor changes
2024-02-12 17:16:24 +01:00
Alexander Spies
a241b96b8f
Upgrade ANTLR4 to 4.13.1 (#105334) 2024-02-12 15:24:32 +01:00
Kostas Krikellas
510c8515ab
Refactor getAliases, add more yaml tests (#105370)
* Refactor getAliases, add more yaml tests

* more test updates
2024-02-12 09:10:45 +02:00
Martijn van Groningen
47304a1f04
Added a more randomized passthrough indexing test. (#105344)
That also asserts routing aspects of indexing, searching and getting by
id.

Relates to #103567
2024-02-09 11:34:15 -05:00
Keith Massey
45885fdb91
Adding some tests for many nested pipeline processors (#105291) 2024-02-09 08:09:50 -06:00
Mary Gouseti
f1aae380d9
[IndicesOptions] Group indices options based on what they are applied on. (#104655)
In this PR we are refactoring the internals of the `IndicesOptions` class. Because this class is widely used the refactoring is strictly an internal refactoring, we do not change the existing serialisation. This allows us to better test this and to preserve performance over the wire. The improvements we are brining forth with this PR are:
- New internal structure of the flags, based on what the flags influnce.
- Every flag is a boolean instead of using the presence of an enum options in a set.
- We provide builders to allow easier construction of the object and easier overriding of the defaults.
- This will enable easier extension that might be useful for other projects.
2024-02-09 13:01:34 +02:00
Armin Braun
5c8006499a
Move test-only search response x-content-parsing code to test codebase (#105308)
Loads of code here that is only used in tests and one duplicate unused
class that was only used as an indirection to parsing the
`AsyncSearchResponse`. Moved what I could easily move via automated
refactoring to `SearchResponseUtils` in tests and removed the duplicate
now unused class from the client codebase.
2024-02-09 11:56:39 +01:00
David Turner
2615aa00b8
Fix race in HTTP response shutdown handling (#105306)
Similar to #97301, the fix in #105293 was still not quite correct: we
could in principle shut down the transport after checking `isOpen()` but
before sending the message. Applying the same fix as for the transport
layer here.
2024-02-09 08:43:13 +00:00
Yang Wang
5cf80496e5
Add s3 HeadObject request to request stats (#105105)
The HeadObject request should be included in requests stats and metrics
for completeness. This PR does that.

Relates: #98083 Resolves: ES-7810
2024-02-08 21:29:14 -05:00
Yang Wang
a3b3083d2c
Do not record s3 http request time when it is not available (#105103)
The metrics of HTTP request time can be unavailable (null) when the
request fails on the client side. This PR makes sure we do not attempt
to record it when it happens to avoid NPE.
2024-02-09 12:32:55 +11:00
David Turner
97dbb2a27e
Fix leaked HTTP response sent after close (#105293)
Today a `HttpResponse` is always released via a `ChannelPromise` which
means the release happens on a network thread. However, it's possible we
try and send a `HttpResponse` after the node has got far enough through
shutdown that it doesn't have any running network threads left, which
means the response just leaks.

This is no big deal in production, it becomes irrelevant when the
process exits, but in tests we start and stop many nodes within the same
process so mustn't leak anything.

At this point in shutdown, all HTTP channels are now closed, so it's
sufficient to check whether the channel is open first, and to fail the
listener on the calling thread if not. That's what this commit does.

Closes #104651
2024-02-08 14:57:02 -05:00
Matteo Piergiovanni
54cfce4379
Flag in _field_caps to return only fields with values in index (#103651)
We are adding a query parameter to the field_caps api in order to filter out 
fields with no values. The parameter is called `include_empty_fields`  and 
defaults to true, and if set to false it will filter out from the field_caps 
response all the fields that has no value in the index.
We keep track of FieldInfos during refresh in order to know which field has 
value in an index. We added also a system property 
`es.field_caps_empty_fields_filter` in order to disable this feature if needed.

---------

Co-authored-by: Matthias Wilhelm <ankertal@gmail.com>
2024-02-08 17:52:21 +01:00
Michael Peterson
ac36aa7795
Resolve Cluster API (#102726)
To improve cross-cluster search user experience, Kibana needs an endpoint that is accessible
by arbitrary Kibana dashboard search users and provides:

1. a listing of clusters in scope for a CCS query (based on the index expression and whether 
there are any indices on each cluster that the Kibana user has access to query).
2. whether that cluster is currently connected to the querying cluster (will it come back as 
skipped or failed in a CCS search)
3. showing the skip_unavailable setting for those clusters (so you can know whether it will
return skipped or failed in a CCS search)
4. the ES version of the cluster

Since no single Elasticsearch endpoint provides all of these features, this PR creates a new endpoint `_resolve/cluster` that works along side the existing `_resolve/index` endpoint 
(and leverages some of its features).

Example usage against a cluster with 2 remote clusters configured:

GET /_resolve/cluster/*,remote*:bl*

Response:

{
  "(local)": {
    "connected": true,
    "skip_unavailable": false,
    "matching_indices": true,
    "version": {
      "number": "8.12.0-SNAPSHOT",
      "build_flavor": "default",
      "minimum_wire_compatibility_version": "7.17.0",
      "minimum_index_compatibility_version": "7.0.0"
    }
  },
  "remote2": {
    "connected": true,
    "skip_unavailable": true,
    "matching_indices": true,
    "version": {
      "number": "8.12.0-SNAPSHOT",
      "build_flavor": "default",
      "minimum_wire_compatibility_version": "7.17.0",
      "minimum_index_compatibility_version": "7.0.0"
    }
  },
  "remote1": {
    "connected": true,
    "skip_unavailable": false,
    "matching_indices": false,
    "version": {
      "number": "8.12.0-SNAPSHOT",
      "build_flavor": "default",
      "minimum_wire_compatibility_version": "7.17.0",
      "minimum_index_compatibility_version": "7.0.0"
    }
  }
}

Almost all errors show up as "error" entries in the response.
Only the local SecurityException returns a 403 since that happens before the ResolveCluster
Transport code kicks in.
2024-02-08 10:50:05 -05:00
Felix Barnsteiner
f426b68a82
Unmute LogsDataStreamIT.testIgnoreDynamicBeyondLimit (#105282) 2024-02-08 13:26:42 +01:00
David Turner
e489951d84
Close currentChunkedWrite on client cancel (#105258)
If the client closes the channel while we're in the middle of a chunked
write then today we don't complete the corresponding listener. This
commit fixes the problem.
2024-02-08 07:07:04 -05:00
Felix Barnsteiner
9dfd5dbd8f
Mute LogsDataStreamIT.testIgnoreDynamicBeyondLimit (#105280) 2024-02-08 12:25:45 +01:00
Ignacio Vera
8f37ef977f
Remove abstract method InternalMultiBucketAggregation#reduceBucket (#105275) 2024-02-08 11:24:02 +01:00
Felix Barnsteiner
50902e15a6
Use new ignore_dynamic_beyond_limit setting in logs and metrics data streams (#105180)
This reduces the risk of document loss if too many fields are added.

As these component templates are imported by Fleet, this also affects
integrations.
2024-02-08 04:23:50 -05:00
Ignacio Vera
609e8059eb
Introduce an AggregatorReducer to reduce the footprint of aggregations in the coordinating node (#105207)
This commit adds an abstraction that performs reduction of InternalAggregations in a streaming fashion.
2024-02-08 09:30:54 +01:00
Mary Gouseti
65d1d3d47d
Change the rest client configuration in the LazyRolloverDataStreamIT (#105243) 2024-02-07 17:44:40 +02:00
Niels Bauman
64891011d3
Extend repository_integrity health indicator for unknown and invalid repos (#104614)
This PR extends the repository integrity health indicator to cover also unknown and invalid repositories. Because these errors are local to a node, we extend the `LocalHealthMonitor` to monitor the repositories and report the changes in their health regarding the unknown or invalid status.
To simplify this extension in the future, we introduce the `HealthTracker` abstract class that can be used to create new local health checks.
Furthermore, we change the severity of the health status when the repository integrity indicator reports unhealthy from `RED` to `YELLOW` because even though this is a serious issue, there is no user impact yet.
2024-02-07 15:18:55 +01:00
Mary Gouseti
011876367a
Execute lazy rollover with an internal dedicated user #104732 (#104905)
The unconditional rollover that is a consequence of a lazy rollover command is triggered by the creation of a document. In many cases, the user triggering this rollover won't have sufficient privileges to ensure the successful execution of this rollover. For this reason, we introduce a dedicated rollover action and a dedicated internal user to cover this case and enable this functionality.
2024-02-07 13:01:01 +02:00
Ignacio Vera
4d5416912b
Use an AbstractList to build the AggregationList for reduction (#105200)
We are building a list of InternalAggregations from a list of Buckets, therefore we can use an AbstractList to create the actual list and save some allocations.
2024-02-06 17:53:41 +01:00
Joe Gallo
341f845832
Ingest geoip: tidy up logging code (#105086) 2024-02-06 10:44:48 -05:00
Joe Gallo
d392cd7d56
Tidy up collections code (#105085) 2024-02-06 10:44:20 -05:00
Felix Barnsteiner
ff0f83f59d
Make field limit more predictable (#102885)
Today, we're counting all mappers, including mappers for subfields that
aren't explicitly added to the mapping towards the field limit.

This means that some field types, such as `search_as_you_type` or
`percolator` count as more than one field even though that's not
apparent to users as they're just defining them as a single field in the
mapping.

This change makes it so that each field mapper only counts as one. We're
still counting multi-fields.

This makes it easier to understand for users why the field limit is hit.

~In addition to that, it also simplifies
https://github.com/elastic/elasticsearch/pull/96235 as it makes the
implementation of `Mapper.Builder#getTotalFieldsCount` much easier and
easier to align with `Mapper#getTotalFieldsCount`. This reduces the risk
of over- or under-estimating the field count of a `Mapper.Builder` in
`DocumentParserContext#addDynamicMapper`, which in turn reduces the risk
of data loss due to the issue described here:
https://github.com/elastic/elasticsearch/pull/96235#discussion_r1402495749.~

*Edit: due to https://github.com/elastic/elasticsearch/pull/103865, we
don't need an implementation of `getTotalFieldsCount` or `mapperSize` in
`Mapper.Builder`. Still, this PR more closely aligns
`Mapper#getTotalFieldsCount` with `MappingLookup#getTotalFieldsCount`,
which  `DocumentParserContext#addDynamicMapper` uses to determine
whether the field limit is hit*

A potential risk of this is that we're now effectively allowing more
fields in the mapping. It may be surprising to users that more fields
can be added to a mapping. Although, I'd not expect negative
consequences from that. Generally, I'd  expect users to be happy about
any change that reduces the risk of data loss.

We could also think about whether to apply the new counting logic only
to new indices (depending on the `IndexVersion`). However, that would
add more complexity and I'm not convinced about the value. We'd then
need to maintain two different ways of counting fields and also require
passing in the `IndexVersion` to `MappingLookup` which previously didn't
require the `IndexVersion`.

This PR is meant as a conversation starter. It would also simplify
https://github.com/elastic/elasticsearch/pull/96235 but I don't think
this blocks that PR in any way.

I'm curious about the opinion of @javanna and @jpountz on this.
2024-02-06 06:58:42 -05:00
James Baiera
9d3a645d59
Redirect failed ingest node operations to a failure store when available (#103481)
This PR updates the ingest service to detect if a failed ingest document was bound for a data stream configured 
with a failure store, and in that event, restores the document to its original state, transforms it with its failure 
information, and redirects it to the failure store for the data stream it was originally targeting.
2024-02-05 14:37:30 -05:00
Armin Braun
f879508834
Avoid building large CompositeByteBuf when sending transport messages (#105137)
We can avoid building composite byte buf instances on the transport
layer (they have quite a bit of overhead and make heap dumps more
complicated to read). There's no need to add another round of references
to the BytesReference components here. Just write these out as they come
in. This would allow for some efficiency improving follow-ups where we
can essentially release the pages that have passed the write pipeline.
To avoid having this explode the size of the queue for writes per
channel, I moved that to a linked list. The slowdown from a linked list
is irrelevant I believe. Mostly the queue is empty so it doesn't matter
or if it isn't empty, operations other than dequeuing are much more
important to performance in this logic anyway (+ Netty internally uses a
LL down the line anyway).

I would regard this as step-1 in making the serialisation here more lazy
like on the REST layer to avoid copying bytes to the outbound buffer
that we already have as `byte[]`.
2024-02-05 14:35:15 -05:00
Martijn van Groningen
39eefb3197
Unmute TimeSeriesTsidHashCardinalityIT (#105121)
and reduce the number of time series in order to fix test related OOME.

Relates to #105104
2024-02-05 17:20:30 +01:00
Kostas Krikellas
e85bb5afc3
Nest pass-through objects within objects (#105062)
* Fix test failure

https://gradle-enterprise.elastic.co/s/icg66i6mwnjoi

* Fix test failure

https://gradle-enterprise.elastic.co/s/icg66i6mwnjoi

* Nest pass-through objects within objects

* Update docs/changelog/105062.yaml

* improve test
2024-02-05 09:31:13 +02:00
Yang Wang
552d2f563b
Expose OperationPurpose via CustomQueryParameter to s3 logs (#105044)
This PR adds the OperationPurpose as a custom query parameter for each
S3 request so that they are available in s3 access logs.

Resolves: ES-7750
2024-02-04 03:21:50 -05:00
Nhat Nguyen
40a61abb95 Awaits fix #105104 2024-02-03 18:34:03 -08:00
Moritz Mack
54088839b4
Do not enable APM agent 'instrument', it's not required for manual tracing. (#105055) 2024-02-02 18:13:00 +01:00
Mary Gouseti
55cf726a80
Fix typo in test (#104744) (#105052)
Easy fix, there was a typo in the warning instead of checking for the
correct index patterns `other-*` it was checking for `ds-*`.

Closes #104774
2024-02-02 06:30:54 -05:00
John Verwolf
98a37c7b6b
Enhancement: Metrics for Search Took Times using Action Listeners (#104996)
* Instrument search took times

* Update assertion helper method to use client param

* Update docs/changelog/104996.yaml

* spotless

* Fix test
2024-02-01 12:51:12 -08:00