elasticsearch

mirror of https://github.com/elastic/elasticsearch.git synced 2025-04-23 06:37:27 -04:00

Author	SHA1	Message	Date
Rene Groeschke	5afd06ae57	[7.17] Update Gradle Wrapper to 8.2 (#96686 ) (#97484 ) * Update Gradle Wrapper to 8.2 (#96686) - Convention usage has been deprecated and was fixed in our build files - Fix test dependencies and deprecation	2023-09-27 08:46:44 +02:00
Ryan Ernst	085744a689	Rename NodeEnvironment.NodePath to DataPath (#86942 ) (#86943 ) The NodePath inner class of NodeEnvironment represents path.data entries of the node. The name however is sometimes confusing since these are the data paths, and there is normally no singular concept of "node path" (an installation has several different paths). This commit renames NodePath to DataPath, as well as all methods and variables referring to it, so that the more general term "node paths" can be utilized for something node wide: the paths in environment (in a future PR). backport of #86942	2022-05-19 18:44:43 -04:00
Martijn van Groningen	dad1c7b5e6	Turn RoutingNodes#shardsWithState(...) methods into test methods. (#79134 ) Backport of #79001 to 7.x branch. Move shardsWithState(...) methods from RoutingNodes to RoutingNodesHelper helper class in the test framework library. This is a large change, but the majority of this change is changing tests to use `RoutingNodesHelper#shardsWithState(...)` method. The changes to `ReactiveStorageDeciderService` are the only changes to production code. Originates from #78959	2021-10-14 08:16:57 -04:00
Jim Ferenczi	5416582f57	Filter original indices in shard level request (#78508 ) Today the search action send the full list of original indices on every shard request. This change restricts the list to the concrete index of the shard or the alias that was used to resolve it. Relates #78314	2021-10-14 10:34:08 +02:00
Chris Hegarty	964180ba99	[7.x] Fix split package org.elasticsearch.common.xcontent (#79061 ) * Fix split package org.elasticsearch.common.xcontent * Fix test	2021-10-13 15:43:41 +01:00
Igor Motov	3487a66414	[7.x] Interrupt aggregation reduce phase if the search task is cancelled (#78583 ) This change raises a TaskCancelledException to stop the search query if it is detected that the SearchTask has been cancelled during the reduce phase. Issue: #71021 Co-authored-by: Daniel Hsu <daniel.hsu7@gmail.com> Co-authored-by: Igor Motov <igor@motovs.org>	2021-10-04 18:25:25 -10:00
Benjamin Trent	0782ae7427	[7.x] [ML] Text/Log categorization multi-bucket aggregation (#71752 ) (#78623 ) * [ML] Text/Log categorization multi-bucket aggregation (#71752) This commit adds a new multi-bucket aggregation: `categorize_text` The aggregation follows a similar design to significant text in that it reads from `_source` and re-analyzes the the text as it is read. Key difference is that it does not use the indexed field's analyzer, but instead relies on the `ml_standard` tokenizer with specialized ML token filters. The tokenizer + filters are the same that machine learning categorization anomaly jobs utilize. The high level logical flow is as follows: - at each shard, read in the text field with a custom analyzer using `ml_standard` tokenizer - Read in the particular tokens from the analyzer - Feed these tokens to a token tree algorithm (an adaptation of the drain categorization algorithm) - Gather the individual log categories (the leaf nodes), sort them by doc_count, ship those buckets to be merged - Merge all buckets that have the EXACT same key - Once all buckets are merged, pass those keys + counts to a new token tree for additional merging - That tree builds the final buckets and that is returned to the user Algorithm explanation: - Each log is parsed with the ml-standard tokenizer - each token is passed into a token tree - For `max_match_token` each token is stored in the tree and at `max_match_token+1` (or `len(tokens)`) a log group is created - If another log group exists at that leaf, merge it if they have `similarity_threshold` percentage of tokens in common - merging simply replaces tokens that are different in the group with `` - If a layer in the tree has `max_unique_tokens` we add a `` child and any new tokens are passed through there. Catch here is that on the final merge, we first attempt to merge together subtrees with the smallest number of documents. Especially if the new sub tree has more documents counted. ## Aggregation configuration. Here is an example on some openstack logs ```js POST openstack/_search?size=0 { "aggs": { "categories": { "categorize_text": { "field": "message", // The field to categorize "similarity_threshold": 20, // merge log groups if they are this similar "max_unique_tokens": 20, // Max Number of children per token position "max_match_token": 4, // Maximum tokens to build prefix trees "size": 1 } } } } ``` This will return buckets like ```json "aggregations" : { "categories" : { "buckets" : [ { "doc_count" : 806, "key" : "nova-api.log.1.2017-05-16_13 INFO nova.osapi_compute.wsgi.server * HTTP/1.1 status len time" } ] } } ``` * fixing for backport * fixing test after backport	2021-10-04 13:33:56 -04:00
Christos Soulios	6cccbf55c3	[7.x] Add `time_series_dimension` and `time_series_metric` mapping parameters (#78265 ) Backports the following PRs: * Add dimension mapping parameter (#74450) Added the dimension parameter to the following field types: keyword ip Numeric field types (integer, long, byte, short) The dimension parameter is of type boolean (default: false) and is used to mark that a field is a time series dimension field. Relates to #74014 * Add constraints to dimension fields (#74939) This PR adds the following constraints to dimension fields: It must be an indexed field and must has doc values It cannot be multi-valued The number of dimension fields in the index mapping must not be more than 16. This should be configurable through an index property (index.mapping.dimension_fields.limit) keyword fields cannot be more than 1024 bytes long keyword fields must not use a normalizer Based on the code added in PR #74450 Relates to #74660 * Expand DocumentMapperTests (#76368) Adds a test for setting the maximum number of dimensions setting and tests the names and types of the metadata fields in the index. Previously we just asserted the count of metadata fields. That made it hard to read failures. * Fix broken test for dimension keywords (#75408) Test was failing because it was testing 1024 bytes long keyword and assertion was failing. Closes #75225 * Checkstyle * Add time_series_metric parameter (#76766) This PR adds the time_series_metric parameter to the following field types: Numeric field types histogram aggregate_metric_double * Rename `dimension` mapping parameter to `time_series_dimension` (#78012) This PR renames dimension mapping parameter to time_series_dimension to make it consistent with time_series_metric parameter (#76766) Relates to #74450 and #74014 * Add time series params to `unsigned_long` and `scaled_float` (#78204) Added the time_series_metric mapping parameter to the unsigned_long and scaled_float field types Added the time_series_dimension mapping parameter to the unsigned_long field type Fixes #78100 Relates to #76766, #74450 and #74014 Co-authored-by: Nik Everett <nik9000@gmail.com>	2021-09-27 11:59:05 +03:00
Nik Everett	05af24335d	Memory efficient xcontent filtering (backport of #77154 ) (#77653 ) * Memory efficient xcontent filtering (backport of #77154) I found myself needing support for something like `filter_path` on `XContentParser`. It was simple enough to plug it in so I did. Then I realized that it might offer more memory efficient source filtering (#25168) so I put together a quick benchmark comparing the source filtering that we do in `_search`. Filtering using the parser is about 33% faster than how we filter now when you select a single field from a 300 byte document: ``` Benchmark (excludes) (includes) (source) Mode Cnt Score Error Units FetchSourcePhaseBenchmark.filterObjects message short avgt 5 2360.342 ± 4.715 ns/op FetchSourcePhaseBenchmark.filterXContentOnBuilder message short avgt 5 2010.278 ± 15.042 ns/op FetchSourcePhaseBenchmark.filterXContentOnParser message short avgt 5 1588.446 ± 18.593 ns/op ``` The top line is the way we filter now. The middle line is adding a filter to `XContentBuilder` - something we can do right now without any of my plumbing work. The bottom line is filtering on the parser, requiring all the new plumbing. This isn't particularly impresive. 33% sounds great! But 700 nanoseconds per document isn't going to cut into anyone's search times. If you fetch a thousand docuents that's .7 milliseconds of savings. But we mostly advise folks to use source filtering on fetch when the source is large and you only want a small part of it. So I tried when the source is about 4.3kb and you want a single field: ``` Benchmark (excludes) (includes) (source) Mode Cnt Score Error Units FetchSourcePhaseBenchmark.filterObjects message one_4k_field avgt 5 5957.128 ± 117.402 ns/op FetchSourcePhaseBenchmark.filterXContentOnBuilder message one_4k_field avgt 5 4999.073 ± 96.003 ns/op FetchSourcePhaseBenchmark.filterXContentonParser message one_4k_field avgt 5 3261.478 ± 48.879 ns/op ``` That's 45% faster. Put another way, 2.7 microseconds a document. Not bad! But have a look at how things come out when you want a single field from a 4 megabyte document: ``` Benchmark (excludes) (includes) (source) Mode Cnt Score Error Units FetchSourcePhaseBenchmark.filterObjects message one_4m_field avgt 5 8266343.036 ± 176197.077 ns/op FetchSourcePhaseBenchmark.filterXContentOnBuilder message one_4m_field avgt 5 6227560.013 ± 68306.318 ns/op FetchSourcePhaseBenchmark.filterXContentonParser message one_4m_field avgt 5 1617153.472 ± 80164.547 ns/op ``` These documents are very large. I've encountered documents like them in real life, but they've always been the outlier for me. But a 6.5 millisecond per document savings ain't anything to sneeze at. Take a look at what you get when I turn on gc metrics: ``` FetchSourcePhaseBenchmark.filterObjects message one_4m_field avgt 5 7036097.561 ± 84721.312 ns/op FetchSourcePhaseBenchmark.filterObjects:·gc.alloc.rate message one_4m_field avgt 5 2166.613 ± 25.975 MB/sec FetchSourcePhaseBenchmark.filterXContentOnBuilder message one_4m_field avgt 5 6104595.992 ± 55445.508 ns/op FetchSourcePhaseBenchmark.filterXContentOnBuilder:·gc.alloc.rate message one_4m_field avgt 5 2496.978 ± 22.650 MB/sec FetchSourcePhaseBenchmark.filterXContentonParser message one_4m_field avgt 5 1614980.846 ± 31716.956 ns/op FetchSourcePhaseBenchmark.filterXContentonParser:·gc.alloc.rate message one_4m_field avgt 5 1.755 ± 0.035 MB/sec ``` * Fixup benchmark for 7.x	2021-09-13 16:08:45 -04:00
Armin Braun	fdb6fe5046	Remove Dead NamedWritableRegistry Fields in Aggs/Search Code (#76743 ) (#76774 ) No need for the registry in these places anymore => we can simplify things here and there.	2021-08-20 22:16:15 +02:00
Igor Motov	d13ec5e64c	[7.x] Fix wrong error upper bound when performing incremental reductions (#43874 ) (#76475 ) When performing incremental reductions, 0 value of docCountError may mean that the error was not previously calculated, or that the error was indeed previously calculated and its value was 0. We end up rejecting true values set to 0 this way. This may lead to wrong upper bound of error in result. To fix it, this PR makes docCountError nullable. null values mean that error was not calculated yet. Fixes #40005, #75667 Co-authored-by: Nikita Glashenko <nikita.glashenko@gmail.com>	2021-08-16 08:06:21 -10:00
Stuart Tettemer	91e9c5c5a0	Script: Fields API for Sort and Score scripts (#75863 ) (#76108 ) Adds minimal fields API support to sort and score scripts. Example: `field('myfield').getValue(123)` where `123` is the default if the field has no values. Refs: #61388 Backport: `6c02a6c`	2021-08-04 11:40:52 -05:00
Rory Hunter	fb8f84fdae	Order imports when reformatting (#74059 ) Change the formatter config to sort / order imports, and reformat the codebase. We already had a config file for Eclipse users, so Spotless now uses that. The "Eclipse Code Formatter" plugin ought to be able to use this file as well for import ordering, but in my experiments the results were poor. Instead, use IntelliJ's `.editorconfig` support to configure import ordering. I've also added a config file for the formatter plugin. Other changes: * I've quietly enabled the `toggleOnOff` option for Spotless. It was already possible to disable formatting for sections using the markers for docs snippets, so enabling this option just accepts this reality and makes it possible via `formatter:off` and `formatter:on` without the restrictions around line length. It should still only be used as a very last resort and with good reason. * I've removed mention of the `paddedCell` option from the contributing guide, since I haven't had to use that option for a very long time. I moved the docs to the spotless config.	2021-06-16 09:25:55 +01:00
Ryan Ernst	c9471144f5	Add precommit task for detecting split packages (#73784 ) (#73931 ) Modularization of the JDK has been ongoing for several years. Recently in Java 16 the JDK began enforcing module boundaries by default. While Elasticsearch does not yet use the module system directly, there are some side effects even for those projects not modularized (eg #73517). Before we can even begin to think about how to modularize, we must Prepare The Way by enforcing packages only exist in a single jar file, since the module system does not allow packages to coexist in multiple modules. This commit adds a precommit check to the build which detects split packages. The expectation is that we will add the existing split packages to the ignore list so that any new classes will not exacerbate the problem, and the work to cleanup these split packages can be parallelized. relates #73525	2021-06-08 16:53:56 -07:00
Ryan Ernst	393ab2d813	Rename o.e.common in libs/core to o.e.core (#73909 ) (#73920 ) When libs/core was created, several classes were moved from server's o.e.common package, but they were not moved to a new package. Split packages need to go away long term, so that Elasticsearch can even think about modularization. This commit moves all the classes under o.e.common in core to o.e.core. relates #73784 backport #73909	2021-06-08 14:17:44 -07:00
Luca Cavanna	95521018b0	Remove getMatchingFieldTypes method (#73655 ) FieldTypeLookup and MappingLookup expose the getMatchingFieldTypes method to look up matching field type by a string pattern. We have migrated ExistsQueryBuilder to instead rely on getMatchingFieldNames, hence we can go ahead and remove the remaining usages and the method itself. The remaining usages are to find specific field types from the mappings, specifically to eagerly load global ordinals and for the join field type. These are operations that are performed only once when loading the mappings, and may be refactored to work differently in the future. For now, we remove getMatchingFieldTypes and rather call for the two mentioned scenarios getMatchingFieldNames(*) and then getFieldType for each of the returned field name. This is a bit wasteful but performance can be sacrificed for these scenarios in favour of less code to maintain.	2021-06-03 11:43:28 +02:00
Nik Everett	8864e00d96	Add setting to disable aggs optimization (backport of #73620 ) (#73668 ) Sometimes our fancy "run this agg as a Query" optimizations end up slower than running the aggregation in the old way. We know that and use heuristics to dissable the optimization in that case. But it turns out that the process of running the heuristics itself can be slow, depending on the query. Worse, changing the heuristics requires an upgrade, which means waiting. If the heurisics make a terrible choice folks need a quick way out. This adds such a way: a cluster level setting that contains a list of queries that are considered "too expensive" to try and optimize. If the top level query contains any of those queries we'll disable the "run as Query" optimization. The default for this settings is wildcard and term-in-set queries, which is fairly conservative. There are certainly wildcard and term-in-set queries that the optimization works well with, but there are other queries of that type that it works very badly with. So we're being careful. Better, you can modify this setting in a running cluster to disable the optimization if we find a new type of query that doesn't work well. Closes #73426	2021-06-02 12:04:18 -04:00
Alan Woodward	00ac2b12ba	Replace simpleMatchToFullName (#72674 ) (#73036 ) MappingLookup has a method simpleMatchToFieldName that attempts to return all field names that match a given pattern; if no patterns match, then it returns a single-valued collection containing just the pattern that was originally passed in. This is a fairly confusing semantic. This PR replaces simpleMatchToFullName with two new methods: * getMatchingFieldNames(), which returns a set of all mapped field names that match a pattern. Calling getFieldType() with a name returned by this method is guaranteed to return a non-null MappedFieldType * getMatchingFieldTypes, that returns a collection of all MappedFieldTypes in a mapping that match the passed-in pattern. This allows us to clean up several call-sites because we know that MappedFieldTypes returned from these calls will never be null. It also simplifies object field exists query construction.	2021-05-13 14:07:16 +01:00
Nik Everett	a8999f0b77	update benchmark readme (#72620 ) Documents that version 2.0 of the async profiler doesn't seem to work with jmh. Fixes some syntax in another profiling example.	2021-05-03 11:36:28 -04:00
Rene Groeschke	59126ea871	Restructure buildsrc restructure buildsrc (7.x backport) (#72315 ) backports #72030 to 7.x Related to #71593 we move all build logic that is for elasticsearch build only into the org.elasticsearch.gradle.internal* packages This makes it clearer if build logic is considered to be used by external projects Ultimately we want to only expose TestCluster and PluginBuildPlugin logic to third party plugin authors. This is a very first step towards that direction.	2021-04-28 08:52:56 +02:00
Nik Everett	d46ea3cf54	Fix profiled global agg (backport of #71575 ) (#71634 ) This fixes the `global` aggregator when `profile` is enabled. It does so by removing all of the special case handling for `global` aggs in `AggregationPhase` and having the global aggregator itself perform the scoped collection using the same trick that we use in filter-by-filter mode of the `filters` aggregation. Closes #71098	2021-04-13 10:28:37 -04:00
Rory Hunter	76b887280b	Refresh formatter config (#71588 ) Write out the formatter config using the latest Eclipse. This has the effect of configuring assertion formatting properly, which has improved how some of our assertion messsages are formatted. Also reconfigure how annotations are formatted, so that they are correctly line-wrapped.	2021-04-13 09:35:49 +01:00
Armin Braun	7928a3308d	Use ByteBufferStreamInput to Stream Byte Arrays (#71538 ) (#71591 ) For bulk operations that fall back to hotspot intrinsic code (reading short, int, long) using this stream brings a massive speedup. The added benchmark for reading `long` values sees a ~100x speedup in local benchmarking and the vLong read benchmark still sees a slightly under ~10x speedup. Also, this PR moves creation of the `StreamInput` out of the hot benchmark loop for all the bytes reference benchmarks to make the benchmark less noisy and more practically useful. (the `readLong` case using intrinsic operations is so fast that it wouldn't even show up in a profile relative to instantiating the stream otherwise). Relates work in #71181	2021-04-13 09:52:35 +02:00
Armin Braun	88b3d65ebc	Optimize Reading vInt and vLong from BytesReference (#71522 ) (#71531 ) Same optimization as in #71181 (also used by buffering Lucene DataInput implementations) but for the variable length encodings. Benchmarks show a ~50% speedup for the benchmarked mix of values for `vLong`. Generally this change helps the most with large values but shows a slight speedup even for the 1 byte length case by avoiding some indirection and bounds checking.	2021-04-10 07:21:47 +02:00
Armin Braun	db2b29d2b4	Add Benchmark for Long Reads from BytesReference Stream (#71310 ) (#71316 ) Relates #70800 This reproduces the issue reported in #70800 and demonstrates that the fix in #71181 brings about a 5x speedup for reading `long` from a bytes reference stream when backed by a paged bytes reference.	2021-04-06 10:38:20 +02:00
Alan Woodward	e3b3bd20a4	Add script parameter to long and double field mappers (#69531 ) (#71105 ) This commit adds a script parameter to long and double fields that makes it possible to calculate a value for these fields at index time. It uses the same script context as the equivalent runtime fields, and allows for multiple index-time scripted fields to cross-refer while still checking for indirection loops.	2021-03-31 15:28:47 +01:00
Jim Ferenczi	f6cadd4f7f	Remove the _parent_join metadata field (#70143 ) This commit removes the metadata field _parent_join that was needed to ensure that only one join field is used in a mapping. It is replaced with a validation at the field level. This change also fixes in [bug](https://github.com/elastic/kibana/issues/92960) in the handling of parent join fields in _field_caps. This metadata field throws an unexpected exception in [7.11](https://github.com/elastic/elasticsearch/pull/63878) when checking if the field is aggregatable. That's now fixed since this unused field has been removed.	2021-03-10 09:21:59 +01:00
Nik Everett	d033de0316	Modest memory savings in date_histogram>terms (backport of #68592 ) (#69305 ) This saves 16 bytes of memory per bucket for some aggregations. Specifically, it kicks in when there is a parent bucket and we have a good estimate on its upper bound cardinality, and we have good estimate on the per-bucket cardinality of this aggregation, and both those upper bounds will fit into a single `long`. That sounds unlikely, but there is a fairly common case where we have it: a `date_histogram` followed by a `terms` aggregation powered by global ordinals. This is common enough that we already had at least two rally operations for it: * `date-histo-string-terms-via-global-ords` * `filtered-date-histo-string-terms-via-global-ords` Running those rally tracks shows that the space savings yields a small but statistically significant perform bump. The 90th percentile service time drops by about 4% in the unfiltered case and 1% for the filtered case. That's not great but it good to know saving 16 bytes doesn't slow us down. ``` \| 50th percentile latency \| date-histo \| 3185.77 \| 3028.65 \| -157.118 \| ms \| \| 90th percentile latency \| date-histo \| 3237.07 \| 3101.32 \| -135.752 \| ms \| \| 100th percentile latency \| date-histo \| 3270.53 \| 3178.7 \| -91.8319 \| ms \| \| 50th percentile service time \| date-histo \| 3181.55 \| 3024.32 \| -157.238 \| ms \| \| 90th percentile service time \| date-histo \| 3232.91 \| 3097.67 \| -135.238 \| ms \| \| 100th percentile service time \| date-histo \| 3266.63 \| 3175.08 \| -91.5494 \| ms \| \| 50th percentile latency \| filtered-date-histo \| 1349.22 \| 1331.94 \| -17.2717 \| ms \| \| 90th percentile latency \| filtered-date-histo \| 1402.71 \| 1383.7 \| -19.0131 \| ms \| \| 100th percentile latency \| filtered-date-histo \| 1412.41 \| 1397.7 \| -14.7139 \| ms \| \| 50th percentile service time \| filtered-date-histo \| 1345.18 \| 1326.2 \| -18.9806 \| ms \| \| 90th percentile service time \| filtered-date-histo \| 1397.24 \| 1378.14 \| -19.1031 \| ms \| \| 100th percentile service time \| filtered-date-histo \| 1406.69 \| 1391.63 \| -15.0529 \| ms \| ``` The microbenchmarks for `LongKeyedBucketOrds`, the interface we're targeting, show a performance boost on the method in the path of about 13%. This is obvious not the entire hot path, given that th 13% savings translated to a 4% performance savings over the whole agg. But its something. ``` Benchmark Mode Cnt Score Error Units multiBucketMany avgt 5 10.038 ± 0.009 ns/op multiBucketManySmall avgt 5 8.738 ± 0.029 ns/op singleBucketIntoMulti avgt 5 7.701 ± 0.073 ns/op singleBucketIntoSingleImmutableBimorphicInvocation avgt 5 6.160 ± 0.029 ns/op singleBucketIntoSingleImmutableMonmorphicInvocation avgt 5 6.571 ± 0.043 ns/op singleBucketIntoSingleMutableBimorphicInvocation avgt 5 7.714 ± 0.010 ns/op singleBucketIntoSingleMutableMonmorphicInvocation avgt 5 7.459 ± 0.017 ns/op ``` While I was touching the JMH benchmarks for `LongKeyedBucketOrds` I took the opportunity to try and make the runs that collect from a single bucket more comparable to the ones that collect from many buckets. It only seemed fair.	2021-02-19 17:15:36 -05:00
Nik Everett	bb1f8bfdcd	Add benchmark racing scripts (backport of #68369 ) This adds a microbenchmark running our traditional `ScriptScoreQuery` race. This races Lucene Expressions, Painless, and a hand rolled implementation of `ScoreScript`. Through the magic of the async profiler, this revealed a few bottlenecks that hit painless that we likely can fix! Happy times. Co-authored-by: Rene Groeschke <rene@breskeby.com>	2021-02-03 12:30:35 -05:00
Mark Vieira	2d1e8b3abd	Update sources with new SSPL+Elastic-2.0 license headers As per the new licensing change for Elasticsearch and Kibana this commit moves existing Apache 2.0 licensed source code to the new dual license SSPL+Elastic license 2.0. In addition, existing x-pack code now uses the new version 2.0 of the Elastic license. Full changes include: - Updating LICENSE and NOTICE files throughout the code base, as well as those packaged in our published artifacts - Update IDE integration to now use the new license header on newly created source files - Remove references to the "OSS" distribution from our documentation - Update build time verification checks to no longer allow Apache 2.0 license header in Elasticsearch source code - Replace all existing Apache 2.0 license headers for non-xpack code with updated header (vendored code with Apache 2.0 headers obviously remains the same). - Replace all Elastic license 1.0 headers with new 2.0 header in xpack.	2021-02-02 18:07:23 -08:00
Nik Everett	f4ce451fc7	It's flame graph time! (backport of #68312 ) (#68399 ) Upgrade JMH to latest (1.26) to pick up its async profiler integration and update the documentation to include instructions to running the async profiler and making pretty pretty flame graphs.	2021-02-02 16:59:23 -05:00
Nik Everett	4fd9b1dd1b	Lower contention on requests with many aggs (backport of #66895 ) (#66941 ) This lowers the contention on the `REQUEST` circuit breaker when building many aggregations on many threads by preallocating a chunk of breaker up front. This cuts down on the number of times we enter the busy loop in `ChildMemoryCircuitBreaker.limit`. Now we hit it one time when building aggregations. We still hit the busy loop if we collect many buckets. We let the `AggregationBuilder` pick size of the "chunk" that we preallocate but it doesn't have much to go on - not even the field types. But it is available in a convenient spot and the estimates don't have to be particularly accurate. The benchmarks on my 12 core desktop are interesting: ``` Benchmark (breaker) Mode Cnt Score Error Units sum noop avgt 10 1.672 ± 0.042 us/op sum real avgt 10 4.100 ± 0.027 us/op sum preallocate avgt 10 4.230 ± 0.034 us/op termsSixtySums noop avgt 10 92.658 ± 0.939 us/op termsSixtySums real avgt 10 278.764 ± 39.751 us/op termsSixtySums preallocate avgt 10 120.896 ± 16.097 us/op termsSum noop avgt 10 4.573 ± 0.095 us/op termsSum real avgt 10 9.932 ± 0.211 us/op termsSum preallocate avgt 10 7.695 ± 0.313 us/op ``` They show pretty clearly that not using the circuit breaker at all is faster. But we can't do that because we don't want to bring the node down on bad aggs. When there are many aggs (termsSixtySums) the preallocation claws back much of the performance. It even helps marginally when there are two aggs (termsSum). For a single agg (sum) we see a 130 nanosecond hit. Fine. But these values are all pretty small. At best we're seeing a 160 microsecond savings. Not so on a 160 vCPU machine: ``` Benchmark (breaker) Mode Cnt Score Error Units sum noop avgt 10 44.956 ± 8.851 us/op sum real avgt 10 118.008 ± 19.505 us/op sum preallocate avgt 10 241.234 ± 305.998 us/op termsSixtySums noop avgt 10 1339.802 ± 51.410 us/op termsSixtySums real avgt 10 12077.671 ± 12110.993 us/op termsSixtySums preallocate avgt 10 3804.515 ± 1458.702 us/op termsSum noop avgt 10 59.478 ± 2.261 us/op termsSum real avgt 10 293.756 ± 253.854 us/op termsSum preallocate avgt 10 197.963 ± 41.578 us/op ``` All of these numbers are larger because we're running all the CPUs flat out and we're seeing more contention everywhere. Even the "noop" breaker sees some contention, but I think it is mostly around memory allocation. Anyway, with many many (termsSixtySums) aggs we're looking at 8 milliseconds of savings by preallocating. Just by dodging the busy loop as much as possible. The error in the measurements there are substantial. Here are the runs: ``` real: Iteration 1: 8679.417 ±(99.9%) 273.220 us/op Iteration 2: 5849.538 ±(99.9%) 179.258 us/op Iteration 3: 5953.935 ±(99.9%) 152.829 us/op Iteration 4: 5763.465 ±(99.9%) 150.759 us/op Iteration 5: 14157.592 ±(99.9%) 395.224 us/op Iteration 1: 24857.020 ±(99.9%) 1133.847 us/op Iteration 2: 24730.903 ±(99.9%) 1107.718 us/op Iteration 3: 18894.383 ±(99.9%) 738.706 us/op Iteration 4: 5493.965 ±(99.9%) 120.529 us/op Iteration 5: 6396.493 ±(99.9%) 143.630 us/op preallocate: Iteration 1: 5512.590 ±(99.9%) 110.222 us/op Iteration 2: 3087.771 ±(99.9%) 120.084 us/op Iteration 3: 3544.282 ±(99.9%) 110.373 us/op Iteration 4: 3477.228 ±(99.9%) 107.270 us/op Iteration 5: 4351.820 ±(99.9%) 82.946 us/op Iteration 1: 3185.250 ±(99.9%) 154.102 us/op Iteration 2: 3058.000 ±(99.9%) 143.758 us/op Iteration 3: 3199.920 ±(99.9%) 61.589 us/op Iteration 4: 3163.735 ±(99.9%) 71.291 us/op Iteration 5: 5464.556 ±(99.9%) 59.034 us/op ``` That variability from 5.5ms to 25ms is terrible. It makes me not particularly trust the 8ms savings from the report. But still, the preallocating method has much less variability between runs and almost all the runs are faster than all of the non-preallocated runs. Maybe the savings is more like 2 or 3 milliseconds, but still. Or maybe we should think of hte savings as worst vs worst? If so its 19 milliseconds. Anyway, its hard to measure how much this helps. But, certainly some. Closes #58647	2021-01-04 12:40:59 -05:00
Rene Groeschke	709643e649	Move tasks in build scripts to task avoidance api (7.x backport) (#64990 ) * Move tasks in build scripts to task avoidance api (#64046) - Some trivial cleanup on build scripts - Change task referencing in build scripts to use task avoidance api where replacement is trivial.	2020-11-12 13:57:01 +01:00
Tanguy Leroux	87076c32e2	Determine shard size before allocating shards recovering from snapshots (#61906 ) (#63337 ) Determines the shard size of shards before allocating shards that are recovering from snapshots. It ensures during shard allocation that the target node that is selected as recovery target will have enough free disk space for the recovery event. This applies to regular restores, CCR bootstrap from remote, as well as mounting searchable snapshots. The InternalSnapshotInfoService is responsible for fetching snapshot shard sizes from repositories. It provides a getShardSize() method to other components of the system that can be used to retrieve the latest known shard size. If the latest snapshot shard size retrieval failed, the getShardSize() returns ShardRouting.UNAVAILABLE_EXPECTED_SHARD_SIZE. While we'd like a better way to handle such failures, returning this value allows to keep the existing behavior for now. Note that this PR does not address an issues (we already have today) where a replica is being allocated without knowing how much disk space is being used by the primary. Co-authored-by: Yannick Welsch <yannick@welsch.lu>	2020-10-06 18:37:05 +02:00
Jim Ferenczi	78a93dc18f	Request-level circuit breaker support on coordinating nodes (#62884 ) This commit allows coordinating node to account the memory used to perform partial and final reduce of aggregations in the request circuit breaker. The search coordinator adds the memory that it used to save and reduce the results of shard aggregations in the request circuit breaker. Before any partial or final reduce, the memory needed to reduce the aggregations is estimated and a CircuitBreakingException} is thrown if exceeds the maximum memory allowed in this breaker. This size is estimated as roughly 1.5 times the size of the serialized aggregations that need to be reduced. This estimation can be completely off for some aggregations but it is corrected with the real size after the reduce completes. If the reduce is successful, we update the circuit breaker to remove the size of the source aggregations and replace the estimation with the serialized size of the newly reduced result. As a follow up we could trigger partial reduces based on the memory accounted in the circuit breaker instead of relying on a static number of shard responses. A simpler follow up that could be done in the mean time is to [reduce the default batch reduce size](https://github.com/elastic/elasticsearch/issues/51857) of blocking search request to a more sane number. Closes #37182	2020-09-24 18:59:28 +02:00
Nik Everett	0a7f335215	Speed up writeVInt (backport of #62345 ) (#62419 ) This speeds up `StreamOutput#writeVInt` quite a bit which is nice because it is very commonly called when serializing aggregations. Well, when serializing anything. All "collections" serialize their size as a vint. Anyway, I was examining the serialization speeds of `StringTerms` and this saves about 30% of the write time for that. I expect it'll be useful other places.	2020-09-15 17:14:08 -04:00
Nik Everett	dfd502f9ca	Rework checking if a year is a leap year (#60585 ) (#60790 ) This way is faster, saving about 8% on the microbenchmark that rounds to the nearest month. That is in the hot path for `date_histogram` which is a very popular aggregation so it seems worth it to at least try and speed it up a little. Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>	2020-08-10 12:45:34 -04:00
Nik Everett	81cba796e6	Add microbenchmark for LongKeyedBucketOrds (#58608 ) (#59459 ) I've always been confused by the strange behavior that I saw when working on #57304. Specifically, I saw switching from a bimorphic invocation to a monomorphic invocation to give us a 7%-15% performance bump. This felt bonkers to me. And, it also made me wonder whether it'd be worth looking into doing it everywhere. It turns out that, no, it isn't needed everywhere. This benchmark shows that a bimorphic invocation like: ``` LongKeyedBucketOrds ords = new LongKeyedBucketOrds.ForSingle(); ords.add(0, 0); <------ this line ``` is 19% slower than a monomorphic invocation like: ``` LongKeyedBucketOrds.ForSingle ords = new LongKeyedBucketOrds.ForSingle(); ords.add(0, 0); <------ this line ``` But only when the reference is mutable. In the example above, if `ords` is never changed then both perform the same. But if the `ords` reference is assigned twice then we start to see the difference: ``` immutable bimorphic avgt 10 6.468 ± 0.045 ns/op immutable monomorphic avgt 10 6.756 ± 0.026 ns/op mutable bimorphic avgt 10 9.741 ± 0.073 ns/op mutable monomorphic avgt 10 8.190 ± 0.016 ns/op ``` So the conclusion from all this is that we've done the right thing: `auto_date_histogram` is the only aggregation in which `ords` isn't final and it is the only aggregation that forces monomorphic invocations. All other aggregations use an immutable bimorphic invocation. Which is fine. Relates to #56487	2020-07-13 17:22:46 -04:00
Rene Groeschke	a896df53ac	Remove misc dependency related deprecation warnings (7.x backport) (#59122 ) * Fix dependency related deprecations (#58892) * Fix classpath setup for forbiddenapi usage	2020-07-07 17:10:31 +02:00
Rene Groeschke	d952b101e6	Replace compile configuration usage with api (7.x backport) (#58721 ) * Replace compile configuration usage with api (#58451) - Use java-library instead of plugin to allow api configuration usage - Remove explicit references to runtime configurations in dependency declarations - Make test runtime classpath input for testing convention - required as java library will by default not have build jar file - jar file is now explicit input of the task and gradle will ensure its properly build * Fix compile usages in 7.x branch	2020-06-30 15:57:41 +02:00
Rene Groeschke	abc72c1a27	Unify dependency licenses task configuration (#58116 ) (#58274 ) - Remove duplicate dependency configuration - Use task avoidance api accross the build - Remove redundant licensesCheck config	2020-06-18 08:15:50 +02:00
Nik Everett	bd4b9dd10e	Speed up time interval arounding around dst (backport #56371 ) (#56396 ) When an index spans a daylight savings time transition we can't use our optimization that rewrites the requested time zone to a fixed time zone and instead we used to fall back to a java.util.time based rounding implementation. In #55559 we optimized "time unit" rounding. This optimizes "time interval" rounding. The java.util.time based implementation is about 1650% slower than the rounding implementation for a fixed time zone. This replaces it with a similar optimization that is only about 30% slower than the fixed time zone. The java.util.time implementation allocates a ton of short lived objects but the optimized implementation doesn't. So it might end up being faster than the microbenchmarks imply.	2020-05-08 13:39:27 -04:00
Nik Everett	e35919d3b8	Optimize date_histograms across daylight savings time (backport of #55559 ) (#56334 ) Rounding dates on a shard that contains a daylight savings time transition is currently something like 1400% slower than when a shard contains dates only on one side of the DST transition. And it makes a ton of short lived garbage. This replaces that implementation with one that benchmarks to having around 30% overhead instead of the 1400%. And it doesn't generate any garbage per search hit. Some background: There are two ways to round in ES: * Round to the nearest time unit (Day/Hour/Week/Month/etc) * Round to the nearest time interval (3 days/2 weeks/etc) I'm only optimizing the first one in this change and plan to do the second in a follow up. It turns out that rounding to the nearest unit really is two problems: when the unit rounds to midnight (day/week/month/year) and when it doesn't (hour/minute/second). Rounding to midnight is consistently about 25% faster and rounding to individual hour or minutes. This optimization relies on being able to usually figure out what the minimum and maximum dates are on the shard. This is similar to an existing optimization where we rewrite time zones that aren't fixed (think America/New_York and its daylight savings time transitions) into fixed time zones so long as there isn't a daylight savings time transition on the shard (UTC-5 or UTC-4 for America/New_York). Once I implement time interval rounding the time zone rewriting optimization should no longer be needed. This optimization doesn't come into play for `composite` or `auto_date_histogram` aggs because neither have been migrated to the new `DATE` `ValuesSourceType` which is where that range lookup happens. When they are they will be able to pick up the optimization without much work. I expect this to be substantial for `auto_date_histogram` but less so for `composite` because it deals with fewer values. Note: My 30% overhead figure comes from small numbers of daylight savings time transitions. That overhead gets higher when there are more transitions in logarithmic fashion. When there are two thousand years worth of transitions my algorithm ends up being 250% slower than rounding without a time zone, but java time is 47000% slower at that point, allocating memory as fast as it possibly can.	2020-05-07 09:10:51 -04:00
Ryan Ernst	29b70733ae	Use task avoidance with forbidden apis (#55034 ) Currently forbidden apis accounts for 800+ tasks in the build. These tasks are aggressively created by the plugin. In forbidden apis 3.0, we will get task avoidance (https://github.com/policeman-tools/forbidden-apis/pull/162), but we need to ourselves use the same task avoidance mechanisms to not trigger these task creations. This commit does that for our foribdden apis usages, in preparation for upgrading to 3.0 when it is released.	2020-04-15 13:27:53 -07:00
Tanguy Leroux	4d36917e52	Merge feature/searchable-snapshots branch into 7.x (#54803 ) (#54825 ) This is a backport of #54803 for 7.x. This pull request cherry picks the squashed commit from #54803 with the additional commits: `6f50c92` which adjusts master code to 7.x `a114549` to mute a failing ILM test (#54818) `48cbca1` and `50186b2` that cleans up and fixes the previous test `aae12bb` that adds a missing feature flag (#54861) `6f330e3` that adds missing serialization bits (#54864) `bf72c02` that adjust the version in YAML tests `a51955f` that adds some plumbing for the transport client used in integration tests Co-authored-by: David Turner <david.turner@elastic.co> Co-authored-by: Yannick Welsch <yannick@welsch.lu> Co-authored-by: Lee Hinman <dakrone@users.noreply.github.com> Co-authored-by: Andrei Dan <andrei.dan@elastic.co>	2020-04-07 13:28:53 +02:00
Jason Tedor	5fcda57b37	Rename MetaData to Metadata in all of the places (#54519 ) This is a simple naming change PR, to fix the fact that "metadata" is a single English word, and for too long we have not followed general naming conventions for it. We are also not consistent about it, for example, METADATA instead of META_DATA if we were trying to be consistent with MetaData (although METADATA is correct when considered in the context of "metadata"). This was a simple find and replace across the code base, only taking a few minutes to fix this naming issue forever.	2020-03-31 17:24:38 -04:00
Rory Hunter	39de995740	Exclude generated source from benchmarks formatting (#52968 ) IDEs can sometimes run annotation processors that leave files in `src/main/generated/*/.java`, causing Spotless to complain. Even though this path ought not to exist, exclude it anyway in order to avoid spurious failures.	2020-02-28 20:55:52 +00:00
Rory Hunter	d863c510da	Autoformat :qa:os and :benchmarks (#52816 ) Add `:qa:os` and `:benchmarks` to the list of automatically formatted projects, and apply some manual fix-ups to polish it up. In particular, I noticed that `Files.write(...)` when passed a list will automaticaly apply a UTF-8 encoding and write a newline after each line, making it easier to use than FileUtils.append. It's even available from 1.8. Also, in the Allocators class, a number of methods declared thrown exceptions that IntelliJ reported were never thrown, and as far as I could see this is true, so I removed the exceptions.	2020-02-28 14:48:04 +00:00
Maria Ralli	ba8d6d1fb5	Remove Xlint exclusions from gradle files Backport of #52542. This commit is part of issue #40366 to remove disabled Xlint warnings from gradle files. In particular, it removes the Xlint exclusions from the following files: - benchmarks/build.gradle - client/client-benchmark-noop-api-plugin/build.gradle - x-pack/qa/rolling-upgrade/build.gradle - x-pack/qa/third-party/active-directory/build.gradle - modules/transport-netty4/build.gradle For the first three files no code adjustments were needed. For x-pack/qa/third-party/active-directory move the suppression at the code level. For transport-netty4 replace the variable arguments with ArrayLists and remove any redundant casts.	2020-02-20 14:12:05 +00:00
Rory Hunter	c46a0e8708	Apply 2-space indent to all gradle scripts (#49071 ) Backport of #48849. Update `.editorconfig` to make the Java settings the default for all files, and then apply a 2-space indent to all `*.gradle` files. Then reformat all the files.	2019-11-14 11:01:23 +00:00

1 2

99 commits