elasticsearch

mirror of https://github.com/elastic/elasticsearch.git synced 2025-04-23 14:47:31 -04:00

Author	SHA1	Message	Date
Mark Vieira	bcfbf00074	Reformat Elasticsearch source	2021-10-27 15:23:15 -07:00
Armin Braun	262db33192	Optimize XContent Object Parsers (#78813 ) (#79673 ) Optimize object parsers a little by extracting cold paths, removing some unnecessary lambda wrapping and some other small things. Also, fixed a very expensive use of these APIs in Phase moving from a very hot stream instantiation to a standard loop.	2021-10-22 11:52:55 +02:00
Keith Massey	1f0e85bae3	Exposing the ability to log deprecated settings at non-critical level(#79107 ) (#79492 ) A recent change for the deprecation logs provided the capability to emit deprecation's at critical vs. warning levels, #77482. However deprecated settings always log at critical level without the ability to express that the setting deprecation is only a warning. This commit exposes the ability to set the deprecation level when deprecating a setting. Relates #78781	2021-10-19 14:32:17 -05:00
Nik Everett	44ea58fa27	[7.x] Move xcontent filtering tests (#79298 ) (#79389 ) * Move xcontent filtering tests (#79298) * Move xcontent filtering tests Moves the xcontent filtering tests to the xcontent project because its testing code in the xcontent project. * More clear * Spotless * Fixup	2021-10-18 17:56:14 -04:00
Chris Hegarty	964180ba99	[7.x] Fix split package org.elasticsearch.common.xcontent (#79061 ) * Fix split package org.elasticsearch.common.xcontent * Fix test	2021-10-13 15:43:41 +01:00
Armin Braun	0d23a0bf4f	Speed up toXContent Collection Serialization in some Spots (#78742 ) (#78755 ) Found this when benchmarking large cluster states. When serializing collections we'd mostly not take any advantage of what we know about the collection contents (like we do in `StreamOutput`). This PR adds a couple of helpers to the x-content-builder similar to what we have on `StreamOutput` to allow for faster serializing by avoiding the writer lookup and some self-reference checks.	2021-10-06 15:53:45 +02:00
Chris Hegarty	0323ce6ff7	Remove stray para tags (#77699 )	2021-09-14 10:46:47 -04:00
Nik Everett	05af24335d	Memory efficient xcontent filtering (backport of #77154 ) (#77653 ) * Memory efficient xcontent filtering (backport of #77154) I found myself needing support for something like `filter_path` on `XContentParser`. It was simple enough to plug it in so I did. Then I realized that it might offer more memory efficient source filtering (#25168) so I put together a quick benchmark comparing the source filtering that we do in `_search`. Filtering using the parser is about 33% faster than how we filter now when you select a single field from a 300 byte document: ``` Benchmark (excludes) (includes) (source) Mode Cnt Score Error Units FetchSourcePhaseBenchmark.filterObjects message short avgt 5 2360.342 ± 4.715 ns/op FetchSourcePhaseBenchmark.filterXContentOnBuilder message short avgt 5 2010.278 ± 15.042 ns/op FetchSourcePhaseBenchmark.filterXContentOnParser message short avgt 5 1588.446 ± 18.593 ns/op ``` The top line is the way we filter now. The middle line is adding a filter to `XContentBuilder` - something we can do right now without any of my plumbing work. The bottom line is filtering on the parser, requiring all the new plumbing. This isn't particularly impresive. 33% sounds great! But 700 nanoseconds per document isn't going to cut into anyone's search times. If you fetch a thousand docuents that's .7 milliseconds of savings. But we mostly advise folks to use source filtering on fetch when the source is large and you only want a small part of it. So I tried when the source is about 4.3kb and you want a single field: ``` Benchmark (excludes) (includes) (source) Mode Cnt Score Error Units FetchSourcePhaseBenchmark.filterObjects message one_4k_field avgt 5 5957.128 ± 117.402 ns/op FetchSourcePhaseBenchmark.filterXContentOnBuilder message one_4k_field avgt 5 4999.073 ± 96.003 ns/op FetchSourcePhaseBenchmark.filterXContentonParser message one_4k_field avgt 5 3261.478 ± 48.879 ns/op ``` That's 45% faster. Put another way, 2.7 microseconds a document. Not bad! But have a look at how things come out when you want a single field from a 4 megabyte document: ``` Benchmark (excludes) (includes) (source) Mode Cnt Score Error Units FetchSourcePhaseBenchmark.filterObjects message one_4m_field avgt 5 8266343.036 ± 176197.077 ns/op FetchSourcePhaseBenchmark.filterXContentOnBuilder message one_4m_field avgt 5 6227560.013 ± 68306.318 ns/op FetchSourcePhaseBenchmark.filterXContentonParser message one_4m_field avgt 5 1617153.472 ± 80164.547 ns/op ``` These documents are very large. I've encountered documents like them in real life, but they've always been the outlier for me. But a 6.5 millisecond per document savings ain't anything to sneeze at. Take a look at what you get when I turn on gc metrics: ``` FetchSourcePhaseBenchmark.filterObjects message one_4m_field avgt 5 7036097.561 ± 84721.312 ns/op FetchSourcePhaseBenchmark.filterObjects:·gc.alloc.rate message one_4m_field avgt 5 2166.613 ± 25.975 MB/sec FetchSourcePhaseBenchmark.filterXContentOnBuilder message one_4m_field avgt 5 6104595.992 ± 55445.508 ns/op FetchSourcePhaseBenchmark.filterXContentOnBuilder:·gc.alloc.rate message one_4m_field avgt 5 2496.978 ± 22.650 MB/sec FetchSourcePhaseBenchmark.filterXContentonParser message one_4m_field avgt 5 1614980.846 ± 31716.956 ns/op FetchSourcePhaseBenchmark.filterXContentonParser:·gc.alloc.rate message one_4m_field avgt 5 1.755 ± 0.035 MB/sec ``` * Fixup benchmark for 7.x	2021-09-13 16:08:45 -04:00
Rory Hunter	934691e720	Fix shadowed variables in various places - part 1 (#77555 ) Part of #19752. Fix a number of locations where local variables or parameters are shadowing a field that is defined in the same class.	2021-09-13 14:10:35 +01:00
Rory Hunter	a7c1ca790d	Changes to keep Checkstyle happy after reformatting (#76464 ) (#76647 ) * Reformatting to keep Checkstyle after formatting * Configure spotless everywhere, and disable the tasks if necessary * Add XContentBuilder helpers, fix test	2021-08-18 09:40:17 -04:00
elasticsearchmachine	4941d0e74a	[7.x] Fix copyCurentStructure(MapXContentParser) (#76357 ) (#76374 ) * Fix copyCurentStructure(MapXContentParser) (#76357) This stops `MapXContentParser` from throwing an `UnsupportedOperationException` when passed as an argument to `XContentBuilder#copyCurrentStructure`. This is mostly useful in tests where `Map` is a convenient way to talk about structured configuration but the production APIs need the map to be embedded into a larger blob of `XContent`. Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com> * Fixup Co-authored-by: Nik Everett <nik9000@gmail.com> Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>	2021-08-11 14:36:25 -04:00
Armin Braun	b806cdaf0d	Dry Up XContent Parser Construction (#75114 ) (#75126 ) Cleanup duplication in how we parse byte arrrays directly.	2021-07-08 15:32:29 +02:00
elasticsearchmachine	d855d82284	[7.x] Json processor: allow duplicate keys (#74956 )	2021-07-06 16:37:13 +02:00
Nhat Nguyen	0ebddea945	Allow build XContent directly from Writable (#73804 ) Today, writing a Writable value to XContent in Base64 format performs these steps: (1) create a BytesStreamOutput, (2) write Writable to that output, (3) encode a copy of bytes from that output stream, (4) create a string from the encoded bytes, (5) write the encoded string to XContent. These steps allocate/use memory 5 times than writing the encode chars directly to the output of XContent. This API would help reduce memory usage when storing a large response of an async search. Relates #67594	2021-06-09 12:54:34 -04:00
Ryan Ernst	1f94aaf280	Move ParseField to o.e.c.xcontent (#73923 ) (#73929 ) ParseField is part of the x-content lib, yet it doesn't exist under the same root package as the rest of the lib. This commit moves the class to the appropriate package. relates #73784	2021-06-08 15:51:18 -07:00
Ryan Ernst	393ab2d813	Rename o.e.common in libs/core to o.e.core (#73909 ) (#73920 ) When libs/core was created, several classes were moved from server's o.e.common package, but they were not moved to a new package. Split packages need to go away long term, so that Elasticsearch can even think about modularization. This commit moves all the classes under o.e.common in core to o.e.core. relates #73784 backport #73909	2021-06-08 14:17:44 -07:00
Ryan Ernst	ddf4c69f42	Revert "Upgrade Azure SDK and Jackson (#72833 ) (#72995 ) (#73493 )" (#73861 ) The recent upgrade of the Azure SDK has caused a few test failures that have been difficult to debug and do not yet have a fix. In particular, a change to the netty reactor resolving (reactor/reactor-netty#1655). We need to wait for a fix for that issue, so this reverts commit `f454cefc26`. relates #73493	2021-06-07 18:17:25 -07:00
Ryan Ernst	f454cefc26	Upgrade Azure SDK and Jackson (#72833 ) (#72995 ) (#73493 ) This commit upgrades the Azure SDK to 12.11.0 and Jackson to 2.12.2. The Jackson upgrade must happen at the same time due to Azure depending on this new version of Jackson. closes #66555 closes #67214 backport #72995 backport #73011 Co-authored-by: Francisco Fernández Castaño <francisco.fernandez.castano@gmail.com> Co-authored-by: Mark Vieira <portugee@gmail.com>	2021-05-27 16:45:01 -07:00
Alan Woodward	5b0f267181	Remove 'external values', and replace with swapped out XContentParsers (#72203 ) (#72448 ) The majority of field mappers read a single value from their positioned XContentParser, and do not need to call nextToken. There is a general assumption that the same holds for any multifields defined on them, and so the XContentParser is passed down to their multifields builder as-is. This assumption does not hold for mappers that accept json objects, and so we have a second mechanism for passing values around called 'external values', where a mapper can set a specific value on its context and child mappers can then check for these external values before reading from xcontent. The disadvantage of this is that every field mapper now needs to check its context for external values. Because the values are defined by their java class, we can also know that in the vast majority of cases this functionality is unused. We have only two mappers that actually make use of this, CompletionFieldMapper and GeoPointFieldMapper. This commit removes external values entirely, and replaces it with the ability to pass a modified XContentParser to multifields. FieldMappers can just check the parser attached to their context for data and don't need to worry about multiple sources. Plugins implementing field mappers will need to take the removal of external values into account. Implementations that are passing structured objects as external values should instead use ParseContext.switchParser and wrap the objects using MapXContentParser.wrapObject(). GeoPointFieldMapper passes on a fake parser that just wraps its input data formatted as a geohash; CompletionFieldMapper has a slightly more complicated parser that in general wraps its metadata, but if textOrNull() is called without the parser being advanced just returns its text input. Relates to #56063	2021-04-29 10:44:17 +01:00
Alan Woodward	416c2b1aa8	Rework geo mappers to index value by value (#71696 ) (#71828 ) The various geo field mappers are organised in a hierarchy that shares parsing and indexing code. This ends up over-complicating things, particularly when we have some mappers that accept multiple values and others that only accept singletons. It also leads to confusing behaviour around ignore_malformed behaviour: geo fields will ignore all values if a single one is badly formed, while all other field mappers will only ignore the problem value and index the rest. Finally, this structure makes adding index-time scripts to geo_point needlessly complex. This commit refactors the indexing logic of the hierarchy to move the individual value indexing logic into the concrete implementations, and aligns the ignore_malformed behaviour with that of other mappers. It contains two breaking changes: * The geo field mappers no longer check for external field values on the parse context. This added considerable complication to the refactored parse methods, and is unused anywhere in our codebase, but may impact plugin-based field mappers which expect to use geo fields as multifields * The geo_point field mapper now passes geohashes to its multifields one-by-one, instead of formatting them into a comma-delimited string and passing them all at once. Completion multifields using this as an input should still behave as normal because by default they would split this combined geohash string on the commas in any case, but keyword subfields may look different. Fixes #69601	2021-04-19 13:43:27 +01:00
Jack Conradson	0dcd59e2a1	Add missing boolean array to unknown value writers for xcontent (#71651 ) This change adds the ability to call value on an XContentBuilder and consume a boolean[]. This was missing from the set of other writers for the unknown value call.	2021-04-13 12:51:57 -07:00
Rory Hunter	1b6ad967cb	Replace NOT operator with explicit `false` check - part 8 (#68625 ) Part 8. We have an in-house rule to compare explicitly against `false` instead of using the logical not operator (`!`). However, this hasn't historically been enforced, meaning that there are many violations in the source at present. We now have a Checkstyle rule that can detect these cases, but before we can turn it on, we need to fix the existing violations. This is being done over a series of PRs, since there are a lot to fix.	2021-02-08 15:32:53 +00:00
Mark Vieira	2d1e8b3abd	Update sources with new SSPL+Elastic-2.0 license headers As per the new licensing change for Elasticsearch and Kibana this commit moves existing Apache 2.0 licensed source code to the new dual license SSPL+Elastic license 2.0. In addition, existing x-pack code now uses the new version 2.0 of the Elastic license. Full changes include: - Updating LICENSE and NOTICE files throughout the code base, as well as those packaged in our published artifacts - Update IDE integration to now use the new license header on newly created source files - Remove references to the "OSS" distribution from our documentation - Update build time verification checks to no longer allow Apache 2.0 license header in Elasticsearch source code - Replace all existing Apache 2.0 license headers for non-xpack code with updated header (vendored code with Apache 2.0 headers obviously remains the same). - Replace all Elastic license 1.0 headers with new 2.0 header in xpack.	2021-02-02 18:07:23 -08:00
Rory Hunter	e8da7e33fd	Replace NOT operator with explicit `false` check (#67817 ) We have an in-house rule to compare explicitly against `false` instead of using the logical not operator (`!`). However, this hasn't historically been enforced, meaning that there are many violations in the source at present. We now have a Checkstyle rule that can detect these cases, but before we can turn it on, we need to fix the existing violations. This is being done over a series of PRs, since there are a lot to fix.	2021-01-27 20:51:31 +00:00
Rene Groeschke	68fce39562	Avoid tasks materialized during configuration phase (#65922 ) (#66218 ) * Avoid tasks materialized during configuration phase * Fix RestTestFromSnippet testRoot setup	2020-12-12 22:13:38 +01:00
Hendrik Muhs	8e377da291	[7.x][Transform] use ISO dates in output instead of epoch millis (#65584 ) (#65952 ) Transform writes dates as epoch millis, this does not work for historic data in some cases or is unsupported. Dates should be written as such. With this PR transform starts writing dates in ISO format, but as existing transform might rely on the format it provides backwards compatibility for old jobs as well as a setting to write dates as epoch millis. fixes #63787 backport #65584	2020-12-07 17:18:55 +01:00
Przemyslaw Gomulka	20cd2e4046	Allow passing versioned media types to 7.x server (#65362 ) a follow up after #63071 where it missed the XContentType.fromMediaType method. That method also have to remove the vendor specific substrings (vnd.elasticsearch+ and compatible-with parameter) from mediaType value relates #51816	2020-11-25 11:52:41 +01:00
Rene Groeschke	709643e649	Move tasks in build scripts to task avoidance api (7.x backport) (#64990 ) * Move tasks in build scripts to task avoidance api (#64046) - Some trivial cleanup on build scripts - Change task referencing in build scripts to use task avoidance api where replacement is trivial.	2020-11-12 13:57:01 +01:00
Lee Hinman	dd99125193	[7.x] Add assert that raw and readable xcontent field names are different (#63332 ) (#63343 ) This adds asserts that will catch the case where we accidentally provide the same raw and readable field name in xcontent.	2020-10-06 11:32:41 -06:00
Przemyslaw Gomulka	eb630e599d	Allow passing versioned media types to 7.x server (#63071 ) 7.x client can pass media type with a version which will return a 7.x version of the api in ES 8. In ES server 7 this media type shoulld be accepted but it serve the same version of the API (7x) relates #61427	2020-10-02 09:17:11 +02:00
Armin Braun	9be36865ef	Speed up XContent Collection Parsing (#61442 ) (#61617 ) 1. Get rid of the capturing lambda on the hot path that inlines very badly 2. Remove as many bounds checks as possible, thereby reducing method size and improving inlining	2020-08-27 14:15:46 +02:00
Armin Braun	af2e2782eb	Stop Needlessly Copying Bytes in XContent Parsing (#61447 ) (#61469 ) Wrapping a `BytesArray` in a `StreamInput` for deserialization is inefficient. This forces Jackson to internally buffer (i.e. copy) all bytes from the `BytesArray` before deserializing, adding overhead for copying the bytes and managing the buffers. This commit fixes a number of spots where `BytesArray` is the most common type of `BytesReference` to special case this type and parse it more efficiently. Also improves parsing `String`s to use the more efficient direct `String` parsing APIs.	2020-08-24 15:49:15 +02:00
Armin Braun	7ae9dc2092	Unify Stream Copy Buffer Usage (#56078 ) (#60608 ) We have various ways of copying between two streams and handling thread-local buffers throughout the codebase. This commit unifies a number of them and removes buffer allocations in many spots.	2020-08-04 09:54:52 +02:00
Yang Wang	a84469742c	Improve role cache efficiency for API key roles (#58156 ) (#59397 ) This PR ensure that same roles are cached only once even when they are from different API keys. API key role descriptors and limited role descriptors are now saved in Authentication#metadata as raw bytes instead of deserialised Map<String, Object>. Hashes of these bytes are used as keys for API key roles. Only when the required role is not found in the cache, they will be deserialised to build the RoleDescriptors. The deserialisation is directly from raw bytes to RoleDescriptors without going through the current detour of "bytes -> Map -> bytes -> RoleDescriptors".	2020-07-13 22:58:11 +10:00
Przemysław Witek	4a791e835b	Simplify parser declarations when specialist types are stored in strings (#58996 ) (#59056 )	2020-07-06 13:05:03 +02:00
Rene Groeschke	d952b101e6	Replace compile configuration usage with api (7.x backport) (#58721 ) * Replace compile configuration usage with api (#58451) - Use java-library instead of plugin to allow api configuration usage - Remove explicit references to runtime configurations in dependency declarations - Make test runtime classpath input for testing convention - required as java library will by default not have build jar file - jar file is now explicit input of the task and gradle will ensure its properly build * Fix compile usages in 7.x branch	2020-06-30 15:57:41 +02:00
Rene Groeschke	abc72c1a27	Unify dependency licenses task configuration (#58116 ) (#58274 ) - Remove duplicate dependency configuration - Use task avoidance api accross the build - Remove redundant licensesCheck config	2020-06-18 08:15:50 +02:00
Rene Groeschke	01e9126588	Remove deprecated usage of testCompile configuration (#57921 ) (#58083 ) * Remove usage of deprecated testCompile configuration * Replace testCompile usage by testImplementation * Make testImplementation non transitive by default (as we did for testCompile) * Update CONTRIBUTING about using testImplementation for test dependencies * Fail on testCompile configuration usage	2020-06-14 22:30:44 +02:00
Christoph Büscher	56625e35b7	Fix `bool` query behaviour on null value (#56817 ) Until 7.7 we used to ignore `null` values for `bool`queries `minimum_should_match`, parameters and also for the `must`, `must_not`, `should` and `filter` clauses. An internal refactoring has changed this so now we get a parsing error. While `null` should not a common value here, we should restore the old behaviour for bwc for now. Closes #56812	2020-05-26 16:23:40 +02:00
Ryan Ernst	9fb80d3827	Move publishing configuration to a separate plugin (#56727 ) This is another part of the breakup of the massive BuildPlugin. This PR moves the code for configuring publications to a separate plugin. Most of the time these publications are jar files, but this also supports the zip publication we have for integ tests.	2020-05-14 20:23:07 -07:00
Jason Tedor	33669c0420	Upgrade to Jackson 2.10.4 (#56188 ) Another Jackson release is available. There are some CVEs addressed, none of which impact us, but since we can now bump Jackson easily, let us move along with the train to avoid the false positives from security scanners.	2020-05-06 17:20:23 -04:00
Hendrik Muhs	e177a38504	[7.x][Transform] add throttling (#56007 ) (#56184 ) add throttling to transform, throttling will slow down search requests by delaying the execution based on a documents per second metric. fixes #54862	2020-05-05 13:09:02 +02:00
Igor Motov	3504755f44	Add InstantiatingObjectParser (#55483 ) (#55604 ) Introduces InstantiatingObjectParser which is similar to the ConstructingObjectParser, but instantiates the object using its constructor instead of a builder function. Closes #52499	2020-04-22 12:28:52 -04:00
William Brafford	2ba3be9db6	Remove deprecated third-party methods from tests (#55255 ) (#55269 ) I've noticed that a lot of our tests are using deprecated static methods from the Hamcrest matchers. While this is not a big deal in any objective sense, it seems like a small good thing to reduce compilation warnings and be ready for a new release of the matcher library if we need to upgrade. I've also switched a few other methods in tests that have drop-in replacements.	2020-04-15 17:54:47 -04:00
Ryan Ernst	29b70733ae	Use task avoidance with forbidden apis (#55034 ) Currently forbidden apis accounts for 800+ tasks in the build. These tasks are aggressively created by the plugin. In forbidden apis 3.0, we will get task avoidance (https://github.com/policeman-tools/forbidden-apis/pull/162), but we need to ourselves use the same task avoidance mechanisms to not trigger these task creations. This commit does that for our foribdden apis usages, in preparation for upgrading to 3.0 when it is released.	2020-04-15 13:27:53 -07:00
Mark Vieira	ce85063653	[7.x] Re-add origin url information to publish POM files (#55173 )	2020-04-14 13:24:15 -07:00
Alan Woodward	d23112f441	Report parser name and location in XContent deprecation warnings (#53805 ) It's simple to deprecate a field used in an ObjectParser just by adding deprecation markers to the relevant ParseField objects. The warnings themselves don't currently have any context - they simply say that a deprecated field has been used, but not where in the input xcontent it appears. This commit adds the parent object parser name and XContentLocation to these deprecation messages. Note that the context is automatically stripped from warning messages when they are asserted on by integration tests and REST tests, because randomization of xcontent type during these tests means that the XContentLocation is not constant	2020-03-20 11:52:55 +00:00
Alan Woodward	580bc40c0c	Make it possible to deprecate all variants of a ParseField with no replacement (#53722 ) Sometimes we want to deprecate and remove a ParseField entirely, without replacement; for example, the various places where we specify a _type field in 7x. Currently we can tell users only that a particular field name should not be used, and that another name should be used in its place. This commit adds the ability to say that a field should not be used at all.	2020-03-18 14:16:19 +00:00
Ryan Ernst	5c472fcb47	Upgrade jackson to 2.10.3 and GeoIP to 2.13.1 (#53642 ) Re-applies the change from #53523 along with test fixes. closes #53626 closes #53624 closes #53622 closes #53625 Co-authored-by: Nik Everett <nik9000@gmail.com> Co-authored-by: Lee Hinman <dakrone@users.noreply.github.com> Co-authored-by: Jake Landis <jake.landis@elastic.co>	2020-03-17 10:28:51 -07:00
Mark Vieira	2f0aca992b	Revert "Upgrade to Jackson 2.10.3 and GeoIP2 to 2.13.1 (#53576 )" This reverts commit `b7dbadeea0`.	2020-03-15 18:10:40 -07:00

1 2

99 commits