elasticsearch

mirror of https://github.com/elastic/elasticsearch.git synced 2025-04-23 06:37:27 -04:00

Author	SHA1	Message	Date
Craig Taverner	40c34cd896	Optimize ST_EXTENT_AGG for geo_shape and cartesian_shape (#119889 ) Support for `ST_EXTENT_AGG` was added in https://github.com/elastic/elasticsearch/pull/118829, and then partially optimized in https://github.com/elastic/elasticsearch/pull/118829. This optimization worked only for cartesian_shape fields, and worked by extracting the Extent from the doc-values and re-encoding it as a WKB `BBOX` geometry. This does not work for geo_shape, where we need to retain all 6 integers stored in the doc-values, in order to perform the datelline choice only at reduce time during the final phase of the aggregation. Since both geo_shape and cartesian_shape perform the aggregations using integers, and the original Extent values in the doc-values are integers, this PR expands the previous optimization by: * Saving all Extent values into a multi-valued field in an IntBlock for both cartesian_shape and geo_shape * Simplifying the logic around merging intermediate states for all cases (geo/cartesian and grouped and non-grouped aggs) * Widening test cases for testing more combinations of aggregations and types, and fixing a few bugs found * Enhancing cartesian extent to convert from 6 ints to 4 ints at block loading time (for efficiency) * Fixing bugs in both cartesian and geo extents for generating intermediate state with missing groups (flaky tests in serverless) * Moved the int order to always match Rectangle for 4-int and Extent for 6-int cases (improved internal consistency) Since the PR already changed the meaning of the invalid/infinite values of the intermediate state integers, it was already not compatible with the previous cluster versions. We disabled mixed-cluster testing to prevent errors as a result of that. This leaves us the opportunity to make further changes that are mixed-cluster incompatible, hence the decision to perform this consistency update now.	2025-01-16 19:43:51 +01:00
Gal Lalouche	2be4cd983f	ESQL: Support ST_EXTENT_AGG (#117451 ) This PR adds support for ST_EXTENT_AGG aggregation, i.e., computing a bounding box over a set of points/shapes (Cartesian or geo). Note the difference between this aggregation and the already implemented scalar function ST_EXTENT. This isn't a very efficient implementation, and future PRs will attempt to read these extents directly from the doc values. We currently always use longitude wrapping, i.e., we may wrap around the dateline for a smaller bounding box. Future PRs will let the user control this behavior. Fixes #104659.	2024-12-13 12:41:24 +02:00
Craig Taverner	c7e985c3b6	Support ST_ENVELOPE and related ST_XMIN, etc. (#116964 ) Support ST_ENVELOPE and related ST_XMIN, etc. Based on the PostGIS equivalents: https://postgis.net/docs/ST_Envelope.html https://postgis.net/docs/ST_XMin.html https://postgis.net/docs/ST_XMax.html https://postgis.net/docs/ST_YMin.html https://postgis.net/docs/ST_YMax.html	2024-12-04 12:20:47 +01:00
Ryan Ernst	e5d5c17c99	Use directory name as project name for libs (#115720 ) The libs projects are configured to all begin with `elasticsearch-`. While this is desireable for the artifacts to contain this consistent prefix, it means the project names don't match up with their directories. Additionally, it creates complexities for subproject naming that must be manually adjusted. This commit adjusts the project names for those under libs to be their directory names. The resulting artifacts for these libs are kept the same, all beginning with `elasticsearch-`.	2024-10-29 13:02:28 -07:00
Mark Vieira	a59c182f9f	Add AGPLv3 as a supported license	2024-09-13 15:29:46 -07:00
Craig Taverner	097fc0654f	Add maximum nested depth check to WKT parser (#111843 ) * Add maximum nested depth check to WKT parser This prevents StackOverflowErrors, replacing them with ParseException errors, which is more easily managed by running servers. * Update docs/changelog/111843.yaml	2024-08-13 20:07:52 +02:00
Simon Cooper	1b8baf1cf8	Convert most uses of BaseMatcher to TypeSafeMatcher (#105764 )	2024-03-11 09:12:42 +00:00
Ignacio Vera	a6b36eb20a	Add the possibility to transform WKT to WKB directly (#104030 ) enhancement to geo utilities.	2024-01-08 16:10:32 +01:00
Ignacio Vera	9fb550be44	WellKnownBinary#toWKB should not throw an IOException (#100669 ) The only reason this method is throwing an exception is because the method ByteArrayOutputStream#close() is declaring it although it is a noop. Therefore it can be safely ignored. Thanks @romseygeek for bringing into attention.	2023-10-11 08:24:32 -04:00
Armin Braun	b7eafce32c	Make some practically static methods static (#97565 ) Another round of automated fixes to this, marking things that can be made static as static. Saves some JIT cycles but also turns some lambdas from capturing to non-capturing and makes the "utilityness" of some classes visible.	2023-10-06 23:37:07 +02:00
Craig Taverner	9dbce6f6c7	Asset tracking: geo_line for TSDB (#94954 ) * WIP Started geo_line for TSDB work Starting with YAML tests (which currently pass) and AggregatorTests (currently failing, likely due to mistake in the tests) * Update docs/changelog/94954.yaml * WIP Refactoring to prepare for TSDB geo_line * Created TimeSeries version of GeoLineAggregator, and wired it in so that time-series aggregations use it, but current behavior is still identical to non-time-series. * Added both yaml and unit tests for testing that geo_line works with correct results in both time-series and non-time-series cases. * Added additional tests to verify the grouping behaviour of time-series vs. terms aggs, and the combination of the two. * WIP Refactoring to prepare for TSDB geo_line * Started refactoring to re-use simplifier for all buckets * Fixed bug with leaf collector not changing per segment * Fixed bug with leaf collector not detecting bucket changes The bucket id can change within a segment, so we need to detect this and save the geo_line. * Renamed class since it no longer extends BucketedSort The original geo_line relied on the BucketedSort for all intelligence. The time-series geo_line uses none of that, and does its own memory management. * Fixed bug with geo_point leaking between geo_line buckets And enhanced unit tests to cover multiple groups * Code review updates * Verify that the sort field is specifically the TS timestamp Only activate the time-series optimizations if the aggregation is both: * Within a time-series aggregation (ie. tsid and @timestamp ordered) * The geo_line sort field is @timestamp * Allow geo_point time-series to skip sort config Also disables the new geo_line for time-series even if the correct sort and point fields are used if the point field is not explicitly configured to be a position metric. * Support geo_centroid and geo_bounds on position metric * Update yaml tests for multi-terms tests * Changed to disallow alternative sort-fields in ts-geo_line Since the primary criteria for switching to the new algorithm is that geo_line is within a time-series aggregation, we now disallow any other sort field. We test the negative case in the yaml tests, but changed the unit tests to use TermsAggregation to minim the time-series aggregation to get comparable results. * For non-time-series check missing sort field early The old code only threw error if there was data because the check was done inside the leaf collector just before actually reading the sort field. And there were no tests for missing sort field. This commit adds the tests, and checks early so even if data is missing. * Reviewed TODOs * Test that behaviour is identical with or without POSITION metric * Removed fallback code in builder (was switching to old geo_line without POSITION metric) * Removed two TODO's that are no longer valid concerns	2023-06-15 14:58:25 +02:00
Craig Taverner	38baa36346	Geometry simplifier (#94859 ) Support geometry and streaming simplification There are many opportunities to enable geometry simplification in Elasticsearch, both as an explicit feature available to users, and as an internal optimization technique for reducing memory consumption for complex geometries. For the latter case, it can even be considered a bug fix. This PR provides support for constraining Line and LinearRing sizes to a fixed number of points, and thereby a fixed amount of memory usage. Consider, for example, the geo_line aggregation. This is similar to the top-10 aggregation, but allows the top-10k (ten thousand) points to be aggregated. This is not only a lot of memory, but can still cause unwanted line truncation for very large geometries. Line simplification is a solution to this. It is likely that a much smaller limit than 10k would suffice, while at the same time not truncating the geometry at all, so we fix a bug (truncation) while improving memory usage (pull limit from 10k down to perhaps just 1k). This PR provides two APIs: Streaming: * By using the simplifier.consume(x, y) method on a stream of points, the total memory used is limited to a linear function of k, the total number of points to retain. This algorithm is at its heart based on the Visvalingam–Whyatt algorithm, with concepts from https://bost.ocks.org/mike/simplify/ and in particular the detailed streaming discussions in the paper at https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.106.7132&rep=rep1&type=pdf Full-geometry: * Simplifying full geometries using the simplifier.simplify(geometry) method can work with most geometry types, even GeometryCollection, but: - Some geometries do not get simplified because it makes no sense to: Point, Circle, Rectangle - The maxPoints parameter is used as is to apply to the main component (shell for polygons, largest geometry for multi-polygons and geometry collections), and all other sub-components (holes in polygons, etc.) are simplified to a scaled down version of the maxPoints, scaled by the relative size of the sub-component to the main component. * The simplification itself is done on each Line and LinearRing component using the same streaming algorithm above. Since we use the Visvalingam–Whyatt algorithm, this works is applicable to both streaming and full-geometry simplification with the same essential result, but better control over memory than normal full-geometry simplifiers. The basic algorithm for simplification on a stream of points requires maintaining two data structures: * an array of all currently simplified points (implicitly ordered in stream order) * a priority queue of all but the two end points with an estimated error on each that expresses the cost of removing that point from the line	2023-05-11 17:17:04 +02:00
Ignacio Vera	ab1d891340	Support for store parameter in geo_shape field (#94418 ) This commit adds support for storing geo_shape fields as a lucene stored field. The geometry will be stored normalised in well-known binary format (WKB).	2023-03-21 12:57:58 +01:00
Simon Cooper	4c46ccacaa	Migrate the remaining uses of Version to TransportVersion (#93384 ) Remove get/setVersion methods	2023-02-13 09:15:53 +00:00
Ievgen Degtiarenko	ad63465c64	Make mutateInstance non abstract (#93229 )	2023-01-26 09:22:27 +01:00
Oleksandr Porunov	7790f616b8	Move SpatialUtils to geo library (#88088 ) Moving parts of the code related to convert circles into polygons.	2022-10-31 12:42:25 +01:00
Tanguy Leroux	f2154e8687	Fix unnecessary string concatenations (#90405 )	2022-09-27 16:14:38 +02:00
Chris Hegarty	3071c6a055	Modularize Elasticsearch (#81066 ) This PR represents the initial phase of Modularizing Elasticsearch (with Java Modules). This initial phase modularizes the core of the Elasticsearch server with Java Modules, which is then used to load and configure extension components atop the server. Only a subset of extension components are modularized at this stage (other components come in a later phase). Components are loaded dynamically at runtime with custom class loaders (same as is currently done). Components with a module-info.class are defined to a module layer. This architecture is somewhat akin to the Modular JDK, where applications run on the classpath. In the analogy, the Elasticsearch server modules are the platform (thus are always resolved and present), while components without a module-info.class are non-modular code running atop the Elasticsearch server modules. The extension components cannot access types from non-exported packages of the server modules, in the same way that classpath applications cannot access types from non-exported packages of modules from the JDK. Broadly, the core Elasticseach java modules simply "wrap" the existing packages and export them. There are opportunites to export less, which is best done in more narrowly focused follow-up PRs. The Elasticsearch distribution startup scripts are updated to put jars on the module path (the class path is empty), so the distribution will run the core of the server as java modules. A number of key components have been retrofitted with module-info.java's too, and the remaining components can follow later. Unit and functional tests run as non-modular (since they commonly require package-private access), while higher-level integration tests, that run the distribution, run as modular. Co-authored-by: Chris Hegarty <christopher.hegarty@elastic.co> Co-authored-by: Ryan Ernst <ryan@iernst.net> Co-authored-by: Rene Groeschke <rene@elastic.co>	2022-05-20 13:11:42 +01:00
Chris Hegarty	50528d5d79	Add missing explicit no-args ctors (#84763 )	2022-03-09 11:08:48 +00:00
Ignacio Vera	ad036be98c	Add WKB support to lib geo (#82706 ) Adds a WKB reader/writer in the same fashion as the existing WKT reader/writer.	2022-01-20 08:03:10 +01:00
Artem Prigoda	0699c9351f	Use Java 14 switch expressions (#82178 ) JEP 361[https://openjdk.java.net/jeps/361] added support for switch expressions which can be much more terse and less error-prone than switch statements. Another useful feature of switch expressions is exhaustiveness: we can make sure that an enum switch expression covers all the cases at compile time.	2022-01-10 09:53:35 +01:00
Mark Vieira	12ad399c48	Reformat Elasticsearch source	2021-10-27 08:19:51 -07:00
Rory Hunter	e55edf937a	Fix shadowed variables in various places - part 1 (#77555 ) Part of #19752. Fix a number of locations where local variables or parameters are shadowing a field that is defined in the same class.	2021-09-13 13:48:46 +01:00
Ignacio Vera	07715438b5	Refactor of GeoShape integration tests (#77052 ) This commit joins GeoFilterIT and GeoShapeIntegrationIT into one test case called GeoShapeIntegTestCase which is moved into the test framework.	2021-09-01 07:21:15 +02:00
Ignacio Vera	d42c0cf016	Make methods in GeoJson and WellKnownText static (#73805 ) This change makes all methods on those utility classes static.	2021-06-08 08:37:29 +02:00
Marten	8e46acf6ba	Fix typo in Rectangle() error message (#73124 ) Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>	2021-05-18 10:58:44 -04:00
Rory Hunter	d181b947c2	Remove depth limit from checkstyle negation rule (#70274 ) The Checkstyle rule that bans unary negation in favour of an explicit `== false` has a `maximumDepth` of 2 configured, which meant that it didn't catch all violations. The `maximumDepth` isn't required (actually it has a really high default), so this change removes the limit and fixes the resulting violations.	2021-03-10 22:06:50 +00:00
Rory Hunter	780f273067	Replace NOT operator with explicit `false` check - part 8 (#68625 ) Part 8. We have an in-house rule to compare explicitly against `false` instead of using the logical not operator (`!`). However, this hasn't historically been enforced, meaning that there are many violations in the source at present. We now have a Checkstyle rule that can detect these cases, but before we can turn it on, we need to fix the existing violations. This is being done over a series of PRs, since there are a lot to fix.	2021-02-08 15:20:34 +00:00
Mark Vieira	a92a647b9f	Update sources with new SSPL+Elastic-2.0 license headers As per the new licensing change for Elasticsearch and Kibana this commit moves existing Apache 2.0 licensed source code to the new dual license SSPL+Elastic license 2.0. In addition, existing x-pack code now uses the new version 2.0 of the Elastic license. Full changes include: - Updating LICENSE and NOTICE files throughout the code base, as well as those packaged in our published artifacts - Update IDE integration to now use the new license header on newly created source files - Remove references to the "OSS" distribution from our documentation - Update build time verification checks to no longer allow Apache 2.0 license header in Elasticsearch source code - Replace all existing Apache 2.0 license headers for non-xpack code with updated header (vendored code with Apache 2.0 headers obviously remains the same). - Replace all Elastic license 1.0 headers with new 2.0 header in xpack.	2021-02-02 16:10:53 -08:00
Rory Hunter	ad1f876daa	Replace NOT operator with explicit `false` check (#67817 ) We have an in-house rule to compare explicitly against `false` instead of using the logical not operator (`!`). However, this hasn't historically been enforced, meaning that there are many violations in the source at present. We now have a Checkstyle rule that can detect these cases, but before we can turn it on, we need to fix the existing violations. This is being done over a series of PRs, since there are a lot to fix.	2021-01-26 14:47:09 +00:00
Rene Groeschke	680ea07f7f	Remove deprecated usage of testCompile configuration (#57921 ) * Remove usage of deprecated testCompile configuration * Replace testCompile usage by testImplementation * Make testImplementation non transitive by default (as we did for testCompile) * Update CONTRIBUTING about using testImplementation for test dependencies * Fail on testCompile configuration usage	2020-06-12 13:34:53 +02:00
Ryan Ernst	c0ee68b0a0	Move publishing configuration to a separate plugin (#56727 ) This is another part of the breakup of the massive BuildPlugin. This PR moves the code for configuring publications to a separate plugin. Most of the time these publications are jar files, but this also supports the zip publication we have for integ tests.	2020-05-14 18:56:59 -07:00
Tal Levy	f26ee1298d	Add geo_shape support for geotile_grid and geohash_grid (#55966 ) this commit adds aggregation support for the geo_shape field type on geo*_grid aggregations. it introduces a Tiler for both tiles and hashes that enables a new type of ValuesSource to replace the GeoPoint's CellIdSource. This makes it possible for the existing Aggregator to be re-used, so no new implementations of the grid aggregators are added.	2020-05-05 08:00:16 -07:00
Ryan Ernst	842ce32870	Use task avoidance with forbidden apis (#55034 ) Currently forbidden apis accounts for 800+ tasks in the build. These tasks are aggressively created by the plugin. In forbidden apis 3.0, we will get task avoidance (https://github.com/policeman-tools/forbidden-apis/pull/162), but we need to ourselves use the same task avoidance mechanisms to not trigger these task creations. This commit does that for our foribdden apis usages, in preparation for upgrading to 3.0 when it is released.	2020-04-15 13:23:55 -07:00
Mark Vieira	0e55fdeae9	Re-add origin url information to publish POM files (#55171 )	2020-04-14 11:48:36 -07:00
Dominic Page	d1cbdfb753	Geo shape query vs geo point (#52382 ) Enable geo_shape query to work on geo_point fields for shapes: circle, polygon, multipolygon, rectangle see: #48928 Co-Authored-By: @iverase	2020-03-18 17:03:52 +01:00
Ryan Ernst	bf317e8c4e	Remove comparison to true for booleans (#51723 ) While we use `== false` as a more visible form of boolean negation (instead of `!`), the true case is implied and the true value does not need to explicitly checked. This commit converts cases that have slipped into the code checking for `== true`.	2020-01-31 16:34:27 -08:00
Nik Everett	5ea750f2ca	Clean up wire test case a bit (#50627 ) * Adds JavaDoc to `AbstractWireTestCase` and `AbstractWireSerializingTestCase` so it is more obvious you should prefer the latter if you have a choice * Moves the `instanceReader` method out of `AbstractWireTestCase` becaue it is no longer used. * Marks a bunch of methods final so it is more obvious which classes are for what. * Cleans up the side effects of the above.	2020-01-05 14:42:34 -05:00
Igor Motov	a26e4d1e5e	Geo: Switch generated WKT to upper case (#50285 ) Switches generated WKT to upper case to conform to the standard recommendation. Relates #49568	2019-12-18 07:28:56 -10:00
Rory Hunter	3a3e5f6176	Apply 2-space indent to all gradle scripts (#48849 ) Closes #48724. Update `.editorconfig` to make the Java settings the default for all files, and then apply a 2-space indent to all `*.gradle` files. Then reformat all the files.	2019-11-13 10:14:04 +00:00
Alpar Torok	ca54b442bf	Remove eclipse conditionals (#44075 ) * Remove eclipse conditionals We used to have some meta projects with a `-test` prefix because historically eclipse could not distinguish between test and main source-sets and could only use a single classpath. This is no longer the case for the past few Eclipse versions. This PR adds the necessary configuration to correctly categorize source folders and libraries. With this change eclipse can import projects, and the visibility rules are correct e.x. auto compete doesn't offer classes from test code or `testCompile` dependencies when editing classes in `main`. Unfortunately the cyclic dependency detection in Eclipse doesn't seem to take the difference between test and non test source sets into account, but since we are checking this in Gradle anyhow, it's safe to set to `warning` in the settings. Unfortunately there is no setting to ignore it. This might cause problems when building since Eclipse will probably not know the right order to build things in so more wirk might be necesarry.	2019-10-03 10:50:46 +03:00
Igor Motov	13a8835e5a	Geo: Change order of parameter in Geometries to lon, lat (#45332 ) Changes the order of parameters in Geometries from lat, lon to lon, lat and moves all Geometry classes are moved to the org.elasticsearch.geomtery package. Closes #45048	2019-08-09 13:22:00 -04:00
Igor Motov	612e7e5776	GEO: Switch to using GeoTestUtil to generate random geo shapes (#44635 ) Switches to more robust way of generating random test geometries by reusing lucene's GeoTestUtil. Removes duplicate random geometry generators by moving them to the test framework. Closes #37278	2019-07-22 08:51:03 -04:00
Igor Motov	6bd185317e	Geo: add validator that only checks altitude (#43893 ) By default, we don't check ranges while indexing geo_shapes. As a result, it is possible to index geoshapes that contain contain coordinates outside of -90 +90 and -180 +180 ranges. Such geoshapes will currently break SQL and ML retrieval mechanism. This commit removes these restriction from the validator is used in SQL and ML retrieval.	2019-07-10 10:20:39 -04:00
Igor Motov	8029b479b8	Geo: Makes coordinate validator in libs/geo plugable (#43657 ) Moves coordinate validation from Geometry constructors into parser. Relates #43644	2019-06-27 13:34:33 -04:00
Igor Motov	f6a06d8b22	Geo: Add coerce support to libs/geo WKT parser (#43273 ) Adds support for coercing not closed polygons and ignoring Z value to libs/geo WKT parser. Closes #43173	2019-06-18 07:03:45 -07:00
Mark Vieira	12d583dbf6	Remove unnecessary usage of Gradle dependency substitution rules (#42773 )	2019-06-03 16:18:45 -07:00
Igor Motov	28ad74f889	Geo: Refactor libs/geo parsers (#42549 ) Refactors the WKT and GeoJSON parsers from an utility class into an instantiatable objects. This is a preliminary step in preparation for moving out coordinate validators from Geometry constructors. This should allow us to make validators plugable.	2019-05-29 20:05:12 -04:00
Igor Motov	6d3fd8401d	Geo: Add GeoJson parser to libs/geo classes (#41575 ) Adds GeoJson parser for Geometry classes defined in libs/geo. Relates #40908 and #29872	2019-04-29 13:40:30 -04:00
Nick Knize	a8870ef98c	Refactor GeoHashUtils (#40869 ) This commit refactors GeoHashUtils class into a new Geohash utility class located in the ES geo library. The intent is to not only better control what geo methods are whitelisted for painless scripting but to clean up the geo utility API in general.	2019-04-25 11:59:13 -05:00

1 2

55 commits