mirror of
https://github.com/elastic/elasticsearch.git
synced 2025-06-28 17:34:17 -04:00
* Documentation for geohex_grid over geo_shape The feature to add support for geohex_grid aggregations over geo_shape fields was added in https://github.com/elastic/elasticsearch/pull/91956. This is the associated documentation for that. * Update docs/reference/aggregations/bucket/geohexgrid-aggregation.asciidoc Co-authored-by: Abdon Pijpelink <abdon.pijpelink@elastic.co> * Fix explanation for geo_point vs geo_shape proj When aggregating geohex over geoshape we use requirectangular because underlying lucene index indexes and searches the polygons in that way. * Correct spelling According to grammarly, "therefor" is not an alternative spelling of "therefore". We should use the conjunctive form here. See https://www.grammarly.com/blog/therefore-vs-therefor/ Co-authored-by: Abdon Pijpelink <abdon.pijpelink@elastic.co>
274 lines
8.7 KiB
Text
274 lines
8.7 KiB
Text
[role="xpack"]
|
|
[[search-aggregations-bucket-geohexgrid-aggregation]]
|
|
=== Geohex grid aggregation
|
|
++++
|
|
<titleabbrev>Geohex grid</titleabbrev>
|
|
++++
|
|
|
|
A multi-bucket aggregation that groups <<geo-point,`geo_point`>> and
|
|
<<geo-shape,`geo_shape`>> values into buckets that represent a grid.
|
|
The resulting grid can be sparse and only
|
|
contains cells that have matching data. Each cell corresponds to a
|
|
https://h3geo.org/docs/core-library/h3Indexing#h3-cell-indexp[H3 cell index] and is
|
|
labeled using the https://h3geo.org/docs/core-library/h3Indexing#h3index-representation[H3Index representation].
|
|
|
|
See https://h3geo.org/docs/core-library/restable[the table of cell areas for H3
|
|
resolutions] on how precision (zoom) correlates to size on the ground.
|
|
Precision for this aggregation can be between 0 and 15, inclusive.
|
|
|
|
WARNING: High-precision requests can be very expensive in terms of RAM and
|
|
result sizes. For example, the highest-precision geohex with a precision of 15
|
|
produces cells that cover less than one square meter. We recommend you use a
|
|
filter to limit high-precision requests to a smaller geographic area. For an example,
|
|
refer to <<geohexgrid-high-precision>>.
|
|
|
|
[[geohexgrid-low-precision]]
|
|
==== Simple low-precision request
|
|
|
|
[source,console,id=geohexgrid-aggregation-example]
|
|
--------------------------------------------------
|
|
PUT /museums
|
|
{
|
|
"mappings": {
|
|
"properties": {
|
|
"location": {
|
|
"type": "geo_point"
|
|
}
|
|
}
|
|
}
|
|
}
|
|
|
|
POST /museums/_bulk?refresh
|
|
{"index":{"_id":1}}
|
|
{"location": "POINT (4.912350 52.374081)", "name": "NEMO Science Museum"}
|
|
{"index":{"_id":2}}
|
|
{"location": "POINT (4.901618 52.369219)", "name": "Museum Het Rembrandthuis"}
|
|
{"index":{"_id":3}}
|
|
{"location": "POINT (4.914722 52.371667)", "name": "Nederlands Scheepvaartmuseum"}
|
|
{"index":{"_id":4}}
|
|
{"location": "POINT (4.405200 51.222900)", "name": "Letterenhuis"}
|
|
{"index":{"_id":5}}
|
|
{"location": "POINT (2.336389 48.861111)", "name": "Musée du Louvre"}
|
|
{"index":{"_id":6}}
|
|
{"location": "POINT (2.327000 48.860000)", "name": "Musée d'Orsay"}
|
|
|
|
POST /museums/_search?size=0
|
|
{
|
|
"aggregations": {
|
|
"large-grid": {
|
|
"geohex_grid": {
|
|
"field": "location",
|
|
"precision": 4
|
|
}
|
|
}
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
|
|
Response:
|
|
|
|
[source,console-result]
|
|
--------------------------------------------------
|
|
{
|
|
...
|
|
"aggregations": {
|
|
"large-grid": {
|
|
"buckets": [
|
|
{
|
|
"key": "841969dffffffff",
|
|
"doc_count": 3
|
|
},
|
|
{
|
|
"key": "841fb47ffffffff",
|
|
"doc_count": 2
|
|
},
|
|
{
|
|
"key": "841fa4dffffffff",
|
|
"doc_count": 1
|
|
}
|
|
]
|
|
}
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
// TESTRESPONSE[s/\.\.\./"took": $body.took,"_shards": $body._shards,"hits":$body.hits,"timed_out":false,/]
|
|
|
|
[[geohexgrid-high-precision]]
|
|
==== High-precision requests
|
|
|
|
When requesting detailed buckets (typically for displaying a "zoomed in" map),
|
|
a filter like <<query-dsl-geo-bounding-box-query,geo_bounding_box>> should be
|
|
applied to narrow the subject area. Otherwise, potentially millions of buckets
|
|
will be created and returned.
|
|
|
|
[source,console,id=geohexgrid-high-precision-ex]
|
|
--------------------------------------------------
|
|
POST /museums/_search?size=0
|
|
{
|
|
"aggregations": {
|
|
"zoomed-in": {
|
|
"filter": {
|
|
"geo_bounding_box": {
|
|
"location": {
|
|
"top_left": "POINT (4.9 52.4)",
|
|
"bottom_right": "POINT (5.0 52.3)"
|
|
}
|
|
}
|
|
},
|
|
"aggregations": {
|
|
"zoom1": {
|
|
"geohex_grid": {
|
|
"field": "location",
|
|
"precision": 12
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
// TEST[continued]
|
|
|
|
Response:
|
|
|
|
[source,console-result]
|
|
--------------------------------------------------
|
|
{
|
|
...
|
|
"aggregations": {
|
|
"zoomed-in": {
|
|
"doc_count": 3,
|
|
"zoom1": {
|
|
"buckets": [
|
|
{
|
|
"key": "8c1969c9b2617ff",
|
|
"doc_count": 1
|
|
},
|
|
{
|
|
"key": "8c1969526d753ff",
|
|
"doc_count": 1
|
|
},
|
|
{
|
|
"key": "8c1969526d26dff",
|
|
"doc_count": 1
|
|
}
|
|
]
|
|
}
|
|
}
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
// TESTRESPONSE[s/\.\.\./"took": $body.took,"_shards": $body._shards,"hits":$body.hits,"timed_out":false,/]
|
|
|
|
[[geohexgrid-addtl-bounding-box-filtering]]
|
|
==== Requests with additional bounding box filtering
|
|
|
|
The `geohex_grid` aggregation supports an optional `bounds` parameter
|
|
that restricts the cells considered to those that intersect the
|
|
provided bounds. The `bounds` parameter accepts the same
|
|
<<query-dsl-geo-bounding-box-query-accepted-formats,bounding box formats>>
|
|
as the geo-bounding box query. This bounding box can be used with or
|
|
without an additional `geo_bounding_box` query for filtering the points prior to aggregating.
|
|
It is an independent bounding box that can intersect with, be equal to, or be disjoint
|
|
to any additional `geo_bounding_box` queries defined in the context of the aggregation.
|
|
|
|
[source,console,id=geohexgrid-aggregation-with-bounds]
|
|
--------------------------------------------------
|
|
POST /museums/_search?size=0
|
|
{
|
|
"aggregations": {
|
|
"tiles-in-bounds": {
|
|
"geohex_grid": {
|
|
"field": "location",
|
|
"precision": 12,
|
|
"bounds": {
|
|
"top_left": "POINT (4.9 52.4)",
|
|
"bottom_right": "POINT (5.0 52.3)"
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
// TEST[continued]
|
|
|
|
Response:
|
|
|
|
[source,console-result]
|
|
--------------------------------------------------
|
|
{
|
|
...
|
|
"aggregations": {
|
|
"tiles-in-bounds": {
|
|
"buckets": [
|
|
{
|
|
"key": "8c1969c9b2617ff",
|
|
"doc_count": 1
|
|
},
|
|
{
|
|
"key": "8c1969526d753ff",
|
|
"doc_count": 1
|
|
},
|
|
{
|
|
"key": "8c1969526d26dff",
|
|
"doc_count": 1
|
|
}
|
|
]
|
|
}
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
// TESTRESPONSE[s/\.\.\./"took": $body.took,"_shards": $body._shards,"hits":$body.hits,"timed_out":false,/]
|
|
|
|
[discrete]
|
|
[role="xpack"]
|
|
[[geohexgrid-aggregating-geo-shape]]
|
|
==== Aggregating `geo_shape` fields
|
|
|
|
Aggregating on <<geo-shape>> fields works almost as it does for points. There are two key differences:
|
|
|
|
* When aggregating over `geo_point` data, points are considered within a hexagonal tile if they lie
|
|
within the edges defined by great circles. In other words the calculation is done using spherical coordinates.
|
|
However, when aggregating over `geo_shape` data, the shapes are considered within a hexagon if they lie
|
|
within the edges defined as straight lines on an equirectangular projection.
|
|
The reason is that Elasticsearch and Lucene treat edges using the equirectangular projection at index and search time.
|
|
In order to ensure that search results and aggregation results are aligned, we therefore also use equirectangular
|
|
projection in aggregations.
|
|
For most data, the difference is subtle or not noticed.
|
|
However, for low zoom levels (low precision), especially far from the equator, this can be noticeable.
|
|
For example, if the same point data is indexed as `geo_point` and `geo_shape`, it is possible to get
|
|
different results when aggregating at lower resolutions.
|
|
* As is the case with <<geotilegrid-aggregating-geo-shape,`geotile_grid`>>,
|
|
a single shape can be counted for in multiple tiles. A shape will contribute to the count of matching values
|
|
if any part of its shape intersects with that tile. Below is an image that demonstrates this:
|
|
|
|
|
|
image:images/spatial/geoshape_hexgrid.png[]
|
|
|
|
==== Options
|
|
|
|
[horizontal]
|
|
field::
|
|
(Required, string) Field containing indexed geo-point or geo-shape values.
|
|
Must be explicitly mapped as a <<geo-point,`geo_point`>> or a <<geo-shape,`geo_shape`>> field.
|
|
If the field contains an array, `geohex_grid` aggregates all array values.
|
|
|
|
precision::
|
|
(Optional, integer) Integer zoom of the key used to define cells/buckets in
|
|
the results. Defaults to `6`. Values outside of [`0`,`15`] will be rejected.
|
|
|
|
bounds::
|
|
(Optional, object) Bounding box used to filter the geo-points or geo-shapes in each bucket.
|
|
Accepts the same bounding box formats as the
|
|
<<query-dsl-geo-bounding-box-query-accepted-formats,geo-bounding box query>>.
|
|
|
|
size::
|
|
(Optional, integer) Maximum number of buckets to return. Defaults to 10,000.
|
|
When results are trimmed, buckets are prioritized based on the volume of
|
|
documents they contain.
|
|
|
|
shard_size::
|
|
(Optional, integer) Number of buckets returned from each shard. Defaults to
|
|
`max(10,(size x number-of-shards))` to allow for a more accurate count of the
|
|
top cells in the final result. Since each shard could have a different top result order,
|
|
using a larger number here reduces the risk of inaccurate counts, but incurs a performance cost.
|