elasticsearch/docs/reference/aggregations/metrics/geoline-aggregation.asciidoc
Tal Levy cd7d1c9183
Add geo_line aggregation (#41612) (#65442)
A metric aggregation that aggregates a set of points as
a GeoJSON LineString ordered by some sort parameter.

A `geo_line` aggregation request would specify a `geo_point` field, as well
as a `sort` field. `geo_point` represents the values used in the LineString,
while the `sort` values will be used as the total ordering of the points.

the `sort` field would support any numeric field, including date.

```
{
	"query": {
		"bool": {
			"must": [
				{ "term": { "person": "004" } },
				{ "term": { "trajectory": "20090131002206.plt" } }
			]
		}
	},
	"aggs": {
		"make_line": {
			"geo_line": {
				"point": {"field": "location"},
				"sort": { "field": "timestamp" },
                                "include_sort": true,
                                "sort_order": "desc",
                                "size": 15
			}
		}
	}
}
```

```
{
    "took": 21,
    "timed_out": false,
    "_shards": {...},
    "hits": {...},
    "aggregations": {
        "make_line": {
            "type": "LineString",
            "coordinates": [
                [
                    121.52926194481552,
                    38.92878997139633
                ],
                [
                    121.52922699227929,
                    38.92876998055726
                ],
             ]
        }
    }
}
```

Due to the cardinality of points, an initial max of 10k points
will be used. This should support many use-cases.

One solution to overcome this limitation is to keep a PriorityQueue of
points, and simplifying the line once it hits this max. If simplifying
makes sense, it may be a nice option, in general. The ability to use a parameter
to specify how aggressive one wants to simplify. This parameter could be
the number of points. Example algorithm one could use with a PriorityQueue:
https://bost.ocks.org/mike/simplify/. This would still require O(m) space, where m
is the number of points returned. And would also require heapifying triangles
sorted by their areas, which would be O(log(m)) operations. Since sorting is done,
anyways, simplifying would still be a O(n log(m)) operation, where n is the total number
of points to filter........... something to explore

closes #41649
2020-11-24 09:30:05 -08:00

143 lines
2.9 KiB
Text

[role="xpack"]
[testenv="gold"]
[[search-aggregations-metrics-geo-line]]
=== Geo-Line Aggregation
++++
<titleabbrev>Geo-Line</titleabbrev>
++++
The `geo_line` aggregation aggregates all `geo_point` values within a bucket into a LineString ordered
by the chosen `sort` field. This `sort` can be a date field, for example. The bucket returned is a valid
https://tools.ietf.org/html/rfc7946#section-3.2[GeoJSON Feature] representing the line geometry.
[source,console,id=search-aggregations-metrics-geo-line-simple]
----
PUT test
{
"mappings": {
"dynamic": "strict",
"_source": {
"enabled": false
},
"properties": {
"my_location": {
"type": "geo_point"
},
"group": {
"type": "keyword"
},
"@timestamp": {
"type": "date"
}
}
}
}
POST /test/_bulk?refresh
{"index": {}}
{"my_location": {"lat":37.3450570, "lon": -122.0499820}, "@timestamp": "2013-09-06T16:00:36"}
{"index": {}}
{"my_location": {"lat": 37.3451320, "lon": -122.0499820}, "@timestamp": "2013-09-06T16:00:37Z"}
{"index": {}}
{"my_location": {"lat": 37.349283, "lon": -122.0505010}, "@timestamp": "2013-09-06T16:00:37Z"}
POST /test/_search?filter_path=aggregations
{
"aggs": {
"line": {
"geo_line": {
"point": {"field": "my_location"},
"sort": {"field": "@timestamp"}
}
}
}
}
----
Which returns:
[source,js]
----
{
"aggregations": {
"line": {
"type" : "Feature",
"geometry" : {
"type" : "LineString",
"coordinates" : [
[
-122.049982,
37.345057
],
[
-122.050501,
37.349283
],
[
-122.049982,
37.345132
]
]
},
"properties" : {
"complete" : true
}
}
}
}
----
// TESTRESPONSE
[[search-aggregations-metrics-geo-line-options]]
==== Options
`point`::
(Required)
This option specifies the name of the `geo_point` field
Example usage configuring `my_location` as the point field:
[source,js]
----
"point": {
"field": "my_location"
}
----
// NOTCONSOLE
`sort`::
(Required)
This option specifies the name of the numeric field to use as the sort key
for ordering the points
Example usage configuring `@timestamp` as the sort key:
[source,js]
----
"point": {
"field": "@timestamp"
}
----
// NOTCONSOLE
`include_sort`::
(Optional, boolean, default: `false`)
This option includes, when true, an additional array of the sort values in the
feature properties.
`sort_order`::
(Optional, string, default: `"ASC"`)
This option accepts one of two values: "ASC", "DESC".
The line is sorted in ascending order by the sort key when set to "ASC", and in descending
with "DESC".
`size`::
(Optional, integer, default: `10000`)
The maximum length of the line represented in the aggregation. Valid sizes are
between one and 10000.