rate aggregation should support being a sub-aggregation
of a composite agg.
The catch is that the composite aggregation source
must be a date histogram. Other sources can be present
but their must be exactly one date histogram source
otherwise the rate aggregation does not know which
interval to compare its unit rate to.
closes https://github.com/elastic/elasticsearch/issues/76988
Changes:
* Use "geopoint" when not referring to the literal field type
* Use "geoshape" when not referring to the literal field type or query type
* Use "GeoJSON" consistently
This replaces the `script` docs for bucket aggregations with runtime
fields. We expect runtime fields to be nicer to work with because you
can also fetch them or filter on them. We expect them to be faster
because their don't need this sort of `instanceof` tree:
a92a647b9f/server/src/main/java/org/elasticsearch/search/aggregations/support/values/ScriptDoubleValues.java (L42)
Relates to #69291
Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>
Co-authored-by: Adam Locke <adam.locke@elastic.co>
In some cases when the rate aggregation is not a child of a date histogram
aggregation, it is not possible to determine the actual size of the date
histogram bucket. In this case the rate aggregation now throws an exception.
Closes#63703
A metric aggregation that aggregates a set of points as
a GeoJSON LineString ordered by some sort parameter.
#### specifics
A `geo_line` aggregation request would specify a `geo_point` field, as well
as a `sort` field. `geo_point` represents the values used in the LineString,
while the `sort` values will be used as the total ordering of the points.
the `sort` field would support any numeric field, including date.
#### sample usage
```
{
"query": {
"bool": {
"must": [
{ "term": { "person": "004" } },
{ "term": { "trajectory": "20090131002206.plt" } }
]
}
},
"aggs": {
"make_line": {
"geo_line": {
"point": {"field": "location"},
"sort": { "field": "timestamp" },
"include_sort": true,
"sort_order": "desc",
"size": 15
}
}
}
}
```
#### sample response
```
{
"took": 21,
"timed_out": false,
"_shards": {...},
"hits": {...},
"aggregations": {
"make_line": {
"type": "LineString",
"coordinates": [
[
121.52926194481552,
38.92878997139633
],
[
121.52922699227929,
38.92876998055726
],
]
}
}
}
```
#### visual response
<img width="540" alt="Screen Shot 2019-04-26 at 9 40 07 AM" src="https://user-images.githubusercontent.com/388837/56834977-cf278e00-6827-11e9-9c93-005ed48433cc.png">
#### limitations
Due to the cardinality of points, an initial max of 10k points
will be used. This should support many use-cases.
One solution to overcome this limitation is to keep a PriorityQueue of
points, and simplifying the line once it hits this max. If simplifying
makes sense, it may be a nice option, in general. The ability to use a parameter
to specify how aggressive one wants to simplify. This parameter could be
the number of points. Example algorithm one could use with a PriorityQueue:
https://bost.ocks.org/mike/simplify/. This would still require O(m) space, where m
is the number of points returned. And would also require heapifying triangles
sorted by their areas, which would be O(log(m)) operations. Since sorting is done,
anyways, simplifying would still be a O(n log(m)) operation, where n is the total number
of points to filter........... something to explore
closes#41649
This PR adds support for the 'fields' option in the following places:
* Anytime `inner_hits` is used, for both fetching nested/ child docs and field collapsing
* The `top_hits` aggregation
Addresses #61949.
Changes:
* Moves `Retrieve selected fields` to its own page and adds a title abbreviation.
* Adds existing script and stored fields content to `Retrieve selected fields`
* Adds a xref for `Retrieve selected fields` to `Search your data`
* Adds related redirects and updates existing xrefs
Plugin discovery documentation contained information about installing
Elasticsearch 2.0 and installing an oracle JDK, both of which is no
longer valid.
While noticing that the instructions used cleartext HTTP to install
packages, this commit replaces HTTPs links instead of HTTP where possible.
In addition a few community links have been removed, as they do not seem
to exist anymore.
Moves the search sort docs from the deprecated 'Request Body Search'
page to a new subpage of 'Run a search'.
No substantive changes were made to the content.
Moves the highlighting docs from the deprecated 'Request Body Search'
chapter to the new subpage of the 'Run a search chapter' section.
No substantive changes were made to the content.
Changes:
* Condenses and relocates the `docvalue_fields` example to the 'Run a search'
page.
* Adds docs for the `docvalue_fields` request body parameter.
* Updates several related xrefs.
Co-authored-by: debadair <debadair@elastic.co>
Per 49554 I added standard deviation sampling and variance sampling to the extended stats interface.
Closes#49554
Co-authored-by: Igor Motov <igor@motovs.org>
Implements value_count and avg aggregations over Histogram fields as discussed in #53285
- value_count returns the sum of all counts array of the histograms
- avg computes a weighted average of the values array of the histogram by multiplying each value with its associated element in the counts array
Adds support for filters to T-Test aggregation. The filters can be used to
select populations based on some criteria and use values from the same or
different fields.
Closes#53692
Adds t_test metric aggregation that can perform paired and unpaired two-sample
t-tests. In this PR support for filters in unpaired is still missing. It will
be added in a follow-up PR.
Relates to #53692
* Removes experimental.
* Replaces `"v"` (for value) with `"m"` (for metric).
* Move the note about tiebreaking into the list of limitations of the
sort.
* Explain how you ask for `metrics`.
* Clean up some wording.
* Link to the docs from `top_metrics`.
Closes#51813
This changes the `top_metrics` aggregation to return metrics in their
original type. Since it only supports numerics, that means that dates,
longs, and doubles will come back as stored, with their appropriate
formatter applied.
The `top_metrics` agg is kind of like `top_hits` but it only works on
doc values so it *should* be faster.
At this point it is fairly limited in that it only supports a single,
numeric sort and a single, numeric metric. And it only fetches the "very
topest" document worth of metric. We plan to support returning a
configurable number of top metrics, requesting more than one metric and
more than one sort. And, eventually, non-numeric sorts and metrics. The
trick is doing those things fairly efficiently.
Co-Authored by: Zachary Tong <zach@elastic.co>