Commit graph

145 commits

Author SHA1 Message Date
James Rodewig
d84cac0590
[DOCS] Fix typos (#72227) (#72256)
Co-authored-by: Pierre Grimaud <grimaud.pierre@gmail.com>
2021-04-26 14:18:27 -04:00
Nik Everett
121ecb959d
Convert metric aggs docs runtime fields (backport of #71260) (#71298)
This replaces the `script` docs for bucket aggregations with runtime
fields. We expect runtime fields to be nicer to work with because you
can also fetch them or filter on them. We expect them to be faster
because their don't need this sort of `instanceof` tree:
a92a647b9f/server/src/main/java/org/elasticsearch/search/aggregations/support/values/ScriptDoubleValues.java (L42)

Relates to #69291

Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>
Co-authored-by: Adam Locke <adam.locke@elastic.co>
2021-04-05 13:24:19 -04:00
James Rodewig
c757f9e4e7
[DOCS] Fix double spaces (#71082) (#71120) 2021-03-31 11:43:34 -04:00
Nik Everett
ec9c9a884b
Docs: Add example fetching keyword in top_metrics (backport of #69135) (#69141)
Adds an example of fetching a keyword field.
2021-02-17 14:30:08 -05:00
James Rodewig
b55249507e
[DOCS] Fix typos for duplicate words (#69125) (#69132) 2021-02-17 11:16:58 -05:00
James Rodewig
59f9f41cf2
[DOCS] Add missing newline for bulleted list in top_metrics docs (#68481) (#68551)
Co-authored-by: Nathan L Smith <nathan.smith@elastic.co>
2021-02-04 14:49:09 -05:00
Adam Locke
0324892ed5
[DOCS] Adding headers in TOC for aggregation docs. (#66604) (#66607) 2020-12-18 12:00:11 -05:00
Igor Motov
de3ee05b33
Return an error when a rate aggregation cannot calculate bucket sizes (#65429) (#65502)
In some cases when the rate aggregation is not a child of a date histogram
aggregation, it is not possible to determine the actual size of the date
histogram bucket. In this case the rate aggregation now throws an exception.

Closes #63703
2020-11-25 12:27:08 -05:00
Tal Levy
cd7d1c9183
Add geo_line aggregation (#41612) (#65442)
A metric aggregation that aggregates a set of points as
a GeoJSON LineString ordered by some sort parameter.

A `geo_line` aggregation request would specify a `geo_point` field, as well
as a `sort` field. `geo_point` represents the values used in the LineString,
while the `sort` values will be used as the total ordering of the points.

the `sort` field would support any numeric field, including date.

```
{
	"query": {
		"bool": {
			"must": [
				{ "term": { "person": "004" } },
				{ "term": { "trajectory": "20090131002206.plt" } }
			]
		}
	},
	"aggs": {
		"make_line": {
			"geo_line": {
				"point": {"field": "location"},
				"sort": { "field": "timestamp" },
                                "include_sort": true,
                                "sort_order": "desc",
                                "size": 15
			}
		}
	}
}
```

```
{
    "took": 21,
    "timed_out": false,
    "_shards": {...},
    "hits": {...},
    "aggregations": {
        "make_line": {
            "type": "LineString",
            "coordinates": [
                [
                    121.52926194481552,
                    38.92878997139633
                ],
                [
                    121.52922699227929,
                    38.92876998055726
                ],
             ]
        }
    }
}
```

Due to the cardinality of points, an initial max of 10k points
will be used. This should support many use-cases.

One solution to overcome this limitation is to keep a PriorityQueue of
points, and simplifying the line once it hits this max. If simplifying
makes sense, it may be a nice option, in general. The ability to use a parameter
to specify how aggressive one wants to simplify. This parameter could be
the number of points. Example algorithm one could use with a PriorityQueue:
https://bost.ocks.org/mike/simplify/. This would still require O(m) space, where m
is the number of points returned. And would also require heapifying triangles
sorted by their areas, which would be O(log(m)) operations. Since sorting is done,
anyways, simplifying would still be a O(n log(m)) operation, where n is the total number
of points to filter........... something to explore

closes #41649
2020-11-24 09:30:05 -08:00
Mark Tozzi
491a5a08f3
[7.x] Add supports for upper and lower values on boxplot based on the IQR value (#63617) (#64611)
* Add supports for upper and lower values on boxplot based on the IQR value (#63617)

* fix List.of usage
2020-11-05 09:18:27 -05:00
James Rodewig
354602e798
[DOCS] Change agg titles to sentence case (#64425) (#64430) 2020-10-30 13:45:54 -04:00
Igor Motov
5ebe90daa0
Add value_count mode to rate agg (#63687) (#63847)
Adds a new value count mode to the rate aggregation.

Closes #63575
2020-10-19 13:04:38 -04:00
Igor Motov
3bfb11a32a
Add support for histogram fields to rate aggregation (#63289) (#63511)
The rate aggregation now supports histogram fields. At the moment only sum
is supported.

Closes #62939
2020-10-08 19:16:34 -04:00
Christos Soulios
ad79a2b6a1
[7.x] Histogram field type support for min/max aggregations (#62689)
Implement min/max aggregations for histogram fields.

Backports #62532
2020-09-21 12:53:56 +03:00
Julie Tibshirani
4a19bdb2ea
Support the 'fields' option in inner_hits and top_hits. (#62337)
This PR adds support for the 'fields' option in the following places:
* Anytime `inner_hits` is used, for both fetching nested/ child docs and field collapsing
* The `top_hits` aggregation

Addresses #61949.
2020-09-14 11:51:45 -07:00
Igor Motov
f70a59971a
[7.x] Add rate aggregation (#61369) (#61554)
Adds a new rate aggregation that can calculate a document rate for buckets
of a date_histogram.

Closes #60674
2020-08-25 17:39:00 -04:00
James Rodewig
06d3159125
[DOCS] Add usage tips to top_hits agg (#61215) (#61225) 2020-08-17 13:05:40 -04:00
Adam Locke
a3f357c8a5
[DOCS] Update info about geo_shape bounding boxes (#61214) (#61216)
* Adding information about geo_shape bounding boxes.

* Fixing cross link and incorporating review feedback.
2020-08-17 11:44:46 -04:00
James Rodewig
60876a0e32
[DOCS] Replace Wikipedia links with attribute (#61171) (#61209) 2020-08-17 11:27:04 -04:00
James Rodewig
a761985fab
[DOCS] Move script and stored fields content to search fields page (#60826) (#60835)
Changes:

* Moves `Retrieve selected fields` to its own page and adds a title abbreviation.
* Adds existing script and stored fields content to `Retrieve selected fields`
* Adds a xref for `Retrieve selected fields` to `Search your data`
* Adds related redirects and updates existing xrefs
2020-08-06 13:06:06 -04:00
James Rodewig
815f3d526e
[DOCS] Move named query content to bool query (#60748) (#60772) 2020-08-05 13:42:13 -04:00
James Rodewig
a21ec410c7
[DOCS] Replace twitter dataset in search/agg docs (#60667) (#60675) 2020-08-04 14:16:38 -04:00
James Rodewig
5a2c6f0d4f
[DOCS] http -> https, remove outdated plugin docs (#60380) (#60545)
Plugin discovery documentation contained information about installing
Elasticsearch 2.0 and installing an oracle JDK, both of which is no
longer valid.

While noticing that the instructions used cleartext HTTP to install
packages, this commit replaces HTTPs links instead of HTTP where possible.

In addition a few community links have been removed, as they do not seem
to exist anymore.

Co-authored-by: Alexander Reelsen <alexander@reelsen.net>
2020-07-31 16:16:31 -04:00
James Rodewig
2e01f652c1
[DOCS] Move search sort docs to separate page (#60123) (#60142)
Moves the search sort docs from the deprecated 'Request Body Search'
page to a new subpage of 'Run a search'.

No substantive changes were made to the content.
2020-07-23 13:44:47 -04:00
Howard
466e947b0e
[DOCS] Fix missing punctuation in agg docs (#59823) 2020-07-21 10:19:29 -04:00
James Rodewig
ff8a042580
[DOCS] Reformat agg snippets to use two-space indents (#59912) (#59922) 2020-07-20 15:59:00 -04:00
James Rodewig
24fec52447
[DOCS] Add performance warning for scripts (#59890) (#59913) 2020-07-20 15:05:33 -04:00
James Rodewig
a672a2a2d4
[DOCS] Move highlighting docs to separate page (#59768) (#59781)
Moves the highlighting docs from the deprecated 'Request Body Search'
chapter to the new subpage of the 'Run a search chapter' section.

No substantive changes were made to the content.
2020-07-17 10:57:00 -04:00
Tal Levy
11086d5c7d
add geo_shape documentation for supported aggregations (#58284) (#58354)
This commit adds documentation for geo_shape fields in aggregations

Closes #55495.
2020-06-18 12:36:24 -07:00
James Rodewig
d534862d41
[DOCS] Move search API's docvalue_fields examples (#57760) (#57989)
Changes:

* Condenses and relocates the `docvalue_fields` example to the 'Run a search'
   page.
* Adds docs for the `docvalue_fields` request body parameter.
* Updates several related xrefs.

Co-authored-by: debadair <debadair@elastic.co>
2020-06-11 11:25:04 -04:00
Igor Motov
947573f309
Added standard deviation / variance sampling to extended stats (#49782) (#57947)
Per 49554 I added standard deviation sampling and variance sampling to the extended stats interface.
 
Closes #49554

Co-authored-by: Igor Motov <igor@motovs.org>

Co-authored-by: andrewjohnson2 <aj114114@gmail.com>
2020-06-11 09:19:44 -04:00
James Rodewig
b03a83a69d
[DOCS] Fix source filtering xrefs (#57720) (#57725) 2020-06-05 09:05:30 -04:00
Christos Soulios
c65f828cb7
[7.x] Histogram field type support for ValueCount and Avg aggregations (#56099)
Backports #55933 to 7.x

Implements value_count and avg aggregations over Histogram fields as discussed in #53285

- value_count returns the sum of all counts array of the histograms
- avg computes a weighted average of the values array of the histogram by multiplying each value with its associated element in the counts array
2020-05-04 13:23:02 +03:00
Christos Soulios
02bf0c586a
[7.x] Histogram field type support for Sum aggregation (#55916)
Implements Sum aggregation over Histogram fields by summing the value of each bucket multiplied by their count as requested in #53285

Backports #55681 to 7.x
2020-04-29 15:06:12 +03:00
Igor Motov
51c6f69e02
[7.x] Add support for filters to T-Test aggregation (#54980) (#55066)
Adds support for filters to T-Test aggregation. The filters can be used to
select populations based on some criteria and use values from the same or
different fields.

Closes #53692
2020-04-13 12:28:58 -04:00
Igor Motov
2794572a35
[7.x] Add Student's t-test aggregation support (#54469) (#54737)
Adds t_test metric aggregation that can perform paired and unpaired two-sample
t-tests. In this PR support for filters in unpaired is still missing. It will
be added in a follow-up PR.

Relates to #53692
2020-04-06 11:36:47 -04:00
Gil Raphaelli
2984a54b7f [DOCS] Fix typos in top metrics agg docs (#54299) 2020-03-27 10:49:21 -04:00
Paweł Krześniak
c0534f4157 [DOCS] link fix (#53973)
Fix bad link in top_metrics.
2020-03-23 14:20:54 -04:00
Lisa Cawley
278e3fce50 [DOCS] Add anchors for scripted metric aggregations (#53618) 2020-03-16 12:20:41 -07:00
Nik Everett
f7482f794a
Improve top_metrics docs (#53521) (#53619)
* Removes experimental.
* Replaces `"v"` (for value) with `"m"` (for metric).
* Move the note about tiebreaking into the list of limitations of the
  sort.
* Explain how you ask for `metrics`.
* Clean up some wording.
* Link to the docs from `top_metrics`.

Closes #51813
2020-03-16 13:47:43 -04:00
Nik Everett
9dcd64c110
Preserve metric types in top_metrics (backport of #53288) (#53440)
This changes the `top_metrics` aggregation to return metrics in their
original type. Since it only supports numerics, that means that dates,
longs, and doubles will come back as stored, with their appropriate
formatter applied.
2020-03-12 17:17:09 -04:00
Nik Everett
28df7ae5ed
Support multiple metrics in top_metrics agg (backport of #52965) (#53163)
This adds support for returning multiple metrics to the `top_metrics`
agg. It looks like:
```
POST /test/_search?filter_path=aggregations
{
  "aggs": {
    "tm": {
      "top_metrics": {
        "metrics": [
          {"field": "v"},
          {"field": "m"}
        ],
        "sort": {"s": "desc"}
      }
    }
  }
}
```
2020-03-05 08:12:01 -05:00
Nik Everett
1d1956ee93
Add size support to top_metrics (backport of #52662) (#52914)
This adds support for returning the top "n" metrics instead of just the
very top.

Relates to #51813
2020-02-27 16:12:52 -05:00
Nik Everett
146def8caa
Implement top_metrics agg (#51155) (#52366)
The `top_metrics` agg is kind of like `top_hits` but it only works on
doc values so it *should* be faster.

At this point it is fairly limited in that it only supports a single,
numeric sort and a single, numeric metric. And it only fetches the "very
topest" document worth of metric. We plan to support returning a
configurable number of top metrics, requesting more than one metric and
more than one sort. And, eventually, non-numeric sorts and metrics. The
trick is doing those things fairly efficiently.

Co-Authored by: Zachary Tong <zach@elastic.co>
2020-02-14 11:19:11 -05:00
Igor Motov
a66988281f
Add histogram field type support to boxplot aggs (#52265)
Add support for the histogram field type to boxplot aggs.

Closes #52233
Relates to #33112
2020-02-13 18:09:26 -05:00
Igor Motov
667e1a5225
Add Boxplot Aggregation (#52174)
Adds a `boxplot` aggregation that calculates min, max, medium and the first
and the third quartiles of the given data set.

Closes #33112
2020-02-11 09:38:17 -05:00
Igor Motov
08e9c673e5 Fix leftover mentions of method parameter in Percentile Aggs (#51272)
The method parameter is not used in the percentile aggs, instead
the method is determined by the presence of `hdr` or `tdigest`
objects.

Relates to #8324
2020-01-22 10:03:35 -05:00
James Rodewig
1299dda437 [DOCS] Warn about using geo_centroid as sub-agg to geohash_grid (#50038)
If `geo_point fields` are multi-valued, using `geo_centroid` as a
sub-agg to `geohash_grid` could result in centroids outside of bucket
boundaries.

This adds a related warning to the geo_centroid agg docs.
2020-01-06 07:47:54 -06:00
Gilad Gal
9fdfb075bb
Deleted 'a' before plural 'messages'
Deleted 'a' before plural 'messages'
2019-12-30 21:25:15 +02:00
James Rodewig
694b119f0a [DOCS] Percentile aggs are non-deterministic (#50468)
Percentile aggregations are non-deterministic. A percentile aggregation
can produce different results even when using the same data.

Based on [this discuss post][0], the non-deterministic property stems
from processes in Lucene that can affect the order in which docs are
provided to the aggregation.

This adds a warning stating that the aggregation is non-deterministic
and what that means.

[0]: https://discuss.elastic.co/t/different-results-for-same-query/111757
2019-12-23 13:13:34 -05:00