Commit graph

462 commits

Author SHA1 Message Date
Nik Everett
121ecb959d
Convert metric aggs docs runtime fields (backport of #71260) (#71298)
This replaces the `script` docs for bucket aggregations with runtime
fields. We expect runtime fields to be nicer to work with because you
can also fetch them or filter on them. We expect them to be faster
because their don't need this sort of `instanceof` tree:
a92a647b9f/server/src/main/java/org/elasticsearch/search/aggregations/support/values/ScriptDoubleValues.java (L42)

Relates to #69291

Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>
Co-authored-by: Adam Locke <adam.locke@elastic.co>
2021-04-05 13:24:19 -04:00
Nik Everett
1b35100ab0
Convert bucket aggs docs to runtime fields (backport #71202) (#71248)
This replaces the `script` docs for bucket aggregations with runtime
fields. We expect runtime fields to be nicer to work with because you
can also fetch them or filter on them. We expect them to be faster
because their don't need this sort of `instanceof` tree:
a92a647b9f/server/src/main/java/org/elasticsearch/search/aggregations/support/values/ScriptDoubleValues.java (L42)

Relates to #69291

Co-authored-by: Adam Locke <adam.locke@elastic.co>
2021-04-02 12:40:19 -04:00
James Rodewig
c757f9e4e7
[DOCS] Fix double spaces (#71082) (#71120) 2021-03-31 11:43:34 -04:00
Benjamin Trent
abb182d95c
[7.x] [ML] adding support for composite aggs in anomaly detection (#69970) (#71052)
* [ML] adding support for composite aggs in anomaly detection (#69970)

This commit allows for composite aggregations in datafeeds.

Composite aggs provide a much better solution for having influencers, partitions, etc. on high volume data. Instead of worrying about long scrolls in the datafeed, the calculation is distributed across cluster via the aggregations.

The restrictions for this support are as follows:

- The composite aggregation must have EXACTLY one `date_histogram` source
- The sub-aggs of the composite aggregation must have a `max` aggregation on the SAME timefield as the aforementioned `date_histogram` source
- The composite agg must be the ONLY top level agg and it cannot have a `composite` or `date_histogram` sub-agg
- If using a `date_histogram` to bucket time, it cannot have a `composite` sub-agg.
- The top-level `composite` agg cannot have a sibling pipeline agg. Pipeline aggregations are supported as a sub-agg (thus a pipeline agg INSIDE the bucket).

Some key user interaction differences:
- Speed + resources used by the cluster should be controlled by the `size` parameter in the `composite` aggregation. Previously, we said if you are using aggs, use a specific `chunking_config`. But, with composite, that is not necessary.
- Users really shouldn't use nested `terms` aggs anylonger. While this is still a "valid" configuration and MAY be desirable for some users (only wanting the top 10 of certain terms), typically when users want influencers, partition fields, etc. they want the ENTIRE population. Previously, this really wasn't possible with aggs, with `composite` it is.
- I cannot really think of a typical usecase that SHOULD ever use a multi-bucket aggregation that is NOT supported by composite.
2021-03-30 12:04:54 -04:00
István Zoltán Szabó
591e93397a
[DOCS] Removes beta labels from DFA related docs. (#70808) (#70902) 2021-03-26 10:25:36 +01:00
Nik Everett
05c5ec00f1
Docs: Clean doc for agg parameter (backport of #70675) (#70841)
This adds a heading for `shard_min_doc_count` and merges the paragraphs
for them. I wanted to link to this section earlier today and it wasn't a
"real" section so I couldn't.

Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>
2021-03-24 16:38:44 -04:00
Ignacio Vera
81da10f2e6
Increase search.max_bucket by one (#70645) (#70706) 2021-03-23 10:20:25 +01:00
James Rodewig
896d4f0d13
[DOCS] Reformat adjacency matrix agg reference (#70034) (#70101) 2021-03-08 13:15:13 -05:00
James Rodewig
c7d2cfb920 [DOCS] Fix gap policy xref 2021-03-03 09:31:26 -05:00
James Rodewig
d495b49cc3
[DOCS] Reformat avg bucket agg reference (#69751) (#69830) 2021-03-02 15:16:10 -05:00
Nik Everett
aef2567496
Docs: Switch terms agg scripting to runtime fields (backport of #69628) (#69821)
We expect runtime fields to perform a little better than our "native"
aggregation script so we should point folks to them instead of the
"native" aggregation script.
2021-03-02 11:54:23 -05:00
James Rodewig
8313701dc0
[DOCS] Update example for serial_diff agg (#69635) (#69694)
Co-authored-by: RomainGeffraye <romain.geffraye@elastic.co>
2021-03-01 08:53:29 -05:00
Lisa Cawley
1430e52669
[DOCS] Adds model alias to inference processor and agg (#69576) (#69577) 2021-02-24 16:49:45 -08:00
Igor Motov
a140161f53
Clarify the intended use case for multi_terms aggs (#69397) (#69484)
This PR clarifies when multi_terms aggs should be used instead of composite
aggs or nested term aggs.

Relates to #65623
2021-02-23 16:00:30 -05:00
Nik Everett
ec9c9a884b
Docs: Add example fetching keyword in top_metrics (backport of #69135) (#69141)
Adds an example of fetching a keyword field.
2021-02-17 14:30:08 -05:00
James Rodewig
b55249507e
[DOCS] Fix typos for duplicate words (#69125) (#69132) 2021-02-17 11:16:58 -05:00
James Rodewig
59f9f41cf2
[DOCS] Add missing newline for bulleted list in top_metrics docs (#68481) (#68551)
Co-authored-by: Nathan L Smith <nathan.smith@elastic.co>
2021-02-04 14:49:09 -05:00
Igor Motov
a0604825c6
[7.x] Add multi_terms aggs (#67597) (#68490)
Adds a multi_terms aggregation support. The multi terms aggregation works
very similarly to the terms aggregation but supports multiple terms. The goal
of this PR is to add the basic functionality so it is not optimized at the
moment. It will be done in follow up PRs.

Closes #65623
2021-02-04 11:19:25 -05:00
James Rodewig
f4f5c7c227
[DOCS] Fix casing for agg type titles (#67469) (#67470) 2021-01-13 15:04:08 -05:00
Adam Locke
0324892ed5
[DOCS] Adding headers in TOC for aggregation docs. (#66604) (#66607) 2020-12-18 12:00:11 -05:00
James Rodewig
e4bf2afd58
[DOCS] Fix search.max_buckets default (#66311) (#66312) 2020-12-15 08:17:50 -05:00
Nik Everett
d13c4b3f4b
Drop experimental from variable width histogram (backport of #66055) (#66060)
Its been several months and we haven't bumped into any good reason to
rework the variable width histogram. So let's drop experimental from it!

Closes #58573
2020-12-08 14:38:00 -05:00
James Rodewig
24cc2139c7
[DOCS] Fix typo in histogram agg docs (#65822) (#65827) 2020-12-03 10:53:09 -05:00
Igor Motov
de3ee05b33
Return an error when a rate aggregation cannot calculate bucket sizes (#65429) (#65502)
In some cases when the rate aggregation is not a child of a date histogram
aggregation, it is not possible to determine the actual size of the date
histogram bucket. In this case the rate aggregation now throws an exception.

Closes #63703
2020-11-25 12:27:08 -05:00
Tal Levy
0e6280ae3e Add mention of geo_shape support in geotile and geohash grid agg docs (#61129)
Previously, geo_shape support was only mentioned in a dedicated x-pack
section. This may be misleading, as the introductory paragraph only
mentions geo_point.

Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>
2020-11-24 13:58:29 -08:00
Tal Levy
cd7d1c9183
Add geo_line aggregation (#41612) (#65442)
A metric aggregation that aggregates a set of points as
a GeoJSON LineString ordered by some sort parameter.

A `geo_line` aggregation request would specify a `geo_point` field, as well
as a `sort` field. `geo_point` represents the values used in the LineString,
while the `sort` values will be used as the total ordering of the points.

the `sort` field would support any numeric field, including date.

```
{
	"query": {
		"bool": {
			"must": [
				{ "term": { "person": "004" } },
				{ "term": { "trajectory": "20090131002206.plt" } }
			]
		}
	},
	"aggs": {
		"make_line": {
			"geo_line": {
				"point": {"field": "location"},
				"sort": { "field": "timestamp" },
                                "include_sort": true,
                                "sort_order": "desc",
                                "size": 15
			}
		}
	}
}
```

```
{
    "took": 21,
    "timed_out": false,
    "_shards": {...},
    "hits": {...},
    "aggregations": {
        "make_line": {
            "type": "LineString",
            "coordinates": [
                [
                    121.52926194481552,
                    38.92878997139633
                ],
                [
                    121.52922699227929,
                    38.92876998055726
                ],
             ]
        }
    }
}
```

Due to the cardinality of points, an initial max of 10k points
will be used. This should support many use-cases.

One solution to overcome this limitation is to keep a PriorityQueue of
points, and simplifying the line once it hits this max. If simplifying
makes sense, it may be a nice option, in general. The ability to use a parameter
to specify how aggressive one wants to simplify. This parameter could be
the number of points. Example algorithm one could use with a PriorityQueue:
https://bost.ocks.org/mike/simplify/. This would still require O(m) space, where m
is the number of points returned. And would also require heapifying triangles
sorted by their areas, which would be O(log(m)) operations. Since sorting is done,
anyways, simplifying would still be a O(n log(m)) operation, where n is the total number
of points to filter........... something to explore

closes #41649
2020-11-24 09:30:05 -08:00
Wylie Conlon
4d9f5b1867 Clarify field data cache behavior in docs (#64375)
* Clarify that field data cache includes global ordinals
* Describe that the cache should be cleared once the limit is reached
* Clarify that the `_id` field does not supported aggregations anymore
* Fold the `fielddata` mapping parameter page into the `text field docs
* Improve cross-linking
2020-11-20 13:56:02 -08:00
Adam Locke
8530eaaf98
Explicitly defining types for sources parameter (#65006) (#65021) 2020-11-12 17:08:33 -05:00
Mark Tozzi
491a5a08f3
[7.x] Add supports for upper and lower values on boxplot based on the IQR value (#63617) (#64611)
* Add supports for upper and lower values on boxplot based on the IQR value (#63617)

* fix List.of usage
2020-11-05 09:18:27 -05:00
James Rodewig
354602e798
[DOCS] Change agg titles to sentence case (#64425) (#64430) 2020-10-30 13:45:54 -04:00
James Rodewig
5b1700b660
[DOCS] Rewrite aggs overview (#64318) (#64409)
- Replaces more abstract docs about object structure and values source with task-based examples.
- Relocates several sections from the current `misc.asciidoc` file.
- Alphabetically sorts agg categories in the nav.
- Removes the matrix agg family. Moves the stats matrix agg under the metric agg family

Co-authored-by: debadair <debadair@elastic.co>
2020-10-30 09:29:26 -04:00
Mark Tozzi
51916aa677
[7.x] Allow mixing set-based and regexp-based include and exclude (#63325) (#64014)
* Allow mixing set-based and regexp-based include and exclude (#63325)

Co-authored-by: Hugo Chargois <hugo.chargois@free.fr>
2020-10-27 10:11:24 -04:00
István Zoltán Szabó
b822e582c3
[DOCS] Changes experimental flag to beta in DFA related docs (#63992) (#64176) 2020-10-26 18:04:21 +01:00
Igor Motov
5ebe90daa0
Add value_count mode to rate agg (#63687) (#63847)
Adds a new value count mode to the rate aggregation.

Closes #63575
2020-10-19 13:04:38 -04:00
Aref Razavi
6f7d0d7018 Remove useless parentheses in bucket_key formula (#63868) 2020-10-19 11:53:09 +02:00
Igor Motov
3bfb11a32a
Add support for histogram fields to rate aggregation (#63289) (#63511)
The rate aggregation now supports histogram fields. At the moment only sum
is supported.

Closes #62939
2020-10-08 19:16:34 -04:00
Benjamin Trent
cfcf973259
[7.x] [ML] renames */inference* apis to */trained_models* (#63097) (#63136)
* [ML] renames */inference* apis to */trained_models* (#63097)

This commit renames all `inference` CRUD APIs to `trained_models`.

This aligns with internal terminology, documentation, and use-cases.
2020-10-02 07:34:28 -04:00
Przemyslaw Gomulka
ee500c10b9
[doc] Rounding range query rules backport(#63109) (#63155)
a documentation explaining defaulting of missing fields when using date math parser.
relates #62268
2020-10-02 09:40:01 +02:00
Lisa Cawley
3838fe1fd4 [DOCS] Add experimental tag to inference processor and bucket aggregation (#63023) 2020-09-30 08:51:26 -07:00
Alexander Reelsen
a6548117d0
[DOCS] Backport normalize aggregation fix (#63017)
This is a backport of 8534bd5ce7 which was only applied to the master branch, but not to 7.x or 7.$current
2020-09-29 11:17:40 -04:00
James Rodewig
42437e4b29
[DOCS] Fix elasticsearch-croneval chunking (#63008) (#63009) 2020-09-29 10:35:23 -04:00
Christos Soulios
ad79a2b6a1
[7.x] Histogram field type support for min/max aggregations (#62689)
Implement min/max aggregations for histogram fields.

Backports #62532
2020-09-21 12:53:56 +03:00
Julie Tibshirani
4a19bdb2ea
Support the 'fields' option in inner_hits and top_hits. (#62337)
This PR adds support for the 'fields' option in the following places:
* Anytime `inner_hits` is used, for both fetching nested/ child docs and field collapsing
* The `top_hits` aggregation

Addresses #61949.
2020-09-14 11:51:45 -07:00
Igor Motov
f70a59971a
[7.x] Add rate aggregation (#61369) (#61554)
Adds a new rate aggregation that can calculate a document rate for buckets
of a date_histogram.

Closes #60674
2020-08-25 17:39:00 -04:00
István Zoltán Szabó
86dbd68131
[DOCS] Adds example to the inference aggregation description (#61290) (#61318) 2020-08-19 12:07:30 +02:00
Nik Everett
8a387d6df1 Redo experimental tag on vwh (#61065)
The docs didn't have the standard experimental text. This adds it.
2020-08-18 10:02:26 -04:00
James Rodewig
06d3159125
[DOCS] Add usage tips to top_hits agg (#61215) (#61225) 2020-08-17 13:05:40 -04:00
Adam Locke
a3f357c8a5
[DOCS] Update info about geo_shape bounding boxes (#61214) (#61216)
* Adding information about geo_shape bounding boxes.

* Fixing cross link and incorporating review feedback.
2020-08-17 11:44:46 -04:00
James Rodewig
60876a0e32
[DOCS] Replace Wikipedia links with attribute (#61171) (#61209) 2020-08-17 11:27:04 -04:00
James Rodewig
cfa67e933f
[DOCS] Fix chunking in query docs (#61053) (#61054)
Changes:
* Moves "Notes" sections for the joining queries and percolate query
  pages to the parent page
* Adds related redirects for the moved "Notes" pages
* Assigns explicit anchor IDs to other "Notes" headings. This was required for
  the redirects to work.
2020-08-12 14:01:10 -04:00