Commit graph

587 commits

Author SHA1 Message Date
Alan Woodward
7a622f024f
Remove types from BulkRequest (#46983)
This commit removes types entirely from BulkRequest, both as a global
parameter and as individual entries on update/index/delete lines.

Relates to #41059
2019-10-07 13:29:12 +01:00
Mark Tozzi
c26ce1d7f5
DocValueFormat implementation for date range fields (#47472) 2019-10-04 16:01:28 -04:00
Mark Tozzi
57a679fbbb
Documentation notes for Range field histograms (#46890) 2019-10-01 10:46:04 -04:00
Alan Woodward
c1f99e2d75
Remove _type from SearchHit (#46942)
This commit removes the `_type` field from all search hit responses.

Relates to #41059
2019-09-23 19:14:54 +01:00
Javier Ruiz
e8dac62a4a [DOCS] Fix calendar interval typos for date histo agg (#46911) 2019-09-20 15:22:04 -04:00
James Rodewig
370e434986
[DOCS] Correct several [source,console-result] snippets (#46930) 2019-09-20 11:23:15 -04:00
markharwood
dc0abec595
Remove Adjacency_matrix setting in favour of Lucene Boolean query clause setting (#46327)
Closes #46324
2019-09-19 16:48:04 +01:00
Philipp Krenn
7c5adcc7c1 Minor improvement to the nested aggregation docs (#46475)
* Minor improvement to the nested aggregation docs

* The attributes name and resellers.name were rather confusing,
  especially since the first one was dynamically mapped and not shown
  in the documentation (you had to read the test to see it). This
  change introduces a unique name for the nested attribute and adds
  the example document to the documentation.
* Change the index name from "index" to something more speaking.

* Update docs/reference/aggregations/bucket/nested-aggregation.asciidoc

Co-Authored-By: James Rodewig <james.rodewig@elastic.co>

* Update docs/reference/aggregations/bucket/nested-aggregation.asciidoc

Co-Authored-By: James Rodewig <james.rodewig@elastic.co>

* Update docs/reference/aggregations/bucket/nested-aggregation.asciidoc

Co-Authored-By: James Rodewig <james.rodewig@elastic.co>
2019-09-11 11:23:39 -04:00
James Rodewig
e43be90e6c
[DOCS] [5 of 5] Change // TESTRESPONSE comments to [source,console-results] (#46449) 2019-09-06 14:05:36 -04:00
James Rodewig
466c59a4a7
[DOCS] Replace "// TESTRESPONSE" magic comments with "[source,console-result] (#46295) 2019-09-05 16:47:18 -04:00
James Rodewig
f5827ba0ae
[DOCS] Replace "// CONSOLE" comments with [source,console] (#46159) 2019-09-04 12:51:02 -04:00
Zachary Tong
758f7999b7
Add CumulativeCard pipeline agg to pipeline index (#46279)
The Cumulative Cardinality docs weren't linked
from the pipeline index page
2019-09-03 12:10:34 -04:00
Zachary Tong
273c35f79c
Add Cumulative Cardinality agg (and Data Science plugin) (#43661)
This adds a pipeline aggregation that calculates the cumulative
cardinality of a field.  It does this by iteratively merging in the
HLL sketch from consecutive buckets and emitting the cardinality up
to that point.

This is useful for things like finding the total "new" users that have
visited a website (as opposed to "repeat" visitors).

This is a Basic+ aggregation and adds a new Data Science plugin
to house it and future advanced analytics/data science aggregations.
2019-08-26 10:43:24 -04:00
LHearen
6dadce1112 [DOCS] Correct conditional clause in histogram agg docs (#45643) 2019-08-19 10:09:10 -04:00
LHearen
d1c0ea7833 [DOCS] Fix a 'value' -> 'values' typo in histogram aggregation docs (#45642) 2019-08-19 10:02:44 -04:00
Zachary Tong
ae7c071ec7
Allow pipeline aggs to select specific buckets from multi-bucket aggs (#44179)
This adjusts the `buckets_path` parser so that pipeline aggs can
select specific buckets (via their bucket keys) instead of fetching
the entire set of buckets.  This is useful for bucket_script in
particular, which might want specific buckets for calculations.

It's possible to workaround this with `filter` aggs, but the workaround
is hacky and probably less performant.

- Adjusts documentation
- Adds a barebones AggregatorTestCase for bucket_script
- Tweaks AggTestCase to use getMockScriptService() for reductions and
pipelines.  Previously pipelines could just pass in a script service
for testing, but this didnt work for regular aggs.  The new
getMockScriptService() method fixes that issue, but needs to be used
for pipelines too.  This had a knock-on effect of touching MovFn,
AvgBucket and ScriptedMetric
2019-08-05 12:15:42 -04:00
Nikita Glashenko
ead4eb5209 Add more flexibility to MovingFunction window alignment (#44360)
Introduce shift field to MovingFunction aggregation.

By default, shift = 0. Behavior, in this case, is the same as before.
Increasing shift by 1 moves starting window position by 1 to the right.

    To simply include current bucket to the window, use shift = 1
    For center alignment (n/2 values before and after the current bucket), use shift = window / 2
    For right alignment (n values after the current bucket), use shift = window.
2019-08-02 15:09:48 -04:00
Flavio Pompermaier
e66889635d [DOCS] Correct sum_other_doc_count value in terms agg example (#45028)
Closes issue #41902
2019-07-31 14:10:05 -04:00
Sandeep Kanabar
0e4be837db [Docs] Update daterange-aggregation.asciidoc (#44730)
Correcting the value to be the same as that specified for "missing".
2019-07-29 12:51:15 +02:00
James Rodewig
ea1adb61c2
[DOCS] Update anchors and links for Elasticsearch API relocation (#44500) 2019-07-19 09:16:35 -04:00
Zachary Tong
eac86c9bb8
Document that pipeline aggs are not compatible with composite agg (#44180) 2019-07-12 12:34:34 -04:00
Zachary Tong
3e1f73ffa3
Link rare_terms docs from index page (#43882)
Docs for rare_terms were added in #35718, but neglected to
link it from the bucket index page
2019-07-02 13:10:46 -04:00
Zachary Tong
baf155dced
Add RareTerms aggregation (#35718)
This adds a `rare_terms` aggregation.  It is an aggregation designed
to identify the long-tail of keywords, e.g. terms that are "rare" or
have low doc counts.

This aggregation is designed to be more memory efficient than the
alternative, which is setting a terms aggregation to size: LONG_MAX
(or worse, ordering a terms agg by count ascending, which has
unbounded error).

This aggregation works by maintaining a map of terms that have
been seen. A counter associated with each value is incremented
when we see the term again.  If the counter surpasses a predefined
threshold, the term is removed from the map and inserted into a cuckoo
filter.  If a future term is found in the cuckoo filter we assume it
was previously removed from the map and is "common".

The map keys are the "rare" terms after collection is done.
2019-07-01 10:02:36 -04:00
Paul Sanwald
6357857bba
Adds a minimum interval to auto_date_histogram. (#42814)
Adds a minimum interval to `auto_date_histogram`. We do this by
restricting the roundings passed into to the aggregator.
2019-06-11 15:53:19 -04:00
Zachary Tong
0192fe7d7c
Add documentation for calendar/fixed intervals (#41919)
Original PR missed documentation for the new calendar/fixed
intervals.  This adds the missing documentation
2019-05-10 15:27:41 -04:00
Zachary Tong
290c8b8256
Force selection of calendar or fixed intervals in date histo agg (#33727)
The date_histogram accepts an interval which can be either a calendar 
interval (DST-aware, leap seconds, arbitrary length of months, etc) or 
fixed interval (strict multiples of SI units). Unfortunately this is inferred
by first trying to parse as a calendar interval, then falling back to fixed
if that fails.

This leads to confusing arrangement where `1d` == calendar, but 
`2d` == fixed.  And if you want a day of fixed time, you have to 
specify `24h` (e.g. the next smallest unit).  This arrangement is very
error-prone for users.

This PR adds `calendar_interval` and `fixed_interval` parameters to any
code that uses intervals (date_histogram, rollup, composite, datafeed, etc).
Calendar only accepts calendar intervals, fixed accepts any combination of
units (meaning `1d` can be used to specify `24h` in fixed time), and both
are mutually exclusive.  

The old interval behavior is deprecated and will throw a deprecation warning.
It is also mutually exclusive with the two new parameters. In the future the 
old dual-purpose interval will be removed.

The change applies to both REST and java clients.
2019-05-06 17:17:11 -04:00
James Rodewig
adf67053f4
[DOCS] Add anchors for Asciidoctor migration (#41648) 2019-04-30 10:19:09 -04:00
Ignacio Vera
cc48427e05
Improve accuracy for Geo Centroid Aggregation (#41033)
keeps the partial results as doubles and uses Kahan summation to help reduce floating point errors.
2019-04-25 08:06:55 +02:00
Zachary Tong
5ccc0b5a32
Disallow null/empty or duplicate composite sources (#41359)
Adds some validation to prevent duplicate source names from being 
used in the composite agg.

Also refactored to use a ConstructingObjectParser and removed the 
private ctor and setter for sources, making it mandatory.
2019-04-24 13:22:06 -04:00
Jason Tedor
656cc709b2
Fix intervals section of auto date-histogram docs (#41203)
This section should be at the same sub-level as other sections in the
auto date-histogram docs, otherwise it is rendered on to another page
and is confusing for users to understand what it's in reference to.
2019-04-15 11:27:38 -04:00
Antonio Matarrese
badb8559fb Use the breadth first collection mode for significant terms aggs.
This helps avoid memory issues when computing deep sub-aggregations. Because it
should be rare to use sub-aggregations with significant terms, we opted to always
choose breadth first as opposed to exposing a `collect_mode` option.

Closes #28652.
2019-04-11 15:38:25 -07:00
Lisa Cawley
13454376a4
[DOCS] Fixes callout for Asciidoctor migration (#41127) 2019-04-11 12:04:04 -07:00
Zachary Tong
6f0f8ab4bc
Remove MovingAverage pipeline aggregation (#39328)
This was deprecated in 6.4.0 and for the entirety of 7.0.  Removed
in 8.0
2019-03-19 15:31:05 -04:00
Ian
98bbb4176e Correct date in daterange-aggregation.asciidoc (#39727) 2019-03-06 11:30:17 +01:00
Samuel Cifuentes García
ff6ffe8ba1 Improved Terms Aggregation documentation (#38892)
Added a note after the first query example talking about fielddata.
2019-03-05 10:17:01 -05:00
Hannes Van De Vreken
b76a380f18 Fix typo in DateRange docs (yyy → yyyy) (#38883) 2019-02-15 10:20:16 -05:00
Alexander Reelsen
5f7168ea74
Remove joda time mentions in documentation (#38720)
This is the forward port of #38720 (not containing the 7.0 migration docs)
2019-02-14 10:18:48 +01:00
Yuri Astrakhan
f133bf4ed8
add geotile_grid ref to asciidoc (#38632) 2019-02-08 11:37:35 -05:00
Yuri Astrakhan
f3cde06a1d
geotile_grid implementation (#37842)
Implements `geotile_grid` aggregation

This patch refactors previous implementation https://github.com/elastic/elasticsearch/pull/30240

This code uses the same base classes as `geohash_grid` agg, but uses a different hashing
algorithm to allow zoom consistency.  Each grid bucket is aligned to Web Mercator tiles.
2019-01-31 19:11:30 -05:00
Julie Tibshirani
9ca26b7e63
Remove more references to type in docs. (#37946)
* Update the top-level 'getting started' guide.
* Remove custom types from the painless getting started documentation.
* Fix an incorrect references to '_doc' in the cardinality query docs.
* Update the _update docs to use the typeless API format.
2019-01-29 10:51:07 -08:00
Jim Ferenczi
cb451edb01
Allow nested fields in the composite aggregation (#37178)
This changes adds the support to handle `nested` fields in the `composite`
aggregation. A `nested` aggregation can be used as parent of a `composite`
aggregation in order to target `nested` fields in the `sources`.

Closes #28611
2019-01-25 14:00:39 +01:00
Christoph Büscher
967de04257
Uppercasing some docs section title (#37781)
Section titles are mostly uppercase, only a few cases where query DSL parameters
or Java method names are used as the title they should be lowercased.
2019-01-24 22:54:55 +01:00
Christoph Büscher
95a6951f78
Use new bulk API endpoint in the docs (#37698)
This change switches to using the typeless bulk API endpoint in the
documentation snippets where possible
2019-01-23 09:46:28 +01:00
Boaz Leskes
52ba407931
Expose sequence number and primary terms in search responses (#37639)
Users may require the sequence number and primary terms to perform optimistic concurrency control operations. Currently, you can get the sequence number via the `docvalues_fields` API but the primary term is not accessible because it is maintained by the `SeqNoFieldMapper` and the infrastructure can't find it. 

This commit adds a dedicated sub fetch phase to return both numbers that is connected to a new `seq_no_primary_term` parameter.
2019-01-23 09:01:58 +01:00
Christoph Büscher
34f2d2ec91
Remove remaining occurances of "include_type_name=true" in docs (#37646) 2019-01-22 15:13:52 +01:00
Christoph Büscher
3a96608b3f
Remove more include_type_name and types from docs (#37601) 2019-01-18 14:11:18 +01:00
Christoph Büscher
25aac4f77f
Remove include_type_name in asciidoc where possible (#37568)
The "include_type_name" parameter was temporarily introduced in #37285 to facilitate
moving the default parameter setting to "false" in many places in the documentation
code snippets. Most of the places can simply be reverted without causing errors.
In this change I looked for asciidoc files that contained the
"include_type_name=true" addition when creating new indices but didn't look
likey they made use of the "_doc" type for mappings. This is mostly the case
e.g. in the analysis docs where index creating often only contains settings. I
manually corrected the use of types in some places where the docs still used an
explicit type name and not the dummy "_doc" type.
2019-01-18 09:34:11 +01:00
Julie Tibshirani
36a3b84fc9
Update the default for include_type_name to false. (#37285)
* Default include_type_name to false for get and put mappings.

* Default include_type_name to false for get field mappings.

* Add a constant for the default include_type_name value.

* Default include_type_name to false for get and put index templates.

* Default include_type_name to false for create index.

* Update create index calls in REST documentation to use include_type_name=true.

* Some minor clean-ups around the get index API.

* In REST tests, use include_type_name=true by default for index creation.

* Make sure to use 'expression == false'.

* Clarify the different IndexTemplateMetaData toXContent methods.

* Fix FullClusterRestartIT#testSnapshotRestore.

* Fix the ml_anomalies_default_mappings test.

* Fix GetFieldMappingsResponseTests and GetIndexTemplateResponseTests.

We make sure to specify include_type_name=true during xContent parsing,
so we continue to test the legacy typed responses. XContent generation
for the typeless responses is currently only covered by REST tests,
but we will be adding unit test coverage for these as we implement
each typeless API in the Java HLRC.

This commit also refactors GetMappingsResponse to follow the same appraoch
as the other mappings-related responses, where we read include_type_name
out of the xContent params, instead of creating a second toXContent method.
This gives better consistency in the response parsing code.

* Fix more REST tests.

* Improve some wording in the create index documentation.

* Add a note about types removal in the create index docs.

* Fix SmokeTestMonitoringWithSecurityIT#testHTTPExporterWithSSL.

* Make sure to mention include_type_name in the REST docs for affected APIs.

* Make sure to use 'expression == false' in FullClusterRestartIT.

* Mention include_type_name in the REST templates docs.
2019-01-14 13:08:01 -08:00
Josh Soref
edb48321ba [DOCS] Various spelling corrections (#37046) 2019-01-07 14:44:12 +01:00
Igor Motov
d6acd8e15f
Docs: add clarification about geohash use in geohashgrid agg (#36901)
Adds an example on translating geohashes returned by geohashgrid 
agg as bucket keys into geo bounding box filters in elasticsearch as well
as 3rd party applications.

Closes #36413
2019-01-03 15:40:48 -05:00