elasticsearch

mirror of https://github.com/elastic/elasticsearch.git synced 2025-06-30 18:33:26 -04:00

Author	SHA1	Message	Date
AB Prashanth	785527bb58	[DOCS] Remove approximate document counts example from term agg docs (#55442 ) Removes an example from the "Document counts are approximate" section of the terms agg documentation. As #52377 details, the example was no longer accurate in 7.x or 6.8. Document counts were more precise than the example presented. We've opened issue #56025 to discuss re-adding an example later. Co-authored-by: James Rodewig <james.rodewig@elastic.co>	2020-04-30 09:49:32 -04:00
Zachary Tong	9f165bd44e	Aggs must specify a `field` or `script` (or both) (#52226 ) * Aggs must specify a `field` or `script` (or both) This adds a validation to VSParserHelper to ensure that a field or script or both are specified by the user. This is technically required today already, but throws an exception much deeper in the agg framework and has a very unintuitive error for the user (as well as eating more resources instead of failing early) * Fix StringStats test * Add yaml test * Skip test on older versions Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>	2020-04-23 14:26:38 -04:00
Anton Dollmaier	e9c8c03fee	[DOCS] Fix parameter formatting for GeoHash grid agg docs (#53032 ) Adds missing colon (`:`) to the parameter definition list.	2020-03-09 08:17:57 -04:00
Tal Levy	6c86606d2a	Adds support for geo-bounds filtering in geogrid aggregations (#50002 ) It is fairly common to filter the geo point candidates in geohash_grid and geotile_grid aggregations according to some viewable bounding box. This change introduces the option of specifying this filter directly in the tiling aggregation. This is even more relevant to `geo_shape` where the bounds will restrict the shape to be within the bounds this optional `bounds` parameter is parsed in an equivalent fashion to the bounds specified in the geo_bounding_box query.	2020-01-14 08:29:10 -08:00
Nik Everett	326d696d9a	Support offset in composite aggs (#50609 ) Adds support for the `offset` parameter to the `date_histogram` source of composite aggs. The `offset` parameter is supported by the normal `date_histogram` aggregation and is useful for folks that need to measure things from, say, 6am one day to 6am the next day. This is implemented by creating a new `Rounding` that knows how to handle offsets and delegates to other rounding implementations. That implementation doesn't fully implement the `Rounding` contract, namely `nextRoundingValue`. That method isn't used by composite aggs so I can't be sure that any implementation that I add will be correct. I propose to leave it throwing `UnsupportedOperationException` until I need it. Closes #48757	2020-01-07 14:49:09 -05:00
Nik Everett	a7cc0b0159	Docs: Refine note about `after_key` (#50475 ) * Docs: Refine note about `after_key` I was curious about composite aggregations, specifically I wanted to know how to write a composite aggregation that had all of its buckets filtered out so you had to use the `after_key`. Then I saw that we've declared composite aggregations not to work with pipelines in #44180. So I'm not sure you can do that any more. Which makes the note about `after_key` inaccurate. This rejiggers that section of the docs a little so it is more obvious that you send the `after_key` back to us. And so it is more obvious that you should only use the `after_key` that we give you rather than try to work it out for yourself. * Apply suggestions from code review Co-Authored-By: James Rodewig <james.rodewig@elastic.co> Co-authored-by: James Rodewig <james.rodewig@elastic.co>	2020-01-02 10:02:55 -05:00
Lisa Cawley	6d608e6a0d	[DOCS] Move transform resource definitions into APIs (#50108 )	2019-12-17 09:01:31 -08:00
Jim Ferenczi	804a5042e7	Optimize composite aggregation based on index sorting (#48399 ) Co-authored-by: Daniel Huang <danielhuang@tencent.com> This is a spinoff of #48130 that generalizes the proposal to allow early termination with the composite aggregation when leading sources match a prefix or the entire index sort specification. In such case the composite aggregation can use the index sort natural order to early terminate the collection when it reaches a composite key that is greater than the bottom of the queue. The optimization is also applicable when a query other than match_all is provided. However the optimization is deactivated for sources that match the index sort in the following cases: * Multi-valued source, in such case early termination is not possible. * missing_bucket is set to true	2019-12-17 14:02:06 +01:00
Przemko Robakowski	04f6b6fdb2	[DOCS] IDs for doc snippets (#49008 ) * Ids for docs snippets * Ids for tests * Ids for docs snippets * ignoring build folder from idea * Ignoring build-eclipse	2019-11-25 15:30:00 +01:00
Lisa Cawley	a4efab6ab4	[DOCS] Merge rollup config details into API (#49412 )	2019-11-22 08:31:30 -08:00
James Rodewig	f53eba024b	[DOCS] Remove binary gendered language (#48362 )	2019-10-23 09:36:31 -05:00
Alan Woodward	566e1b7d33	Remove type field from DocWriteRequest and associated Response objects (#47671 ) This commit removes the type field from index, update and delete requests, and their associated responses. Relates to #41059	2019-10-11 10:23:55 +01:00
Alan Woodward	7a622f024f	Remove types from BulkRequest (#46983 ) This commit removes types entirely from BulkRequest, both as a global parameter and as individual entries on update/index/delete lines. Relates to #41059	2019-10-07 13:29:12 +01:00
Mark Tozzi	c26ce1d7f5	DocValueFormat implementation for date range fields (#47472 )	2019-10-04 16:01:28 -04:00
Mark Tozzi	57a679fbbb	Documentation notes for Range field histograms (#46890 )	2019-10-01 10:46:04 -04:00
Javier Ruiz	e8dac62a4a	[DOCS] Fix calendar interval typos for date histo agg (#46911 )	2019-09-20 15:22:04 -04:00
James Rodewig	370e434986	[DOCS] Correct several [source,console-result] snippets (#46930 )	2019-09-20 11:23:15 -04:00
markharwood	dc0abec595	Remove Adjacency_matrix setting in favour of Lucene Boolean query clause setting (#46327 ) Closes #46324	2019-09-19 16:48:04 +01:00
Philipp Krenn	7c5adcc7c1	Minor improvement to the nested aggregation docs (#46475 ) * Minor improvement to the nested aggregation docs * The attributes name and resellers.name were rather confusing, especially since the first one was dynamically mapped and not shown in the documentation (you had to read the test to see it). This change introduces a unique name for the nested attribute and adds the example document to the documentation. * Change the index name from "index" to something more speaking. * Update docs/reference/aggregations/bucket/nested-aggregation.asciidoc Co-Authored-By: James Rodewig <james.rodewig@elastic.co> * Update docs/reference/aggregations/bucket/nested-aggregation.asciidoc Co-Authored-By: James Rodewig <james.rodewig@elastic.co> * Update docs/reference/aggregations/bucket/nested-aggregation.asciidoc Co-Authored-By: James Rodewig <james.rodewig@elastic.co>	2019-09-11 11:23:39 -04:00
James Rodewig	e43be90e6c	[DOCS] [5 of 5] Change // TESTRESPONSE comments to [source,console-results] (#46449 )	2019-09-06 14:05:36 -04:00
James Rodewig	466c59a4a7	[DOCS] Replace "// TESTRESPONSE" magic comments with "[source,console-result] (#46295 )	2019-09-05 16:47:18 -04:00
James Rodewig	f5827ba0ae	[DOCS] Replace "// CONSOLE" comments with [source,console] (#46159 )	2019-09-04 12:51:02 -04:00
LHearen	6dadce1112	[DOCS] Correct conditional clause in histogram agg docs (#45643 )	2019-08-19 10:09:10 -04:00
LHearen	d1c0ea7833	[DOCS] Fix a 'value' -> 'values' typo in histogram aggregation docs (#45642 )	2019-08-19 10:02:44 -04:00
Flavio Pompermaier	e66889635d	[DOCS] Correct sum_other_doc_count value in terms agg example (#45028 ) Closes issue #41902	2019-07-31 14:10:05 -04:00
Sandeep Kanabar	0e4be837db	[Docs] Update daterange-aggregation.asciidoc (#44730 ) Correcting the value to be the same as that specified for "missing".	2019-07-29 12:51:15 +02:00
James Rodewig	ea1adb61c2	[DOCS] Update anchors and links for Elasticsearch API relocation (#44500 )	2019-07-19 09:16:35 -04:00
Zachary Tong	eac86c9bb8	Document that pipeline aggs are not compatible with composite agg (#44180 )	2019-07-12 12:34:34 -04:00
Zachary Tong	baf155dced	Add RareTerms aggregation (#35718 ) This adds a `rare_terms` aggregation. It is an aggregation designed to identify the long-tail of keywords, e.g. terms that are "rare" or have low doc counts. This aggregation is designed to be more memory efficient than the alternative, which is setting a terms aggregation to size: LONG_MAX (or worse, ordering a terms agg by count ascending, which has unbounded error). This aggregation works by maintaining a map of terms that have been seen. A counter associated with each value is incremented when we see the term again. If the counter surpasses a predefined threshold, the term is removed from the map and inserted into a cuckoo filter. If a future term is found in the cuckoo filter we assume it was previously removed from the map and is "common". The map keys are the "rare" terms after collection is done.	2019-07-01 10:02:36 -04:00
Paul Sanwald	6357857bba	Adds a minimum interval to `auto_date_histogram`. (#42814 ) Adds a minimum interval to `auto_date_histogram`. We do this by restricting the roundings passed into to the aggregator.	2019-06-11 15:53:19 -04:00
Zachary Tong	0192fe7d7c	Add documentation for calendar/fixed intervals (#41919 ) Original PR missed documentation for the new calendar/fixed intervals. This adds the missing documentation	2019-05-10 15:27:41 -04:00
Zachary Tong	290c8b8256	Force selection of calendar or fixed intervals in date histo agg (#33727 ) The date_histogram accepts an interval which can be either a calendar interval (DST-aware, leap seconds, arbitrary length of months, etc) or fixed interval (strict multiples of SI units). Unfortunately this is inferred by first trying to parse as a calendar interval, then falling back to fixed if that fails. This leads to confusing arrangement where `1d` == calendar, but `2d` == fixed. And if you want a day of fixed time, you have to specify `24h` (e.g. the next smallest unit). This arrangement is very error-prone for users. This PR adds `calendar_interval` and `fixed_interval` parameters to any code that uses intervals (date_histogram, rollup, composite, datafeed, etc). Calendar only accepts calendar intervals, fixed accepts any combination of units (meaning `1d` can be used to specify `24h` in fixed time), and both are mutually exclusive. The old interval behavior is deprecated and will throw a deprecation warning. It is also mutually exclusive with the two new parameters. In the future the old dual-purpose interval will be removed. The change applies to both REST and java clients.	2019-05-06 17:17:11 -04:00
James Rodewig	adf67053f4	[DOCS] Add anchors for Asciidoctor migration (#41648 )	2019-04-30 10:19:09 -04:00
Zachary Tong	5ccc0b5a32	Disallow null/empty or duplicate composite sources (#41359 ) Adds some validation to prevent duplicate source names from being used in the composite agg. Also refactored to use a ConstructingObjectParser and removed the private ctor and setter for sources, making it mandatory.	2019-04-24 13:22:06 -04:00
Jason Tedor	656cc709b2	Fix intervals section of auto date-histogram docs (#41203 ) This section should be at the same sub-level as other sections in the auto date-histogram docs, otherwise it is rendered on to another page and is confusing for users to understand what it's in reference to.	2019-04-15 11:27:38 -04:00
Antonio Matarrese	badb8559fb	Use the breadth first collection mode for significant terms aggs. This helps avoid memory issues when computing deep sub-aggregations. Because it should be rare to use sub-aggregations with significant terms, we opted to always choose breadth first as opposed to exposing a `collect_mode` option. Closes #28652.	2019-04-11 15:38:25 -07:00
Lisa Cawley	13454376a4	[DOCS] Fixes callout for Asciidoctor migration (#41127 )	2019-04-11 12:04:04 -07:00
Ian	98bbb4176e	Correct date in daterange-aggregation.asciidoc (#39727 )	2019-03-06 11:30:17 +01:00
Samuel Cifuentes García	ff6ffe8ba1	Improved Terms Aggregation documentation (#38892 ) Added a note after the first query example talking about fielddata.	2019-03-05 10:17:01 -05:00
Hannes Van De Vreken	b76a380f18	Fix typo in DateRange docs (yyy → yyyy) (#38883 )	2019-02-15 10:20:16 -05:00
Alexander Reelsen	5f7168ea74	Remove joda time mentions in documentation (#38720 ) This is the forward port of #38720 (not containing the 7.0 migration docs)	2019-02-14 10:18:48 +01:00
Yuri Astrakhan	f3cde06a1d	geotile_grid implementation (#37842 ) Implements `geotile_grid` aggregation This patch refactors previous implementation https://github.com/elastic/elasticsearch/pull/30240 This code uses the same base classes as `geohash_grid` agg, but uses a different hashing algorithm to allow zoom consistency. Each grid bucket is aligned to Web Mercator tiles.	2019-01-31 19:11:30 -05:00
Jim Ferenczi	cb451edb01	Allow nested fields in the composite aggregation (#37178 ) This changes adds the support to handle `nested` fields in the `composite` aggregation. A `nested` aggregation can be used as parent of a `composite` aggregation in order to target `nested` fields in the `sources`. Closes #28611	2019-01-25 14:00:39 +01:00
Christoph Büscher	967de04257	Uppercasing some docs section title (#37781 ) Section titles are mostly uppercase, only a few cases where query DSL parameters or Java method names are used as the title they should be lowercased.	2019-01-24 22:54:55 +01:00
Christoph Büscher	95a6951f78	Use new bulk API endpoint in the docs (#37698 ) This change switches to using the typeless bulk API endpoint in the documentation snippets where possible	2019-01-23 09:46:28 +01:00
Christoph Büscher	3a96608b3f	Remove more include_type_name and types from docs (#37601 )	2019-01-18 14:11:18 +01:00
Christoph Büscher	25aac4f77f	Remove `include_type_name` in asciidoc where possible (#37568 ) The "include_type_name" parameter was temporarily introduced in #37285 to facilitate moving the default parameter setting to "false" in many places in the documentation code snippets. Most of the places can simply be reverted without causing errors. In this change I looked for asciidoc files that contained the "include_type_name=true" addition when creating new indices but didn't look likey they made use of the "_doc" type for mappings. This is mostly the case e.g. in the analysis docs where index creating often only contains settings. I manually corrected the use of types in some places where the docs still used an explicit type name and not the dummy "_doc" type.	2019-01-18 09:34:11 +01:00
Julie Tibshirani	36a3b84fc9	Update the default for include_type_name to false. (#37285 ) * Default include_type_name to false for get and put mappings. * Default include_type_name to false for get field mappings. * Add a constant for the default include_type_name value. * Default include_type_name to false for get and put index templates. * Default include_type_name to false for create index. * Update create index calls in REST documentation to use include_type_name=true. * Some minor clean-ups around the get index API. * In REST tests, use include_type_name=true by default for index creation. * Make sure to use 'expression == false'. * Clarify the different IndexTemplateMetaData toXContent methods. * Fix FullClusterRestartIT#testSnapshotRestore. * Fix the ml_anomalies_default_mappings test. * Fix GetFieldMappingsResponseTests and GetIndexTemplateResponseTests. We make sure to specify include_type_name=true during xContent parsing, so we continue to test the legacy typed responses. XContent generation for the typeless responses is currently only covered by REST tests, but we will be adding unit test coverage for these as we implement each typeless API in the Java HLRC. This commit also refactors GetMappingsResponse to follow the same appraoch as the other mappings-related responses, where we read include_type_name out of the xContent params, instead of creating a second toXContent method. This gives better consistency in the response parsing code. * Fix more REST tests. * Improve some wording in the create index documentation. * Add a note about types removal in the create index docs. * Fix SmokeTestMonitoringWithSecurityIT#testHTTPExporterWithSSL. * Make sure to mention include_type_name in the REST docs for affected APIs. * Make sure to use 'expression == false' in FullClusterRestartIT. * Mention include_type_name in the REST templates docs.	2019-01-14 13:08:01 -08:00
Igor Motov	d6acd8e15f	Docs: add clarification about geohash use in geohashgrid agg (#36901 ) Adds an example on translating geohashes returned by geohashgrid agg as bucket keys into geo bounding box filters in elasticsearch as well as 3rd party applications. Closes #36413	2019-01-03 15:40:48 -05:00
Luca Cavanna	42ea644903	Remove single shard optimization when suggesting shard_size (#37041 ) When executing terms aggregations we set the shard_size, meaning the number of buckets to collect on each shard, to a value that's higher than the number of requested buckets, to guarantee some basic level of precision. We have an optimization in place so that we leave shard_size set to size whenever we are searching against a single shard, in which case maximum precision is guaranteed by definition. Such optimization requires us access to the total number of shards that the search is executing against. In the context of cross-cluster search, once we will introduce multiple reduction steps (one per cluster) each cluster will only know the number of local shards, which is problematic as we should only optimize if we are searching against a single shard in a single cluster. It could be that we are searching against one shard per cluster in which case the current code would optimize number of terms causing a loss of precision. While discussing how to address the CCS scenario, we decided that we do not want to introduce further complexity caused by this single shard optimization, as it benefits only a minority of cases, especially when the benefits are not so great. This commit removes the single shard optimization, meaning that we will always have heuristic enabled on how many number of buckets to collect on the shards, even when searching against a single shard. This will cause more buckets to be collected when searching against a single shard compared to before. If that becomes a problem for some users, they can work around that by setting the shard_size equal to the size. Relates to #32125	2019-01-02 17:45:49 +01:00

1 2 3 4 5

202 commits