elasticsearch

mirror of https://github.com/elastic/elasticsearch.git synced 2025-06-28 17:34:17 -04:00

Author	SHA1	Message	Date
David Turner	c238aa1b46	Add YAML spec docs about matching errors (#89370 ) It's not obvious that a YAML test with a `catch` stanza also permits `match` blocks to assert things about the structure of the error response, but this structure may be an important part of the API spec. This commit adds this info to the docs about YAML tests.	2022-08-18 22:20:13 +09:30
Nik Everett	b46d95b2fb	REST tests for percentiles_bucket agg (#88029 ) Adds REST tests for the `percentiles_bucket` pipeline bucket aggregation. This gives us forwards and backwards compatibility tests for these aggs as well as mixed version cluster tests for these aggs. Relates to #26220	2022-08-17 13:19:49 -04:00
Nik Everett	63b850cac9	REST tests for cumulative pipeline aggs (#88966 ) Adds REST tests for the `cumulative_cardinality` and `cumulative_sum` pipeline aggregations. This gives us forwards and backwards compatibility tests for these aggs as well as mixed version cluster tests for these aggs. Relates to #26220	2022-08-17 13:05:47 -04:00
Nik Everett	79a89790e3	Synthetic source: load text from stored fields (#87480 ) Adds support for loading `text` and `keyword` fields that have `store: true`. We could likely load any stored fields, but I wanted to blaze the trail using something fairly useful.	2022-08-17 10:18:36 -04:00
Nik Everett	b327b17653	Fix shard splitting for `nested` (#89351 ) I broke shard splitting when `_routing` is required and you use `nested` docs. The mapping would look like this: ``` "mappings": { "_routing": { "required": true }, "properties": { "n": { "type": "nested" } } } ``` If you attempt to split an index with a mapping like this it'll blow up with an exception like this: ``` Caused by: [idx] org.elasticsearch.action.RoutingMissingException: routing is required for [idx]/[0] at org.elasticsearch.cluster.routing.IndexRouting$IdAndRoutingOnly.checkRoutingRequired(IndexRouting.java:181) at org.elasticsearch.cluster.routing.IndexRouting$IdAndRoutingOnly.getShard(IndexRouting.java:175) ``` This fixes the problem by entirely avoiding the branch of code. That branch was trying to find any top level documents that don't have a `_routing`. But we know that there aren't any top level documents without a routing in this case - the routing is "required". ES wouldn't have let you index any top level documents without the routing. This also adds a small pile of REST layer tests for shard splitting that hit various branches in this area. For extra paranoia. Closes #88109	2022-08-16 11:55:46 -04:00
weizijun	104ad7fd92	TSDB: fix time series field caps bwc yaml test (#89236 ) Stops the repeated test failures due to #89171	2022-08-15 09:46:09 +01:00
Yang Wang	d663231a83	User Profile - GetProfile API nows supports multiple UIDs (#89023 ) This PR expands the existing GetProfile API to support getting multiple profiles by IDs. As a result, the response format is also changed to align with the latest version of API design guideline. Concretely, this means moving the profiles as an array inside a top level "profiles" field so that (1) does not mix dynamic fields (uid) with static fields and (2) enforcing an order in the response which is desirable for clients. The change also reports any error encounter in the retrieving process in a top level "errors" field. Relates: #81910	2022-08-10 10:51:38 +09:30
Benjamin Trent	d588d456f0	[ML] add new trained model deployment cache clear API (#89074 ) This adds a new `_ml/trained_models/<model_id>/deployment/cache/_clear` API. This will clear the inference cache on every node where the model is allocated.	2022-08-04 19:45:15 +01:00
Nhat Nguyen	e3c33e2acd	Deduplicate fetching doc-values fields (#89094 ) If a docvalues field matches multiple field patterns, then ES will return the value of that doc-values field multiple times. Like fetching fields from source, we should deduplicate the matching doc-values fields.	2022-08-04 14:05:09 -04:00
likzn	f28f4545b2	In the field capabilities API, re-add support for `fields` in the request body (#88972 ) We previously removed support for `fields` in the request body, to ensure there was only one way to specify the parameter. We've now decided to undo the change, since it was disruptive and the request body is actually the best place to pass variable-length data like `fields`. This PR restores support for `fields` in the request body. It throws an error if the parameter is specified both in the URL and the body. Closes #86875	2022-08-04 13:44:50 -04:00
Christos Soulios	b81f4187ab	[TSDB] Metric fields in the field caps API (#88695 ) To assist the user in configuring the visualizations correctly while leveraging TSDB functionality, information about TSDB configuration should be exposed via the field caps API per field. Especially for metrics fields, it must be clear which fields are metrics and if they belong to only time-series indexes or mixed time-series and non-time-series indexes. To further distinguish metric fields when they belong to any of the following indices: - Standard (non-time-series) indexes - Time series indexes - Downsampled time series indexes This PR modifies the field caps API so that the mapping parameters time_series_dimension and time_series_dimension are presented only when they are set on fields of time-series indexes. Those parameters are completely ignored when they are set on standard (non-time-series) indexes. This PR revisits some of the conventions adopted by #78790	2022-08-04 20:42:34 +03:00
Ed Savage	188f8872c6	[ML] ECS Grok patterns in the _text_structure/find_structure endpoint (#88982 ) Also add support for new CATALINA/TOMCAT timestamp formats used by ECS Grok patterns Relates #77065 Co-authored-by: David Roberts <dave.roberts@elastic.co>	2022-08-04 18:39:04 +01:00
Julie Tibshirani	0bed7f768a	Fix failures in vector field usage mixed cluster test	2022-08-03 16:14:46 -04:00
Julie Tibshirani	21eb984e64	Deprecate the _knn_search endpoint (#88828 ) This change deprecates the kNN search API in favor of the new 'knn' option inside the search API. The 'knn' option is now the preferred way of performing kNN search. Relates to #87625	2022-08-03 15:19:01 -04:00
Nikolaj Volgushev	a124bafe7e	REST tests and spec for bulk update API keys (#89027 ) This PR adds REST API spec and YAML test files for the BulkUpdateApiKey operation.	2022-08-03 12:42:54 +02:00
Artem Prigoda	f4e617e894	Add a test for checking for misspelled "dry_run" parameters for Desired Nodes API (#88898 ) Check we the API doesn't accept a misspelled parameter and returns a client error.	2022-07-28 16:15:43 +02:00
Nik Everett	3bcee8eaa0	Format runtime geo_points (#85449 ) This formats the result of the `fields` section of the `_search` API for runtime `geo_point` fields using the `format` parameter like we do for non-runtime `geo_point` fields. This changes the default format for those fields from `lat, lon` to `geojson` with the option to get `wkt` or any other format we support. The fix does so by preserving the `double, double` nature of the `geo_point` rather than encoding it immediately in the script. Callers can use the results. The field fetchers use the `double, double` natively, preserving as much precision as possible. The queries quantize the points exactly like lucene indexing does. And like the script did before this Pr. Closes #85245	2022-07-27 13:11:07 -04:00
Przemko Robakowski	539434dbb4	Add min_* conditions to rollover (#83345 )	2022-07-26 11:46:39 -04:00
Julie Tibshirani	abd561a277	Support kNN vectors in disk usage action (#88785 ) This change adds support for kNN vector fields to the `_disk_usage` API. The strategy: * Iterate the vector values (using the same strategy as for doc values) to estimate the vector data size * Run some random vector searches to estimate the vector index size Co-authored-by: Yannick Welsch <yannick@welsch.lu> Closes #84801	2022-07-26 07:57:47 -07:00
Artem Prigoda	c0bc85522d	Clean up desired nodes in between dry run tests (#88797 )	2022-07-26 12:04:06 +02:00
Artem Prigoda	72a6fdc2b8	Support "dry run" mode for updating Desired Nodes (#88305 ) Add the dry_run query parameter to support simulating of updating of desired nodes. The update request will be validated, but no cluster state updates will be performed. In order to indicate that the response was a result of a dry run, we add the dry_run run field to the JSON representation of a response. See #82975	2022-07-26 09:03:12 +02:00
Keith Massey	4b060a6046	Removing the notion of components from the health API (#88663 ) This commit removes the notion of components from the health API. They are gone from being a top-level field in the response, and indicators is promoted into its place.	2022-07-25 12:29:06 -05:00
Andrei Dan	da765ced7f	Remove help_url,rename summary to symptom, and user_actions to diagnosis (#88553 ) Remove help_url,rename summary->symptom,user_actions->diagnosis Separate the diagnosis `message` field in `cause` and `action` Co-authored-by: Mary Gouseti <mgouseti@gmail.com>	2022-07-25 10:35:16 +01:00
Julie Tibshirani	e3ede67262	Integrate ANN into _search endpoint (#88694 ) This PR adds a new `knn` option to the `_search` API to support ANN search. It's powered by the same Lucene ANN capabilities as the old `_knn_search` endpoint. The `knn` option can be combined with other search features like queries and aggregations. Addresses #87625	2022-07-22 08:02:07 -07:00
Benjamin Trent	94f2544998	Adding cardinality support for random_sampler agg (#86838 ) This adds support for the `cardinality` aggregation within a random_sampler. This usecase is helpful in determining the ratio of unique values compared to the count of total documents within the sampled set.	2022-07-21 07:19:35 -04:00
Seth Michael Larson	fffabae10a	Add pagination parameters to API spec and docs for 'snapshot.get' API	2022-07-20 06:35:52 -05:00
tmgordeeva	ab2602ecb0	Propagate alias filters to significance aggs filters (#88221 ) Propagate alias filters to significance aggs filters If we have an alias filter, use it as part of the background filter on a signficant terms agg. Previously, alias filters did not apply to background filters so this will change bg_count results for some significant terms aggs using background filter. Closes #81585	2022-07-19 10:03:08 -07:00
Seth Michael Larson	478c06ef29	Verify that 'details' aren't sent when explain=false	2022-07-18 09:48:11 -05:00
Benjamin Trent	afa28d49b4	[ML] add new cache_size parameter to trained_model deployments API (#88450 ) With: https://github.com/elastic/ml-cpp/pull/2305 we now support caching pytorch inference responses per node per model. By default, the cache will be the same size has the model on disk size. This is because our current best estimate for memory used (for deploying) is 2*model_size + constant_overhead. This is due to the model having to be loaded in memory twice when serializing to the native process. But, once the model is in memory and accepting requests, its actual memory usage is reduced vs. what we have "reserved" for it within the node. Consequently, having a cache layer that takes advantage of that unused (but reserved) memory is effectively free. When used in production, especially in search scenarios, caching inference results is critical for decreasing latency.	2022-07-18 09:19:01 -04:00
Alan Woodward	5c11a81913	Add 'mode' option to `_source` field mapper (#88211 ) Currently we have two parameters that control how the source of a document is stored, `enabled` and `synthetic`, both booleans. However, there are only three possible combinations of these, with `enabled:false` and `synthetic:true` being disallowed. To make this easier to reason about, this commit replaces the `enabled` parameter with a new `mode` parameter, which can take the values `stored`, `synthetic` and `disabled`. The `mode` parameter cannot be set in combination with `enabled`, and we will subsequently move towards deprecating `enabled` entirely.	2022-07-18 12:50:10 +01:00
Chen Ni	c45c205c33	Add test execution guide in yamlRestTest asciidoc (#88490 )	2022-07-14 08:22:35 -07:00
Nhat Nguyen	227d80975b	Add tests for query/agg on lookup runtime fields (#88389 ) Adds tests to ensure that querying and aggregating on lookup runtimes aren't supported. Relates #88296	2022-07-09 02:02:13 +09:30
Nikolaj Volgushev	f42b15bc8c	Updatable API keys - REST API spec and tests (#88270 ) This PR adds REST API spec and YAML test files for the UpdateApiKey operation.	2022-07-08 11:48:02 +02:00
Ryan Ernst	9016883e1c	Add build_flavor back to info api rest response (#88336 ) The build_flavor was previously removed since it is no longer relevant; only the default distribution now exists. However, the removal of build flavor included removing it from the version information on the info response for the root path. This API is supposed to be stable, so removing that key was a compatibility break. This commit adds the build_flavor back to that API, hardcoded to `default`. Additionally, a test is added to ensure the key exists going forward, until it can be properly deprecated. closes #88318	2022-07-08 09:54:29 +09:30
Mark Tozzi	9ee6a19187	Add ability to select execution mode for cardinality aggregation (#87704 ) Plumbs through a new parameter for the cardinality aggregation, to allow configuring the execution mode. This can have significant impacts on speed and memory usage. This PR exposes three collection modes and two heuristics that we can tune going forward. All of these are treated as hints and can be silently ignored, e.g. if not applicable to the given field type. I've change the default behavior to optimize for time, which potentially uses more memory. Users can override this for the old behavior if needed.	2022-07-05 09:11:22 -04:00
Rene Groeschke	8ccae4da71	Setup elasticsearch dependency monitoring with Snyk for production code (#88036 ) This adds the generation and upload logic of Gradle dependency graphs to snyk We directly implemented a rest api based snyk plugin as: the existing snyk gradle plugin delegates to the snyk command line tool the command line tool uses custom gradle logic by injecting a init file that is a) using deprecated build logic which we definitely want to avoid b) uses gradle api we avoid like eager task creation. Shipping this as a internal gradle plugin gives us the most flexibility as we only want to monitor production code for now we apply this plugin as part of the elasticsearch.build plugin, that usage has been for now the de-facto indicator if a project is considered a "production" project that ends up in our distribution or public maven repositories. This isnt yet ideal and we will revisit the distinction between production and non production code / projects in a separate effort. As part of this effort we added the elasticsearch.build plugin to more projects that actually end up in the distribution. To unblock us on this we for now disabled a few check tasks that started failing by applying elasticsearch.build. Addresses #87620	2022-06-29 13:29:14 +02:00
Nik Everett	d88dfb11c7	More REST tests for avg/max/min/sum_bucket aggs (#88027 ) Adds REST layer tests for some sneaky cases in the the `avg_bucket`, `max_bucket`, `min_bucket`, and `sum_bucket` pipeline aggregations. This gives us forwards and backwards compatibility tests for these aggs as well as mixed version cluster tests for these aggs. Relates to #26220	2022-06-27 13:49:29 -04:00
Ryan Ernst	e3c4cddbe2	Remove legacy bootstrap plugins (#87775 ) Bootstrap plugins were an internal mechanism added to allow a filesystemprovider for cloud with the quota-aware-fs plugin. Since that was removed, bootstrap plugins no longer serve a purpose. They were never officially documented because they were for internal use only. This commit removes the bootstrap plugins infrastructure.	2022-06-23 20:38:06 -04:00
Julie Tibshirani	572a5b9bb4	Skip dense_vector field usage test before 8.1 Fixes #87971.	2022-06-23 10:25:17 -07:00
Julie Tibshirani	3a9e511117	Move kNN search and dense vectors to core (#87815 ) This PR moves kNN search and dense vector support out of an xpack plugin and into server. In #87625 we plan to integrate ANN search into the main `_search` endpoint as a new top-level component called `knn`. So kNN will be a dedicated part of the search request, and we'll have kNN logic within the search phases. The classes and logic will live in server, matching the other search components like suggesters, field collapsing, etc.	2022-06-22 21:10:20 -07:00
Nik Everett	463d46cd79	Add force_synthetic_source to mget (#87574 ) This adds the option to force synthetic source to the MGET API. See #87068 for more discussion on why you'd want to do that - the short version is to get an upper bound on the performance cost of using synthetic source in MGET.	2022-06-22 08:55:55 -04:00
Mark Tozzi	5f2411a3b8	Revert "Correct skip versions for new flattened terms test (#87540 )" (#87764 ) This reverts commit `f72c7da7ee`.	2022-06-16 16:35:39 -04:00
Nik Everett	cf154fd367	Tests for synthetic _source from translog (#87578 ) This adds tests to make sure that we use all of the normal synthetic source machinery, even when loading from the translog. So all GETs on synthetic source indices will require an in memory index. That'll be an extra cost on indices that are updated very very frequently.	2022-06-16 14:51:17 -04:00
Mark Tozzi	f72c7da7ee	Correct skip versions for new flattened terms test (#87540 ) Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>	2022-06-16 14:38:05 -04:00
Nik Everett	5b525a290e	REST tests for avg/max/min/sum_bucket aggs (#87009 ) Adds REST layer tests for the `avg_bucket`, `max_bucket`, `min_bucket`, and `sum_bucket` pipeline aggregations. This gives us forwards and backwards compatibility tests for these aggs as well as mixed version cluster tests for these aggs. Relates to #26220	2022-06-16 12:16:06 -04:00
Nik Everett	d74b45b9b1	REST tests for stats_bucket aggs (#87006 ) Adds REST tests for `stats_bucket` and `extended_stats_bucket` aggs. Relates to #26220	2022-06-16 12:14:52 -04:00
Nik Everett	48ab87f60b	Fix synthetic source highlighting tests (#87749 ) The synthetic source highlighting tests would sometimes fail in a strange way - they expect the entire search request to fail but it didn't - only a single shard would fail. This locks the tests to always make single shard indices so the failures are consistent. Closes #87730	2022-06-16 12:07:43 -04:00
Nik Everett	8ebf39b7e1	Fixup highlighting with synthetic source (#87667 ) Synthetic source has a habit of reordering text fields. This frustrates highlighting because it often wants to use index structures to find the offsets to values in the field. This disables the FVH highlighter for multi-valued text fields when synthetic source is enabled and runs the unified highlighter in "analyze" mode when synthetic source is enabled. That's enough to stop them from spitting out wrong answers. We might be leaving some performance on the table when the unified highlighter works on a single valued text field that is indexed with offsets or term vectors. We don't really expect that to be common at all though because generally folks will enable synthetic source to save space and adding offsets or term vectors is quite space inefficient. If it comes up, we might be able to improve here.	2022-06-15 14:49:06 -04:00
David Turner	fcf293f87c	Report overall mapping size in cluster stats (#87556 ) Adds measures of the total size of all mappings and the total number of fields in the cluster (both before and after deduplication). Relates #86639 Relates #77466	2022-06-14 13:55:14 +01:00
Nik Everett	a37edb7796	Add force_synthetic_source to GET (#87536 ) This adds the option to force synthetic source to the GET API. See #87068 for more discussion on why you'd want to do that - the short version is to get an upper bound on the performance cost of using synthetic source in GET.	2022-06-09 09:40:36 -04:00

1 2 3 4 5 ...

2826 commits