elasticsearch

mirror of https://github.com/elastic/elasticsearch.git synced 2025-06-28 09:28:55 -04:00

Author	SHA1	Message	Date
likzn	f28f4545b2	In the field capabilities API, re-add support for `fields` in the request body (#88972 ) We previously removed support for `fields` in the request body, to ensure there was only one way to specify the parameter. We've now decided to undo the change, since it was disruptive and the request body is actually the best place to pass variable-length data like `fields`. This PR restores support for `fields` in the request body. It throws an error if the parameter is specified both in the URL and the body. Closes #86875	2022-08-04 13:44:50 -04:00
Christos Soulios	b81f4187ab	[TSDB] Metric fields in the field caps API (#88695 ) To assist the user in configuring the visualizations correctly while leveraging TSDB functionality, information about TSDB configuration should be exposed via the field caps API per field. Especially for metrics fields, it must be clear which fields are metrics and if they belong to only time-series indexes or mixed time-series and non-time-series indexes. To further distinguish metric fields when they belong to any of the following indices: - Standard (non-time-series) indexes - Time series indexes - Downsampled time series indexes This PR modifies the field caps API so that the mapping parameters time_series_dimension and time_series_dimension are presented only when they are set on fields of time-series indexes. Those parameters are completely ignored when they are set on standard (non-time-series) indexes. This PR revisits some of the conventions adopted by #78790	2022-08-04 20:42:34 +03:00
Ed Savage	188f8872c6	[ML] ECS Grok patterns in the _text_structure/find_structure endpoint (#88982 ) Also add support for new CATALINA/TOMCAT timestamp formats used by ECS Grok patterns Relates #77065 Co-authored-by: David Roberts <dave.roberts@elastic.co>	2022-08-04 18:39:04 +01:00
Julie Tibshirani	0bed7f768a	Fix failures in vector field usage mixed cluster test	2022-08-03 16:14:46 -04:00
Julie Tibshirani	21eb984e64	Deprecate the _knn_search endpoint (#88828 ) This change deprecates the kNN search API in favor of the new 'knn' option inside the search API. The 'knn' option is now the preferred way of performing kNN search. Relates to #87625	2022-08-03 15:19:01 -04:00
Nikolaj Volgushev	a124bafe7e	REST tests and spec for bulk update API keys (#89027 ) This PR adds REST API spec and YAML test files for the BulkUpdateApiKey operation.	2022-08-03 12:42:54 +02:00
Artem Prigoda	f4e617e894	Add a test for checking for misspelled "dry_run" parameters for Desired Nodes API (#88898 ) Check we the API doesn't accept a misspelled parameter and returns a client error.	2022-07-28 16:15:43 +02:00
Nik Everett	3bcee8eaa0	Format runtime geo_points (#85449 ) This formats the result of the `fields` section of the `_search` API for runtime `geo_point` fields using the `format` parameter like we do for non-runtime `geo_point` fields. This changes the default format for those fields from `lat, lon` to `geojson` with the option to get `wkt` or any other format we support. The fix does so by preserving the `double, double` nature of the `geo_point` rather than encoding it immediately in the script. Callers can use the results. The field fetchers use the `double, double` natively, preserving as much precision as possible. The queries quantize the points exactly like lucene indexing does. And like the script did before this Pr. Closes #85245	2022-07-27 13:11:07 -04:00
Przemko Robakowski	539434dbb4	Add min_* conditions to rollover (#83345 )	2022-07-26 11:46:39 -04:00
Julie Tibshirani	abd561a277	Support kNN vectors in disk usage action (#88785 ) This change adds support for kNN vector fields to the `_disk_usage` API. The strategy: * Iterate the vector values (using the same strategy as for doc values) to estimate the vector data size * Run some random vector searches to estimate the vector index size Co-authored-by: Yannick Welsch <yannick@welsch.lu> Closes #84801	2022-07-26 07:57:47 -07:00
Artem Prigoda	c0bc85522d	Clean up desired nodes in between dry run tests (#88797 )	2022-07-26 12:04:06 +02:00
Artem Prigoda	72a6fdc2b8	Support "dry run" mode for updating Desired Nodes (#88305 ) Add the dry_run query parameter to support simulating of updating of desired nodes. The update request will be validated, but no cluster state updates will be performed. In order to indicate that the response was a result of a dry run, we add the dry_run run field to the JSON representation of a response. See #82975	2022-07-26 09:03:12 +02:00
Keith Massey	4b060a6046	Removing the notion of components from the health API (#88663 ) This commit removes the notion of components from the health API. They are gone from being a top-level field in the response, and indicators is promoted into its place.	2022-07-25 12:29:06 -05:00
Andrei Dan	da765ced7f	Remove help_url,rename summary to symptom, and user_actions to diagnosis (#88553 ) Remove help_url,rename summary->symptom,user_actions->diagnosis Separate the diagnosis `message` field in `cause` and `action` Co-authored-by: Mary Gouseti <mgouseti@gmail.com>	2022-07-25 10:35:16 +01:00
Julie Tibshirani	e3ede67262	Integrate ANN into _search endpoint (#88694 ) This PR adds a new `knn` option to the `_search` API to support ANN search. It's powered by the same Lucene ANN capabilities as the old `_knn_search` endpoint. The `knn` option can be combined with other search features like queries and aggregations. Addresses #87625	2022-07-22 08:02:07 -07:00
Benjamin Trent	94f2544998	Adding cardinality support for random_sampler agg (#86838 ) This adds support for the `cardinality` aggregation within a random_sampler. This usecase is helpful in determining the ratio of unique values compared to the count of total documents within the sampled set.	2022-07-21 07:19:35 -04:00
Seth Michael Larson	fffabae10a	Add pagination parameters to API spec and docs for 'snapshot.get' API	2022-07-20 06:35:52 -05:00
tmgordeeva	ab2602ecb0	Propagate alias filters to significance aggs filters (#88221 ) Propagate alias filters to significance aggs filters If we have an alias filter, use it as part of the background filter on a signficant terms agg. Previously, alias filters did not apply to background filters so this will change bg_count results for some significant terms aggs using background filter. Closes #81585	2022-07-19 10:03:08 -07:00
Seth Michael Larson	478c06ef29	Verify that 'details' aren't sent when explain=false	2022-07-18 09:48:11 -05:00
Benjamin Trent	afa28d49b4	[ML] add new cache_size parameter to trained_model deployments API (#88450 ) With: https://github.com/elastic/ml-cpp/pull/2305 we now support caching pytorch inference responses per node per model. By default, the cache will be the same size has the model on disk size. This is because our current best estimate for memory used (for deploying) is 2*model_size + constant_overhead. This is due to the model having to be loaded in memory twice when serializing to the native process. But, once the model is in memory and accepting requests, its actual memory usage is reduced vs. what we have "reserved" for it within the node. Consequently, having a cache layer that takes advantage of that unused (but reserved) memory is effectively free. When used in production, especially in search scenarios, caching inference results is critical for decreasing latency.	2022-07-18 09:19:01 -04:00
Alan Woodward	5c11a81913	Add 'mode' option to `_source` field mapper (#88211 ) Currently we have two parameters that control how the source of a document is stored, `enabled` and `synthetic`, both booleans. However, there are only three possible combinations of these, with `enabled:false` and `synthetic:true` being disallowed. To make this easier to reason about, this commit replaces the `enabled` parameter with a new `mode` parameter, which can take the values `stored`, `synthetic` and `disabled`. The `mode` parameter cannot be set in combination with `enabled`, and we will subsequently move towards deprecating `enabled` entirely.	2022-07-18 12:50:10 +01:00
Chen Ni	c45c205c33	Add test execution guide in yamlRestTest asciidoc (#88490 )	2022-07-14 08:22:35 -07:00
Nhat Nguyen	227d80975b	Add tests for query/agg on lookup runtime fields (#88389 ) Adds tests to ensure that querying and aggregating on lookup runtimes aren't supported. Relates #88296	2022-07-09 02:02:13 +09:30
Nikolaj Volgushev	f42b15bc8c	Updatable API keys - REST API spec and tests (#88270 ) This PR adds REST API spec and YAML test files for the UpdateApiKey operation.	2022-07-08 11:48:02 +02:00
Ryan Ernst	9016883e1c	Add build_flavor back to info api rest response (#88336 ) The build_flavor was previously removed since it is no longer relevant; only the default distribution now exists. However, the removal of build flavor included removing it from the version information on the info response for the root path. This API is supposed to be stable, so removing that key was a compatibility break. This commit adds the build_flavor back to that API, hardcoded to `default`. Additionally, a test is added to ensure the key exists going forward, until it can be properly deprecated. closes #88318	2022-07-08 09:54:29 +09:30
Mark Tozzi	9ee6a19187	Add ability to select execution mode for cardinality aggregation (#87704 ) Plumbs through a new parameter for the cardinality aggregation, to allow configuring the execution mode. This can have significant impacts on speed and memory usage. This PR exposes three collection modes and two heuristics that we can tune going forward. All of these are treated as hints and can be silently ignored, e.g. if not applicable to the given field type. I've change the default behavior to optimize for time, which potentially uses more memory. Users can override this for the old behavior if needed.	2022-07-05 09:11:22 -04:00
Rene Groeschke	8ccae4da71	Setup elasticsearch dependency monitoring with Snyk for production code (#88036 ) This adds the generation and upload logic of Gradle dependency graphs to snyk We directly implemented a rest api based snyk plugin as: the existing snyk gradle plugin delegates to the snyk command line tool the command line tool uses custom gradle logic by injecting a init file that is a) using deprecated build logic which we definitely want to avoid b) uses gradle api we avoid like eager task creation. Shipping this as a internal gradle plugin gives us the most flexibility as we only want to monitor production code for now we apply this plugin as part of the elasticsearch.build plugin, that usage has been for now the de-facto indicator if a project is considered a "production" project that ends up in our distribution or public maven repositories. This isnt yet ideal and we will revisit the distinction between production and non production code / projects in a separate effort. As part of this effort we added the elasticsearch.build plugin to more projects that actually end up in the distribution. To unblock us on this we for now disabled a few check tasks that started failing by applying elasticsearch.build. Addresses #87620	2022-06-29 13:29:14 +02:00
Nik Everett	d88dfb11c7	More REST tests for avg/max/min/sum_bucket aggs (#88027 ) Adds REST layer tests for some sneaky cases in the the `avg_bucket`, `max_bucket`, `min_bucket`, and `sum_bucket` pipeline aggregations. This gives us forwards and backwards compatibility tests for these aggs as well as mixed version cluster tests for these aggs. Relates to #26220	2022-06-27 13:49:29 -04:00
Ryan Ernst	e3c4cddbe2	Remove legacy bootstrap plugins (#87775 ) Bootstrap plugins were an internal mechanism added to allow a filesystemprovider for cloud with the quota-aware-fs plugin. Since that was removed, bootstrap plugins no longer serve a purpose. They were never officially documented because they were for internal use only. This commit removes the bootstrap plugins infrastructure.	2022-06-23 20:38:06 -04:00
Julie Tibshirani	572a5b9bb4	Skip dense_vector field usage test before 8.1 Fixes #87971.	2022-06-23 10:25:17 -07:00
Julie Tibshirani	3a9e511117	Move kNN search and dense vectors to core (#87815 ) This PR moves kNN search and dense vector support out of an xpack plugin and into server. In #87625 we plan to integrate ANN search into the main `_search` endpoint as a new top-level component called `knn`. So kNN will be a dedicated part of the search request, and we'll have kNN logic within the search phases. The classes and logic will live in server, matching the other search components like suggesters, field collapsing, etc.	2022-06-22 21:10:20 -07:00
Nik Everett	463d46cd79	Add force_synthetic_source to mget (#87574 ) This adds the option to force synthetic source to the MGET API. See #87068 for more discussion on why you'd want to do that - the short version is to get an upper bound on the performance cost of using synthetic source in MGET.	2022-06-22 08:55:55 -04:00
Mark Tozzi	5f2411a3b8	Revert "Correct skip versions for new flattened terms test (#87540 )" (#87764 ) This reverts commit `f72c7da7ee`.	2022-06-16 16:35:39 -04:00
Nik Everett	cf154fd367	Tests for synthetic _source from translog (#87578 ) This adds tests to make sure that we use all of the normal synthetic source machinery, even when loading from the translog. So all GETs on synthetic source indices will require an in memory index. That'll be an extra cost on indices that are updated very very frequently.	2022-06-16 14:51:17 -04:00
Mark Tozzi	f72c7da7ee	Correct skip versions for new flattened terms test (#87540 ) Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>	2022-06-16 14:38:05 -04:00
Nik Everett	5b525a290e	REST tests for avg/max/min/sum_bucket aggs (#87009 ) Adds REST layer tests for the `avg_bucket`, `max_bucket`, `min_bucket`, and `sum_bucket` pipeline aggregations. This gives us forwards and backwards compatibility tests for these aggs as well as mixed version cluster tests for these aggs. Relates to #26220	2022-06-16 12:16:06 -04:00
Nik Everett	d74b45b9b1	REST tests for stats_bucket aggs (#87006 ) Adds REST tests for `stats_bucket` and `extended_stats_bucket` aggs. Relates to #26220	2022-06-16 12:14:52 -04:00
Nik Everett	48ab87f60b	Fix synthetic source highlighting tests (#87749 ) The synthetic source highlighting tests would sometimes fail in a strange way - they expect the entire search request to fail but it didn't - only a single shard would fail. This locks the tests to always make single shard indices so the failures are consistent. Closes #87730	2022-06-16 12:07:43 -04:00
Nik Everett	8ebf39b7e1	Fixup highlighting with synthetic source (#87667 ) Synthetic source has a habit of reordering text fields. This frustrates highlighting because it often wants to use index structures to find the offsets to values in the field. This disables the FVH highlighter for multi-valued text fields when synthetic source is enabled and runs the unified highlighter in "analyze" mode when synthetic source is enabled. That's enough to stop them from spitting out wrong answers. We might be leaving some performance on the table when the unified highlighter works on a single valued text field that is indexed with offsets or term vectors. We don't really expect that to be common at all though because generally folks will enable synthetic source to save space and adding offsets or term vectors is quite space inefficient. If it comes up, we might be able to improve here.	2022-06-15 14:49:06 -04:00
David Turner	fcf293f87c	Report overall mapping size in cluster stats (#87556 ) Adds measures of the total size of all mappings and the total number of fields in the cluster (both before and after deduplication). Relates #86639 Relates #77466	2022-06-14 13:55:14 +01:00
Nik Everett	a37edb7796	Add force_synthetic_source to GET (#87536 ) This adds the option to force synthetic source to the GET API. See #87068 for more discussion on why you'd want to do that - the short version is to get an upper bound on the performance cost of using synthetic source in GET.	2022-06-09 09:40:36 -04:00
Nik Everett	2ec59e799b	Fix test in synthetic source (#87534 ) Fixes a test for forcing synthetic source that sometimes fails if the index has more than one shard. We're just looking for a sensible failure message here so we can lock it to one shard.	2022-06-08 15:17:25 -04:00
Yang Wang	f5ceed19fc	User Profile - remove feature flag (#87383 ) The feature flag is no longer necessary in the 8.4 release cycle. The feature itself is still in beta.	2022-06-08 10:18:18 -04:00
Mark Tozzi	c9af118237	Fix a bug with flattened fields in terms aggregations (#87392 ) The root cause here was that missing did not correctly delegate `supportsGlobalOrdinalsMappnig` to the wrapped values source, instead falling back to the default. I've added the delegation, and made the base method abstract so this doesn't happen again.	2022-06-08 08:08:18 -04:00
Salvatore Campagna	5d062f9fdd	Make the metric in the buckets_path parameter optional (#87220 ) With this change the metric field name becomes optional if the 'bukets_path' is pointing to a multi-value aggregation with a single metric field. Normally the full path would be required including the aggregation name followed by the metric field. If the metric is not specified in the path and the multi-value aggregation computes more than one value an error is thrown. The old notation is still supported for backward compatibility in case the full path is specified and the target multi-value aggregation computes a single value.	2022-06-08 10:44:02 +02:00
Nik Everett	d0b50b56a0	Add an option to _search to force synthetic source (#87068 ) This adds `?force_synthetic_source` to, well, force running the fetch phase with synthetic source. If the mapping is incompatible with synthetic source it'll throw a 400 error.	2022-06-07 11:11:14 -04:00
Albert Zaharovits	7be60d6068	[DOCS] Profile Has Privileges API (#87360 ) Docs for the new Has Privileges API for profiles from #85898. [Has privileges user profile API preview](https://elasticsearch_87360.docs-preview.app.elstc.co/guide/en/elasticsearch/reference/master/security-api-has-privileges-user-profiles.html).	2022-06-07 03:58:32 -04:00
Keith Massey	c95230d155	Master stability health indicator part 1 (when a master has been seen recently) (#86524 ) This is the first PR for the master stability check, which is part of the health API. It handles the case when we have seen a master node recently. The more complicated case when we have not seen a master node recently will be in subsequent PRs.	2022-06-06 14:40:15 -05:00
Nik Everett	5f70d30330	Synthetic source: paranoid tests for configuration (#87182 ) This adds some paranoid REST layer tests for modifying the `synthetic` configuration.	2022-06-06 09:37:57 -04:00
Luca Cavanna	50793a68a8	Fields API to allow fetching values when _source is disabled (#87267 ) Back when we introduced the fields parameter to the search API, it could only fetch values from _source, hence the corresponding sub-fetch phase fails early whenever _source is disabled. Today though runtime fields can be retrieved from a separate value fetcher that reads from fielddata, and metadata fields can be retrieved from stored fields. These two scenarios currently throw an unnecessary error whenever _source is disabled. This commit removes the check for disabled _source, so that runtime fields and metadata fields can be retrieved even when _source is disabled. Fields that need to be loaded from _source are simply skipped whenever _source is disabled, similar to when a field is not found in _source. Closes #87072	2022-06-02 11:28:36 +02:00

1 2 3 4 5 ...

2817 commits