elasticsearch

mirror of https://github.com/elastic/elasticsearch.git synced 2025-06-28 09:28:55 -04:00

Author	SHA1	Message	Date
HYUNSANG HAN (한현상, Travis)	d16271b78d	Add RemoveBlock API to allow `DELETE /{index}/_block/{block}` (#129128 ) Introduces a new `RemoveBlock` API that complements the existing `AddBlock` API by allowing users to remove index blocks using `DELETE /{index}/_block/{block}`. Resolves #128966 --------- Co-authored-by: Niels Bauman <nielsbauman@gmail.com>	2025-06-25 06:16:14 +10:00
Panagiotis Bailis	b855266bd1	Make bbq_hnsw the default index option for dense-vector fields with more than 384 dimensions (#129825 )	2025-06-24 12:20:16 +03:00
Niels Bauman	f430a6c28c	Fix index stats field data YAML test (#129816 ) Occasional shard allocation issues were causing the YAML tests to fail because the shard that had the document in it would be unavailable. Fixes #96711	2025-06-24 01:32:27 +10:00
Luke Whiting	1ccf1c6806	Streams - Log's Enable, Disable and Status endpoints (#129474 ) * Enable And Disable Endpoint * Status Endpoint * Integration Tests * REST Spec * REST Spec tests * Some documentation * Update docs/changelog/129474.yaml * Fix failing security test * PR Fixes * PR Fixes - Add missing feature flag name to YAML spec * PR Fixes - Fix support for timeout and master_timeout parameters * PR Fixes - Make the REST handler validation happy with the new params * Delete docs/changelog/129474.yaml * PR Fixes - Switch to local metadata action type and improve request handling * PR Fixes - Make enable / disable endpoint cancellable * PR Fixes - Switch timeout param name for status endpoint * PR Fixes - Switch timeout param name for status endpoint in spec * PR Fixes - Enforce local only use for status action * PR Fixes - Refactor StreamsMetadata into server * PR Fixes - Add streams module to multi project YAML test suite * PR Fixes - Add streams cluster module to multi project YAML test suite	2025-06-19 11:48:44 +01:00
Jeremy Dahlgren	d43198ea3e	Add 'state' query param to GET snapshots API (#128635 ) This change introduces a new optional 'state' query parameter for the Get Snapshots API, allowing users to filter snapshots by state. The parameter accepts comma-separated values for states: SUCCESS, IN_PROGRESS, FAILED, PARTIAL, INCOMPATIBLE (case-insensitive). A new 'snapshots.get.state_parameter' NodeFeature has been added with this change. The new state query parameter will only be supported in clusters where all nodes support this feature. --------- Co-authored-by: Elena Stoeva <elenastoeva99@gmail.com>	2025-06-16 17:07:39 -04:00
Tommaso Teofili	629a366baa	Make dense_vector fields updatable to bbq_flat/bbq_hnsw (#128291 )	2025-06-16 17:15:59 +02:00
Gal Lalouche	936f3385b0	ESQL: Change queries ID to be the same as the async (#127472 ) This PR changes the list and query API for ESQL, such that the ID now follows the same format as async query IDs. This is saved as part of the task status. For async queries, this is easy, but for sync queries, this is slightly more complicated, since when creating them, we don't have access to a node ID. So instead, the status itself is just the doc ID portion of the async execution ID, which is used for salting, since this part needs to be consistent, so that when we list the queries, we can compute the async execution ID correctly. Also, I've removed the individual ID, node, and data node tags, as mentioned in the ticket. In addition, I've changed the accept and content-type to be JSON for lists. Resolves #127187	2025-06-12 14:37:08 +02:00
Dimitris Rempapis	0193dadae8	Enable Shard-Level Search-load rate metric (#128660 ) Introduces a new search load metric to the stats infrastructure, measured and tracked on a per-shard basis. The metric represents the Exponentially Weighted Moving Rate (EWMR) of search operations, calculated using the "took" time from each completed search phase.	2025-06-11 16:19:48 +03:00
Benjamin Trent	b5d522928a	Add support for nested queries for ivf indices (#128782 ) This does a first pass at adding nested query support for bbq_ivf indices. The support is pretty simple right now, basically, we keep exploring until we at least get `k` results to cover the case when the nested docs are all tightly clustered and the typical `nprobe` explores too few clusters to actually get `k` docs. I have some weird test failures I need to debug, so opening as draft for now.	2025-06-10 03:00:40 +10:00
Jim Ferenczi	6e67fac31a	Add option to include or exclude vectors from _source retrieval (#128735 ) This PR introduces a new include_vectors option to the _source retrieval context. When set to false, vectors are excluded from the returned _source. This is especially efficient when used with synthetic source, as it avoids loading vector fields entirely. By default, vectors remain included unless explicitly excluded.	2025-06-09 12:01:41 +01:00
Mary Gouseti	9764730d49	Remove include_default query param from get data stream options. (#128730 ) Initially we added to the `include_defaults` to the get data stream options REST API as it was used in the lifecycler API; however, we decided to simplify it and not use it. We remove it now before it gets adopted.	2025-06-03 18:15:42 +10:00
Mayya Sharipova	080a0cdd89	Enable sort optimization on int, short and byte fields (#127968 ) Before this PR sorting on integer, short and byte fields types used SortField.Type.LONG. This made sort optimization impossible for these field types. This PR uses SortField.Type.INT for integer, short and byte fields. This enables sort optimization. There are several caveats with changing sort type that are addressed: - Before mixed sort on integer and long fields was automatically supported, as both field types used SortField.TYPE.LONG. Now when merging results from different shards, we need to convert sort to LONG and results to long values. - Similar for collapsing when there is mixed INT and LONG sort types. - Index sorting. Similarly, before for index sorting on integer field, SortField.Type.LONG was used. This sort type is stored in the index writer config on disk and can't be modified. Now when providing sortField() for index sorting, we need to account for index version: for older indices return sort with SortField.Type.LONG and for new indices return SortField.Type.INT. --- There is only 1 change that may be considered not backwards compatible: Before if an integer field was [missing a value](https://www.elastic.co/docs/reference/elasticsearch/rest-apis/sort-search-results#_missing_values) , it sort values will return Long.MAX_VALUE in a search response. With this integer, it sort valeu will return Integer.MAX_VALUE. But I think this change is ok, as in our documentation, we don't provide information what value will be returned, we just say it will be sorted last. --- Also closes #127965 (as same type validation in added for collapse queries)	2025-06-03 07:50:11 +10:00
Benjamin Trent	b73a180bee	This adds a new experimental IVF format behind a feature flag (#128631 ) * Adding new bbq_ivf format behind a feature flag * adding tests * [CI] Auto commit changes from spotless * addressing pr comments * fixing flagging for yaml tests * adjust ivf search to utilize num candidates as approximation measure --------- Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>	2025-05-30 16:20:23 -04:00
Keith Massey	dc2fbe19a6	Removing the data stream settings feature flag (#128594 )	2025-05-29 09:50:14 -05:00
Keith Massey	83a13b9cc4	Making the data stream settngs rest-api-spec consistent with the elasticsearch-specification repository (#128535 )	2025-05-28 09:28:11 -05:00
Keith Massey	7207692056	Adding dry_run mode for setting data stream settings (#128269 )	2025-05-23 11:29:00 -05:00
Pete Gillin	1fe3b77a2a	ES-10063 Add multi-project support for more stats APIs (#127650 ) * Add multi-project support for more stats APIs This affects the following APIs: - `GET _nodes/stats`: - For `indices`, it now prefixes the index name with the project ID (for non-default projects). Previously, it didn't tell you which project an index was in, and it failed if two projects had the same index name. - For `ingest`, it now gets the pipeline and processor stats for all projects, and prefixes the pipeline ID with the project ID. Previously, it only got them for the default project. - `GET /_cluster/stats`: - For `ingest`, it now aggregates the pipeline and processor stats for all projects. Previously, it only got them for the default project. - `GET /_info`: - For `ingest`, same as for `GET /_nodes/stats`. This is done by making `IndicesService.stats()` and `IngestService.stats()` include project IDs in the `NodeIndicesStats` and `IngestStats` objects they return, and making those stats objects incorporate the project IDs when converting to XContent. The transitive callers of these two methods are rather extensive (including all callers to `NodeService.stats()`, all callers of `TransportNodesStatsAction`, and so on). To ensure the change is safe, the callers were all checked out, and they fall into the following cases: - The behaviour change is one of the desired enhancements described above. - There is no behaviour change because it was getting node stats but neither `indices` nor `ingest` stats were requested. - There is no behaviour change because it was getting `indices` and/or `ingest` stats but only using aggregate values. - In `MachineLearningUsageTransportAction` and `TransportGetTrainedModelsStatsAction`, the `IngestStats` returned will return stats from all projects instead of just the default with this change, but they have been changed to filter the non-default project stats out, so this change is a noop there. (These actions are not MP-ready yet.) - `MonitoringService` will be affected, but this is the legacy monitoring module which is not in use anywhere that MP is going to be enabled. (If anything, the behaviour is probably improved by this change, as it will now include project IDs, rather than producing ambiguous unqualified results and failing in the case of duplicates.) * Update test/external-modules/multi-project/build.gradle Change suggested by Niels. Co-authored-by: Niels Bauman <33722607+nielsbauman@users.noreply.github.com> * Respond to review comments * fix merge weirdness * [CI] Auto commit changes from spotless * Fix test compilation following upstream change to base class * Update x-pack/plugin/core/src/test/java/org/elasticsearch/xpack/core/datatiers/DataTierUsageFixtures.java Co-authored-by: Niels Bauman <33722607+nielsbauman@users.noreply.github.com> * Make projects-by-index map nullable and omit in single-project; always include project prefix in XContent in multip-project, even if default; also incorporate one other review comment * Add a TODO * update IT to reflect changed behaviour * Switch to using XContent.Params to indicate whether it is multi-project or not * Refactor NodesStatsMultiProjectIT to common up repeated assertions * Defer use of ProjectIdResolver in REST handlers to keep tests happy * Include index UUID in "unknown project" case * Make the index-to-project map empty rather than null in the BWC deserialization case. This works out fine, for the reasons given in the comment. As it happens, I'd already forgotten to do the null check in the one place it's actively used. * remove a TODO that is done, and add a comment * fix typo * Get REST YAML tests working with project ID prefix TODO finish this * As a drive-by, fix and un-suppress one of the health REST tests * [CI] Auto commit changes from spotless * TODO ugh * Experiment with different stashing behaviour * [CI] Auto commit changes from spotless * Try a more sensible stash behaviour for assertions * clarify comment * Make checkstyle happy * Make the way `Assertion` works more consistent, and simplify implementation * [CI] Auto commit changes from spotless * In RestNodesStatsAction, make the XContent params to channel.request(), which is the value it would have had before this change --------- Co-authored-by: Niels Bauman <33722607+nielsbauman@users.noreply.github.com> Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>	2025-05-21 19:04:22 +01:00
Keith Massey	bc45087962	Adding rest actions to get and set data stream settings (#127858 )	2025-05-21 12:17:56 -05:00
David Turner	943b22400b	Mark `repository_verify_integrity` API as `public` (#128244 ) Not sure why this was defined as `private` in #112348, it should have been `public`. This commit fixes the visibility so we generate docs for this API.	2025-05-21 23:21:18 +10:00
Shahbaz Aamir	80946ce385	[Dev Docs] Replacing unsupported Note with [!Note] (#127102 ) * Replacing unsupported Note with [!Note]	2025-05-05 11:33:04 +02:00
mushaoqiong	feb44c5c89	Throw exception for unknown token in RestIndexPutAliasAction (#124708 ) This PR throws IllegalArgumentException in RestIndexPutAliasAction to aovid slience Swallowing unsupport token	2025-04-29 11:12:09 +10:00
mushaoqiong	637807c82b	Throw exception for unsupported values type in Alias (#124737 ) After creating index with alias using the following request ``` PUT test-index { "aliases": { "alias1": { "is_write_index": "true" } } } ``` we got the following result for get index request: ``` { "test-index" : { "aliases" : { "alias1" : { } }, "mappings" : { }, "settings" : { ... } } } ``` The `is_write_index` field is missing because string boolean value is not supported for this filed and `no warning message showed`, which will mislead the users. In #120453 I open a PR to let the createIndex API support string boolean values for `is_write_index` field, but @dakrone think it's better to be strict about boolean values. So I open this PR to let the Alias class throw exception for the unsupport value type to avoid the slience swallowing of this case.	2025-04-29 06:48:20 +10:00
Samiul Monir	cd4fcbff21	Update Default value of Oversample for bbq (#127134 ) * Unit test to validate default behavior * adding default value to oversample for bbq * Fix code style issue * Update docs/changelog/127134.yaml * Update changelog * Adding index version to support only new indices * Update index version name to better match * Adding a simple yaml test to verify the yaml functionality for oversample value * Refactor knn float to add rescore vector by default when index type is one of bbq * adding yaml tests to verify oversampel default value * Fixing format issue for not_exists	2025-04-28 12:36:03 -04:00
Benjamin Trent	fa1a1e8bbd	Add refresh to 41_knn_search_byte_quantized as other yaml tests have it as well (#127352 )	2025-04-25 08:35:32 -04:00
Chris Hegarty	19550a838f	Add dense vector off-heap stats to Node stats and Index stats APIs (#126704 ) This change enhances the dense_vector section of the Nodes stats and Index stats APIs so that they report the desired size of off-heap memory for all indexed vectors. The dense_vector section of the Custer stats API remains unchanged. The retrieval mechanism and structure of the new stats is the same across the various three stats APIs, but more fine-grained information is disclosed as when moving from Cluster -> Node -> Index API. For Node stats, we aggregate the total byte sizes for all vectors, categorised by the data type. For example: "dense_vector" : { "value_count" : 5, "off_heap" : { "total_size_in_bytes" : 27, "total_veb_size_in_bytes" : 3, "total_vec_size_in_bytes" : 23, "total_veq_size_in_bytes" : 0, "total_vex_size_in_bytes" : 1 } } Index stats: same as Node stats with included field break down . For example: "dense_vector" : { "value_count" : 5, "off_heap" : { "total_size_in_bytes" : 27, "total_veb_size_in_bytes" : 3, "total_vec_size_in_bytes" : 23, "total_veq_size_in_bytes" : 0, "total_vex_size_in_bytes" : 1, "fielddata" : { "bar" : { "veb_size_in_bytes" : 3, "vec_size_in_bytes" : 14, "vex_size_in_bytes" : 1 }, "foo" : { "vec_size_in_bytes" : 9 } } } The implementation accesses the actual statistics through reflection. This will be completely removed when Lucene exposes this, which is expected in Lucene 10.3	2025-04-23 15:04:44 +01:00
Carlos Delgado	4d4b962fd1	Synonyms API - Add refresh parameter to check synonyms index and reload analyzers (#126935 ) * Add timeout to SynonymsManagementAPIService put synonyms * Remove replicas 0, as that may impact serverless * Add timeout to put synonyms action, fix tests * Fix number of replicas * Remove cluster.health checks for synonyms index * Revert debugging * Add integration test for timeouts * Use TimeValue instead of an int * Add YAML tests and REST API specs * Fix a validation bug in put synonym rule * Spotless * Update docs/changelog/126314.yaml * Remove unnecessary checks for null * Fix equals / HashCode * Checks that timeout is passed correctly to the check health method * Use correctly the default timeout * spotless * Add monitor cluster privilege to internal synonyms user * [CI] Auto commit changes from spotless * Add capabilities to avoid failing on bwc tests * Replace timeout for refresh param * Add param to specs * Add YAML tests * Fix changelog * [CI] Auto commit changes from spotless * Use BWC serialization tests * Fix bug in test parser * Spotless * Delete doesn't need reloading 🤦 removing it * Revert "Delete doesn't need reloading 🤦 removing it" This reverts commit `9c8e0b62be`. * [CI] Auto commit changes from spotless * Fix refresh for delete synonym rule * Fix tests * Update docs/changelog/126935.yaml * Add reload analyzers test * reload_analyzers is not available on serverless --------- Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>	2025-04-22 17:23:06 +02:00
James Baiera	7b89f4d4a6	Add ability to redirect ingestion failures on data streams to a failure store (#126973 ) Removes the feature flags and guards that prevent the new failure store functionality from operating in production runtimes.	2025-04-18 16:33:03 -04:00
Quentin Pradet	1f68bfbc3e	Add back inference.inference API (#126601 )	2025-04-11 14:09:51 +04:00
Josh Mock	5f871c5cf5	Remove reference to dropped EIS API (#126422 ) Co-authored-by: Quentin Pradet <quentin.pradet@elastic.co>	2025-04-09 12:06:00 +04:00
Gal Lalouche	953b9fbb83	ESQL: List/get query API (#124832 ) This PR adds two new REST endpoints, for listing queries and getting information on a current query. * Resolves #124827 * Related to #124828 (initial work) Changes from the API specified in the above issues: * The get API is pretty initial, as we don't have a way of fetching the memory used or number of rows processed. List queries response: ``` GET /_query/queries // returns for each of the running queries // query_id, start_time, running_time, query { "queries" : { "abc": { "id": "abc", "start_time_millis": 14585858875292, "running_time_nanos": 762794, "query": "FROM logs* \| STATS BY hostname" }, "4321": { "id":"4321", "start_time_millis": 14585858823573, "running_time_nanos": 90231, "query": "FROM orders \| LOOKUP country_code ON country" } } } ``` Get query response: ``` GET /_query/queries/abc { "id" : "abc", "start_time_millis": 14585858875292, "running_time_nanos": 762794, "query": "FROM logs* \| STATS BY hostname" "coordinating_node": "oTUltX4IQMOUUVeiohTt8A" "data_nodes" : [ "DwrYwfytxthse49X4", "i5msnbUyWlpe86e7"] } ```	2025-04-08 22:21:32 +03:00
David Turner	527d2a203b	Improve handling of empty response (#125562 ) Today `ActionResponse$Empty` implements `ToXContentObject`, but yields no bytes of content when serialized which creates an invalid JSON response. This commit removes the bogus interface and adjusts the affected REST APIs to send a `text/plain` response instead.	2025-04-07 12:10:07 +01:00
Jordan Powers	4c174a891f	Use Lucene101 postings format by default (#126080 ) Update the PerFieldFormatSupplier so that new standard indices use the Lucene101PostingsFormat instead of the current default ES812PostingsFormat. Currently, use of the new codec is gated behind a feature flag.	2025-04-04 12:41:27 -07:00
Alexey Ivanov	fd7efe587e	[main] Move system indices migration to migrate plugin (#125437 ) * [main] Move system indices migration to migrate plugin It seems the best way to fix #122949 is to use existing data stream reindex API. However, this API is located in the migrate x-pack plugin. This commit moves the system indices migration logic (REST handlers, transport actions, and task) to the migrate plugin. Port of #123551 * [CI] Auto commit changes from spotless * Fix compilation * Fix tests * Fix test --------- Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>	2025-04-04 18:49:38 +01:00
Stanislav Malyshev	6043d9c675	Update allow_partial_results docs (#126257 )	2025-04-03 22:13:49 -06:00
Niels Bauman	483f97915c	Run `TransportGetIndexAction` on local node (#125652 ) This action solely needs the cluster state, it can run on any node. Since this is the last class/action that extends the `ClusterInfo` abstract classes, we remove those classes too as they're not required anymore. Relates #101805	2025-04-02 18:41:35 +01:00
Mary Gouseti	25050495b9	Data stream options convert to `javaRestTests` to `yamlRestTests`. (#126037 ) In this PR we introduce the data stream API in the `es-rest-api` using the feature flag feature. This enabled us to use the `yamlRestTests` tests instead of the `javaRestTests`.	2025-04-03 01:32:54 +11:00
Niels Bauman	eb4d64f94a	Run `TransportGetSettingsAction` on local node (#126051 ) This action solely needs the cluster state, it can run on any node. Additionally, it needs to be cancellable to avoid doing unnecessary work after a client failure or timeout. Relates #101805	2025-04-02 15:05:31 +01:00
Niels Bauman	8028d5adde	Fix cat allocation YAML test (#126003 ) This test failed when the `disk.indices.forecast` value was a decimal number. We adjust the regex to allow decimal values and for consistency we also allow negative values. Fixes #125711 Fixes #125848 Fixes #125661	2025-04-01 11:25:13 +01:00
Benjamin Trent	505f21ba42	Simplify tests, bypassing raw score test (#125877 ) I was debating on having this tests in the original PR anyways. It ain't worth the flakiness. We know the oversampling setting gets updated given the other tests. closes: https://github.com/elastic/elasticsearch/issues/125851	2025-03-31 23:49:29 +11:00
Armin Braun	fd2cc97541	Introduce batched query execution and data-node side reduce (#121885 ) This change moves the query phase a single roundtrip per node just like can_match or field_caps work already. A a result of executing multiple shard queries from a single request we can also partially reduce each node's query results on the data node side before responding to the coordinating node. As a result this change significantly reduces the impact of network latencies on the end-to-end query performance, reduces the amount of work done (memory and cpu) on the coordinating node and the network traffic by factors of up to the number of shards per data node! Benchmarking shows up to orders of magnitude improvements in heap and network traffic dimensions in querying across a larger number of shards.	2025-03-29 16:53:18 +01:00
Carlos Delgado	968bddc462	Non existing synonyms sets do not fail shard recovery (#125659 )	2025-03-27 18:04:20 +02:00
Benjamin Trent	d84eb1f53f	Update bbq test data to better distinguish docs (#125705 ) Adjust the test data. I verified that the scores are now more distinguishable when: - each doc has its own segment - when 1 & 2 are in the same segment but 3 is alone - 2 & 3 in the same segment but 1 alone - 1 & 3 in the same segment but 2 alone - all three in the same segment closes: https://github.com/elastic/elasticsearch/issues/123727 closes: https://github.com/elastic/elasticsearch/issues/124848	2025-03-28 00:12:56 +11:00
Benjamin Trent	dd58b0b6fa	Return appropriate error on null dims update instead of npe (#125716 ) Calling `Object::toString` was trying to call `null.toString()`, really it should have been `Objects::toString`, which accepts `null`. closes: https://github.com/elastic/elasticsearch/issues/125713	2025-03-27 08:47:20 +11:00
Benjamin Trent	009a86a0e3	Allow zero for rescore_vector.oversample to indicate by-passing oversample and rescoring (#125599 ) This allows a `rescore_vector: {oversample: 0}` to indicate bypassing oversampling and rescoring. This is useful for: - Updating a quantized mapping to turn off automatic rescoring - Bypassing oversampling at query time in an ad-hoc manner if its on by default in the mapping closes: https://github.com/elastic/elasticsearch/issues/125157	2025-03-27 06:56:51 +11:00
Stanislav Malyshev	07921a78a6	Handle long overflow in dates (#124048 ) * Handle long overflow in dates	2025-03-26 18:57:04 +02:00
Niels Bauman	fdd453734d	Fix NPE in rolling over unknown target and return 404 (#125352 ) Since #122905 we were throwing NPEs (i.e. 5xxs) when a rollover request has an unknown/non-existent target. Before that, we returned a 400 - illegal argument exception. We now return a 404 which matches "missing target" better. Additionally, to avoid this from happening again, we add a YAML test that asserts the correct exception behavior.	2025-03-22 12:59:13 +02:00
Lisa Cawley	97c5d4e149	Add more inference API REST specifications (#125187 )	2025-03-21 09:44:37 +02:00
Benjamin Trent	e9c4b267c2	Adjusting 41_knn_search_bbq_hnsw tests to have explicit refresh (#125255 )	2025-03-20 17:15:05 -04:00
Tommaso Teofili	6d3dac32c6	Let random_score yaml test explicitly fail on _id field (#125230 ) * constrain the no-field scenario to 9.x	2025-03-20 14:16:02 +01:00
István Zoltán Szabó	8a741bfd62	Adds VoyageAI PUT Inference API. (#125198 )	2025-03-19 13:29:14 +01:00

1 2 3 4 5 ...

3822 commits