elasticsearch

mirror of https://github.com/elastic/elasticsearch.git synced 2025-04-23 14:47:31 -04:00

Author	SHA1	Message	Date
Martijn van Groningen	2b1170509b	Change subobjects yaml tests to use composable index templates. (#112129 ) Currently the legacy templates are being used which are deprecated.	2024-08-23 17:17:54 +07:00
Quentin Pradet	92d25c157a	Fix id and routing types in indices.split YAML tests (#112059 )	2024-08-23 12:46:52 +04:00
Kostas Krikellas	1362d56865	Introduce mode `subobjects=auto` for objects (#110524 ) * Introduce mode `subobjects=auto` for objects * Update docs/changelog/110524.yaml * compilation error * tests and fixes * refactor * spotless * more tests * fix nested objects * fix test * update fetch test * add QA coverage * update tests * update tests * update tests * fix nested	2024-08-22 15:13:52 +03:00
Oleksandr Kolomiiets	27721c3c05	Add a test reproducing issue with lookup of parent document in nested field synthetic source (#112043 )	2024-08-21 08:36:55 -07:00
Keith Massey	fac9b6a21e	Updating fix version for bulk api took time fix now that it has been backported (#111863 ) (#111899 ) (#111906 )	2024-08-14 12:59:01 -05:00
Keith Massey	e63225ae32	Fixing incorrect bulk request took time (#111863 )	2024-08-14 10:39:45 -05:00
Jim Ferenczi	6ee9801a99	Update the intervals query docs (#111808 ) Since https://github.com/apache/lucene-solr/pull/620, intervals disjunctions are automatically rewritten to handle cases where minimizations can miss valid matches. This change updates the documentation to take this behaviour into account (users don't need to manually pull intervals disjunctions to the top anymore).	2024-08-13 13:39:55 +09:00
Benjamin Trent	d0bd1f2cb1	fixing data setup for knn yaml tests (#111794 ) We should do set up just in the test as that is the only place that uses this index. This way we get around any weird bwc checks around previously required parameters. Additionally, this adjusts the bwc version skip as the code fix has been backported. closes: https://github.com/elastic/elasticsearch/issues/111765 closes: https://github.com/elastic/elasticsearch/issues/111766 closes: https://github.com/elastic/elasticsearch/issues/111767 closes: https://github.com/elastic/elasticsearch/issues/111768	2024-08-13 06:38:14 +10:00
Kathleen DeRusso	4e26114764	Fix NullPointerException when doing knn search on empty index without dims (#111756 ) * Fix NullPointerException when doing knn search on empty index without dims * Update docs/changelog/111756.yaml * Fix typo in yaml test --------- Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>	2024-08-09 13:59:57 -04:00
Simon Cooper	b0c82f4054	Update docs with new behavior on skip conditions (#111640 ) #111585 and #111268 change the behavior to skip on any node having the feature/capability, not all nodes	2024-08-07 10:37:59 +01:00
Simon Cooper	5da4f31a4b	Skip on any node capability being present (#111585 ) Update capabilities skip behavior to skip on any node having the capability, not all nodes	2024-08-07 10:36:23 +01:00
Mike Pellegrini	e11fa74333	Gracefully handle invalid synonym rules in updateable synonyms (#110901 ) Gracefully handle invalid synonym rules by setting lenient to true by default when synonyms are updateable --------- Co-authored-by: carlosdelest <carlos.delgado@elastic.co>	2024-08-06 10:44:23 -04:00
David Turner	e3a2ce99de	Fix trappy timeouts in data stream APIs (#111474 ) Relying on the default 30s timeout is trappy, we should be explicit about the timeouts we're using in these requests. Relates #107984	2024-08-06 21:13:06 +10:00
David Turner	586405d11f	Remove trappy timeout from `ClusterSearchShardsRequest` (#111442 ) Exposes the `?master_timeout` parameter to the REST API and sets it appropriately on internal/test requests. Relates #107984	2024-07-31 08:53:24 +01:00
Benjamin Trent	69c96974de	Ensure vector similarity correctly limits inner_hits returned for nested kNN (#111363 ) For nested kNN we support not only similarity thresholds, but also multi-passage search while retrieving more than one nearest passage. However, the inner_hits retrieved for the kNN search would ignore the restricted similarity. Meaning, the inner hits would return all passages, not just the ones within the limited similarity and this is confusing. closes: https://github.com/elastic/elasticsearch/issues/111093	2024-07-30 06:01:56 +10:00
Nhat Nguyen	52834fe041	Relax assertions in segment level field stats (#111243 ) This PR relaxes the assertions to allow an additional field introduced in serverless.	2024-07-24 12:58:11 -07:00
Nhat Nguyen	20094bfd8f	Use routing_table for allocated node in tests (#111217 ) The previous fix, which uses the search API, doesn't work with the indexing tier only. This change uses the routing table from the cluster state instead. I have tested this change in a serverless environment. Relates #111211	2024-07-24 10:04:27 +10:00
Oleksandr Kolomiiets	b8da526eda	Change the name of logsdb mapping test file to more specific (#111076 )	2024-07-23 15:19:04 -07:00
Nhat Nguyen	8e07c4e572	Replace search_shards with search API in tests (#111211 ) The `search_shards` API is not available in serverless. This PR replaces its usage in the newly added test with the `search` API with profiling. Relates #111123	2024-07-24 06:14:52 +10:00
Nhat Nguyen	f275dff609	Add Lucene segment-level fields stats (#111123 ) This change returns the total number of fields at the segment level, allowing for a more accurate estimate of the memory used by Lucene. The new estimate is expected to be closer to the actual memory usage than the current estimate using the index-level field count, due to the non-trivial overhead incurred by each Lucene segment. Two new fields are introduced: total_segment_fields, which is the total number of fields at the segment level, and average_fields_per_segment. The overhead per field in segments with fewer fields is larger than in segments with many fields.	2024-07-23 08:52:39 -07:00
Oleksandr Kolomiiets	344d846c5b	Fix remaining references to logs index mode (#111164 )	2024-07-22 12:28:10 -07:00
Keith Massey	a2814e816b	Adding mapping validation to the simulate ingest API (#110606 )	2024-07-19 08:08:21 -05:00
Salvatore Campagna	0f584176ca	Rename `logs` index mode to `logsdb` (#111054 )	2024-07-19 13:38:58 +02:00
Salvatore Campagna	9332a937e1	test: re-enable test after backport #11031 (#111035 )	2024-07-19 10:20:43 +02:00
Enrico Zimuel	39aa832400	Changed security API endpoints to stable (#110862 )	2024-07-18 15:24:36 +02:00
Tommaso Teofili	0289ca68b8	Dense vector field types updatable for int4 (#110928 )	2024-07-18 13:54:32 +02:00
Salvatore Campagna	ac2afd7633	Inject `host.name` field without relying on (component) templates (#110938 ) We do not want to rely on templates or component templates to include the host.name field in indices using LogsDB. The host.name field is a field we sort on by default when LogsDB is used. As a result, we just inject it by default, the same way we do for the @timestamp field. This prevents sorting errors due to missing host.name field in mappings. The host.name is a keyword field and depending on the value of subobjects it will be mapped as a name keyword nested inside a host or as a flat host.name keyword. We also include ignore_above as we normally do for keywords in observability mappings.	2024-07-18 12:47:51 +02:00
Joe Gallo	27e7601698	Directly download commercial ip geolocation databases from providers (#110844 ) Co-authored-by: Keith Massey <keith.massey@elastic.co>	2024-07-17 20:55:14 -04:00
Benjamin Trent	28c7cbccce	Make empty string searches be consistent with case (in)sensitivity (#110833 ) If we determine that the searchable term is completely empty, we switch back to a regular term query. This way we return the same docs as expected when we do a case sensitive search. closes: #108968	2024-07-17 15:20:57 -04:00
Oleksandr Kolomiiets	ed0f3d0f70	Revert "Fix logsdb mapping rest tests on serverless (#110900 )" (#110931 ) This reverts commit `1bb58ccff0`.	2024-07-16 09:49:52 -07:00
Oleksandr Kolomiiets	1bb58ccff0	Fix logsdb mapping rest tests on serverless (#110900 ) Currently fails due to validation that is only performed in serverless: ``` java.lang.AssertionError: Failure at [logsdb/20_mapping:94]: Expected: "Failed to parse mapping: Indices with with index mode [logs] only support synthetic source" but: was "Failed to parse mapping: Parameter [mode=disabled] is not allowed in source" ```	2024-07-16 08:15:33 +10:00
Oleksandr Kolomiiets	a25ed530e5	Add validation for synthetic source mode in logs mode indices (#110677 )	2024-07-15 11:11:57 -07:00
Benjamin Trent	c2e1ab8934	Correct tests, skipping on cluster features in mixed clusters is buggy (#110747 ) Cluster feature "skip" just doesn't work as expected in a mixed cluster scenario. It could be that the request is handled by a new node. I honestly don't know whats happening there. This adjusts the tests so that we verify that `allow_unmapped_fields` modifies the behavior as expected. closes: https://github.com/elastic/elasticsearch/issues/110720 closes: https://github.com/elastic/elasticsearch/issues/110719	2024-07-12 22:39:14 +10:00
Nhat Nguyen	1964be565c	Allow querying index_mode (#110676 ) This change allows querying the `index.mode` setting via a new `_index_mode` metadata field, enabling APIs such as `field_caps` or `resolve_indices` to target indices that are either time_series or logs only. This approach avoids adding and handling a new parameter for `index_mode` in these APIs. Both ES\|QL and the `_search` API should also work with this new field.	2024-07-10 16:45:11 -07:00
ghostspiders	3bd192c2e0	KnnVectorQueryBuilder support for allowUnmappedFields (#107047 ) * KnnVectorQueryBuilder support for allowUnmappedFields * Update and rename 106811.yaml to 107047.yaml * Update 107047.yaml * buildkite test * spotless * spotless * Apply suggestions from code review * fixing compilation --------- Co-authored-by: Benjamin Trent <ben.w.trent@gmail.com> Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com> Co-authored-by: Benjamin Trent <4357155+benwtrent@users.noreply.github.com>	2024-07-10 09:41:15 -04:00
Moritz Mack	a4b3e6ffb5	Use valid documentation url for capabilities in rest specs (#110657 )	2024-07-10 09:30:45 +02:00
Benjamin Trent	9dbe97b2cb	Fix flaky test #109978 (#110245 ) CCS tests could split the vectors over any number of shards. Through empirical testing, I determined this commits values work to provide the expected order, even if they are not all part of the same shard. quantization can have weird behaviors when there are uniform values, just like this test does. closes #109978	2024-07-09 07:28:31 +10:00
Johannes Fredén	89cd966b24	Add bulk delete roles API (#110383 ) * Add bulk delete roles API	2024-07-03 11:04:53 +02:00
Albert Zaharovits	566f5f831a	Query Roles API (#108733 ) This adds the Query Roles API: ``` POST /_security/_query/role GET /_security/_query/role ``` This is similar to the currently existing: * [Query API key API](https://www.elastic.co/guide/en/elasticsearch/reference/current/security-api-query-api-key.html) * [Query User API](https://www.elastic.co/guide/en/elasticsearch/reference/current/security-api-query-user.html) Sample request: ``` POST /_security/_query/role { "query": { "bool": { "filter": [ { "terms": { "applications.application": ["app-1", "app-2" ] } } ], "must_not": [ { "match": { "description": { "query": "test match on role description (which is mapped as a text field)" } } } ] } }, "sort": [ "name" ], "search_after": [ "role-name-1" ] } ``` The query supports a subset of query types, including match_all, bool, term, terms, match, ids, prefix, wildcard, exists, range, and simple query string. Currently, the supported fields are: * name * description * metadata * applications.application * applications.resources * applications.privileges The query also supports pagination-related fields (`from`, `size`, `search_after`), analogous to the generic [Search API](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-search.html). The response format is similar to that of the [Query API key](https://www.elastic.co/guide/en/elasticsearch/reference/current/security-api-query-api-key.html) and [Query User](https://www.elastic.co/guide/en/elasticsearch/reference/current/security-api-query-user.html) APIs. It contains a list of roles, in the sorted order (if specified). Unlike the [Get Roles API](https://www.elastic.co/guide/en/elasticsearch/reference/current/security-api-get-role.html), the role name is an attribute of the element in the list of roles (in the get-roles API case, the role name was the key in the response map, and the value was the rest of the role descriptor). In addition, the element in the list of roles also contains the optional `_sort` field, eg (sample response): ``` { "total": 3, "count": 3, "roles": [ { "name": "LYdz2", "cluster": [], "indices": [], "applications": [ { "application": "ejYWvGQTF", "privileges": [ "pRCfBMgOy", "zDhFtMQfc", "roudxado" ], "resources": [ "nWHEpmgxy", "SOML/hMYrqx", "YIqP/", "ueEomwsA" ] }, { "application": "ampUW9", "privileges": [ "jDvRtp" ], "resources": [ "99" ] } ], "run_as": [], "metadata": { "nFKc": [ 1, 0 ], "PExF": [], "qlqY": -433239865, "IQXm": [] }, "transient_metadata": { "enabled": true }, "description": "KoLlsEbq", "_sort": [ "LYdz2" ] }, { "name": "oaxW0", "cluster": [], "indices": [], "applications": [ { "application": "", "privileges": [ "qZYb" ], "resources": [ "tFrSULaKb" ] }, { "application": "aLaEN9", "privileges": [ "fCOc" ], "resources": [ "gozqXtSgE", "UX/JgydeIM", "sjUp", "Ivdz/UAmuNrQAG" ] }, { "application": "rbxyuKIMPAp", "privileges": [ "lluqieFRu", "xKU", "gHlb" ], "resources": [ "99" ] } ], "run_as": [], "metadata": {}, "transient_metadata": { "enabled": true }, "_sort": [ "oaxW0" ] }, { "name": "vWAV1", "cluster": [], "indices": [], "applications": [ { "application": "*", "privileges": [ "kWBWjCAc" ], "resources": [ "hvEtV", "gZJ" ] }, { "application": "avVUV9", "privileges": [ "newZTa", "gQpxNm" ], "resources": [ "99" ] } ], "run_as": [], "metadata": {}, "transient_metadata": { "enabled": true }, "_sort": [ "vWAV1" ] } ] } ```	2024-07-03 01:59:11 +10:00
Johannes Fredén	55476041d9	Add BulkPutRoles API (#109339 ) * Add BulkPutRoles API	2024-07-02 15:45:39 +02:00
Martijn van Groningen	e0d71d660d	Disallow index.time_series.end_time setting from being set or updated in normal indices (#110268 ) The index.mode setting validates other index settings. When updating the index.time_series.end_time setting and the index.mode setting isn't wasn't defined at index creation time (meaning that default is active), then this validation is skipped which results into (worse) errors at a later point in time. This problem is fixed by enforced by making index.mode setting a dependency of index.time_series.end_time setting. Note that this problem doesn't exist for the index.time_series.start_time and index.routing_path index settings, because these index settings are final, which mean these can only be defined when an index is being created. Closes #110265	2024-07-02 12:19:09 +02:00
Kostas Krikellas	e3caeed2b6	Fix sort on nested test (#110331 ) * Add test for nested array, fix sort on nested test. * Fix sort on nested test.	2024-07-01 15:00:15 +03:00
Kostas Krikellas	5fa92812cf	Add test for nested array, fix sort on nested test. (#110325 )	2024-07-01 12:08:04 +03:00
Kostas Krikellas	6ae652f90e	Support index sorting with nested fields (#110251 ) This PR piggy-backs on recent changes in Lucene 9.11.1 (https://github.com/apache/lucene/pull/12829, https://github.com/apache/lucene/pull/13341/), setting the parent doc when nested fields are present. This allows moving nested documents along with parent ones during sorting. With this change, sorting is now allowed on fields outside nested objects. Sorting on fields within nested objects is still not supported (throws an exception). Fixes #107349	2024-07-01 17:24:17 +10:00
Mayya Sharipova	405e39660b	Support k parameter for knn query (#110233 ) Introduce an optional k param for knn query If k is not set, knn query has the previous behaviour: - `num_candidates` docs is collected from each shard. This `num_candidates` docs are used for combining with results with other queries and aggregations on each shard. - docs from all shards are merged to produce the top global `size` results If k is set, the behaviour instead is following: - `k` docs is collected from each shard. This `k` docs are used for combining results with other queries and aggregations on each shard. - similarly, docs from all shards are merged to produce the top global `size` results. Having `k` param makes it more intuitive for users to address their needs. They also don't need to care and can skip `num_candidates` param for this query as it is of more internal details to tune how knn search operates. Closes #108473	2024-06-28 09:59:28 -04:00
Kathleen DeRusso	959d07f5ee	Rename query rules namespace in rest api spec (#110208 ) * Rename query rules namespace in rest api spec * Rename per Specification PR feedback	2024-06-28 08:19:08 -04:00
Alexander Spies	2876e059f3	Aggs: Improve scripted metric agg allow list tests (#110153 ) * Add an override to the aggs tests to override the allow list default setting. This makes it possible to run the scripted metric aggs tests on Serverless, even when we disallow these aggs per default on Serverless. * Move the allow list tests next to the scripted metric tests since these belong together.	2024-06-28 11:47:30 +02:00
Oleksandr Kolomiiets	736357a9fb	Handle ignore_above in synthetic source for flattened fields (#110214 )	2024-06-27 10:11:26 -07:00
Benjamin Trent	5add44d7d1	Adds new `bit` element_type for dense_vectors (#110059 ) This commit adds `bit` vector support by adding `element_type: bit` for vectors. This new element type works for indexed and non-indexed vectors. Additionally, it works with `hnsw` and `flat` index types. No quantization based codec works with this element type, this is consistent with `byte` vectors. `bit` vectors accept up to `32768` dimensions in size and expect vectors that are being indexed to be encoded either as a hexidecimal string or a `byte[]` array where each element of the `byte` array represents `8` bits of the vector. `bit` vectors support script usage and regular query usage. When indexed, all comparisons done are `xor` and `popcount` summations (aka, hamming distance), and the scores are transformed and normalized given the vector dimensions. Note, indexed bit vectors require `l2_norm` to be the similarity. For scripts, `l1norm` is the same as `hamming` distance and `l2norm` is `sqrt(l1norm)`. `dotProduct` and `cosineSimilarity` are not supported. Note, the dimensions expected by this element_type are always to be divisible by `8`, and the `byte[]` vectors provided for index must be have size `dim/8` size, where each byte element represents `8` bits of the vectors. closes: https://github.com/elastic/elasticsearch/issues/48322	2024-06-27 04:48:41 +10:00
Quentin Pradet	6d98e0d6b9	Fix trailing slash in two rollup specifications (#110176 )	2024-06-26 12:29:19 +04:00

1 2 3 4 5 ...

3650 commits