elasticsearch

mirror of https://github.com/elastic/elasticsearch.git synced 2025-04-25 07:37:19 -04:00

Author	SHA1	Message	Date
kosabogi	ff926182f1	Adds text_similarity task type to inference processor documentation (#113517 ) (#113612 )	2024-09-27 00:38:48 +10:00
István Zoltán Szabó	cf55728d77	[DOCS] Improves semantic text documentation. (#113606 ) (#113611 )	2024-09-27 00:34:37 +10:00
Kostas Krikellas	8539876663	[8.x] Apply auto-flattening to `subobjects: auto` (#113584 ) * Apply auto-flattening to `subobjects: auto` (#112092) * Introduce mode `subobjects=auto` for objects * Update docs/changelog/110524.yaml * compilation error * tests and fixes * refactor * spotless * more tests * fix nested objects * fix test * update fetch test * add QA coverage * update tests * update tests * update tests * Apply auto-flattening to `subobjects: auto` * Update docs/changelog/112092.yaml * sync * dont flatten subobjects auto * refine test * fix path for nested flattened objects and dynamic * document `subobjects: auto` * Apply suggestions from code review Co-authored-by: Felix Barnsteiner <felixbarny@users.noreply.github.com> * comment updates * restore indentation in comment * update comment * update comment * update comment * update comment * rename isFlattenable * add test for dynamic template * fix copy_to and noop dynamic updates * tests * update comment * fix tests * update cluster feature in yaml test * address comments --------- Co-authored-by: Felix Barnsteiner <felixbarny@users.noreply.github.com> (cherry picked from commit `fffe8844e9`) # Conflicts: # modules/dot-prefix-validation/build.gradle # rest-api-spec/build.gradle * Update build.gradle	2024-09-26 20:17:11 +10:00
Keith Massey	7870e2dbe2	Adding component template substitutions to the simulate ingest API (#113276 ) (#113567 )	2024-09-26 07:32:13 +10:00
Nik Everett	0e6bbb0bea	ESQL: TOP support for strings (#113183 ) (#113408 ) Adds support to the `TOP` aggregation for `keyword` and `text` field types. Closes #109849	2024-09-26 05:18:20 +10:00
Liam Thompson	fd775317ed	[DOCS] Create Elasticsearch basics section, refactor quickstarts section (#112436 ) (#113543 ) Co-authored-by: shainaraskas <58563081+shainaraskas@users.noreply.github.com>	2024-09-26 01:55:19 +10:00
David Kyle	cc3caa228d	[ML] Add deployment threading details and memory usage to telemetry (#113099 ) (#113516 ) Adds deployment threading options and a new memory section reporting the memory usage for each of the ml features # Conflicts: # server/src/main/java/org/elasticsearch/TransportVersions.java	2024-09-25 22:35:09 +10:00
Sam Xiao	ce0681225b	ILM: Add total_shards_per_node setting to searchable snapshot (#112972 ) (#113493 ) Allows setting index total_shards_per_node in the SearchableSnapshot action of ILM to remediate hot spot in shard allocation for searchable snapshot index. Closes #112261	2024-09-25 06:53:11 +10:00
Nik Everett	f8dbda3f98	ESQL: Document esql_worker threadpool (#113203 ) (#113459 ) Documents the thread pool we use to run ESQL operations. It's the same size and queue depth as the `search` thread pool. Closes #113130	2024-09-24 23:28:53 +10:00
Salvatore Campagna	9a21ca63d7	LogsDB data migration integration testing (#112710 ) (#113448 ) Here we test reindexing logsdb indices, creating and restoring snapshots. Note that logsdb uses synthetic source and restoring source only snapshots fails due to missing _source. (cherry picked from commit `f7880ae85f`)	2024-09-24 21:47:09 +10:00
Salvatore Campagna	bac208a154	Introduce an `ignore_above` index-level setting (#113121 ) (#113414 ) Here we introduce a new index-level setting, `ignore_above`, similar to what we have for `ignore_malformed`. The setting will apply to all `keyword`, `wildcard` and `flattened` fields. Each field mapping will still be allowed to override the index-level setting using a mapping-level `ignore_above` value. (cherry picked from commit `208a1fe571`)	2024-09-24 06:16:08 +10:00
Liam Thompson	cbe2faead8	fix typos (#113329 ) (#113400 ) Co-authored-by: Pm Ching <41728178+pionCham@users.noreply.github.com>	2024-09-24 02:05:57 +10:00
Liam Thompson	9ae2439a34	[DOCS] Add snippet tests to retriever API docs (#113289 ) (#113396 )	2024-09-24 01:25:32 +10:00
Felix Barnsteiner	0aebbb53d6	[8.x] Add support for multi-value dimensions (#112645 ) (#113369 ) * Add support for multi-value dimensions (#112645) Closes https://github.com/elastic/elasticsearch/issues/110387 Having this in now affords us not having to introduce version checks in the ES exporter later. We can simply use the same serialization logic for metric attributes as we do for other signals. This also enables us to properly map `.ip` fields to the ip field type as ip fields containing a list of IPs are not converted to a comma-separated list. (cherry picked from commit `8d223cbf7a`) # Conflicts: # server/src/main/java/org/elasticsearch/index/mapper/TimeSeriesIdFieldMapper.java Remove skip test for 8.x This was just needed for 8.x to 9.0 compatibility tests	2024-09-24 00:05:25 +10:00
Carlos Delgado	c3a2b19993	[8.x] ESQL QSTR function (#112590 ) (#113189 )	2024-09-23 10:13:53 +02:00
Martijn van Groningen	b82afc1377	Added known issue entry for synthetic source bug. (#113269 ) (#113358 ) Added known issue entry for synthetic source bug. Co-authored-by: Oleksandr Kolomiiets <olkolomiiets@gmail.com>	2024-09-23 15:34:22 +10:00
Iraklis Psaroudakis	6f63a4e08b	fix a couple of docs typos (#112901 ) (#113283 ) Co-authored-by: Pm Ching <41728178+pionCham@users.noreply.github.com>	2024-09-21 01:59:14 +10:00
Bogdan Pintea	6e314d6c2a	ESQL: Align year diffing to the rest of the units in DATE_DIFF: chronological (#113103 ) (#113258 ) This will correct/switch "year" unit diffing from the current integer subtraction to a crono subtraction. Consequently, two dates are (at least) one year apart now if (at least) a full calendar year separates them. The previous implementation simply subtracted the year part of the dates. Note: this parts with ES SQL's implementation of the same function, which itself is aligned with MS SQL's implementation, which works equivalent to an integer subtraction. Fixes #112482. (cherry picked from commit `f7ff00f645`)	2024-09-20 22:31:36 +10:00
István Zoltán Szabó	ec109dd9bf	[DOCS] Fixes adaptive_allocations examples (#113248 ) (#113254 ) Co-authored-by: Jan Kuipers <148754765+jan-elastic@users.noreply.github.com>	2024-09-20 19:54:50 +10:00
Alexander Spies	afae6b2d46	ESQL Docs: Mention Discover/Field Statistics in OOM known issue in 8.15.1/2 (#113196 ) (#113243 )	2024-09-20 19:02:58 +10:00
Pius	83ea259b7c	Update 8.15.1.asciidoc (#113221 ) (#113240 )	2024-09-20 18:29:25 +10:00
Liam Thompson	8a5d68e390	[DOCS] Fix reranking IA, move retrievers to search api overview (#112949 ) (#113193 )	2024-09-20 01:49:59 +10:00
Simon Cooper	ceb9deff89	Use deprecation logger for CLDR date format specifiers (#112917 ) The addition of the logger requires several updates to tests to deal with the possible warning, or muting if there is not way to specify an allowed (but not mandatory) warning	2024-09-19 15:50:37 +01:00
David Turner	2ba00c2810	Mention full-cluster restart in `initial_master_node` docs (#112986 ) (#113166 ) Apparently some users consider "node is restarting" not to apply to a full-cluster restart. This commit further clarifies that you must not set `cluster.initial_master_nodes` in a full cluster restart.	2024-09-19 20:06:24 +10:00
Stef Nestor	c9764b86c4	(Doc+) Update example SAML blog for Okta (#112934 ) (#113098 )	2024-09-18 20:30:59 +10:00
István Zoltán Szabó	2f7ad416ce	[DOCS] Gives more details to the load data step of the semantic search tutorials (#113088 ) (#113094 ) Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com>	2024-09-18 20:03:10 +10:00
Nik Everett	50703cb988	ESQL: Add known issue to 8.15 docs for OOM due to wide index pattern (#112926 ) (#112959 ) Co-authored-by: Alexander Spies <alexander.spies@elastic.co>	2024-09-16 16:30:42 -04:00
István Zoltán Szabó	08ce93eb01	[DOCS] Fixes response object indentation in semantic text tutorial (#112915 ) (#112920 )	2024-09-16 23:05:28 +10:00
Martijn van Groningen	47be9bb975	[8.x] Remove zstd feature flag for index codec best compression. (#112665 ) (#112857 ) * Remove zstd feature flag for index codec best compression. (#112665) ZStandard was added via #103374 a few months ago to snapshot builds of Elasticsearch only and benchmark results have shown that using zstd is a better trade off compared to deflate for when index.codec is set to best_compression. This change removes the feature flag for ZStandard stored field compression for indices with index.codec set to best_compression. * Update docs/changelog/112857.yaml --------- Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>	2024-09-14 02:48:37 +10:00
István Zoltán Szabó	0c428b4923	[DOCS] Improves inference workflow tutorial. (#112870 ) (#112879 )	2024-09-14 02:01:16 +10:00
István Zoltán Szabó	21183609ae	[DOCS] Simplifies semantic_text tutorial by removing copy_to field (#112864 ) (#112876 )	2024-09-14 01:16:51 +10:00
Benjamin Trent	96cc923dcf	Update knn-query.asciidoc (#112833 ) (#112868 )	2024-09-13 21:40:59 +10:00
Stef Nestor	d039c280af	(Docs+) Flush out Resource+Task troubleshooting (#111773 ) (#112818 ) * (Docs+) Flush out Resource+Task troubleshooting --------- Co-authored-by: shainaraskas <58563081+shainaraskas@users.noreply.github.com> Co-authored-by: David Turner <david.turner@elastic.co>	2024-09-13 00:09:58 +10:00
István Zoltán Szabó	5b2d861f5a	[DOCS] Rework semantic search main page (#112452 ) (#112808 ) Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Mike Pellegrini <mike.pellegrini@elastic.co>	2024-09-12 22:30:38 +10:00
Stef Nestor	b9662b505b	(Doc+) Inference Pipeline ignores Mapping Analyzers (#112522 ) (#112776 ) * (Doc+) Inference Pipeline ignores Mapping Analyzers From internal Dev feedback (will cross-link after), this updates that inference processors within ingest pipelines run before mapping analyzers effectively ignoring them. So if users want analyzers to take effect, they would need to select the analyzer's ingest pipeline process equivalent and run it higher in flow than the inference processor. --------- Co-authored-by: István Zoltán Szabó <istvan.szabo@elastic.co>	2024-09-12 08:30:07 +10:00
Stef Nestor	98aa3f2572	(Doc+) Terminating Exit Codes (#112530 ) (#112774 ) 👋 howdy, team! Mini PR to cross-replicate [this knowledge article](https://support.elastic.co/knowledge/6610ba83) about Elasticsearch's exit codes which expands [this ES doc section](https://www.elastic.co/guide/en/elasticsearch/reference/master/stopping-elasticsearch.html#fatal-errors).	2024-09-12 07:58:18 +10:00
Stef Nestor	a5dad1fe0e	(Doc+) CAT Nodes default columns (#112715 ) (#112772 ) 👋 howdy, team! 1. Related to https://github.com/elastic/dev/issues/2631, highlights customers are usually seeking `heap.percent` instead of `ram.percent` 2. Aligns the claimed "(Default)" columns in doc to what returned for v8.15.1 test cluster	2024-09-12 07:54:54 +10:00
David Turner	f79fb8c25b	Introduce repository integrity verification API (#112348 ) Adds an API which scans all the metadata (and optionally the raw data) in a snapshot repository to look for corruptions or other inconsistencies. Closes https://github.com/elastic/elasticsearch/issues/52622 Closes ES-8560	2024-09-11 23:17:59 +10:00
Mary Gouseti	c1a2d390ef	Update data stream lifecycle telemetry to track global retention (#112451 ) Currently, the data stream lifecycle telemetry has the following structure: ``` { .... "data_lifecycle" : { "available": true, "enabled": true, "count": 0, "default_rollover_used": true, "retention": { "minimum_millis": 0, "maximum_millis": 0, "average_millis": 0.0 } }.... ``` In the snippet above you can see that we track: - The amount of data streams managed by the data stream lifecycle by `count` - If the default rollover has been overwritten by `default_rollover_used` - The min, max and average of the `data_retention` configured on a data stream level. In this PR we propose the following extention: ``` .... "data_lifecycle" : { "available": true, "enabled": true, "count": 0, "default_rollover_used": true, "effective_retention": { #https://github.com/elastic/dev/issues/2537 "retained_data_streams": 5, "minimum_millis": 0, # Only if retained data streams > 1 "maximum_millis": 0, "average_millis": 0.0 }, "data_retention": { "configured_data_streams": 5, "minimum_millis": 0, # Only if retained data streams > 1 "maximum_millis": 0, "average_millis": 0.0 }, "global_retention": { "default": { "defined": true/false, "affected_data_streams": 0, "millis": 0 }, "max": { "defined": true/false, "affected_data_streams": 0, "millis": 0 } } ``` With this extension we are tracking: - The amount of data streams managed by the data stream lifecycle by `count` - If the default rollover has been overwritten by `default_rollover_used` - The min, max and average of the `data_retention` configured on a data stream level and the number of data streams that have it configured. We add the min, max and avg only if there are data streams with data retention configuration to avoid messing with the stats in a dashboard. - The min, max and average of the `effective_retention` and the number of data streams that are retained. We add the min, max and avg only if there are retained data streams to avoid messing with the stats in a dashboard. - Global retention stats, if they are defined, if the number of the affected data streams and the actual value. The above metrics allow us to answer questions like: - How many data streams are affected by global retention. - How big is the difference between the longest data retention compared to max global retention. - How much does the effective retention diverging from the data retention, this will show the impact of the global retention.	2024-09-11 18:31:04 +10:00
kosabogi	6e7a9eb629	Adds details on Kibana access credentials (#112695 )	2024-09-11 06:20:08 +02:00
Stanislav Malyshev	9081a951d5	Implement CCS telemetry export as part of _cluster/stats (#112310 ) * Implement CCS telemetry export as part of _cluster/stats	2024-09-10 09:31:06 -06:00
István Zoltán Szabó	3636797cfe	[DOCS] Adds path params and available task types to the PUT inference page (#112696 ) Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com>	2024-09-10 12:43:08 +02:00
Liam Thompson	c2d4543250	[DOCS][101] Refine mappings + documents/indices overviews (#112545 )	2024-09-10 12:17:10 +02:00
kosabogi	6da37658ad	#101472 Updates default index.translog.flush_threshold_size value (#112052 ) * #101472 Updates default index.translog.flush_threshold_size value * Update docs/reference/index-modules/translog.asciidoc Co-authored-by: István Zoltán Szabó <istvan.szabo@elastic.co> * Updates the description --------- Co-authored-by: István Zoltán Szabó <istvan.szabo@elastic.co>	2024-09-10 11:08:53 +02:00
Fang Xing	e8569356ea	[ES\|QL] explicit cast a string literal to date_period and time_duration in arithmetic operations (#109193 ) explicit cast to date_period and time_duration in arithmic operation	2024-09-09 14:56:43 -04:00
Nik Everett	ef3a5a1385	ESQL: Fix CASE when conditions are multivalued (#112401 ) When CASE hits a multivalued field it was previously either crashing on fold or evaluating it to the first value. Since booleans are loaded in sorted order from lucene that usually means `false`. This changes the behavior to line up with the rest of ESQL - now multivalued fields are treated as `false` with a warning. You might say "hey wait! multivalued fields usually become `null`, not `false`!". Yes, dear reader, you are right. Very right. But! `CASE`'s contract is to immediatly convert its values into `true` or `false` using the standard boolean tri-valued logic. So `null` just become `false` immediately. This is how PostgreSQL, MySQL, and SQLite behave: ``` > SELECT CASE WHEN null THEN 1 ELSE 2 END; 2 ``` They turn that `null` into a false. And we're right there with them. Except, of course, that we're turning `[false, false]` and the like into `null` first. See!? It's consitent. Consistently confusing, but sane at least. The warning message just says "treating multivalued field as false" rather than explaining all of that. This also fixes up a few of CASE's docs which I noticed were kind of busted while working on CASE. I think the docs generation is having a lot of trouble with CASE so I've manually hacked the right thing into place, but we should figure out a better solution eventually. Closes #112359	2024-09-10 02:32:19 +10:00
Nik Everett	cf98240950	Update docs from code	2024-09-09 11:28:31 -04:00
David Turner	1977a715df	Add links to network disconnect troubleshooting (#112330 ) Makes the docs added in #112271 more discoverable.	2024-09-10 00:59:39 +10:00
Chris Berkhout	fbaeb1ee61	[ESQL] Add `SPACE` function (#112350 ) Adds the SPACE(number) function, which is equivalent to REPEAT(" ", number).	2024-09-09 21:41:35 +10:00
Iván Cea Fontenla	fc2760cfd4	ESQL: mv_median_absolute_deviation function (#112055 ) - Added mv_median_absolute_deviation function - Added possibility of having a fixed param in Multivalue "ascending" functions - Add surrogate to MedianAbsoluteDeviation ### Calculations used to avoid overflows First, a quick recap of how the MAD is calculated: 1. Sort values, and get the median 2. Calculate the difference between each value with the median (`abs(median - value)`) 3. Sort the differences, and get their median Calculating a MAD may overflow when calculating the differences (Step 2), given the type is a signed number, as the difference is a positive value, with potentially the same value as `POSITIVE_MAX - NEGATIVE_MIN`. To solve this, some types are up-casted as follow: - Int: Stored as longs, simple approach - Long: Stored as longs, but switched to unsigned long representation when calculating the differences - Unsigned long: No effect; the resulting range is the same - Doubles: Nothing. If the values overflow to +/-infinity, they're left that way, as we'll just use those outliers to sort Closes https://github.com/elastic/elasticsearch/issues/111590	2024-09-09 10:04:25 +02:00

1 2 3 4 5 ...

11966 commits