Commit graph

12605 commits

Author SHA1 Message Date
Slobodan Adamović
265c70423b
[DOCS] Add missing ELASTIC_PASSWORD in docker-compose (#112372)
This PR adds missing ELASTIC_PASSWORD environment variable to es02 and es03 nodes.

Resolves https://github.com/elastic/elasticsearch/issues/112235
2024-09-03 09:58:18 +02:00
Luke Whiting
0426e1fbd5
(API) Cluster Health report unassigned_primary_shards (#111727) (#112024)
This PR adds a count of currently unassigned primary shards to both the
`/_cat/health` and `/_cluster/health` endpoints. This is to aid cluster
administrators in estimating the time remaining for a cluster to go from
RED to YELLOW status as per enchancement request #111727.

Tests and doc updates are in place with this PR and manual testing with
`./gradlew run` has been conducted on the endpoints to ensure correct
output.

## Known Limitations * Testing   * Due to limitations in the YAML REST
test framework skip functionality, YAML REST tests for this endpoint are
disabled when running a mixed version cluster by using a cluster version
number synthetic feature to skip when any member of the cluster is not
at a version greater than when this change is due to be introduced
2024-09-02 20:00:06 +10:00
Liam Thompson
1acba13a44
[DOCS] Update documents and indices overview (#112394) 2024-09-02 11:22:41 +02:00
Mary Gouseti
91f4023e27
Expose global retention settings via data stream lifecycle API (#112210)
In this PR we expose the global retention via the `GET
_data_stream/{target}/_lifecycle` API.

Since the global retention is a main feature of the data stream
lifecycle we chose to expose it by default.

```
GET /_data_stream/my-data-stream/_lifecycle
{
 "global_retention": {
      "default_retention": "7d",
      "max_retention": "365d"
  }, 
  "data_streams": [...]
}
```
2024-09-02 18:40:08 +10:00
Liam Thompson
3d2ca69b7c
[DOCS] Collapse some content in local dev setup for readability (#112355)
* [DOCS] Collapse some content in local dev setup for readability

* Reword collapsible text
2024-09-02 10:28:35 +02:00
Lee Hinman
4ae88f98dc
Add 'verbose' flag retrieving maximum_timestamp for get data stream API (#112303)
This commit adds support for the `verbose` querystring parameter to the
get data stream API (`GET /_data_stream/{name}`).

The flag defaults to "false".

When set to true, the `maximum_timestamp` for the data stream will be
retrieved and returned for each data stream retrieved. This is the same
information available from the data stream stats API (and internally
uses the same action to retrieval).
2024-08-31 03:18:15 +10:00
István Zoltán Szabó
adb23531f9
[DOCS] Adds Google Vertex AI tutorial (#112339)
Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com>
2024-08-30 13:17:59 +02:00
David Kyle
e3e562ffbf
[ML] Support sparse embedding models in the elasticsearch inference service (#112270)
For a sparse embedding model created with the ml trained models APIs
2024-08-29 17:18:54 +01:00
David Turner
9387ce3357
Deduplicate unstable-cluster troubleshooting docs (#112333)
We duplicated these docs in order to avoid breaking older links, but
this makes it confusing and hard to link to the right copy of the
information. This commit removes the duplication by replacing the docs
at the old locations with stubs that link to the new locations.
2024-08-29 13:16:37 +01:00
weizijun
35fe3a9c47
some fixed (#112332) 2024-08-29 13:46:58 +02:00
István Zoltán Szabó
2c29a3ae0a
[DOCS] Highlights auto-chunking in intro of semantic text. (#111836) 2024-08-29 12:43:10 +02:00
Liam Thompson
aa57a1553e
[DOCS] Rewrite "What is Elasticsearch?" (Part 1) (#112213) 2024-08-29 10:13:30 +02:00
weizijun
b9dea69b5c
[Inference API] Add Docs for AlibabaCloud AI Search Support for the Inference API (#112273)
Co-authored-by: István Zoltán Szabó <istvan.szabo@elastic.co>
2024-08-29 09:17:27 +02:00
David Turner
59a42ed41b
Include network disconnect info in troubleshooting docs (#112323)
A misplaced `//end::` tag meant that the docs added in #112271 are only
included in the page on fault detection and not the equivalent
troubleshooting docs. This commit fixes the problem.
2024-08-29 15:03:13 +10:00
David Turner
42d650b9bb
Add docs for troubleshooting network disconnects (#112271)
Basically the same as for nodes that leave the cluster with reason
`disconnected`, except that these disconnects don't involve the master
so don't cause any nodes to leave the cluster.
2024-08-28 18:59:11 +10:00
Mary Gouseti
bed6e18fa3
Exclude internal data streams from global retention (#112100)
With #111972 we enable users to set up global retention for data streams that are managed by the data stream lifecycle. This will allow users of elasticsearch to have a more control over their data retention, and consequently better resource management of their clusters.

However, there is a small number of data streams that are necessary for the good operation of elasticsearch and should not follow user defined retention to avoid surprises.

For this reason, we put forth the following definition of internal data streams.

A data stream is internal if it's either a system index (system flag is true) or if its name starts with a dot.

This PR adds the `isInternalDataStream` param in the effective retention calculation making explicit that this is also used to determine the effective retention.
2024-08-28 11:28:35 +03:00
David Turner
f150e2c11d
Add telemetry for repository usage (#112133)
Adds to the `GET _cluster/stats` endpoint information about the snapshot
repositories in use, including their types, whether they are read-only
or read-write, and for Azure repositories the kind of credentials in
use.
2024-08-27 23:34:02 +10:00
Panagiotis Bailis
7563a724f0
Updating retriever documentation to better explain how filters are applied (#112201) 2024-08-26 16:15:31 +03:00
Panagiotis Bailis
785fe5384b
Adding support for allow_partial_search_results in PIT (#111516) 2024-08-26 12:56:08 +03:00
Panos Koutsovasilis
29453cb2ce
fix: support all allowed protocol numbers (#111528)
* fix(CommunityIdProcessor): support all allowed protocol numbers

* fix(CommunityIdProcessor): update documentation
2024-08-26 08:37:40 +03:00
Nik Everett
9d6bef1651
Docs: Scripted metric not available in serverless (#112161)
This updates the docs to say that scripted metric is not available in
serverless.
2024-08-23 15:26:46 -04:00
Liam Thompson
d71654195c
[DOCS] Wrap document/field restriction tip in IMPORTANT block (#112146) 2024-08-23 18:23:57 +02:00
Mary Gouseti
34a78f3cf3
Add documentation to deprecate the global retention privileges. (#112020) 2024-08-23 11:49:15 +03:00
Niels Bauman
e0c1ccbc1e
Make enrich cache based on memory usage (#111412)
The max enrich cache size setting now also supports an absolute max size in bytes (of used heap space) and a percentage of the max heap space, next to the existing flat document count. The default is 1% of the max heap space.

This should prevent issues where the enrich cache takes up a lot of memory when there are large documents in the cache.
2024-08-23 09:26:55 +02:00
Parker Timmins
1072f2bbab
Add interval based SLM scheduling (#110847)
Add the ability to schedule an SLM policies with a time unit interval schedule rather than a cron job schedule. For example, an slm policy can be created with the argument "schedule":"30m". This will create a policy that will run 30 minutes after the policy modification_date. It will then run again every time another 30 minutes has passed. Every time the policy is changed, the next snapshot will be re-scheduled to run one interval after the new modification date.
2024-08-22 21:15:29 -05:00
Mary Gouseti
ed60470518
Display effective retention in the relevant data stream APIs (#112019) 2024-08-22 17:42:49 +03:00
Stef Nestor
f37440f441
(Doc+) Allocation Explain Examples: THROTTLED, MAX_RETRY (#111558)
Adds [Allocation Explain examples](https://www.elastic.co/guide/en/elasticsearch/reference/master/cluster-allocation-explain.html#cluster-allocation-explain-api-examples) for `THROTTLED` and `MAX_RETRY`. Also formats sub TOC so that we can after link code message to those docs.
2024-08-22 08:16:36 -06:00
David Turner
615e084617
Add more cross-links about sniff/proxy modes (#112079)
The info about remote cluster connection modes is a little disjointed.
This commit adds some cross-links between the sections to help users
find more relevant information.
2024-08-22 14:13:56 +01:00
kosabogi
62305f018b
Updates-warning-about-mounting-snapshots (#112057)
* Updates-warning-about-mounting-snapshots

* Update docs/reference/searchable-snapshots/apis/mount-snapshot.asciidoc

Co-authored-by: shainaraskas <58563081+shainaraskas@users.noreply.github.com>

---------

Co-authored-by: shainaraskas <58563081+shainaraskas@users.noreply.github.com>
2024-08-22 12:22:32 +02:00
David Turner
f0dbda7529
Expand docs on remote cluster proxying (#112025)
It's not obvious from the docs that transport connections (including
connections to remote clusters) use a custom binary protocol and require
a _layer 4_ proxy. This commit clarifies this point.
2024-08-21 22:26:57 +01:00
Liam Thompson
84ddd6c7af
[DOCS] Update rank_constant value in retriever example (#112056) 2024-08-21 15:11:19 +02:00
Kuni Sen
fa7f836916
Update searchable snapshot doc about the timing to notice data loss (#112050)
<!-- Thank you for your interest in and contributing to Elasticsearch!
There are a few simple things to check before submitting your pull
request that can help with the review process. You should delete these
items from your submission, but they are here to help bring them to your
attention. -->

- Have you signed the [contributor license agreement](https://www.elastic.co/contributor-agreement)? => yes
- Have you followed the [contributor guidelines](https://github.com/elastic/elasticsearch/blob/main/CONTRIBUTING.md)?  => yes
- If submitting code, have you built your formula locally prior to submission with `gradle check`?  => not code
- If submitting code, is your pull request against main? Unless there is a good reason otherwise, we prefer pull requests against main and will backport as needed.  => not code
- If submitting code, have you checked that your submission is for an [OS and architecture that we support](https://www.elastic.co/support/matrix#show_os)?  => not code
- If you are submitting this code for a class then read our [policy](https://github.com/elastic/elasticsearch/blob/main/CONTRIBUTING.md#contributing-as-part-of-a-class) for that.  => not code

## Description

Update searchable snapshot doc about the timing to notice data loss:
Sometimes searchable snapshot data is cached onto disk so user may
notice their data loss later during node restart (or on Elastic cloud -
host maintenance) after they delete their snapshots.
2024-08-21 22:48:06 +10:00
Stef Nestor
f5de9c00c8
(Doc+) "min_primary_shard_size" for 10-50GB shards (#111574)
👋🏽 howdy, team! 

Expands [10-50GB sharding recommendation](https://www.elastic.co/guide/en/elasticsearch/reference/master/size-your-shards.html#shard-size-recommendation) to include ILM's more recent [`min_primary_shard_size`](https://www.elastic.co/guide/en/elasticsearch/reference/master/ilm-rollover.html) option to avoid small shards.
2024-08-21 11:57:09 +02:00
Stef Nestor
c1019d4c5d
(Doc+) Link API doc to parent object - part1 (#111951)
* (Doc+) Link API to parent Doc part1

---------

Co-authored-by: shainaraskas <shaina.raskas@elastic.co>
Co-authored-by: shainaraskas <58563081+shainaraskas@users.noreply.github.com>
2024-08-20 14:58:18 -06:00
Nik Everett
d8e705d5da
ESQL: Document date instead of datetime (#111985)
This changes the generated types tables in the docs to say `date`
instead of `datetime`. That's the name of the field in Elasticsearch so
it's a lot less confusing to call it that.

Closes #111650
2024-08-21 01:59:13 +10:00
Stef Nestor
0dab4b0571
(Doc+) Removing "current_node" from Allocation Explain API under Fix Watermark Errors (#111946)
👋 howdy, team!

This just simplifies the Allocation Explain API request to not need to include the `current_node` which may not be known when troubleshooting the [Fix Watermark Errors](https://www.elastic.co/guide/en/elasticsearch/reference/current/fix-watermark-errors.html) guide. 

TIA!
Stef
2024-08-20 08:22:22 -06:00
Iván Cea Fontenla
65ce50c60a
ESQL: Added mv_percentile function (#111749)
- Added the `mv_percentile(values, percentile)` function
- Used as a surrogate in the `percentile(column, percentile)` aggregation
- Updated docs to specify that the surrogate _should_ be implemented if possible

The same way as mv_median does, this yields exact results (Ignoring double operations error).
For that, some decisions were made, specially in the long evaluator (Check the comments in context in `MvPercentile.java`)

Closes https://github.com/elastic/elasticsearch/issues/111591
2024-08-20 15:29:19 +02:00
Iván Cea Fontenla
e3f378ebd2
ESQL: Strings support for MAX and MIN aggregations (#111544)
Support Version, Keyword and Text in Max an Min aggregations.

The current implementation of both max and min does:

For non-grouping:
- Store a BytesRef
- When there's a max/min, copy it to the internal array. Grow it if needed

For grouping:
- Keep an array of BytesRef (null by default: there's no "initial/default value" here, as there's no "MAX" value for a string)
- Each BytesRef stores their own array, which will be grown as needed to copy the new max/min

Some notes:
- It's not shrinking the arrays, as to avoid having to copy, and potentially grow it again
- It's using raw arrays. But maybe it should use BigArrays to compute in the circuit breaker?

Part of https://github.com/elastic/elasticsearch/issues/110346
2024-08-20 15:24:55 +02:00
Bogdan Pintea
dd49c33479
ESQL: BUCKET: allow numerical spans as whole numbers (#111874)
This laxes the check on numerical spans to allow them be specified as whole numbers. So far it was required that they be provided as a double.

This also expands the tests for date ranges to include string types.

Resolves #109340, resolves #104646, resolves #105375.
2024-08-20 13:40:59 +02:00
Mary Gouseti
ad90d1f0f6
Introduce global retention in data stream lifecycle (cluster settings) (#111972)
In this PR we introduce cluster settings to manage the global data stream retention.

We introduce two settings `data_streams.lifecycle.retention.max` & `data_streams.lifecycle.retention.default` that configure the respective retentions. The settings are loaded and monitored by the `DataStreamGlobalRetentionSettings`. The validation has also moved there.

We preserved the `DataStreamGlobalRetention` record to reduce the impact of this change. The purpose of this method is to be simply a wrapper record that groups the retention settings together.

Temporarily, the `DataStreamGlobalRetentionSettings` is using the DataStreamFactoryRetention which is marked as deprecated for migration purposes.
2024-08-20 09:54:55 +03:00
David Turner
fa58a9d08d
Add known issue docs for #111854 (#111978) 2024-08-20 07:25:55 +01:00
Nik Everett
dc24003540
ESQL: Profile more timing information (#111855)
This profiles additional timing information for each individual driver.
To the results from `profile` it adds the start and stop time for each
driver. That was already in the task status. To the profile and task
status it also adds the number of times the driver slept and some more
detailed history about a few of those times.

Explanation time! The compute engine splits work into some number of
`Drivers` per node. Each `Driver` is a single threaded entity - it runs
on a thread for a while then does one of three things: 1. Finishes 2.
Goes async because one of it's `Operator`s has gone async 3. Yields the
thread pool because it has run for too long

This PR measures the second two. At this point only three operators can
go async: * ENRICH * Reading from an empty exchange * Writing to a full
exchange

We're quite interested the these sleeps at the moment because they think
they may be slowing things down. Here's what it looks like when a driver
goes async because it wants to read from an empty exchange:

```
... the rest of the profile ...
        "sleeps" : {
          "counts" : {
            "exchange empty" : 2
          },
          "first" : [
            {
              "reason" : "exchange empty",
              "sleep" : "2024-08-13T19:45:57.943Z",
              "sleep_millis" : 1723578357943,
              "wake" : "2024-08-13T19:45:58.159Z",
              "wake_millis" : 1723578358159
            },
            {
              "reason" : "exchange empty",
              "sleep" : "2024-08-13T19:45:58.164Z",
              "sleep_millis" : 1723578358164,
              "wake" : "2024-08-13T19:45:58.165Z",
              "wake_millis" : 1723578358165
            }
          ],
          "last": [same as above]
```

Every time the driver goes async we count it in the `counts` map -
grouped by the reason the driver slept. We also record the sleep and
wake times for the first and last ten times the driver sleeps. In this
case it only slept twice, so the `first` and `last` ten times is the
same array.

This should give us a good sense about why drivers sleep while using a
limited amount of memory per driver.
2024-08-20 07:29:01 +10:00
David Turner
e6b830e3b3
Clean up dangling S3 multipart uploads (#111955)
If Elasticsearch fails part-way through a multipart upload to S3 it will
generally try and abort the upload, but it's possible that the abort
attempt also fails. In this case the upload becomes _dangling_. Dangling
uploads consume storage space, and therefore cost money, until they are
eventually aborted.

Earlier versions of Elasticsearch require users to check for dangling
multipart uploads, and to manually abort any that they find. This commit
introduces a cleanup process which aborts all dangling uploads on each
snapshot delete instead.

Closes #44971 Closes #101169
2024-08-20 02:49:48 +10:00
David Turner
69f454370a
Fix known issue docs for #111866 (#111956)
The `known-issue-8.15.0` anchor appears twice which breaks the docs
build. Also the existing message suggests incorrectly that
`bootstrap.memory_lock: true` is recommended.
2024-08-19 17:26:16 +10:00
David Turner
1222496cd0
Improve reaction to blob store corruptions (#111954)
Today there are a couple of assertions that can trip if the contents of
a snapshot repostiory are corrupted. It makes sense to assert the
integrity of snapshots in most tests, but we must also (a) protect
against these corruptions in production and (b) allow some tests to
verify the behaviour of the system when the repository is corrupted.

This commit introduces a flag to disable certain assertions, converts
the relevant assertions into production failures too, and introduces a
high-level test to verify that we do detect all relevant corruptions
without tripping any other assertions.

Extracted from #93735 as this change makes sense in its own right.
Relates #52622.
2024-08-19 06:35:52 +01:00
David Turner
a406333f87 Revert "Add 8.15.0 known issue for memory locking in Windows (#111949)"
This reverts commit 1e40fe45d6.
2024-08-19 06:20:28 +01:00
Pius
1e40fe45d6
Add 8.15.0 known issue for memory locking in Windows (#111949) 2024-08-17 05:53:36 -07:00
kosabogi
e7518fbe93
Adds a warning about manually mounting snapshots managed by ILM (#111883)
* Adds a warning about manually mounting snapshots managed by ILM

* Shortens text and moves the warning to Searchable snapshots chapter
2024-08-15 18:43:06 +02:00
István Zoltán Szabó
1ba72e4602
[DOCS] Documents output_field behavior after multiple inference runs (#111875)
Co-authored-by: David Kyle <david.kyle@elastic.co>
2024-08-15 12:36:59 +02:00
Luca Belluccini
595628f9ce
[DOCS] The logs index.mode has been renamed logsdb (#111871) 2024-08-14 08:30:31 -07:00