To assist the user in configuring the visualizations correctly while leveraging TSDB
functionality, information about TSDB configuration should be exposed via the field
caps API per field.
Especially for metrics fields, it must be clear which fields are metrics and if they belong
to only time-series indexes or mixed time-series and non-time-series indexes.
To further distinguish metric fields when they belong to any of the following indices:
- Standard (non-time-series) indexes
- Time series indexes
- Downsampled time series indexes
This PR modifies the field caps API so that the mapping parameters time_series_dimension
and time_series_dimension are presented only when they are set on fields of time-series indexes.
Those parameters are completely ignored when they are set on standard (non-time-series) indexes.
This PR revisits some of the conventions adopted by #78790
Also add support for new CATALINA/TOMCAT timestamp formats used by ECS Grok patterns
Relates #77065
Co-authored-by: David Roberts <dave.roberts@elastic.co>
This change deprecates the kNN search API in favor of the new 'knn' option
inside the search API. The 'knn' option is now the preferred way of performing
kNN search.
Relates to #87625
Introduced in: #88439
* [ML] add text_similarity nlp task documentation
* Apply suggestions from code review
Co-authored-by: István Zoltán Szabó <istvan.szabo@elastic.co>
* Update docs/reference/ml/trained-models/apis/infer-trained-model.asciidoc
Co-authored-by: István Zoltán Szabó <istvan.szabo@elastic.co>
* Apply suggestions from code review
Co-authored-by: István Zoltán Szabó <istvan.szabo@elastic.co>
* Update docs/reference/ml/ml-shared.asciidoc
Co-authored-by: István Zoltán Szabó <istvan.szabo@elastic.co>
Co-authored-by: István Zoltán Szabó <istvan.szabo@elastic.co>
Clean up network setting docs
- Add types for all params
- Remove mention of JDKs before 11
- Clarify some wording
Co-authored-by: Stef Nestor <steffanie.nestor@gmail.com>
This commit fixes the situation where a user wants to use CCR to replicate indices that are part of
a data stream while renaming the data stream. For example, assume a user has an auto-follow request
that looks like this:
```
PUT /_ccr/auto_follow/my-auto-follow-pattern
{
"remote_cluster" : "other-cluster",
"leader_index_patterns" : ["logs-*"],
"follow_index_pattern" : "{{leader_index}}_copy"
}
```
And then the data stream `logs-mysql-error` was created, creating the backing index
`.ds-logs-mysql-error-2022-07-29-000001`.
Prior to this commit, replicating this data stream means that the backing index would be renamed to
`.ds-logs-mysql-error-2022-07-29-000001_copy` and the data stream would *not* be renamed. This
caused a check to trip in `TransportPutLifecycleAction` asserting that a backing index was not
renamed for a data stream during following.
After this commit, there are a couple of changes:
First, the data stream will also be renamed. This means that the `logs-mysql-error` becomes
`logs-mysql-error_copy` when created on the follower cluster. Because of the way that CCR works,
this means we need to support renaming a data stream for a regular "create follower" request, so a
new parameter has been added: `data_stream_name`. It works like this:
```
PUT /mynewindex/_ccr/follow
{
"remote_cluster": "other-cluster",
"leader_index": "myotherindex",
"data_stream_name": "new_ds"
}
```
Second, the backing index for a data stream must be renamed in a way that does not break the parsing
of a data stream backing pattern, whereas previously the index
`.ds-logs-mysql-error-2022-07-29-000001` would be renamed to
`.ds-logs-mysql-error-2022-07-29-000001_copy` (an illegal name since it doesn't end with the
rollover digit), after this commit it will be renamed to
`.ds-logs-mysql-error_copy-2022-07-29-000001` to match the renamed data stream. This means that for
the given `follow_index_pattern` of `{{leader_index}}_copy` the index changes look like:
| Leader Cluster | Follower Cluster |
|--------------|-----------|
| `logs-mysql-error` (data stream) | `logs-mysql-error_copy` (data stream) |
| `.ds-logs-mysql-error-2022-07-29-000001` | `.ds-logs-mysql-error_copy-2022-07-29-000001` |
Which internally means the auto-follow request turned into the create follower request of:
```
PUT /.ds-logs-mysql-error_copy-2022-07-29-000001/_ccr/follow
{
"remote_cluster": "other-cluster",
"leader_index": ".ds-logs-mysql-error-2022-07-29-000001",
"data_stream_name": "logs-mysql-error_copy"
}
```
Relates to https://github.com/elastic/elasticsearch/pull/84940 (cherry-picked the commit for a test)
Relates to https://github.com/elastic/elasticsearch/pull/61993 (where data stream support was first introduced for CCR)
Resolves https://github.com/elastic/elasticsearch/issues/81751
DiscoveryPlugin allows extending getJoinValidator and
getElectionStrategies. These are implementation details of the system.
This commit deprecates these methods so that plugin authors are
discouraged from overriding them.
Network plugins provide network implementations. In the past this has
been used for alternatives to netty based networking, using the JDK's
nio. However, nio has now been removed, and it is inadvisable for a
plugin to implement this low level part of the system.
Therefore, this commit marks the NetworkPlugin interface as deprecated.
Adds some docs giving more detailed background about what data
corruption really means and some suggestions about how to narrow down
the root cause.
Co-authored-by: Henning Andersen <33268011+henningandersen@users.noreply.github.com>
The inference node stats for deployed PyTorch inference
models now contain two new fields: `inference_cache_hit_count`
and `inference_cache_hit_count_last_minute`.
These indicate how many inferences on that node were served
from the C++-side response cache that was added in
https://github.com/elastic/ml-cpp/pull/2305. Cache hits
occur when exactly the same inference request is sent to the
same node more than once.
The `average_inference_time_ms` and
`average_inference_time_ms_last_minute` fields now refer to
the time taken to do the cache lookup, plus, if necessary,
the time to do the inference. We would expect average inference
time to be vastly reduced in situations where the cache hit
rate is high.
This change adds support for kNN vector fields to the `_disk_usage` API. The
strategy:
* Iterate the vector values (using the same strategy as for doc values) to
estimate the vector data size
* Run some random vector searches to estimate the vector index size
Co-authored-by: Yannick Welsch <yannick@welsch.lu>
Closes#84801
This change ensures that existing read_only_allow_delete blocks that
are placed on indices when the flood_stage watermark threshold is
exceeded, are removed when the disk threshold monitoring is disabled.
This is done by changing how InternalClusterInfoService behaves when
disabled. With this change, it will keep calling the registered
listeners periodically, but with an empty ClusterInfo.
Closes#86383
Add the dry_run query parameter to support simulating of updating of desired nodes. The update request will be validated, but no cluster state updates will be performed. In order to indicate that the response was a result of a dry run, we add the dry_run run field to the JSON representation of a response.
See #82975
This PR adds a user action to the SLM health indicator which checks each SLM policy's invocations
since last success field and reports degraded health (YELLOW) in the event that any policy is at or
above the failure threshold (default is 5 failures in a row).
This commit removes the notion of components from the health API. They are gone from being
a top-level field in the response, and indicators is promoted into its place.
Our current default for the http.max_header_size setting is 8kb. This
is lower than the current default for Kibana (16kb in 8.x), and the ESS
proxy (1mb based on the Go http library default). To align with the
current convention of other Elastic components, this PR increases the
ES header size setting default to 16kb.
Closes#88501
Remove help_url,rename summary->symptom,user_actions->diagnosis
Separate the diagnosis `message` field in `cause` and `action`
Co-authored-by: Mary Gouseti <mgouseti@gmail.com>
* Convert disk watermarks to RelativeByteSizeValues
Similar to the existing watermark setting for the frozen tier.
Pre-requisite for PR 88639 that plans to introduce max headroom
settings for the disk watermarks, similar to the frozen tier max
headroom setting.
* Add changelog
* Revert 20gb to 20GB
* Make formatNoTrailingZerosPercent non static
* ByteSizeValue.MINUS_ONE
* Remove getMinimumTotalSizeForBelowWatermark
* Remove comment
* Fix minor stuff
* Make parsing of RelativeByteSizeValue faster
Mimicks older definitelyNotPercentage function
* Remove Locale from Strings.format
* More MINUS_ONE
This PR adds a new `knn` option to the `_search` API to support ANN search.
It's powered by the same Lucene ANN capabilities as the old `_knn_search`
endpoint. The `knn` option can be combined with other search features like
queries and aggregations.
Addresses #87625
This adds support for the `cardinality` aggregation within a random_sampler.
This usecase is helpful in determining the ratio of unique values compared to the count of total documents within the sampled set.
* [DOCS] Add minimal security steps back to docs
* Update instructions to use reset password tool
* Update setting built-in user passwords with the es reset passwords tool
* Revert "Update setting built-in user passwords with the es reset passwords tool"
This reverts commit 51b72fdfdf.
* Address review feedback and make clearer distinctions between security configurations