https://www.elastic.co/start-local is live and will be our go-to local
dev setup.
This PR:
- Updates both the Elasticsearch root readme and `run-elasticsearch-locally.asciidoc`
🧹 Also try to keep as concise as possible by not mirroring _everything_
in readme
This adds some more counts for dense_vector field mapping stats. This
allows for seeing the number of mappings with a given element type,
similarity, or index type.
* (Doc+) Hotspotting link video
👋 howdy, team! Ongoing improvements for common support topics, this [links our example hotspotting video](https://www.youtube.com/watch?v=Q5ODJ5nIKAM&list=PL_mJOmq4zsHbQlfEMEh_30_LuV_hZp-3d&index=5).
---------
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
Co-authored-by: shainaraskas <58563081+shainaraskas@users.noreply.github.com>
- Forbid ephemeral `_auto_gen.html` and `page.html#_auto_gen` links.
- Remove dangling/unused `BOOTSTRAP_CHECK_G1GC` link.
- Separate test suite into individual tests.
Enhance ES|QL responses to include information about `took` time (search latency), shards, and
clusters against which the query was executed.
The goal of this PR is to begin to provide parity between the metadata displayed for
cross-cluster searches in _search and ES|QL.
This PR adds the following features:
- add overall `took` time to all ES|QL query responses. And to emphasize: "all" here
means: async search, sync search, local-only and cross-cluster searches, so it goes
beyond just CCS.
- add `_clusters` metadata to the final response for cross-cluster searches, for both
async and sync search (see example below)
- tracking/reporting counts of skipped shards from the can_match (SearchShards API)
phase of ES|QL processing
- marking clusters as skipped if they cannot be connected to (during the field-caps
phase of processing)
Out of scope for this PR:
- honoring the `skip_unavailable` cluster setting
- showing `_clusters` metadata in the async response **while** the search is still running
- showing any shard failure messages (since any shard search failures in ES|QL are
automatically fatal and _cluster/details is not shown in 4xx/5xx error responses). Note that
this also means that the `failed` shard count is always 0 in ES|QL `_clusters` section.
Things changed with respect to behavior in `_search`:
- the `timed_out` field in `_clusters/details/mycluster` was removed in the ESQL
response, since ESQL does not support timeouts. It could be added back later
if/when ESQL supports timeouts.
- the `failures` array in `_clusters/details/mycluster/_shards` was removed in the ESQL
response, since any shard failure causes the whole query to fail.
Example output from ES|QL CCS:
```es
POST /_query
{
"query": "from blogs,remote2:bl*,remote1:blogs|\nkeep authors.first_name,publish_date|\n limit 5"
}
```
```json
{
"took": 49,
"columns": [
{
"name": "authors.first_name",
"type": "text"
},
{
"name": "publish_date",
"type": "date"
}
],
"values": [
[
"Tammy",
"2009-11-04T04:08:07.000Z"
],
[
"Theresa",
"2019-05-10T21:22:32.000Z"
],
[
"Jason",
"2021-11-23T00:57:30.000Z"
],
[
"Craig",
"2019-12-14T21:24:29.000Z"
],
[
"Alexandra",
"2013-02-15T18:13:24.000Z"
]
],
"_clusters": {
"total": 3,
"successful": 2,
"running": 0,
"skipped": 1,
"partial": 0,
"failed": 0,
"details": {
"(local)": {
"status": "successful",
"indices": "blogs",
"took": 43,
"_shards": {
"total": 13,
"successful": 13,
"skipped": 0,
"failed": 0
}
},
"remote2": {
"status": "skipped", // remote2 was offline when this query was run
"indices": "remote2:bl*",
"took": 0,
"_shards": {
"total": 0,
"successful": 0,
"skipped": 0,
"failed": 0
}
},
"remote1": {
"status": "successful",
"indices": "remote1:blogs",
"took": 47,
"_shards": {
"total": 13,
"successful": 13,
"skipped": 0,
"failed": 0
}
}
}
}
}
```
Fixes https://github.com/elastic/elasticsearch/issues/112402 and https://github.com/elastic/elasticsearch/issues/110935
Lucene 10 has upgraded its Snowball stemming support, as part of those
upgrades, two no longer supported stemmers were removed, `KpStemmer` and
`LovinsStemmer`. These are `dutch_kp` and `lovins`, respectively.
We will deprecate in 8.16 and will remove support for these in a future
version.
* Add new Previous Step Info field to LifecycleExecutionState
* Add new field to IndexLifecycleExplainResponse
* Add new field to TransportExplainLifecycleAction
* Add logic to IndexLifecycleTransition to keep previous setp info
* Switch tests to use Java standard Clock class
for any time based testing, this is the recommended method
* Fix tests for new field
Also refactor tests to newer style
* Add test to ensure step info is preserved
Across auto retries
* Add docs for new field
* Changelog Entry
* Update docs/changelog/113187.yaml
* Revert "Switch tests to use Java standard Clock class"
This reverts commit 241074c735.
* PR Changes
* PR Changes - Improve docs wording
Co-authored-by: Mary Gouseti <mgouseti@gmail.com>
* Integration test for new ILM explain field
* Use ROOT locale instead of default toLowerCase
* PR Changes - Switch to block strings
* Remove forbidden API usage
---------
Co-authored-by: Mary Gouseti <mgouseti@gmail.com>
Adds a new option trace_redact in redact processor to indicate a document has been redacted in the ingest pipeline. If a document is processed by a redact processor AND any field is redacted, ingest metadata _ingest._redact._is_redacted = true will be set.
Closes#94633
* Note in docs about incorrect IO stats when running in docker
* Update docs/reference/cluster/nodes-stats.asciidoc
Co-authored-by: David Turner <david.turner@elastic.co>
* Requested PR changes to wording
* Update docs/reference/cluster/nodes-stats.asciidoc
Co-authored-by: David Turner <david.turner@elastic.co>
---------
Co-authored-by: David Turner <david.turner@elastic.co>
Resolves#111842
This adds a conversion function that yields DATE_NANOS. Mostly this is straight forward.
It is worth noting that when converting a millisecond date into a nanosecond date, the conversion function truncates it to 0 nanoseconds (i.e. first nanosecond of that millisecond). This is, of course, a bit of an assumption, but I don't have a better assumption we can make. I'd thought about adding a second, optional, parameter to control this behavior, but it's important that TO_DATE_NANOS extend AbstractConvertFunction, which itself extends UnaryScalarFunction, so that it will work correctly with union types. Also, it's unlikely the user will have any better guess than we do for filling in the nanoseconds.
Making that assumption does, however, create some weirdness. Consider two comparisons:
TO_DATETIME("2023-03-23T12:15:03.360103847") == TO_DATETIME("2023-03-23T12:15:03.360") will return true while TO_DATE_NANOS("2023-03-23T12:15:03.360103847") == TO_DATE_NANOS("2023-03-23T12:15:03.360") will return false. This is akin to casting between longs and doubles, where things may compare equal in one type that are not equal in the other. This seems fine, and I can't think of a better way to do it, but it's worth being aware of.
---------
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
Deprecate to, from, include_lower, include_upper range query params.
These params have been removed from our documentation in v. 0.90.4 (d6ecdecc19),
but did not got through deprecation cycle.
These params to be removed in v9.0.
Related to #81276Closes#48538
Allows setting index total_shards_per_node in the SearchableSnapshot action of ILM to remediate hot spot in shard allocation for searchable snapshot index.
Closes#112261
Here we introduce a new index-level setting, `ignore_above`, similar to what we have
for `ignore_malformed`. The setting will apply to all `keyword`, `wildcard` and `flattened`
fields. Each field mapping will still be allowed to override the index-level setting using a
mapping-level `ignore_above` value.
Closes https://github.com/elastic/elasticsearch/issues/110387
Having this in now affords us not having to introduce version checks in
the ES exporter later. We can simply use the same serialization logic
for metric attributes as we do for other signals. This also enables us
to properly map `*.ip` fields to the ip field type as ip fields
containing a list of IPs are not converted to a comma-separated list.
This will correct/switch "year" unit diffing from the current integer
subtraction to a crono subtraction. Consequently, two dates are (at
least) one year apart now if (at least) a full calendar year separates
them. The previous implementation simply subtracted the year part of the
dates.
Note: this parts with ES SQL's implementation of the same function,
which itself is aligned with MS SQL's implementation, which works
equivalent to an integer subtraction.
Fixes#112482.