* Fixes CORS headers needed by Elastic clients
Updates the default value for the `http.cors.allow-headers`
setting to include headers used by Elastic client libraries.
Also adds the `access-control-expose-headers` header to responses to
CORS requests so that clients can successfully perform their product
check.
In #92309 we have aligned the size of the `search` and the `get` thread
pool but the docs still contain the prior `get` thread pool size. With
this commit we also align the docs.
Relates #92309
The Redact processor uses the Grok rules engine to
redact text in the input document that matches the
Grok pattern. For example Email or IP addresses can
be redacted using the definitions from the standard
Grok pattern bank. New patterns can be defined in
the processor configuration
* Document datehistogram with long offsets
When offsets are longer than calendar_intervals that are non-standard,
like months which differ in length, then the usual rule of all buckets
starting at the same day and time will no longer apply.
This update attempts to explain this with examples.
* Removed TEST-skip lines
These don't seem to be parsable, even though they match the syntax
described in the README.asciidoc
* Added // TESTRESPONSE[skip:...] lines
* Refined docs description and added more examples
* Update docs/reference/aggregations/bucket/datehistogram-aggregation.asciidoc
Co-authored-by: Abdon Pijpelink <abdon.pijpelink@elastic.co>
* Update docs/reference/aggregations/bucket/datehistogram-aggregation.asciidoc
Co-authored-by: Abdon Pijpelink <abdon.pijpelink@elastic.co>
* Update docs/reference/aggregations/bucket/datehistogram-aggregation.asciidoc
Co-authored-by: Abdon Pijpelink <abdon.pijpelink@elastic.co>
* Update docs/reference/aggregations/bucket/datehistogram-aggregation.asciidoc
Co-authored-by: Abdon Pijpelink <abdon.pijpelink@elastic.co>
---------
Co-authored-by: Abdon Pijpelink <abdon.pijpelink@elastic.co>
We sometimes see a `ShardLockObtainFailedException` when a shard failed
to shut down as fast as we expected, often because a node left and
rejoined the cluster. Sometimes this is because it was held open by
ongoing scrolls or PITs, but other times it may be because the shutdown
process itself is too slow. With this commit we add the ability to
capture and log a thread dump at the time of the failure to give us more
information about where the shutdown process might be running slowly.
Relates #93226
This adds term query capabilities for rank_features fields. term queries against rank_features are not scored in the typical way as regular fields. This is because the stored feature values take advantage of the term frequency storage mechanism, and thus regular BM25 does not work.
Instead, a term query against a rank_features field is very similar to linear rank_feature query. If more complicated combinations of features and values are required, the rank_feature query should be used.
This adds a new option to the knn search clause called query_vector_builder. This is a pluggable configuration that allows the query_vector created or retrieved.
This change introduces the configuration option `ignore_missing_component_templates` as discussed in https://github.com/elastic/elasticsearch/issues/92426 The implementation [option 6](https://github.com/elastic/elasticsearch/issues/92426#issuecomment-1372675683) was picked with a slight adjustment meaning no patterns are allowed.
## Implementation
During the creation of an index template, the list of component templates is checked if all component templates exist. This check is extended to skip any component templates which are listed under `ignore_missing_component_templates`. An index template that skips the check for the component template `logs-foo@custom` looks as following:
```
PUT _index_template/logs-foo
{
"index_patterns": ["logs-foo-*"],
"data_stream": { },
"composed_of": ["logs-foo@package", "logs-foo@custom"],
"ignore_missing_component_templates": ["logs-foo@custom"],
"priority": 500
}
```
The component template `logs-foo@package` has to exist before creation. It can be created with:
```
PUT _component_template/logs-foo@custom
{
"template": {
"mappings": {
"properties": {
"host.ip": {
"type": "ip"
}
}
}
}
}
```
## Testing
For manual testing, different scenarios can be tested. To simplify testing, the commands from `.http` file are added. Before each test run, a clean cluster is expected.
### New behaviour, missing component template
With the new config option, it must be possible to create an index template with a missing component templates without getting an error:
```
### Add logs-foo@package component template
PUT http://localhost:9200/
_component_template/logs-foo@package
Authorization: Basic elastic password
Content-Type: application/json
{
"template": {
"mappings": {
"properties": {
"host.name": {
"type": "keyword"
}
}
}
}
}
### Add logs-foo index template
PUT http://localhost:9200/
_index_template/logs-foo
Authorization: Basic elastic password
Content-Type: application/json
{
"index_patterns": ["logs-foo-*"],
"data_stream": { },
"composed_of": ["logs-foo@package", "logs-foo@custom"],
"ignore_missing_component_templates": ["logs-foo@custom"],
"priority": 500
}
### Create data stream
PUT http://localhost:9200/
_data_stream/logs-foo-bar
Authorization: Basic elastic password
Content-Type: application/json
### Check if mappings exist
GET http://localhost:9200/
logs-foo-bar
Authorization: Basic elastic password
Content-Type: application/json
```
It is checked if all templates could be created and data stream mappings are correct.
### Old behaviour, with all component templates
In the following, a component template is made optional but it already exists. It is checked, that it will show up in the mappings:
```
### Add logs-foo@package component template
PUT http://localhost:9200/
_component_template/logs-foo@package
Authorization: Basic elastic password
Content-Type: application/json
{
"template": {
"mappings": {
"properties": {
"host.name": {
"type": "keyword"
}
}
}
}
}
### Add logs-foo@custom component template
PUT http://localhost:9200/
_component_template/logs-foo@custom
Authorization: Basic elastic password
Content-Type: application/json
{
"template": {
"mappings": {
"properties": {
"host.ip": {
"type": "ip"
}
}
}
}
}
### Add logs-foo index template
PUT http://localhost:9200/
_index_template/logs-foo
Authorization: Basic elastic password
Content-Type: application/json
{
"index_patterns": ["logs-foo-*"],
"data_stream": { },
"composed_of": ["logs-foo@package", "logs-foo@custom"],
"ignore_missing_component_templates": ["logs-foo@custom"],
"priority": 500
}
### Create data stream
PUT http://localhost:9200/
_data_stream/logs-foo-bar
Authorization: Basic elastic password
Content-Type: application/json
### Check if mappings exist
GET http://localhost:9200/
logs-foo-bar
Authorization: Basic elastic password
Content-Type: application/json
```
### Check old behaviour
Ensure, that the old behaviour still exists when a component template is used that is not part of `ignore_missing_component_templates`:
```
### Add logs-foo index template
PUT http://localhost:9200/
_index_template/logs-foo
Authorization: Basic elastic password
Content-Type: application/json
{
"index_patterns": ["logs-foo-*"],
"data_stream": { },
"composed_of": ["logs-foo@package", "logs-foo@custom"],
"ignore_missing_component_templates": ["logs-foo@custom"],
"priority": 500
}
```
Co-authored-by: Lee Hinman <dakrone@users.noreply.github.com>
This commit changes the geoip downloader so that we only download the geoip databases if you
have at least one geoip processor in your cluster, or when you add a new geoip processor (or if
`ingest.geoip.downloader.eager.download` is explicitly set to true).
* enhancement: boolean field to support ignore_malformed
* fix: changes in current builder for BooleanFieldMappers within tests files.
* Updating documentation
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
Co-authored-by: Amy Jonsson <amy.jonsson@elastic.co>
This PR makes JsonProcessor's JSON parsing a little bit stricter so that
we are not silently dropping data when given bad inputs. Previously if
the input string began with something that could be parsed as a valid
json field, then the processor would grab that and ignore the rest. For
example, `123 "foo"` would be parsed as `123`, dropping the `"foo"`. Now
by default it will throw an IllegalArgumentException on a string like
this. A user can now set the `strict_json_parsing` parameter to false to
get the old behavior. For example:
```
POST _ingest/pipeline/_simulate
{
"pipeline": {
"description": "",
"processors" : [
{
"json" : {
"field" : "message",
"strict_json_parsing": false
}
}
]
},
"docs": [
{
"_source": {
"message": "123 \"foo\""
}
}
]
}'
```
Closes#92898
The systemd unit file is part of the Elasticsearch package and should
not be edited. Instead, we recommend creating a service override file.
This commit tweaks the docs for setting tmp dir with systemd to use the
override file instead of editing the unit file.
relates #93121
* Documentation for geohex_grid over geo_shape
The feature to add support for geohex_grid aggregations over geo_shape
fields was added in https://github.com/elastic/elasticsearch/pull/91956.
This is the associated documentation for that.
* Update docs/reference/aggregations/bucket/geohexgrid-aggregation.asciidoc
Co-authored-by: Abdon Pijpelink <abdon.pijpelink@elastic.co>
* Fix explanation for geo_point vs geo_shape proj
When aggregating geohex over geoshape we use requirectangular because
underlying lucene index indexes and searches the polygons in that way.
* Correct spelling
According to grammarly, "therefor" is not an alternative spelling
of "therefore". We should use the conjunctive form here.
See https://www.grammarly.com/blog/therefore-vs-therefor/
Co-authored-by: Abdon Pijpelink <abdon.pijpelink@elastic.co>
Update tsdb docs to include a warning that the format of the `_tsid` field shouldn't be relied upon and added additional limitations about dimension fields.
* [+DOC] Restore policies in restoring ILM indices
👋 howdy! This may need Asciidoc reformatting. Will you kindly add in express commentary on [Restore a managed Datastream or Index](https://www.elastic.co/guide/en/elasticsearch/reference/master/index-lifecycle-and-snapshots.html?edit) to also restore ILM policies as needed (via `include_global_state`). Otherwise, you induce ILM errors once ILM starts (and have to do a form of repeating the entire outlined procedure to get indices going through correctly.)
* Apply suggestions from code review
Co-authored-by: Abdon Pijpelink <abdon.pijpelink@elastic.co>
Co-authored-by: Abdon Pijpelink <abdon.pijpelink@elastic.co>
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
Today we suggest that users set `ES_TMPDIR` using `export`, which only
works if you're running things directly from the shell. Yet most users
encountering `ES_TMPDIR` problems seem to on RHEL and trying to run
things via `systemd`, for whom the `export` suggestion doesn't work.
This commit adds to the docs a suggestion of how to adjust the `systemd`
service file to set the appropriate environment variable.
Relates #80651
This PR is another round of documentation update for the JWT realm with the goal to achieve better clarity, differentiating more between the two token types and encourage readers to choose between them carefully.
Relates: #92409
To make it clear that repository snapshots should be available and reliable for any mounted searchable snapshots.
Co-authored-by: David Turner <david.turner@elastic.co>
The companion PR to elastic/ml-cpp#2440 adds processing of multimodal_distribution field in the anomaly score explanation. I added a changelog entry in the ml-cpp PR hence I mark this PR as a non-issue.