Commit graph

15703 commits

Author SHA1 Message Date
Liam Thompson
d0f4966431
[DOCS] Add local dev setup instructions (#107913)
* [DOCS] Add local dev setup instructions

- Replace existing Run ES in Docker locally page, with simpler no-security local dev setup
- Move this file into Quickstart folder, along with existing quickstart guide
- Update self-managed instructions in Quickstart guide to use local dev approach
2024-05-07 18:10:48 +02:00
Lisa Cawley
a079cdc17d
[DOCS] Update transform and anomaly detection rule creation steps (#107975) 2024-05-07 07:52:45 -07:00
Jonathan Buttner
9623e522c3
[ML] Inference document configurable settings (#108273)
* Starting to document various inference settings

* Finish settings

* Update docs/reference/settings/inference-settings.asciidoc

Co-authored-by: Max Hniebergall <137079448+maxhniebergall@users.noreply.github.com>

* Update docs/reference/settings/inference-settings.asciidoc

Co-authored-by: István Zoltán Szabó <istvan.szabo@elastic.co>

* Update docs/reference/settings/inference-settings.asciidoc

Co-authored-by: István Zoltán Szabó <istvan.szabo@elastic.co>

* Update docs/reference/settings/inference-settings.asciidoc

Co-authored-by: István Zoltán Szabó <istvan.szabo@elastic.co>

* Update docs/reference/settings/inference-settings.asciidoc

Co-authored-by: István Zoltán Szabó <istvan.szabo@elastic.co>

* Update docs/reference/settings/inference-settings.asciidoc

Co-authored-by: István Zoltán Szabó <istvan.szabo@elastic.co>

* Update docs/reference/settings/inference-settings.asciidoc

Co-authored-by: István Zoltán Szabó <istvan.szabo@elastic.co>

* Update docs/reference/settings/inference-settings.asciidoc

Co-authored-by: István Zoltán Szabó <istvan.szabo@elastic.co>

---------

Co-authored-by: Max Hniebergall <137079448+maxhniebergall@users.noreply.github.com>
Co-authored-by: István Zoltán Szabó <istvan.szabo@elastic.co>
2024-05-07 10:19:08 -04:00
Liam Thompson
9b7e9b5d59
[DOCS] ESQL goes GA (#108342) 2024-05-07 14:12:50 +02:00
Bogdan Pintea
de725aef80
Add docs clarifications on DATE_DIFF args (#108301)
This adds some clarifications on the time unit strings the function
takes as arguments, noting the differences between these and the time
span literals, as well as the abbreviations' source.
2024-05-07 12:59:01 +02:00
David Turner
26db24317d
Revert "Cluster state role mapper file settings service (#107886)" (#108346)
This reverts commit 391136c089.
2024-05-07 10:52:51 +01:00
Andrew Wilkins
ee85f74e55
apm-data: increase version for templates (#108340)
* apm-data: bump version of resources

* Update docs/changelog/108340.yaml
2024-05-07 16:07:06 +08:00
Nhat Nguyen
e6b43a1709
Fix BlockHash DirectEncoder (#108283)
The DirectEncoder currently returns the incorrect value for the
positionCount() method, which should be the number of positions ready in
the current batch. We need to keep track of whether a position is loaded
via encodeNextBatch() and consumed via the read() method. However, we
can always return 1 for positionCount(), indicating that one position is
already loaded. Our tests failed to catch this because mv_ordering
wasn't enabled when generating test blocks, effectively disabling the
DirectEncoders.

Closes #108268
2024-05-06 09:05:55 -07:00
Nik Everett
d8d25ebdd7
ESQL: Log queries at debug level (#108257)
Previously we were logging all ESQL queries. That's a lot! Plus maybe
there's PII in there or something. Let's not do that unless you ask for
it. This changes the query logging to the `debug` log level you can
still get at these if you want them, but you don't have them by default.
you have to turn it on.
2024-05-06 11:19:02 -04:00
Nik Everett
089fd7d7da
ESQL: Rework integration-only csv testing (#108313)
This reworks the integration-test-only csv testing for `metadata` to use
the `required_feature:` syntax instead of the `-IT_tests_only`
extension. This is a little more flexible and way nicer on the eyes.
2024-05-06 11:06:50 -04:00
Nikolaj Volgushev
31afff92f8
Invalidate cross cluster API key docs (#108297)
This PR documents privilege requirements for cross-cluster API key
invalidation, which were updated in
https://github.com/elastic/elasticsearch/pull/107411.
2024-05-06 10:02:14 -04:00
Ignacio Vera
c89de11e57
Optimise frequent item sets aggregation for single value fields (#108130)
Similar to #107832, this commit optimize requent item sets aggregation for single value fields.
2024-05-06 16:01:21 +02:00
Liam Thompson
1be1110740
[DOCS] Clarify retriever is not API (#108295) 2024-05-06 15:52:25 +02:00
shainaraskas
58729edc30
add gatekeeper workaround (#108265) 2024-05-06 09:18:51 -04:00
shainaraskas
b84bd458b6
[DOCS] clarify that the repo location setting accepts only one value (#108267) 2024-05-06 09:18:17 -04:00
Bogdan Pintea
b26d7d3e14
Introduce an IP functions group (#108304)
This takes the CIDR_MATCH out of the operators group and adds it to a
new `IP functions` group.
The change also re-aranges the groups, grouping together the
type-specific functions and ordering them alphabetically.
2024-05-06 13:43:30 +02:00
István Zoltán Szabó
d90c0af3a6
[DOCS] Documents param for Health API. (#108296) 2024-05-06 12:23:17 +02:00
Yang Wang
bcf4297e89
Ensure necessary security context for s3 bulk deletions (#108280)
This PR moves the doPrivileged wrapper closer to the actual deletion
request to ensure the necesary security context is established at all
times. Also added a new repository setting to configure max size for s3
deleteObjects request.

Fixes: #108049
2024-05-06 06:02:09 -04:00
elasticsearchmachine
844979414b
Forward port release notes for v8.13.3 (#108256) 2024-05-06 07:44:48 +02:00
Nhat Nguyen
16884a3bcd
Fix tsdb codec when doc-values spread in two blocks (#108276)
Currently, loading ordinals multiple times (after advanceExact) for 
documents with values spread across multiple blocks in the TSDB codec
will fail due to the absence of re-seeking for the ordinals block.

Doc-values of a document can spread across multiple blocks in two cases:
when it has more than 128 values or when it exceeds the remaining space
in the current block.
2024-05-04 13:54:56 -07:00
Fang Xing
4daac77e3b
[ES|QL] Add/Modify annotations for operators for better doc generation (#108220)
* annotation for operators
2024-05-03 22:59:51 -04:00
Rafi Estrada
295fba33d8
Add note about license to "Restore an Entire Cluster" docs (#87485)
One user reached out mentioning that it would be a good idea to remind
users to re-upload the license after full cluster recovery from snapshot
as one can easily miss this when trying to figure out why some features
aren't working after the restore.
2024-05-03 19:23:51 -04:00
Lee Hinman
5361027989
Log details of non-green indicators in HealthPeriodicLogger (#108266)
* Log details of non-green indicators in HealthPeriodicLogger

This commit adds the details of an indicator that is not green to the fields for
`HealthPeriodicLogger`.

An example of a regular (green) log message:

```
[2024-05-03T13:42:34,346][INFO ][o.e.h.HealthPeriodicLogger] [runTask-0] elasticsearch.health.data_stream_lifecycle.status="green" elasticsearch.health.disk.status="green" elasticsearch.health.ilm.status="green" elasticsearch.health.master_is_stable.status="green" elasticsearch.health.overall.status="green" elasticsearch.health.repository_integrity.status="green" elasticsearch.health.shards_availability.status="green" elasticsearch.health.shards_capacity.status="green" elasticsearch.health.slm.status="green" message="health=green"
```

And a message with details while the cluster is non-green:

```
[2024-05-03T13:43:34,339][INFO ][o.e.h.HealthPeriodicLogger] [runTask-0] elasticsearch.health.data_stream_lifecycle.status="green" elasticsearch.health.disk.status="green" elasticsearch.health.ilm.status="green" elasticsearch.health.master_is_stable.status="green" elasticsearch.health.overall.status="yellow" elasticsearch.health.repository_integrity.status="green" elasticsearch.health.shards_availability.details="{"initializing_primaries":0,"creating_replicas":0,"started_replicas":0,"unassigned_primaries":0,"restarting_replicas":0,"creating_primaries":0,"initializing_replicas":0,"unassigned_replicas":1,"started_primaries":2,"restarting_primaries":0}" elasticsearch.health.shards_availability.status="yellow" elasticsearch.health.shards_capacity.status="green" elasticsearch.health.slm.status="green" message="health=yellow [shards_availability]"
```

* Update docs/changelog/108266.yaml
2024-05-03 16:29:27 -06:00
Jake Landis
79e6e770f9
upgrade bouncy castle (non-fips) to 1.78.1 (#108223) 2024-05-03 16:10:20 -05:00
David Turner
eb90e36235
Fix serialization of put/delete shutdown requests (#107862)
Co-authored-by: Simon Cooper <simon.cooper@elastic.co>
2024-05-03 16:57:20 +01:00
elasticsearchmachine
f02761f729 Prune changelogs after 8.13.3 release 2024-05-03 15:14:10 +00:00
Bogdan Pintea
5f4ef87c47
Fix docs generation of signatures for variadic functions (#107865)
This fixes the generation of the signatures for variadic functions,
except for those that take a list as last argument; i.e.  functions with
optional arguments (like ROUND) or functions with overloading-like
signatures (like BUCKET).
2024-05-03 15:37:22 +02:00
Howard
8c8063be63
Drop shards close timeout when stopping node. (#107978)
Closing https://github.com/elastic/elasticsearch/issues/107938 to drop
shards close timeout.
2024-05-03 06:03:33 -04:00
Andrew Wilkins
0bb7dc0788
apm-data: improve indexing resilience (#108227)
In 8.11 we were not dynamically mapping any stack trace fields,
exception attributes, {error,transaction}.custom, cookies, or HTTP
request bodies. In 8.12 we changed to using `flattened` in some of these
and mirrored the change in the apm-data plugin.

It has since come to light that we're seeing indexing failures using
`flattened` where the field contents are not completely unsanitised,
e.g. in stack traces with deeply nested stack frame variables, or source
lines that exceed 32KB.

We're reverting this use of `flattened` since there is no requirement
for these fields to be searchable by default, and users can override
this with a custom component template.

On the other hand we are making HTTP request and response headers
`flattened`, since:  - headers cannot be deeply nested  - these have a
natural length limit imposed by server implementations
2024-05-03 05:54:24 -04:00
David Kyle
0005949884
[ML] Inference Processor: skip inference when all fields are missing (#108131)
If all the configured input_output fields are missing then skip the inference request.
2024-05-03 10:52:16 +01:00
David Kyle
a8166ddb6b
[ML] Allow deletion of the ELSER inference service when referenced in ingest (#108146) 2024-05-03 10:49:53 +01:00
Ignacio Vera
fe857b3908
Optimise cardinality aggregations for single value fields (#107892) 2024-05-03 10:09:04 +02:00
Stef Nestor
5b28d3bff4
[Doc+] Add Secure Connection to Setup CCR Tutorial (#103237) 2024-05-03 09:13:42 +02:00
Liam Thompson
9a62dba53c
[DOCS] Remove remaining beta flags for RCS (#108201) 2024-05-03 09:12:37 +02:00
David Turner
f0ea05811f
Async close of IndexShard (#108145)
Moves the work to close an `IndexShard`, including any final flush and
waiting for merges to complete, off the cluster applier thread to avoid
delaying the application of cluster state updates.

Relates #89821 Relates #107513 Relates #108096 Relates ES-8334
2024-05-03 02:46:03 -04:00
Jake Landis
550b7c9e40
Remote cluster - API key security model - cluster privileges (#107493)
This commit adds a new remote_cluster to the role and API keys to enable expressing 
cluster permissions across remote clusters similar to the remote_indices block for 
RCS 2.0 (API key based cross cluster security).
```
"remote_cluster": [
        {
            "privileges": [
                "monitor_enrich"
            ],
            "clusters": [
                "my_remote*"
            ]
        }
    ]
```
The motivation for this change to enable ES|QL executed over remote clusters to allow or 
disallow remote enrichment. As such, the only privilege allowed here is "monitor_enrich" 
which controls if users can use ES|QL ENRICH keyword as part of their queries for the 
remote cluster. RemoteClusterSecurityEsqlIT has the relevant functional tests since the 
only way to exercise this permission. This change is quite large since this change touches
 the role and role descriptors used for API keys. That requires updates and/or testing for 
CRUD role, API keys, file based roles, get user privileges (which echos the role), usage stats, 
and the various views of the role : role descriptors, roles (base/limited by), role intersections, 
and (effective) roles.

The data model design differs a bit from remote_indices in that it is a simpler data model 
which shares very little with the non-remote variant of cluster privileges. This allows the 
data to modeled only once via RemoteClusterPermissions and RemoteClusterPermissionGroup. RemoteClusterPermissions has N RemoteClusterPermissionGroup and is reused across the 
various vies of the roles. RemoteClusterPermissions is created early and is used as the 
primary contract with this new concept. Those data objects are self de/serializing and 
self toXcontent (we still use old school fromXContent and didn't want to mix and match 
old/new parsing strategies). Much like remote_indices, on transfer to the remote cluster, 
they will be converted into the non remote variant.
2024-05-02 14:19:48 -05:00
Nik Everett
9cb5b3174e
Docs: Update known issue (#108211)
Updates the known issue for #108181 to include that this can happen
during a rolling restart.
2024-05-02 13:12:21 -04:00
Stef Nestor
a3f3f59399
(Doc+) Delineate Bootstrapping Data Stream from Alias (#107976)
* (Doc+) Delineate Bootstrapping Data Stream from Alias 

👋 howdy, team! 

This is follow-up to [elasticsearch#107327](https://github.com/elastic/elasticsearch/pull/107327). I realized my mistake was that we had duplicate sentences in different sections so I edited the wrong area. However, it seemed like a good opportunity to consider clarifying the page better by fixing header links so that the sub-sections reflect as sub-headers instead of all being equal. Thoughts?

* Apply suggestions from code review

Co-authored-by: shainaraskas <58563081+shainaraskas@users.noreply.github.com>

---------

Co-authored-by: shainaraskas <58563081+shainaraskas@users.noreply.github.com>
2024-05-02 11:07:07 -06:00
Luigi Dell'Aquila
b277d5b033
ES|QL: limit query depth to 500 levels (#108089) 2024-05-02 17:42:41 +02:00
Jake Landis
6d20cef931
Bump Tika dependencies to 2.9.2 (#108144)
This commit bumps Tika to 2.9.2 and manually bumps the transitive versions 
to match 2.9.2's parent POM. This commit also centralizes the dependency 
versions so that you only need to look at 1 list to see the full set of dependencies 
to manually check.
2024-05-02 10:19:31 -05:00
Jake Landis
a3c00ab1ef
Upgrade to Netty 4.1.109 (#108155)
Routine upgrade for Netty
2024-05-02 10:11:16 -05:00
Nhat Nguyen
0877633be2
Add BlockHash for 3 BytesRefs (#108165)
This change introduces a specialized BlockHash for 3 BytesRefs. Unlike 
other specialized BlockHashes, this BlockHash should handle nulls and
multivalues properly. While not intended as a showcase, it can
illustrate the performance enhancements of ESQL on multi-term 
aggregations over the search API in our benchmark. This change is
expected to reduce the execution time of multi-term aggregations from
2054ms to 1178ms.
2024-05-02 07:04:17 -07:00
florent-leborgne
0c500e5264
Remove Beta label for RCS2.0 from 8.14 (#108030) 2024-05-02 15:43:21 +02:00
Parker Timmins
796b0deeec
Simulate should succeed if ignore_missing_pipeline (#108106)
PipelineProcessors with non-existing pipelines should succeed (as noop)
 if ignore_missing_pipeline=true. Currently, does not work when pipelines are
 simulated with verbose=true. In this case, an error is returned and no results
 are shown for subsequent processors. This change allows following processors
 to run, and changes the status from error to error_ignored.
2024-05-02 08:35:20 -05:00
Alexander Spies
f8c9aace9a
ESQL: Fix error message when failing to resolve aggregate groupings (#108101)
In queries like
STATS count(existing_field) BY non_existant_field
do not respond with validation errors claiming that existing_field was an unknown field.
2024-05-02 15:12:14 +02:00
Ignacio Vera
c8a828265d
Add known issue for CCS duplicated buckets (#108182) 2024-05-02 13:33:31 +02:00
Ignacio Vera
eea94ae66f
Optimise histogram aggregations for single value fields (#107893)
This commit optimise histogram aggregations for single value fields.
2024-05-02 07:30:55 +02:00
Albert Zaharovits
cf1f83fdde
Fix lingering license warning header (#108031)
Fixes the `AcceptChannelHandler#accept` mucking with the netty worker thread context.

Fixes #107573
2024-05-02 07:12:04 +02:00
Nhat Nguyen
05665f5269
Optimize for single value in ordinals grouping (#108118)
This PR introduces two optimizations in the ordinals grouping operator: 
(1) reading single values from doc values, and (2) enabling the vector
path for addInput. The query FROM nyc_taxis | stats avg(total_amount) by
rate_code_id | sort rate_code_id reduced execution time from 3300 ms to
2200 ms.
2024-05-01 09:21:39 -07:00
Fang Xing
7ae08306a0
mv functions (#107839)
Add annotations for MV functions for better doc generation.
2024-05-01 10:47:22 -04:00