* [DOCS] Add local dev setup instructions
- Replace existing Run ES in Docker locally page, with simpler no-security local dev setup
- Move this file into Quickstart folder, along with existing quickstart guide
- Update self-managed instructions in Quickstart guide to use local dev approach
* Starting to document various inference settings
* Finish settings
* Update docs/reference/settings/inference-settings.asciidoc
Co-authored-by: Max Hniebergall <137079448+maxhniebergall@users.noreply.github.com>
* Update docs/reference/settings/inference-settings.asciidoc
Co-authored-by: István Zoltán Szabó <istvan.szabo@elastic.co>
* Update docs/reference/settings/inference-settings.asciidoc
Co-authored-by: István Zoltán Szabó <istvan.szabo@elastic.co>
* Update docs/reference/settings/inference-settings.asciidoc
Co-authored-by: István Zoltán Szabó <istvan.szabo@elastic.co>
* Update docs/reference/settings/inference-settings.asciidoc
Co-authored-by: István Zoltán Szabó <istvan.szabo@elastic.co>
* Update docs/reference/settings/inference-settings.asciidoc
Co-authored-by: István Zoltán Szabó <istvan.szabo@elastic.co>
* Update docs/reference/settings/inference-settings.asciidoc
Co-authored-by: István Zoltán Szabó <istvan.szabo@elastic.co>
* Update docs/reference/settings/inference-settings.asciidoc
Co-authored-by: István Zoltán Szabó <istvan.szabo@elastic.co>
---------
Co-authored-by: Max Hniebergall <137079448+maxhniebergall@users.noreply.github.com>
Co-authored-by: István Zoltán Szabó <istvan.szabo@elastic.co>
This adds some clarifications on the time unit strings the function
takes as arguments, noting the differences between these and the time
span literals, as well as the abbreviations' source.
The DirectEncoder currently returns the incorrect value for the
positionCount() method, which should be the number of positions ready in
the current batch. We need to keep track of whether a position is loaded
via encodeNextBatch() and consumed via the read() method. However, we
can always return 1 for positionCount(), indicating that one position is
already loaded. Our tests failed to catch this because mv_ordering
wasn't enabled when generating test blocks, effectively disabling the
DirectEncoders.
Closes#108268
Previously we were logging all ESQL queries. That's a lot! Plus maybe
there's PII in there or something. Let's not do that unless you ask for
it. This changes the query logging to the `debug` log level you can
still get at these if you want them, but you don't have them by default.
you have to turn it on.
This reworks the integration-test-only csv testing for `metadata` to use
the `required_feature:` syntax instead of the `-IT_tests_only`
extension. This is a little more flexible and way nicer on the eyes.
This takes the CIDR_MATCH out of the operators group and adds it to a
new `IP functions` group.
The change also re-aranges the groups, grouping together the
type-specific functions and ordering them alphabetically.
This PR moves the doPrivileged wrapper closer to the actual deletion
request to ensure the necesary security context is established at all
times. Also added a new repository setting to configure max size for s3
deleteObjects request.
Fixes: #108049
Currently, loading ordinals multiple times (after advanceExact) for
documents with values spread across multiple blocks in the TSDB codec
will fail due to the absence of re-seeking for the ordinals block.
Doc-values of a document can spread across multiple blocks in two cases:
when it has more than 128 values or when it exceeds the remaining space
in the current block.
One user reached out mentioning that it would be a good idea to remind
users to re-upload the license after full cluster recovery from snapshot
as one can easily miss this when trying to figure out why some features
aren't working after the restore.
* Log details of non-green indicators in HealthPeriodicLogger
This commit adds the details of an indicator that is not green to the fields for
`HealthPeriodicLogger`.
An example of a regular (green) log message:
```
[2024-05-03T13:42:34,346][INFO ][o.e.h.HealthPeriodicLogger] [runTask-0] elasticsearch.health.data_stream_lifecycle.status="green" elasticsearch.health.disk.status="green" elasticsearch.health.ilm.status="green" elasticsearch.health.master_is_stable.status="green" elasticsearch.health.overall.status="green" elasticsearch.health.repository_integrity.status="green" elasticsearch.health.shards_availability.status="green" elasticsearch.health.shards_capacity.status="green" elasticsearch.health.slm.status="green" message="health=green"
```
And a message with details while the cluster is non-green:
```
[2024-05-03T13:43:34,339][INFO ][o.e.h.HealthPeriodicLogger] [runTask-0] elasticsearch.health.data_stream_lifecycle.status="green" elasticsearch.health.disk.status="green" elasticsearch.health.ilm.status="green" elasticsearch.health.master_is_stable.status="green" elasticsearch.health.overall.status="yellow" elasticsearch.health.repository_integrity.status="green" elasticsearch.health.shards_availability.details="{"initializing_primaries":0,"creating_replicas":0,"started_replicas":0,"unassigned_primaries":0,"restarting_replicas":0,"creating_primaries":0,"initializing_replicas":0,"unassigned_replicas":1,"started_primaries":2,"restarting_primaries":0}" elasticsearch.health.shards_availability.status="yellow" elasticsearch.health.shards_capacity.status="green" elasticsearch.health.slm.status="green" message="health=yellow [shards_availability]"
```
* Update docs/changelog/108266.yaml
This fixes the generation of the signatures for variadic functions,
except for those that take a list as last argument; i.e. functions with
optional arguments (like ROUND) or functions with overloading-like
signatures (like BUCKET).
In 8.11 we were not dynamically mapping any stack trace fields,
exception attributes, {error,transaction}.custom, cookies, or HTTP
request bodies. In 8.12 we changed to using `flattened` in some of these
and mirrored the change in the apm-data plugin.
It has since come to light that we're seeing indexing failures using
`flattened` where the field contents are not completely unsanitised,
e.g. in stack traces with deeply nested stack frame variables, or source
lines that exceed 32KB.
We're reverting this use of `flattened` since there is no requirement
for these fields to be searchable by default, and users can override
this with a custom component template.
On the other hand we are making HTTP request and response headers
`flattened`, since: - headers cannot be deeply nested - these have a
natural length limit imposed by server implementations
Moves the work to close an `IndexShard`, including any final flush and
waiting for merges to complete, off the cluster applier thread to avoid
delaying the application of cluster state updates.
Relates #89821 Relates #107513 Relates #108096 Relates ES-8334
This commit adds a new remote_cluster to the role and API keys to enable expressing
cluster permissions across remote clusters similar to the remote_indices block for
RCS 2.0 (API key based cross cluster security).
```
"remote_cluster": [
{
"privileges": [
"monitor_enrich"
],
"clusters": [
"my_remote*"
]
}
]
```
The motivation for this change to enable ES|QL executed over remote clusters to allow or
disallow remote enrichment. As such, the only privilege allowed here is "monitor_enrich"
which controls if users can use ES|QL ENRICH keyword as part of their queries for the
remote cluster. RemoteClusterSecurityEsqlIT has the relevant functional tests since the
only way to exercise this permission. This change is quite large since this change touches
the role and role descriptors used for API keys. That requires updates and/or testing for
CRUD role, API keys, file based roles, get user privileges (which echos the role), usage stats,
and the various views of the role : role descriptors, roles (base/limited by), role intersections,
and (effective) roles.
The data model design differs a bit from remote_indices in that it is a simpler data model
which shares very little with the non-remote variant of cluster privileges. This allows the
data to modeled only once via RemoteClusterPermissions and RemoteClusterPermissionGroup. RemoteClusterPermissions has N RemoteClusterPermissionGroup and is reused across the
various vies of the roles. RemoteClusterPermissions is created early and is used as the
primary contract with this new concept. Those data objects are self de/serializing and
self toXcontent (we still use old school fromXContent and didn't want to mix and match
old/new parsing strategies). Much like remote_indices, on transfer to the remote cluster,
they will be converted into the non remote variant.
* (Doc+) Delineate Bootstrapping Data Stream from Alias
👋 howdy, team!
This is follow-up to [elasticsearch#107327](https://github.com/elastic/elasticsearch/pull/107327). I realized my mistake was that we had duplicate sentences in different sections so I edited the wrong area. However, it seemed like a good opportunity to consider clarifying the page better by fixing header links so that the sub-sections reflect as sub-headers instead of all being equal. Thoughts?
* Apply suggestions from code review
Co-authored-by: shainaraskas <58563081+shainaraskas@users.noreply.github.com>
---------
Co-authored-by: shainaraskas <58563081+shainaraskas@users.noreply.github.com>
This commit bumps Tika to 2.9.2 and manually bumps the transitive versions
to match 2.9.2's parent POM. This commit also centralizes the dependency
versions so that you only need to look at 1 list to see the full set of dependencies
to manually check.
This change introduces a specialized BlockHash for 3 BytesRefs. Unlike
other specialized BlockHashes, this BlockHash should handle nulls and
multivalues properly. While not intended as a showcase, it can
illustrate the performance enhancements of ESQL on multi-term
aggregations over the search API in our benchmark. This change is
expected to reduce the execution time of multi-term aggregations from
2054ms to 1178ms.
PipelineProcessors with non-existing pipelines should succeed (as noop)
if ignore_missing_pipeline=true. Currently, does not work when pipelines are
simulated with verbose=true. In this case, an error is returned and no results
are shown for subsequent processors. This change allows following processors
to run, and changes the status from error to error_ignored.
In queries like
STATS count(existing_field) BY non_existant_field
do not respond with validation errors claiming that existing_field was an unknown field.
This PR introduces two optimizations in the ordinals grouping operator:
(1) reading single values from doc values, and (2) enabling the vector
path for addInput. The query FROM nyc_taxis | stats avg(total_amount) by
rate_code_id | sort rate_code_id reduced execution time from 3300 ms to
2200 ms.