The local health monitor runs every 10 seconds on every node and the
assertBusy statements we use to check the results are received also have
a 10 seconds timeout. This increases the timeout of the assertBusy
statements to 30 seconds to make sure the results from each node are
received.
We can dry up the code for SequenceIDFields etc. quite a bit.
Also, no point in storing the BytesRef as a field in
SingleValueLongField. Except for the case of nested docs (and I'm not
sure it's even guaranteed to help there) this saves no
allocations and only costs memory end to end (extra field + BytesRef
lives longer).
* Modify name of threadpool metric for rejected to adhere to standard
---------
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
This PR enables `secure_bind_password` setting to be updated
via `reload_secure_settings` API, without the need to restart nodes.
The `secure_bind_password` must be updated on the AD/LDAP
server, changed in the Elasticsearch keystore and reloaded via
`reload_secure_settings` API. This change does not include a
support for the grace period, where both old and new passwords
are active. The new password change is active immediately after
reload and will be used when establishing new LDAP connections.
LDAP connections are stateful, and once a connection is established
and bound, it remains open until explicitly closed or until a connection
timeout occurs. Changing the bind password on will not affect and
invalidate existing connections.
With this commit we rename the field `stacktrace_ids` to
`stacktrace_ids_field` for the API calls `_profiling/stacktraces` and
`_profiling/flamegraph`. The semantics of this field are to contain the
name of the field to query in the indices provided by the `indices`
parameter. As the old field name was misleading (should we provide the
name of the field or a list of ids?) we rename it. As these APIs are
meant for exclusive use by Kibana and the field has been unused so far
we make this change directly without introducing any BWC layer.
This PR supports enrich in remote mode. Enrich with remote mode can't be
after another enrich with the coordinator mode, an aggregation, or a
limit. While we can't address the first two limitations, we should
remove the constraint with LIMIT. Otherwise, users are forced to write
queries that may not perform well. For instance,
```
FROM test
| ORDER @timestamp
| LIMIT 10 | ENRICH[ccq.mode:remote]
```
does not work. In such cases, users must rewrite it as
```
FROM test
| ENRICH[ccq.mode:remote]
| ORDER @timestamp
| LIMIT 10
```
which is equivalent to bringing all data to the coordinating cluster.
We might consider implementing the actual remote enrich on the
coordinating cluster (like remote field extracting), however, this
requires retaining the originating cluster and restructing pages for
routing, which might be complicated.
The forbidden apis task always fails when classes are missing. Sometimes
we may want to not fail, for example when it is not possible to get to
the classes in question, as is the case with preview features. This
commit adds an ignoreMissingClasses option to the forbidden apis task
and extends the worker to conditionally use the forbidden apis option to
fail on missing classes.
Adds a chunkedInfer() method to the InferenceService interface which
automatically splits long text before sending the inputs to the model.
Chunking is done via a sliding window of length window_size with an
overlap of span. This change only applies to the ELSER model and Text Embedding
models deployed locally in the cluster
We should have checked and failed if there is an inconsistent pair of
data node plan and target indices. This PR strengthens these checks and
adds assertions to fail hard in tests.
Relates #10480
A predicate to check whether the cluster supports a feature is available
to rest handlers defined in server. This commit adds that predicate to
plugins defining rest handlers as well.
* Transforms: Adding basic stats API param
Adds an optional parameter to the `transform/_stats` API called
`basic`. This parameter defaults to false, returning the
complete set of stats (this is the same functionality as today).
This helps reduce the latency calling for stats of one or more
transforms, as the `operationsBehind` calculation is increasingly
expensive as the size of the transform and number of nodes grow. Users
can get a basic view of the current state, health, and progress of one
or more transforms, and a second call for the complete stats set can be
made when users want to drill down into a given transform.
When `transform/_stats?basic=true`, Transforms will only return a
subset of stats obtained by information immediately available from the
main node, including `id`, `state`, `node`, `stats`, and `health`.
`checkpointing` may be omitted.
For continuous transforms, `checkpointing` will include the `last`
`checkpoint` id. If there is a difference in data but the transform
has not started on that difference yet, the `next` checkpoint will
be included with the `position` and `progress`.
For stopped transforms, `checkpointing` will include the `last`
`checkpoint` id and the `next` `position` and `progress`.
In both cases, `operationsBehind` will never be calculated, and all
timestamp information will not be recorded.
* [Profiling] Read Azure costs data
* Parse the Azure instance_type
* Amend tests for the CostCalculator
* Adjust and comment pre-allocation in InstanceTypeService
* Remove outdated comment in InstanceType
* Amend comments in InstanceType
* Remove 'provider' from costs data
* Add a StopWatch for loading the profiling costs data
* Remove pre-allocation in InstanceTypeService
* InstanceTypeService: Extend the scope of 'log' variable
* Introduce passthrough field type
`PassthoughObjectMapper` extends `ObjectMapper` to create a container
for fields that also need to be referenced as if they were at the root
level. This is done by creating aliases for all its subfields.
It also supports an option of annotating all its subfields as
dimensions. This will be leveraged in TSDB, where dimension fields can
be dynamically defined as nested under a passthrough object - and still
referenced directly (i.e. without prefixes) in aggregation queries.
Related to #103567
* Update docs/changelog/103648.yaml
* no subobjects
* create dimensions dynamically
* remove unused method
* restore ignoreAbove incompatibility with dimension
* fix test
* refactor, skip aliases on conflict
* fix branch
* fix branch
* add tests
* update test
* remove unused variable
* add yaml test for subobject
* minor refactoring
* add unittest for PassThroughObjectMapper
* suggested fixes
* suggested fixes
* update yaml with warning for duplicate alias
* updates from review
* add withoutMappers()
A Lucene limitation on doc values for UTF-8 fields does not allow us to
write keyword fields whose size is larger then 32K. This limits our
ability to map more than a certain number of dimension fields for time
series indices. Before introducing this change the tsid is created as a
catenation of dimension field names and values into a keyword field.
To overcome this limitation we hash the tsid. This PR is intended to be
used as a draft to test different options.
Note that, as a side effect, this reduces the size of the tsid field as
a result of storing far less data when the tsid is hashed. Anyway, we
expect tsid hashing to affect compression of doc values and resulting in
larger storage footprint. Effect on query latency needs to be evaluated
too.
Resolves#93564
A number of aggregations that rely on deferred collection don't work
with time series index searcher and will produce incorrect result. These
aggregation usages should fail. The documentation has been updated to
describe these limitations.
In case of multi terms aggregation, the depth first collection is
forcefully used when time series aggregation is used. This behaviour is
inline with the terms aggregation.
APM metrics based s3 request stats have been available since #102505. It
has collected sufficient data that we can now remove the log based stats
introduced in #100272.
Resolves: ES-7500
We can dry up logic around LeafDoubleFieldData quite a bit. The
RamBytesUsed value passed to the constructor is always `0` and we can
reuse this class in one more spot. Also the object -> double path is the
same code we have in the `NumberFieldMapper`.
This starts moving the binary compatibility type checking out of the verifier and into the type resolve path. Additionally, and as a test case, it wires up the addition tests to use errorForCasesWithoutExamples, which requires the type resolve improvement, and validates that we're doing the right thing for binary addition.
Before we can actually remove the code from the verifier, we need to also adapt the binary comparison functions, which is a fair bit of work. I am planning to do that in a follow up PR, as this one has already been open for two weeks.
In certain circumstances if Elasticsearch encounters an error while
starting up, the server cli may exit with no error. This commit fixes
the cli to always check and wait on the Elasticsearch process and exit
with the same exit code.
relates #104055
No need to copy here and allocate GB/min of additional pointers to the same
`byte[]` over and over, the reference here is in most if not all
relevant cases backed by an array anyway.