* Use String.replace() instead of replaceAll() for non-regexp replacements
When arguments do not make use of regexp features replace() is a more efficient option, especially the char-variant.
Introducing a plain version of `AbstractRefCounted` as a compromise.
This saves a bunch of allocations and a circular reference to the object
holding the ref counted instance, making smaller SearchHit instances
etc. cheaper. We could get an even more direct solution here by making
these extend `AbstractRefCounted` but that would lose us the ability to
leak-track in tests, so doing it this way (same way Netty does it on
their end) as a compromise.
This is mainly added as a prerequisite to slicing doc sources out of
bulk index requests. The few usages for it added in this PR have limited
performance impact but demonstrate correct functioning of the
implementations.
Co-authored-by: David Turner <david.turner@elastic.co>
Elasticsearch requires access to some native functions. Historically
this has been achieved with the JNA library. However, JNA is a
complicated, magical library, and has caused various problems booting
Elasticsearch over the years. The new Java Foreign Function and Memory
API allows access to call native functions directly from Java. It also
has the advantage of tight integration with hotspot which can improve
performance of these functions (though performance of Elasticsearch's
native calls has never been much of an issue since they are mostly at
boot time).
This commit adds a new native lib that is internal to Elasticsearch. It
is built to use the foreign function api starting with Java 21, and
continue using JNA with Java versions below that.
Only one function, checking whether Elasticsearch is running as root, is
migrated. Future changes will migrate other native functions.
Preallocate opens a FileInputStream in order to get a native file
desctiptor to pass to native functions. However, getting at the file
descriptor requires breaking modular access. This commit adds native
posix functions for opening/closing and retrieving stats on a file in
order to avoid requiring additional permissions.
Qualfied exports in the boot layer only work when they are to other boot
modules. Yet Elasticsearch has dynamically loaded modules as in plugins.
For this purpose we have ModuleQualifiedExportsService. This commit
moves loading of ModuleQualfiedExportService instances in the boot layer
into core so that it can be reused by ProviderLocator when a qualified
export applies to an embedded module.
There's no need for this helper to take more than one argument. Almost
all the usages only passed in a single argument, and the few cases that
supplied more than one can be rewritten as a single argument to save
allocating all those extra lambdas.
x-content embeds its jackson implementation inside its jar. This commit
formalizes the setup for this embedding with a gradle plugin so that it
can be reused by other libs.
It's less code and it actually inlines (avoiding virtual calls in most
cases) to just do the null check here instead of delegating to IOUtils
and then catching the impossible IOException. Also, no need to use
`Releaseables` in 2 spots where try-with-resources works as well and
needs less code.
Noticed this when I saw that we had a lot of strange CPU overhead in
this call in some hot loops like translog writes.
Our readEnum code instantiates/clones enum value arrays on read.
Normally, this doesn't matter much but the two spots adjusted here are
visibly hot during bulk indexing, causing GBs of allocations during e.g.
the http_logs indexing run.
This commit adds an elasticsearch gradle plugin which sets up a project
to build an MRJAR. By applying the plugin, the src dir is checked for
any directories matching mainXX where XX is the java version number.
A source set is automatically setup, an appropriate compiler tied to
that source set, and it's output placed in the correct part of the final
jar. Additionally, the sourceset allows use of preview features in that
verison of Java, and the preview bits are stripped from the resulting
class files.
* Add extract match ranges functionality to Grok.
* TestGrokPatternAction and Request
* TestGrokPattern response
* Update docs/changelog/104394.yaml
* Polish validation error message
* Improve test_grok_pattern API
* Add explicit CharSet
* Add endpoint to operator constants
* Add TransportTestGrokPatternActionTests
* REST API spec
* One more TransportTestGrokPatternActionTest
* Fix API spec
* Refactor REST API spec
* Polish code
* Replace TransportTestGrokPatternActionTests by a YAML REST test
* Add ecs_compatibility
* Always return arrays in the API
* Documentation
* YAML test for ecs_compatibility
* Rename doc fileø
* serverless scope
* Fix docs (hopefully)
* Update docs/reference/rest-api/index.asciidoc
Co-authored-by: István Zoltán Szabó <istvan.szabo@elastic.co>
* Add "text structure APIs" header in docs TOC
* Move file
* Remove test grok from main index
* typo
* Nested APIs underneath text structure
---------
Co-authored-by: István Zoltán Szabó <istvan.szabo@elastic.co>
Remove the rough limit on string length from Jackson 2.15. The limit was already relaxed for JSON in #96031, this extends that change to other XContent types.
Refs: #96031Fixes: #104009
This is needed for the search response pooling work. Also, the one usage
in `ReleasableBytesReference` actually makes outright sense. We
shouldn't be ref-counting on a global constant, that just needlessly
introduces contention that isn't entirely obvious. This change required
a couple tests to be adjusted where we were checking release mechanics
on noop instances.
Today we `decRef()` the per-node responses just after adding them to the
`responses` collection, but in fact we should keep them alive until
we've constructed the final response. This commit does that.
JDK 22 may return a console even if the terminal is redirected. These cases are detected using the new Console#isTerminal() to maintain the current behavior (closes#98033).
We're leaking quite a few of these parsers. That doesn't seem to be much
of a problem but results in some memory inefficiencies in Jackson here
and there. This PR bulk fixes a bunch of instances that I could easily
automatically fix. I'll open a follow-up for closing the parser on the
document parsing context which also suffers from this but is non-trivial
to fix.
In several places we acquire a ref to a resource that we are certain is
not closed, so this commit adds a utility for asserting this to be the
case. This also helps a little with mocks since boolean methods like
`tryIncRef()` return `false` on mock objects by default, but void
methods like `mustIncRef()` default to being a no-op.
This commit upgrades the Bouncy Castle jars. Bouncy Castle is used for
some internal build concners as well as a comnand line application.
Most notably Bouncy Castle is also used as the FIPs certified JCE/JSEE provider
we use to test our ability to use a FIPs compliant crypto provider.
The following changes here are a result of the upgraded Bouncy Castle jars:
* TLSv1.3 is now supported when running in FIPs mode
* RSA PKCS#1 v1.5 is no longer allowed in FIPS mode
* Triple DES (3DES) is no longer allowed in FIPS mode
* Minor updates the security manager configuration used to test FIPs (to read permissions from the security provider)
* Minor adjustments to tests to accommodate the above changes.
* Minor adjustments to the gradle build to accommodate new dependencies
Note - update to the documentation will come in a later commit.
This allows to replace deep copying of blocks by simply calling
Block::incRef - the block then has to be closed (or decRefed) one
additional time for each call to incRef (and tryIncRef, if successfull).
* Add static assertRemoveBeforeV9()
* Switch from assertion to annotation + some usage examples
* Fixup ReplicationTracker
* Update name, use annotation on fields
---------
Co-authored-by: David Turner <david.turner@elastic.co>
* Avoid "this-escape" by making classes final
The "this-escape" compiler warning is intended to alert
developers to potential bugs in object initialization due to
subclassing. This class of bugs cannot occur when a class is
final. Here, we take cases where a class has no implementations
but generates a "this-escape" warning, and we make those
classes final rather than suppressing the compiler warning.
This makes the remaining suppressions more meaningful, since
they now indicate places where we may want to look for
initialization bugs.
In a few cases, making a class final meant changing some of its
protected fields and methods to private or default
accessibility.
Some classes with no implementations are mocked in testing.
Since making those classes final would involve non-trivial
rewrites of tests, I've left them alone.
* Spotless, remove redundant modifiers, clean up "protected" usage
* Revert a few more mocked classes
The only reason this method is throwing an exception is because the
method ByteArrayOutputStream#close() is declaring it although it is a
noop. Therefore it can be safely ignored.
Thanks @romseygeek for bringing into attention.
Another round of automated fixes to this, marking things that can be
made static as static. Saves some JIT cycles but also turns some lambdas
from capturing to non-capturing and makes the "utilityness" of some
classes visible.
Adds @SuppressWarnings("this-escape") to all necessary places to that
Elasticsearch can compile with -Werror on JDK21
No investigation has been done to determine whether any of the cases
are a potential source of errors - we have simply suppressed all
existing occurrences.
Resolves: #99845
* Use long in Centroid count
Centroids currently use integers to track how many samples their mean
tracks. This can overflow in case the digest tracks billions of samples
or more.
TDigestState already serializes the count as VLong, so it can be read as
VInt without compatibility issues.
Fixes#80153
* Update docs/changelog/99491.yaml
* More test fixes
* Bump TransportVersion
* Revert TransportVersion change
Jackson has a direct method for writing string arrays
that saves us some of the indirection we have when looping
over a string array. This normally doesn't gain much, but for extreme
cases like long index name lists in field caps it saves a couple percent
in CPU time.
This commit fixes a jarhell test to create an unnamed temp dir, instead
of the existing creation which uses the test method name. The reason
this causes problems is when running with many iterations, the test
method name is artificially adjusted to include seed information, using
special characters that are potentially invalid path characters.
closes#98949
While ingesting documents that contain nested objects and the
mapping property subobjects is set to false instead of throwing
a mapping exception and dropping the document(s), we map only
leaf field(s) with their full path as their name separated by dots.
Lots of spots where we did weird things around streams like redundant stream creation, redundant collecting
before adding all the collected elements to another collection or so, redundant streams for joining strings
and using less efficient `Collectors.toList` and in a few cases also incorrectly relying on the result being mutable.
This commit updates the plugin cli and scanner components to use ASM 9.5.
The update is required to successfully test with JDK 21. Tests in this component programatically run the java source compiler, which generates class files with major version 65, then tries to parse those generated class files. Without this change the tests fail with java.lang.IllegalArgumentException: Unsupported class file major version 65.
This PR introduces downsampling configuration to the data stream lifecycle. Keep in mind downsampling implementation will come in a follow up PR. Configuration looks like this:
```
{
"lifecycle": {
"data_retention": "90d",
"downsampling": [
{
"after": "1d",
"fixed_interval": "2h"
},
{ "after": "15d", "fixed_interval": "1d" },
{ "after": "30d", "fixed_interval": "1w" }
]
}
}
```
We will also support using `null` to unset downsampling configuration during template composition:
```
{
"lifecycle": {
"data_retention": "90d",
"downsampling": null
}
}
```
* Skip SortingDigest when merging a large digest in HybridDigest.
This is a small performance optimization that avoids creating an
intermediate SortingDigest when merging a digest tracking many samples.
The current behavior is to keep adding values to SortingDigest until we
cross the threshold for switching to MergingDigest, at which point we
copy all values from SortingDigest to MergingDigest and release the
former.
As a side cleanup, remove the methods for adding a list of digests. It's
not used anywhere and it can be tricky to get right - the current
implementation for HybridDigest is buggy.
* Update docs/changelog/97099.yaml