Commit graph

644 commits

Author SHA1 Message Date
Simon Cooper
4aa33409a5
Backport systemd unreadable library path fix (#109419)
* Guard systemd library lookup from unreadable directories (#108931)

When scanning the library path we may come across directories that are
unreadable. If that happens, the recursive walk of the library path
directories will throw a fatal IOException. This commit guards the walk
of the library paths to first check for readability of each directory we
are about to traverse.

---------

Co-authored-by: Ryan Ernst <ryan@iernst.net>
2024-06-06 13:05:13 +01:00
Chris Hegarty
f83f0bceec
[8.14] Fix multithreading copies in lib vec (#108802) (#108810)
Backport of:  * #108802
2024-05-19 12:15:14 -04:00
Jim Ferenczi
d0a388d13a
Fix integer overflow in native scalar quantizer (#108493) (#108535)
Offsets in memory segments should be computed as longs to avoid integer overflow on large segments.

---------

Co-authored-by: ChrisHegarty <chegar999@gmail.com>
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2024-05-12 04:09:32 -04:00
Andrew Wilkins
733620d66a
nativeaccess: try to load all located libsystemds (#108238) (#108428)
Linux systems with multiarch (e.g. i386 & x86_64) libraries
may have libsystemd.0 in two subdirectories of an entry in
java.library.path. For example, libsystemd.so.0 may be found
in both /usr/lib/i386-linux-gnu and /usr/lib/x86_64-linux-gnu.

Instead of attempting to load any library found, attempt all
and stop as soon as one is successfully loaded.
2024-05-08 15:02:15 -04:00
Ryan Ernst
dacc0dbce1
Use direct method mapping for zstd (#108172) (#108205)
JNA supports two types of mapping to native methods, proxying and direct
method mapping. Proxying is nicer for unit testing, but unfortunately
the proxied methods are lazily loaded. NativeAccess expects that methods
are linked during static init, before SecurityManager is initialized.
For any native methods called after security manager init, the proxied
method will fail.

This commit changes the zstd bindings to use direct method mapping so
that calling zstd methods does not fail when using JNA (pre Java 21).

closes #107504
closes #107770
2024-05-02 12:51:17 -04:00
Chris Hegarty
6b52d7837b
Add an optimised int8 vector distance function for aarch64. (#106133)
This commit adds an optimised int8 vector distance implementation for aarch64. Additional platforms like, say, x64, will be added as a follow-up.

The vector distance implementation outperforms Lucene's Pamana Vector implementation for binary comparisons by approx 5x (depending on the number of dimensions). It does so by means of compiler intrinsics built into a separate native library and link by Panama's FFI. Comparisons are performed on off-heap mmap'ed vector data.

The implementation is currently only used during merging of scalar quantized segments, through a custom format ES814HnswScalarQuantizedVectorsFormat, but its usage will likely be expanded over time.

Co-authored-by: Benjamin Trent <ben.w.trent@gmail.com>
Co-authored-by: Lorenzo Dematté <lorenzo.dematte@elastic.co>
Co-authored-by: Mark Vieira <portugee@gmail.com>
Co-authored-by: Ryan Ernst <ryan@iernst.net>
2024-04-12 08:44:21 +01:00
Ryan Ernst
96230f7a7d
Use CloseableByteBuffer in compress/decompress signatures (#106724)
CloseableByteBuffer is backed by native memory segments, but the
interfaces for compress and decompress methods of zstd take ByteBuffer.
Although both Jna and the Jdk can deal with turning the native
ByteBuffer back into an address to pass to the native method, the jdk
may have a more significant cost to that action.

This commit changes the signature of compress and decompress to take in
CloseableByteBuffer so that each implementation can do its own
unwrapping to get the appropriate native address.

relates #103374
2024-03-25 14:30:27 -04:00
Ryan Ernst
2196576aed
Use confined arena for CloseableByteBuffer (#106723)
The jdk implementation of CloseableByteBuffer currently uses a shared
arena. The assumption was that a buffer might be shared across threads.
However, in practice for compression/decompression that is not true, and
the shared arena has a noticeable impact on deallocation when the buffer
is closed. This commit switches to a confined arena, limtting buffer
creation and compress/decompress calls to a single thread.

relates #103374
2024-03-25 13:26:45 -04:00
Ryan Ernst
55c3357c81
Move common mrjar forbidden apis configuration to plugin (#106385)
Since mrjars may use preview apis, forbidden apis must know about any
preview apis from the jdk. However, we do not run forbidden apis with
the preview enabled flag, nor in a separate jvm, so it does not know
about these classes. Thus we ignore missing classes on source sets added
by the mrjar plugin.

This commit configures all sourcesets added by mrjar plugin to ignore
forbidden apis missing classes.
2024-03-19 16:40:52 -04:00
Ryan Ernst
444866aec9
Set explicit directory and file permissions on native libraries (#106505)
The distributions already have correct permissions set on native
libraries copied to them. However, the build itself to extract the
native libs relies on the upstream file permissions. This commit sets
explicit permissions on the copy task which extracts native libraries.
2024-03-19 15:51:57 -04:00
Ryan Ernst
6731538bbe
Different string allocation on jdk 21/22 (#106492)
Similar to https://github.com/elastic/elasticsearch/pull/106360, the
methods for allocating a native string changed between Java 21 and 22.
This commit adds another util method to handle the differences and uses
it in the jdk systemd impl.
2024-03-19 13:09:18 -04:00
Ryan Ernst
1d7a0159d3
Support jdk22 in zstd bindings (#106360)
The foreign memory API changed between Java 21 and 22 in how to decode a
string from native memory. This commit adds an multi-release class to
handle the two different methods on MemorySegment to decode a string.
2024-03-14 10:30:14 -07:00
Rene Groeschke
0f9ebf268f
Mute ZstdTests (#106348)
failing on jdk22
2024-03-14 10:56:10 +01:00
Ryan Ernst
405b88b882
Add zstd to native access (#105715)
This commit makes zstd compression available to Elasticsearch. The
library is pulled in through maven in jar files for each platform, then
bundled in a new platform directory under lib. Access to the zstd
compression/decompression is through NativeAccess.
2024-03-13 09:45:12 -07:00
Ryan Ernst
10dcb8e8bd
Add systemd native access (#106151)
This commit moves systemd access to the NativeAccess lib.

relates #104876
2024-03-12 07:35:02 -07:00
Daniel Mitterdorfer
890bd4b8a5
Consider context in raw serialization (#106163)
With this commit we use `writeRawValue` instead of `writeRaw` when
serializing raw strings as XContent. The latter method does not consider
context (e.g. is the value being written as part of an array and
requires a comma separator?) whereas the former does. This ensures that
pre-rendered double values as we use them in the flamegraph response are
rendered correctly as XContent.

Closes #106103
2024-03-11 13:48:12 +01:00
Simon Cooper
1b8baf1cf8
Convert most uses of BaseMatcher to TypeSafeMatcher (#105764) 2024-03-11 09:12:42 +00:00
Ryan Ernst
83585315fe
Only apply build to direct libs (#106101)
Sometimes libs have subprojects that may not be java projects. This commit adjusts the shared
configuration for libs to only affect direct subprojects of :lib.
2024-03-08 13:48:26 -08:00
Ryan Ernst
ef680c9200
Remove limitation on cross lib dependencies (#106099)
Libs were meant to be a way to break up code from server without
creating full fledged modules. They still exist on the system classpath,
but we did not want to introduce a spaghetti of jars depending on each
other. The check that ensures libs don't depend on each other was added
before Elasticsearch was modularized. Since it now runs modular, the
cross module dependencies are easy to visualize with module-info, and
the module system protects us from circular deps. Additionally, the
number of exceptions to the no-cross-lib-deps rule has grown
considerably.

Given all of the above, the check on cross lib dependencies no longer
provides much benefit, and is more of a hinderance. This commit removes
the check.
2024-03-07 17:22:39 -08:00
Daniel Mitterdorfer
7179c12b24
[Profiling] Speed up serialization of flamegraph (#105779)
The response of the flamegraph is quite large: A typical response can
easily reach 50MB (uncompressed). In order to reduce memory pressure and
also to start sending the response sooner, we chunk the response.
However, this leads to many chunks that are very small and lead to high
overhead. In our experiments, just the serialization takes more than
500ms.

With this commit we take the following measures:

1. We split the response into chunks only when it makes sense and
   otherwise send one larger chunk.
2. Serialization of doubles is very expensive: Just the serialization of
   annual CO2 tons takes around 80ms in our test setup. Therefore, we
apply a custom serialization that is both faster than the builtin
serialization as well reduces the amount of bytes sent over the wire
because we round to four decimal places (which is more than sufficient for 
our purposes).
2024-03-07 15:31:02 +01:00
Armin Braun
fc8e2b7897
Introduce Predicate Utilities for always true/false use-cases (#105881)
Just a suggetion. I think this would save us a bit of memory here and
there. We have loads of places where the always true lambdas are used
with `Predicate.or/and`. Found this initially when looking into field
caps performance where we used to heavily compose these but many spots
in security and index name resolution gain from these predicates.
The better toString also helps in some cases at least when debugging.
2024-03-04 14:01:21 +01:00
Dmitry Cherniachenko
e21a4874ab
Use String.replace() instead of replaceAll() for non-regexp replacements (#105127)
* Use String.replace() instead of replaceAll() for non-regexp replacements

When arguments do not make use of regexp features replace() is a more efficient option, especially the char-variant.
2024-02-12 13:11:15 -05:00
Armin Braun
842915701d
Faster ref-count logic for when ref-counted object does not escape (#105338)
Introducing a plain version of `AbstractRefCounted` as a compromise.
This saves a bunch of allocations and a circular reference to the object
holding the ref counted instance, making smaller SearchHit instances
etc. cheaper. We could get an even more direct solution here by making
these extend `AbstractRefCounted` but that would lose us the ability to
leak-track in tests, so doing it this way (same way Netty does it on
their end) as a compromise.
2024-02-10 14:50:12 +01:00
Ryan Ernst
dd51f6b187
Add classpath based SPI for jna native provider (#105320)
Native lib provider is normally run modular, but since tests run on the
classpath it also needs to work with old style SPI.
2024-02-08 17:31:57 -08:00
Armin Braun
cf27a501aa
Introduce StreamInput.readSlicedBytesReference (#105262)
This is mainly added as a prerequisite to slicing doc sources out of
bulk index requests. The few usages for it added in this PR have limited
performance impact but demonstrate correct functioning of the
implementations.

Co-authored-by: David Turner <david.turner@elastic.co>
2024-02-09 02:30:30 +01:00
Dmitry Cherniachenko
263ea5e987
Replace generic HashSet / HashMap with more efficient EnumSet / EnumMap (#105238) 2024-02-08 13:43:14 +00:00
Ryan Ernst
6375e9f443
Add native access library (#105100)
Elasticsearch requires access to some native functions. Historically
this has been achieved with the JNA library. However, JNA is a
complicated, magical library, and has caused various problems booting
Elasticsearch over the years. The new Java Foreign Function and Memory
API allows access to call native functions directly from Java. It also
has the advantage of tight integration with hotspot which can improve
performance of these functions (though performance of Elasticsearch's
native calls has never been much of an issue since they are mostly at
boot time).

This commit adds a new native lib that is internal to Elasticsearch. It
is built to use the foreign function api starting with Java 21, and
continue using JNA with Java versions below that.

Only one function, checking whether Elasticsearch is running as root, is
migrated. Future changes will migrate other native functions.
2024-02-07 18:27:09 -05:00
Ryan Ernst
18a1ac09e7
Use open and fstat in preallocate (#105171)
Preallocate opens a FileInputStream in order to get a native file
desctiptor to pass to native functions. However, getting at the file
descriptor requires breaking modular access. This commit adds native
posix functions for opening/closing and retrieving stats on a file in
order to avoid requiring additional permissions.
2024-02-07 13:40:05 -05:00
Ryan Ernst
2ca6df71d6
Make ProviderLocator aware of boot qualified exports (#105250)
Qualfied exports in the boot layer only work when they are to other boot
modules. Yet Elasticsearch has dynamically loaded modules as in plugins.
For this purpose we have ModuleQualifiedExportsService. This commit
moves loading of ModuleQualfiedExportService instances in the boot layer
into core so that it can be reused by ProviderLocator when a qualified
export applies to an embedded module.
2024-02-07 09:43:22 -08:00
David Turner
3b7b86c507
Simplify ChunkedToXContentHelper#singleChunk (#105225)
There's no need for this helper to take more than one argument. Almost
all the usages only passed in a single argument, and the few cases that
supplied more than one can be rewritten as a single argument to save
allocating all those extra lambdas.
2024-02-07 03:53:02 -05:00
Ryan Ernst
b250f06b09
Add a gradle plugin for embedded providers (#105094)
x-content embeds its jackson implementation inside its jar. This commit
formalizes the setup for this embedding with a gradle plugin so that it
can be reused by other libs.
2024-02-05 15:21:52 -05:00
Armin Braun
18bd6c4238
Fix Releasables.close performance issues (#104970)
It's less code and it actually inlines (avoiding virtual calls in most
cases) to just do the null check here instead of delegating to IOUtils
and then catching the impossible IOException. Also, no need to use
`Releaseables` in 2 spots where try-with-resources works as well and
needs less code.

Noticed this when I saw that we had a lot of strange CPU overhead in
this call in some hot loops like translog writes.
2024-01-31 08:21:59 -05:00
Armin Braun
50bafd306c
Save allocating enum values array in two hot spots (#104952)
Our readEnum code instantiates/clones enum value arrays on read.
Normally, this doesn't matter much but the two spots adjusted here are
visibly hot during bulk indexing, causing GBs of allocations during e.g.
the http_logs indexing run.
2024-01-31 11:26:36 +01:00
Ryan Ernst
d5e727e362
Add plugin for creating MRJARs (#104883)
This commit adds an elasticsearch gradle plugin which sets up a project
to build an MRJAR. By applying the plugin, the src dir is checked for
any directories matching mainXX where XX is the java version number.
A source set is automatically setup, an appropriate compiler tied to
that source set, and it's output placed in the correct part of the final
jar. Additionally, the sourceset allows use of preview features in that
verison of Java, and the preview bits are stripped from the resulting
class files.
2024-01-29 21:06:19 -08:00
Jan Kuipers
5dec83f69e
Endpoint to test Grok pattern (#104394)
* Add extract match ranges functionality to Grok.

* TestGrokPatternAction and Request

* TestGrokPattern response

* Update docs/changelog/104394.yaml

* Polish validation error message

* Improve test_grok_pattern API

* Add explicit CharSet

* Add endpoint to operator constants

* Add TransportTestGrokPatternActionTests

* REST API spec

* One more TransportTestGrokPatternActionTest

* Fix API spec

* Refactor REST API spec

* Polish code

* Replace TransportTestGrokPatternActionTests by a YAML REST test

* Add ecs_compatibility

* Always return arrays in the API

* Documentation

* YAML test for ecs_compatibility

* Rename doc fileø

* serverless scope

* Fix docs (hopefully)

* Update docs/reference/rest-api/index.asciidoc

Co-authored-by: István Zoltán Szabó <istvan.szabo@elastic.co>

* Add "text structure APIs" header in docs TOC

* Move file

* Remove test grok from main index

* typo

* Nested APIs underneath text structure

---------

Co-authored-by: István Zoltán Szabó <istvan.szabo@elastic.co>
2024-01-24 09:35:59 +01:00
William Brafford
d07651f8b9
Checkstyle: require braces around do, for, and while clauses (#103217)
* Checkstyle requires braces after for, do, while
2024-01-09 16:03:45 -05:00
Simon Cooper
0d53be3bed
Expand uses of matchers for Optionals (#104123)
Also add variants to assert for specific values
2024-01-09 12:55:29 +00:00
Ryan Ernst
2f5247117e
Upgrade ASM to 9.6 for Java 22 support (#104085)
This commit upgrades the version of asm used by the build and plugins in
order to support Java 22 version format.

closes #104065 relates #103963
2024-01-08 15:03:40 -05:00
Stuart Tettemer
a359b1f648
Relax limit on max string size in CBOR, Smile, YAML (#103930)
Remove the rough limit on string length from Jackson 2.15. The limit was already relaxed for JSON in #96031, this extends that change to other XContent types.

Refs: #96031
Fixes: #104009
2024-01-08 13:31:54 -06:00
Ignacio Vera
a6b36eb20a
Add the possibility to transform WKT to WKB directly (#104030)
enhancement to geo utilities.
2024-01-08 16:10:32 +01:00
Armin Braun
78c365fc96
Introduce a noop, never-released ref counted constant (#103931)
This is needed for the search response pooling work. Also, the one usage
in `ReleasableBytesReference` actually makes outright sense. We
shouldn't be ref-counting on a global constant, that just needlessly
introduces contention that isn't entirely obvious. This change required
a couple tests to be adjusted where we were checking release mechanics
on noop instances.
2024-01-04 13:33:07 -05:00
David Turner
6c13a815bd
Refcount responses in TransportNodesAction (#103254)
Today we `decRef()` the per-node responses just after adding them to the
`responses` collection, but in fact we should keep them alive until
we've constructed the final response. This commit does that.
2024-01-02 13:38:02 +00:00
Moritz Mack
9f8088daec
Adjust terminal tests to new behavior in JDK 22. (#103614)
JDK 22 may return a console even if the terminal is redirected. These cases are detected using the new Console#isTerminal() to maintain the current behavior (closes #98033).
2023-12-21 14:36:53 +01:00
Armin Braun
49f1b5b787
Make sure to close XContentParser in more spots (#103504)
We're leaking quite a few of these parsers. That doesn't seem to be much
of a problem but results in some memory inefficiencies in Jackson here
and there. This PR bulk fixes a bunch of instances that I could easily
automatically fix. I'll open a follow-up for closing the parser on the
document parsing context which also suffers from this but is non-trivial
to fix.
2023-12-19 10:26:06 +01:00
Armin Braun
1b84ea7421
Delete all unused private methods (#98111)
Pretty straight forward dead-code cleanup I think. Just delete all
private methods or methods in private classes that aren't used.
2023-11-25 22:21:59 +01:00
David Turner
b2127ec2f9
Introduce RefCounted#mustIncRef (#102515)
In several places we acquire a ref to a resource that we are certain is
not closed, so this commit adds a utility for asserting this to be the
case. This also helps a little with mocks since boolean methods like
`tryIncRef()` return `false` on mock objects by default, but void
methods like `mustIncRef()` default to being a no-op.
2023-11-23 16:40:43 -05:00
David Turner
5e20253fe8
Add @UpdateForV9 markers on versions (#102441)
We probably wouldn't forget these, but for the sake of completeness this
commit marks several lists of versions with the `@UpdateForV9`
annotation.
2023-11-22 08:09:34 +00:00
Jake Landis
17a46a6e9f
upgrade bouncy castle jars (#100923)
This commit upgrades the Bouncy Castle jars. Bouncy Castle is used for 
some internal build concners as well as a comnand line application. 
Most notably Bouncy Castle is also used as the FIPs certified JCE/JSEE provider 
we use to test our ability to use a FIPs compliant crypto provider. 

The following changes here are a result of the upgraded Bouncy Castle jars:
* TLSv1.3 is now supported when running in FIPs mode 
* RSA PKCS#1 v1.5 is no longer allowed in FIPS mode
* Triple DES (3DES) is no longer allowed in FIPS mode
* Minor updates the security manager configuration used to test FIPs (to read permissions from the security provider)
* Minor adjustments to tests to accommodate the above changes. 
* Minor adjustments to the gradle build to accommodate new dependencies 

Note - update to the documentation will come in a later commit.
2023-11-21 11:14:41 -06:00
Alexander Spies
92fb7780f9
ESQL: Make blocks ref counted (#100408)
This allows to replace deep copying of blocks by simply calling
Block::incRef - the block then has to be closed (or decRefed) one
additional time for each call to incRef (and tryIncRef, if successfull).
2023-11-20 14:09:15 +01:00
Ignacio Vera
c579ab2d4a
Remove HighLevelRestClient from CCSDuelIT (#102222) 2023-11-16 12:18:03 +01:00