Commit graph

43 commits

Author SHA1 Message Date
Chris Hegarty
ccb52bf131 Improve logging of native vector scorer - vec_caps (#118325) (#118356)
This commit adds logging of the system's vector capability check, to help with diagnosing whether AVX2 or AVX 512 will be used.
2024-12-11 10:58:45 +00:00
Ryan Ernst
dedf9fd6d7
Use directory name as project name for libs (#115720) (#115984)
* Use directory name as project name for libs (#115720)

The libs projects are configured to all begin with `elasticsearch-`.
While this is desireable for the artifacts to contain this consistent
prefix, it means the project names don't match up with their
directories. Additionally, it creates complexities for subproject naming
that must be manually adjusted.

This commit adjusts the project names for those under libs to be their
directory names. The resulting artifacts for these libs are kept the
same, all beginning with `elasticsearch-`.

* fixes
2024-10-31 07:52:10 +11:00
Mark Vieira
0279c0a909
Add AGPLv3 as a supported license 2024-09-13 14:30:33 -07:00
Ryan Ernst
ef95cdd4cc
Fix native library loading zstd with jna (#112221)
Recent refactoring of native library paths broke jna loading zstd. This
commit fixes jna to set the jna.library.path during init so that jna
calls to load libraries still work.
2024-08-26 18:51:12 -07:00
Ryan Ernst
0aa4758f02
Stop setting java.library.path (#112119)
Native libraries in Java are loaded by calling System.loadLibrary. This
method inspects paths in the java.library.path to find the requested
library. Elasticsearch previously used this to find libsystemd, but now
the only remaining use is to set the additional platform directory in
which Elasticsearch keeps its own native libraries.

One issue with setting java.library.path is that its not set for the cli
process, which makes loading the native library infrastructure from clis
difficult. This commit reworks how Elasticsearch native libraries are
found in order to avoid needing to set java.library.path. There are two
cases. The simplest is production, where the working directory is the
Elasticsearch installation directory, so the platform specific directory
can be constructed. The second case is for tests where we don't have an
installtion. We already pass in java.library.path there, so this change
renames the system property to be a test specific property that the new
loading infrastructure looks for.
2024-08-23 11:16:18 -07:00
Ryan Ernst
0f176e1779
Remove leftover libsystemd references (#112078)
Systemd notification now happens by directly communicating with the
systemd socket. This commit removes the native access to libsystemd,
which is no longer used.
2024-08-22 07:57:15 -07:00
Ryan Ernst
69293e28dc
Use systemd socket directly instead of libsystemd (#111131)
The libsystemd library function sd_notify is just a thin wrapper
around opeing and writing to a unix filesystem socket. This commit
replaces using libsystemd with opening the socket provided by systemd
directly.

relates #86475
2024-08-19 16:31:59 -07:00
Ryan Ernst
0cf9c54f65
Fix windows memory locking (#111866)
Memory locking on Windows with the bundled jdk was broken by native
access refactoring. This commit fixes the linking issue, as well as adds
a packaging test to ensure memory locking is invoked on all supported
platforms.
2024-08-15 12:00:41 -07:00
Lorenzo Dematté
6fc0047f23
do not execute preallocate tests on encrypted storage (#111627) 2024-08-07 17:50:12 +02:00
Lorenzo Dematté
367133e605
Fix: tryPreallocate open ignored creation flag (#111294) 2024-08-05 18:48:50 +02:00
Ryan Ernst
fcc5b737ea
Use ESTestCase for vector sysprop tests (#110990)
The VectorSystemPropertyTests need to run a child process. Normally this
isn't possible since we run with security manager, but the
`@WithoutSecurityManager` annotation causes the test suite to run with
the security manager disabled. However, that annotation only works with
ESTestCase. This commit changes the base class of the test suite to
ESTestCase.

closes #110949
2024-07-19 07:42:57 -07:00
Ryan Ernst
d21d4242dd
Skip preallocate tests on windows (#110998)
The preallocate tests assumed that preallocation was using the fallback
implementation which calls setLength on Windows. However, that fallback
only happens inside the SharedBytes class, so windows doesn't actually
do anything when tryPreallocate is called. This commit skips the test on
windows.

closes #110948
2024-07-19 05:59:35 -07:00
Ryan Ernst
08e91b7fbc
Add test for native preallocation (#110903)
This commit forward ports a test for native preallocation from #110851.
It also fixes fcntl and ftruncate bindings used by MacOS for
preallocation.
2024-07-16 09:06:01 -07:00
Ryan Ernst
e6713a5c0a
Remove JNA from server dependencies (#110809)
All native methods are now bound through NativeAccess. This commit
removes the jna dependency from server.

relates #104876
2024-07-12 19:49:13 -07:00
Ryan Ernst
e4349f8787
Force resolution of fstat64 symbol with JNA (#110807)
When JNA loads libraries it creates a proxy object for the library.
Unfortunately it doesn't actually inspect any of the methods, those get
bound lazily at runtime when the method is called through the proxy. For
fstat64 we need to know at load time whether the symbol exists, so that
we can fallback to an alternate function if it doesn't.

This commit looks up the NativeLibrary object from JNA for libc and
checks if fstat64 exists during load time.
2024-07-12 14:53:01 -07:00
Ryan Ernst
8417d3f141
Move preallocate functionality to native access (#110678)
This commit moves the file preallocation functionality into
NativeAccess. The code is basically the same. One small tweak is that
instead of breaking Java access boundaries in order to get an open file
handle, the new code uses posix open directly.

relates #104876
2024-07-11 09:42:44 -07:00
Ryan Ernst
c6f82604d7
Move exec syscall filtering to NativeAccess (#108970)
This commit moves the system call filtering initialization into
NativeAccess. The code is essentially unmodified from its existing
state, now existing within the *NativeAccess implementations.

relates #104876
2024-07-09 12:25:27 -07:00
Lorenzo Dematté
0bc2b19ead
Add AVX-512 optimised vector distance functions for int7 on x64 (#109084)
* Add vec_caps and inner implementation for AVX-512-F (without VNNI)
* select FNNI function name based on vec_caps; native templated implementation for manual unrolling
* Switched compiler to clang for x64, as gcc has a bug
2024-06-28 11:15:35 +02:00
Chris Hegarty
fa364bfcaf
Rename the vec module to better reflect that it provides SIMD optimized vector scorers (#109661)
This commit renames the vector module to better reflect its intent - to provide SIMD optimized vector scorer implementations.
2024-06-17 11:10:02 +01:00
Chris Hegarty
f71aba1fdd
Use the SIMD optimized SQ vector scorer at search time (#109109)
This commit extends the custom SIMD optimized SQ vector scorer to include search time scoring.

When run on JDK22+ vector scoring with be done with the custom scorer. The implementation uses the JDK 22+ on-heap ALLOW_HEAP_ACCESS Linker.Option so that the native code can access the query vector directly.
2024-05-29 16:32:06 +01:00
Ryan Ernst
13b36c1e73
Move Windows native functions into NativeAccess (#108873)
Elasticsearch uses a couple windows specific functions, specifically
gettting a short path, and registering a console control handler for
shutdown notification. This commit moves this functionality from the
existing jna natives into NativeAccess.

relates #104876
2024-05-23 09:37:33 -04:00
Ryan Ernst
02083d6f11
Guard systemd library lookup from unreadable directories (#108931)
When scanning the library path we may come across directories that are
unreadable. If that happens, the recursive walk of the library path
directories will throw a fatal IOException. This commit guards the walk
of the library paths to first check for readability of each directory we
are about to traverse.
2024-05-23 05:50:55 -07:00
Chris Hegarty
81a8910eb4
Include compiler options explicitly with building on aarch64 with clang (#108937)
This commit includes the compiler options explicitly when building on aarch64 with clang.
2024-05-23 12:28:30 +01:00
Ryan Ernst
062039b7f4
Move memory locking into NativeAccess (#108829)
This commit moves the implementations of locking virtual memory into RAM
into NativeAccess.

relates https://github.com/elastic/elasticsearch/pull/104876
2024-05-21 09:36:08 -07:00
Ryan Ernst
bc499e7c83
Move rlimit calls into NativeAccess (#108805)
This commit moves getting max threads, max virtual memory size, and max
file size into NativeAccess.

relates https://github.com/elastic/elasticsearch/pull/104876
2024-05-20 11:09:50 -04:00
Lorenzo Dematté
2e0f8d087c
Add a SIMD (AVX2) optimised vector distance function for int7 on x64 (#108088)
* Adding support for x64 to native vec library
* Fix: aarch64 sqr7u dims
* Fix: add symbol stripping (deb lintian)
---------
Co-authored-by: Chris Hegarty <62058229+ChrisHegarty@users.noreply.github.com>
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2024-05-10 11:58:34 +02:00
Andrew Wilkins
5a622b0a07
nativeaccess: try to load all located libsystemds (#108238)
Linux systems with multiarch (e.g. i386 & x86_64) libraries
may have libsystemd.0 in two subdirectories of an entry in
java.library.path. For example, libsystemd.so.0 may be found
in both /usr/lib/i386-linux-gnu and /usr/lib/x86_64-linux-gnu.

Instead of attempting to load any library found, attempt all
and stop as soon as one is successfully loaded.
2024-05-08 11:06:59 -07:00
Chris Hegarty
7f90a98ed5
Update native vector provider to use unsigned int7 values only (#108243)
This commit updates the native vector provider to reflect that Lucene's scalar quantization is unsigned int7, with a range of values from 0 to 127 inclusive. Stride has been pushed down into native, to allow other platforms to more easily select there own stride length.

Previously the implementation supports signed int8. We might want the more general signed int8 implementation in the future, but for now unsigned int7 is sufficient, and allows to provide more efficient implementations on x64.
2024-05-04 10:42:55 +01:00
Ryan Ernst
6180b08ce7
Use direct method mapping for zstd (#108172)
JNA supports two types of mapping to native methods, proxying and direct
method mapping. Proxying is nicer for unit testing, but unfortunately
the proxied methods are lazily loaded. NativeAccess expects that methods
are linked during static init, before SecurityManager is initialized.
For any native methods called after security manager init, the proxied
method will fail.

This commit changes the zstd bindings to use direct method mapping so
that calling zstd methods does not fail when using JNA (pre Java 21).

closes #107504
closes #107770
2024-05-02 08:59:12 -07:00
Lorenzo Dematté
6ef4865195
Add functionality to test if the host CPU supports native SIMD instructions (#107429) 2024-04-29 12:01:41 +02:00
Chris Hegarty
6b52d7837b
Add an optimised int8 vector distance function for aarch64. (#106133)
This commit adds an optimised int8 vector distance implementation for aarch64. Additional platforms like, say, x64, will be added as a follow-up.

The vector distance implementation outperforms Lucene's Pamana Vector implementation for binary comparisons by approx 5x (depending on the number of dimensions). It does so by means of compiler intrinsics built into a separate native library and link by Panama's FFI. Comparisons are performed on off-heap mmap'ed vector data.

The implementation is currently only used during merging of scalar quantized segments, through a custom format ES814HnswScalarQuantizedVectorsFormat, but its usage will likely be expanded over time.

Co-authored-by: Benjamin Trent <ben.w.trent@gmail.com>
Co-authored-by: Lorenzo Dematté <lorenzo.dematte@elastic.co>
Co-authored-by: Mark Vieira <portugee@gmail.com>
Co-authored-by: Ryan Ernst <ryan@iernst.net>
2024-04-12 08:44:21 +01:00
Ryan Ernst
96230f7a7d
Use CloseableByteBuffer in compress/decompress signatures (#106724)
CloseableByteBuffer is backed by native memory segments, but the
interfaces for compress and decompress methods of zstd take ByteBuffer.
Although both Jna and the Jdk can deal with turning the native
ByteBuffer back into an address to pass to the native method, the jdk
may have a more significant cost to that action.

This commit changes the signature of compress and decompress to take in
CloseableByteBuffer so that each implementation can do its own
unwrapping to get the appropriate native address.

relates #103374
2024-03-25 14:30:27 -04:00
Ryan Ernst
2196576aed
Use confined arena for CloseableByteBuffer (#106723)
The jdk implementation of CloseableByteBuffer currently uses a shared
arena. The assumption was that a buffer might be shared across threads.
However, in practice for compression/decompression that is not true, and
the shared arena has a noticeable impact on deallocation when the buffer
is closed. This commit switches to a confined arena, limtting buffer
creation and compress/decompress calls to a single thread.

relates #103374
2024-03-25 13:26:45 -04:00
Ryan Ernst
55c3357c81
Move common mrjar forbidden apis configuration to plugin (#106385)
Since mrjars may use preview apis, forbidden apis must know about any
preview apis from the jdk. However, we do not run forbidden apis with
the preview enabled flag, nor in a separate jvm, so it does not know
about these classes. Thus we ignore missing classes on source sets added
by the mrjar plugin.

This commit configures all sourcesets added by mrjar plugin to ignore
forbidden apis missing classes.
2024-03-19 16:40:52 -04:00
Ryan Ernst
444866aec9
Set explicit directory and file permissions on native libraries (#106505)
The distributions already have correct permissions set on native
libraries copied to them. However, the build itself to extract the
native libs relies on the upstream file permissions. This commit sets
explicit permissions on the copy task which extracts native libraries.
2024-03-19 15:51:57 -04:00
Ryan Ernst
6731538bbe
Different string allocation on jdk 21/22 (#106492)
Similar to https://github.com/elastic/elasticsearch/pull/106360, the
methods for allocating a native string changed between Java 21 and 22.
This commit adds another util method to handle the differences and uses
it in the jdk systemd impl.
2024-03-19 13:09:18 -04:00
Ryan Ernst
1d7a0159d3
Support jdk22 in zstd bindings (#106360)
The foreign memory API changed between Java 21 and 22 in how to decode a
string from native memory. This commit adds an multi-release class to
handle the two different methods on MemorySegment to decode a string.
2024-03-14 10:30:14 -07:00
Rene Groeschke
0f9ebf268f
Mute ZstdTests (#106348)
failing on jdk22
2024-03-14 10:56:10 +01:00
Ryan Ernst
405b88b882
Add zstd to native access (#105715)
This commit makes zstd compression available to Elasticsearch. The
library is pulled in through maven in jar files for each platform, then
bundled in a new platform directory under lib. Access to the zstd
compression/decompression is through NativeAccess.
2024-03-13 09:45:12 -07:00
Ryan Ernst
10dcb8e8bd
Add systemd native access (#106151)
This commit moves systemd access to the NativeAccess lib.

relates #104876
2024-03-12 07:35:02 -07:00
Ryan Ernst
83585315fe
Only apply build to direct libs (#106101)
Sometimes libs have subprojects that may not be java projects. This commit adjusts the shared
configuration for libs to only affect direct subprojects of :lib.
2024-03-08 13:48:26 -08:00
Ryan Ernst
dd51f6b187
Add classpath based SPI for jna native provider (#105320)
Native lib provider is normally run modular, but since tests run on the
classpath it also needs to work with old style SPI.
2024-02-08 17:31:57 -08:00
Ryan Ernst
6375e9f443
Add native access library (#105100)
Elasticsearch requires access to some native functions. Historically
this has been achieved with the JNA library. However, JNA is a
complicated, magical library, and has caused various problems booting
Elasticsearch over the years. The new Java Foreign Function and Memory
API allows access to call native functions directly from Java. It also
has the advantage of tight integration with hotspot which can improve
performance of these functions (though performance of Elasticsearch's
native calls has never been much of an issue since they are mostly at
boot time).

This commit adds a new native lib that is internal to Elasticsearch. It
is built to use the foreign function api starting with Java 21, and
continue using JNA with Java versions below that.

Only one function, checking whether Elasticsearch is running as root, is
migrated. Future changes will migrate other native functions.
2024-02-07 18:27:09 -05:00