elasticsearch

mirror of https://github.com/elastic/elasticsearch.git synced 2025-06-27 17:10:22 -04:00

Author	SHA1	Message	Date
Chris Hegarty	4d3b699067	JDKVectorLibrary: update low-level bounds checks and add benchmark (#130216 ) This commit updates the low-level bounds checks in JDKVectorLibrary and add benchmark, so that we can more easily bench the low-level operations. Note: I added the mr-jar gradle plugin to the benchmarks so that we can compile with preview features in Java 21, namely MemorySegment.	2025-06-27 19:21:04 +01:00
Ryan Ernst	5dcded20a9	Show entitlement jar path in agent load failure (#130233 ) When the entitelement agent fails to load the underlying exception can be cryptic. In some casees it may be that the path the agent jar is bad. This commit expands the exception message to show the agent path that we tried to load.	2025-06-27 11:00:18 -07:00
Ryan Ernst	2df9dd42fb	Fail startup if entitlement instrumentation failed (#130051 ) Java class transformers swallow exceptions, so any instrumentation failures, for example due to a java version mismatch, will silently proceed with startup, which then will cryptically fail the entitlement self test. This commit logs exceptions that occur during instrumentation, as well as plumb through the fact that any occured so that bootstrap can fail rather than allow startup to proceed.	2025-06-26 10:31:48 +10:00
Ryan Ernst	f4e7ce935f	Upgrade ASM for entitlements to support JDK 25 (#130037 )	2025-06-26 10:30:15 +10:00
Gal Lalouche	6970bd24a0	ESQL: Aggressive release of shard contexts (#129454 ) Keep better track of shard contexts using RefCounted, so they can be released more aggressively during operator processing. For example, during TopN, we can potentially release some contexts if they don't pass the limit filter. This is done in preparation of TopN fetch optimization, which will delay the fetching of additional columns to the data node coordinator, instead of doing it in each individual worker, thereby reducing IO. Since the node coordinator would need to maintain the shard contexts for a potentially longer duration, it is important we try to release what we can eariler. An even more advanced optimization is to delay fetching to the main cluster coordinator, but that would be more involved, since we need to first figure out how to transport the shard contexts between nodes. Summary of main changes: DocVector now maintains a RefCounted instance per shard. Things which can build or release DocVectors (e.g., LuceneSourceOperator, TopNOperator), can also hold RefCounted instances, so they can pass them to DocVector and also ensure contexts aren't released if they can still be potentially used later. Driver's main loop iteration (runSingleLoopIteration), now closes its operators even between different operator processing. This is extra aggressive, and was mostly done to improve testability. Added a couple of tests to TopNOperator and a new integration test EsqlTopNShardManagementIT, which uses the pausable plugin framework to check that TopNOperator releases things as early as possible..	2025-06-26 09:49:40 +10:00
Niels Bauman	126e8cc5dc	Add `NotMultiProjectCapable` annotation (#129934 ) Some features are unavailable in serverless and are thus not worth the investment to make fully project-aware. This new annotation can be used to clearly mark blocks of code that are intentionally not made properly project-aware, in case we need to revisit them in the future.	2025-06-25 09:11:15 -03:00
Patrick Doyle	0e2362432c	Pass empty lists instead of nulls to FileAccessTree.of (#129942 )	2025-06-25 04:44:13 +10:00
Ankit Sethi	4dab46a825	fix file name (#129883 ) * fix file name * Update docs/changelog/129883.yaml * Delete docs/changelog/129883.yaml	2025-06-24 10:30:23 -05:00
Ignacio Vera	ffea6ca2bf	Introduce an int4 off-heap vector scorer (#129824 ) * Introduce an int4 off-heap vector scorer * iter * Update server/src/main/java/org/elasticsearch/index/codec/vectors/DefaultIVFVectorsReader.java Co-authored-by: Benjamin Trent <ben.w.trent@gmail.com> --------- Co-authored-by: Benjamin Trent <ben.w.trent@gmail.com>	2025-06-23 18:44:12 +02:00
Ignacio Vera	4ca96c199f	Introduce a vectorize soarDistance function (#129744 ) This commit replaces the method #soarResidual with a method call #soarDistance which perfoms better for computing soar distances.	2025-06-20 16:23:50 +02:00
Benjamin Trent	0a9f3a9630	Address when scores can be very large in osq score test (#129592 ) Using a static `diff` or epsilon just doesn't work for this test as the scores can be very large, but relatively close. Maybe there is a simpler way, but my mind wasn't wanting to "math" very much. For example, the seed that this previously failed on had scores like `1.726524E9` and `1.7265239E9`, which, given their size, are really close together (within 128). But a static epsilon wouldn't capture that. closes: https://github.com/elastic/elasticsearch/issues/128485	2025-06-19 01:33:29 +10:00
Benjamin Trent	4e926ae41a	Minor ivf cleanups and fixing quantization performance (#129566 ) We are accidentally utilizing the non-vectorized quantizer when building ivf indices. This provides a 3-5x speed improvement on quantizing on my mac This fixes that and addresses some minor fixes (removing unused code, etc.) Here is a small benchmark result. time spent quantizing goes down significantly. <img width="652" alt="image" src="https://github.com/user-attachments/assets/9f46398c-c587-4e74-bc91-f2e07a63b406" /> vs. <img width="673" alt="image" src="https://github.com/user-attachments/assets/c4f4679f-d7a7-4486-841f-7dd3e75a11cb" />	2025-06-18 05:49:38 +10:00
Simon Cooper	98c1708adb	Add javadocs for BBQ dot product method (#129419 )	2025-06-16 10:18:32 +01:00
Jordan Powers	96300a9d80	Optimized text for full unicode and some escape sequences (#129169 ) Follow-up to #126492 to apply the json parsing optimization to strings containing unicode characters and some backslash-escaped characters. Supporting backslash-escaped strings is tricky as it requires modifying the string. There are two types of modification: some just remove the backslash (e.g. \", \\), and some replace the whole escape sequence with a new character (e.g. \n, \r, \u00e5). In this implementation, the optimization only supports the first case--removing the backslash. This is done by making a copy of the data, skipping the backslash. It should still be more optimized than full String decoding, but it won't be as fast as non-backslashed strings where we can directly reference the input bytes. Relates to #129072.	2025-06-12 09:55:07 -07:00
Patrick Doyle	7ec8fccf94	Refactor before entitlements for testing (#129099 ) * Support multiple plugin source paths * Refactor: remove unncessary PathLookup method. It's only called in one place, and there's no need to override it for testing. Removing it just makes things simpler. * Refactor: local var for pathLookup * Fix bugs in test build info parsing * Fix representative_class in test * Move BridgeUtilTests. Tests in org.elasticsearch.entitlement.bridge are going to be uniquely hard to test once we patch the bridge into java.base, due to Java's prohibition on split packages. Let's just move this guy to another package. * Upcast (?!) Java23EntitlementChecker to EntitlementChecker * Empty TestPathLookup * Create PolicyManager during bootstrap, allowing us to share initialization * Use empty component path list instead of null * Downcast to the class of the check method. In our unit test, we have a mock checker that doesn't extend EntitlementChecker, so downcasting to that would require us to needlessly rework the unit test. * Fix javadoc typos	2025-06-09 18:56:07 +02:00
Rene Groeschke	342083100b	[Build] Add support for publishing to maven central (#128659 ) This ensures we package an aggregation zip with all artifacts we want to publish to maven central as part of a release. Running zipAggregation will produce a zip file in the build/nmcp/zip folder. The content of this zip is meant to match the maven artifacts we have currently declared as dra maven artifacts.	2025-06-06 17:35:44 +02:00
Mike Pellegrini	5ee6dfadfe	Update AbstractXContentParser to support parsers that don't provide text characters (#129005 )	2025-06-06 09:17:41 -04:00
Jordan Powers	496fb2d5a4	Skip UTF8 to UTF16 conversion during document indexing (#126492 ) When parsing documents, we receive the document as UTF-8 encoded data which we then parse and convert the fields to java-native UTF-16 encoded Strings. We then convert these strings back to UTF-8 for storage in lucene. This patch skips the redundant conversion, instead passing lucene a direct reference to the received UTF-8 bytes when possible.	2025-06-05 19:50:09 -07:00
Jordan Powers	de40ac45d1	Move Text class to libs/xcontent (#128780 ) This PR is a precursor to #126492. It does three things: 1. Move org.elasticsearch.common.text.Text from :server to org.elasticsearch.xcontent.Text in :libs:x-content. 2. Refactor the Text class to use a new EncodedBytes record instead of the elasticsearch BytesReference. 3. Add the XContentString interface, with the Text class implementing that interface. These changes were originally implemented in #127666 and #128316, however they were reverted in #128484 due to problems caused by the mutable nature of java ByteBuffers. This is resolved by instead using a new immutable EncodedBytes record.	2025-06-04 11:22:03 -07:00
Niels Bauman	f988611691	React more prompty to task cancellation while waiting for the cluster to unblock (#128737 ) Instead of waiting for the next run of the `ClusterStateObserver` (which might be arbitrarily far in the future, but bound by the timeout if one is set), we notify the listener immediately that the task has been cancelled. While doing so, we ensure we invoke the listener only once. Fixes #117971	2025-06-03 11:00:20 +03:00
Patrick Doyle	c633345a4d	Initial TestPolicyManager implementation (#128700 ) * Initial TestPolicyManager implementation * The forbidden APIs check is not messing around	2025-06-02 13:08:17 -04:00
Ryan Ernst	2be74a47e1	Fully initialize policy checker before instrumenting (#128703 ) Entitlement instrumentation works by reflectively calling back into the entitlements lib to grab the checker. It must be fully in place before any classes are instrumented. This commit fixes a bug that was introduced by refactoring which caused the checker to not be set until after all classes were instrumented. In some situations this could lead the checker to being null when it is grab (and statically cached) by the entitlement bridge.	2025-05-31 02:10:20 +03:00
Patrick Doyle	9e40dc4e3b	Encapsulate entitlements (#128637 ) * Rename and encapsulate InitializeArgs * Move ElasticsearchEntitlementChecker out of api package. It's an implementation detail that doesn't need to be exposed to the rest of the system. * Stub TestPathLookup (not yet implemented)	2025-05-30 17:05:56 -04:00
Patrick Doyle	77595cbccd	[Entitlements] Add test entitlement bootstrap and initialization classes (#128625 ) * Initialization class as argument to EntitlementAgent * visibility changes * WIP: test entitlement bootstrap and initialization classes * Simplify * Moving packages to reduce visibility * adjust visibility * add plugins descriptor + policy parsing * PR comments * update visibility, uncomment TestBuildInfoParser usage * [CI] Auto commit changes from spotless * Factor out createPolicyManager to help merge * TestEntitlementInitialization is not yet implemented * Respond to PR comments --------- Co-authored-by: Lorenzo Dematte <lorenzo.dematte@elastic.co> Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>	2025-05-29 22:36:39 +03:00
Lorenzo Dematté	554b96aec9	[Entitlements] Add missing NIO async network instrumentation (#128582 ) This PR adds some additional instrumentation to ensure we capture more cases in which we use async network usage via channels and `select`	2025-05-29 19:52:10 +03:00
Patrick Doyle	ba50798f62	Split PolicyChecker from PolicyManager (#128004 ) * Split PolicyChecker from PolicyManager * Restore EntitlementCheckerUtils * [CI] Auto commit changes from spotless --------- Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>	2025-05-28 12:48:14 -04:00
Patrick Doyle	7690f4667e	Revert changes to Text class (#128483 ) (#128484 ) * Revert "Fix the Text class package change in example plugins (#128316)" This reverts commit `cc486480e3`. * Revert "Update Text class to use native java ByteBuffer (#127666)" This reverts commit `db0c3c7a28`. Co-authored-by: Lorenzo Dematté <lorenzo.dematte@elastic.co>	2025-05-27 18:37:43 +10:00
Patrick Doyle	8d79de51f5	Use package to suppress warning for entitlement self-test (#128223 ) * Use package to suppress warning for entitlement self-test * [CI] Auto commit changes from spotless --------- Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>	2025-05-21 08:59:02 -04:00
Patrick Doyle	43841a5ac3	Fail fast on invalid entitlement patches (#128071 ) * Fail fast on invalid entitlement patches * Don't peel off `PolicyParserException` * Just catch Exception	2025-05-20 13:29:09 -04:00
Benjamin Trent	1324ee0115	Reapply "Adds new unexposed and experimental IVF format (#127528 )" (#128005 ) (#128051 ) This reverts commit `8a17a5ed5f`. reapplying ivf format, but with a fix.	2025-05-14 08:47:59 +10:00
Craig Taverner	cb1391368b	Fix #123425 numerical floating point edge case (#127982 )	2025-05-10 16:37:29 +02:00
John Wagster	8a17a5ed5f	Revert "Adds new unexposed and experimental IVF format (#127528 )" (#128005 ) This reverts commit `ebe8ea6136`.	2025-05-09 17:10:11 -05:00
Ryan Ernst	ab690ba23f	Check hidden frames in entitlements (#127877 ) Entitlements do a stack walk to find the calling class. When method refences are used in a lambda, the frame ends up hidden in the stack walk. In the case of using a method reference with AccessController.doPrivileged, the call looks like it is the jdk itself, so the call is trivially allowed. This commit adds hidden frames to the stack walk so that the lambda frame created for the method reference is included. Several internal packages are then necessary to filter out of the stack.	2025-05-08 16:59:03 -07:00
Jordan Powers	db0c3c7a28	Update Text class to use native java ByteBuffer (#127666 ) This PR is a precursor to #126492. It does three things: - Move org.elasticsearch.common.text.Text from :server to org.elasticsearch.xcontent.Text in :libs:x-content. - Refactor the Text class to use a java-native ByteBuffer instead of the elasticsearch BytesReference. - Add the XContentString interface, with the Text class implementing that interface.	2025-05-08 08:19:38 -07:00
Lorenzo Dematté	2d9fc30f62	Initialization class as argument to EntitlementAgent (#127815 ) Preliminary step for test entitlement initialization, extracted from #127814	2025-05-08 10:22:02 +02:00
Benjamin Trent	ebe8ea6136	Adds new unexposed and experimental IVF format (#127528 )	2025-05-07 14:59:57 -04:00
Ryan Ernst	9537388897	Remove doPrivileged uses from server (#127781 ) Now that SecurityManager is no longer used, doPrivileged is no longer necessary. This commit removes uses of it from core and server	2025-05-07 07:24:53 -07:00
Lorenzo Dematté	8bda02dafa	Uniform main and backport code (#127766 ) While backporting entitlement initialization refactorings, I realized there is a mismatch in getVersionSpecificCheckerClass signature, and also that this function in the backports is used in more places (DynamicInstrumentation), making it "strange" to have this in EntitlementInitialization. This PR extracts the function to a separate static class (package-private) and makes the signature uniform with backports. This will need to be backported manually to the 8.x branches, and will make the backported version of DynamicInstrumentation cleaner.	2025-05-07 09:25:15 +02:00
Ryan Ernst	60ad8ba744	Remove custom SecurityManager (#127778 ) Since SecurityManager is no longer used, the custom subclass of SecurityManager, SecureSM, is no longer needed.	2025-05-06 16:16:46 -07:00
Ryan Ernst	b78ac7c94c	Remove PrivilegedOperations (#127726 ) With the SecurityManager gone, the PrivilegedOperations class is no longer needed, these operations can be called directly.	2025-05-06 10:50:49 -07:00
Lorenzo Dematté	79ee234721	Extract hardcoded entitlements creation to a separate class (#127698 ) Moving creation of hardcoded entitlements (server policy + APM agent) to a separate class	2025-05-05 19:43:41 +02:00
Lorenzo Dematté	f90b01597c	Move FilesEntitlements validation to a separate class (#127703 ) Moves FilesEntitlements validation to a separate class. This is the final PR to make EntitlementsInitialization a simpler "orchestrator" of the various steps in the initialization phase.	2025-05-05 17:41:22 +02:00
Lorenzo Dematté	23ab059252	[Entitlements] Extract instrumentation initialization to a separate class (#127702 )	2025-05-05 16:08:18 +02:00
Ankit Sethi	94854b3a3f	Remove dangling spaces wherever found. (#127475 ) * Remove dandling spaces wherever found. This PR addresses #117067 , a report about unexpected spaces breaking message parsers built by customers. I used the regex `(\. \")(?![A-Z(a-z_0-9-;<%\/\.+ \t\n]+)` to detect such instances and clean up. In one case, a minor code improvement helps add optional spaces as necessary for a multi-sentence error message. * fix test * Update docs/changelog/127475.yaml * correct logic * fix test * fix tests * fix tests * fix tests * Update docs/changelog/127475.yaml * Update x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/action/TransportGetInferenceModelAction.java Co-authored-by: Slobodan Adamović <slobodanadamovic@users.noreply.github.com> * Update libs/x-content/src/main/java/org/elasticsearch/xcontent/ObjectParser.java Co-authored-by: Slobodan Adamović <slobodanadamovic@users.noreply.github.com> * correctly reference issue * Update docs/changelog/127475.yaml --------- Co-authored-by: Slobodan Adamović <slobodanadamovic@users.noreply.github.com>	2025-05-01 10:33:54 -05:00
Benjamin Trent	74faf47121	New bulk scorer for binary quantized vectors via optimized scalar quantization (#127189 ) * New bulk scorer for binary quantized vectors via optimized scalar quantization * fixing headers * fixing tests	2025-04-29 07:42:08 -04:00
Lorenzo Dematté	e9bedf1184	[Entitlements] Small docs fixes (#127323 )	2025-04-24 18:11:18 +02:00
Simon Cooper	c5ada66410	Copy Lucene99FlatVectorsReader allowing direct IO to be specified directly (#125921 ) We want to use DirectIO to access raw vector data randomly so it doesn't load everything into the page cache	2025-04-24 11:00:30 +01:00
Lorenzo Dematté	002fef75ff	[Entitlements] Fix: consider case sensitiveness differences (#126990 ) Our path comparison for file access is string based, due to the fact that we need to support Paths created for different file systems/platforms. However, Windows files and paths are (sort of) case insensitive. This PR fixes the problem by abstracting String comparison operations and making them case sensitive or not based on the host OS.	2025-04-23 20:23:45 +02:00
Benjamin Trent	059f91c90c	Panama vector accelerated optimized scalar quantization (#127118 ) * Adds accelerates optimized scalar quantization with vectorized functions * Adding benchmark * Update docs/changelog/127118.yaml * adjusting benchmark and delta	2025-04-23 12:51:04 -04:00
Patrick Doyle	4d929ca986	Clean up PolicyManager and ScopeResolver tests (#127115 ) * Simplify PolicyManagerTests * Clean and simplify ScopeResolverTests	2025-04-23 08:57:57 -04:00

1 2 3 4 5 ...

1119 commits