elasticsearch

mirror of https://github.com/elastic/elasticsearch.git synced 2025-04-22 06:07:55 -04:00

Author	SHA1	Message	Date
Yang Wang	fd9b290190	Use the current term in a logging where it is relevant (#116786 ) (#116901 ) As title says, this PR logs current term instead of last-accepted term in a logging message where the former is expected. (cherry picked from commit `131d3c1288`)	2024-11-18 11:10:02 +11:00
Benjamin Trent	ef5f439a93	[8.x] Add multi_dense_vector value access to scripts (#116610 ) (#116850 ) * Add multi_dense_vector value access to scripts (#116610) This adds value access to multi_dense_vector values in scripts. The users will get: - Count of vectors per field - Magnitudes of all the individual vectors - Access to each vector with an iterator I will happily take design critiques around how these are exposed in scripting. I initially though of just providing directly `float[][]` access, but this seems to have some unfavorable behavior around creating a TON of garbage. The reason is that each field could have a different number of vectors, so allocating a new collection of `float[dim]` for every field seemed rough. Generally, when scripting or using the vectors, an iterator should be enough and I have the iterator backed by a simple buffer to keep garbage down. * fixing test	2024-11-16 02:22:27 +11:00
Brendan Cully	4a261ad2a7	Attempt to clean up index before remote transfer (#115142 ) (#116854 ) If a node crashes during recovery, it may leave temporary files behind that can consume disk space, which may be needed to complete recovery. So we attempt to clean up the index before transferring files from a recovery source. We attempt to load the latest snapshot of the target directory, which we supply to store's `cleanupAndVerify` method to remove any files not referenced by it. We treat a failure to load the latest snapshot as equivalent to an empty snapshot, which will cause `cleanupAndVerify` to purge the entire target directory and pull from scratch. Closes #104473	2024-11-15 10:07:16 +11:00
Andrei Dan	28d5ded166	[8.x] Fix testSearchAndRelocateConcurrently (#116806 ) (#116830 ) * Fix testSearchAndRelocateConcurrently (#116806) This aims to test we can search through replica shard relocations. However, the way the test was written it was sometimes also starting another data node. The concurrent search requests would sometimes hit this new node, before its cluster state was RECOVERED. The search action throws exception when the cluster state is not recovered as it needs to be able to read the cluster state. This fixes the test to grab a coy of the bootstrapped nodes and use them when calling the _search API before the cluster (potentially) resizes. (cherry picked from commit `0be75e1b69`) Signed-off-by: Andrei Dan <andrei.dan@elastic.co> * compile	2024-11-15 07:25:02 +11:00
Mark J. Hoy	2459aa7016	add backport transport versions (#116827 ) (#116834 ) (cherry picked from commit `74e6009bb3`)	2024-11-15 05:44:40 +11:00
Aurélien FOUCRET	e4dbf3823a	Add tracking for query rule types (#116357 ) (#116820 ) * Add total rule type counts to list calls and xpack usage * Add feature * Update docs/changelog/116357.yaml * Fix docs test failure & update yaml tests * remove additional spaces --------- Co-authored-by: Mark J. Hoy <mark.hoy@elastic.co> (cherry picked from commit `1b03a96e52`) Co-authored-by: Kathleen DeRusso <kathleen.derusso@elastic.co>	2024-11-14 18:16:56 +01:00
David Kyle	37ef5f21a6	[8.x] [ML] Pass inference timeout to start deployment (#116725 ) (#116733 ) * [ML] Pass inference timeout to start deployment (#116725) Default inference endpoints automatically deploy the model on inference the inference timeout is now passed to start model deployment so users can control that timeout * handle max time --------- Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>	2024-11-15 03:11:10 +11:00
Luke Whiting	8126bf5a49	[8.x] Introduce Email Address Allow Lists For Watcher (#116672 ) (#116805 ) * Introduce Email Address Allow Lists For Watcher (#116672) * New setting plus mutual exclusiveness validation * New domain list checking * Email service tests * Documentation updates * PR Changes Fix comment * Backport missing Settings method for default value with validator	2024-11-15 02:15:12 +11:00
Carlos Delgado	809fd9e7d7	Remove unused method introduced in #113194 (#116793 ) (#116807 ) (cherry picked from commit `25223dddae`)	2024-11-14 14:15:07 +01:00
Carlos Delgado	161b7ef129	[8.x] Add Search Phase APM metrics (#113194 ) (#116751 )	2024-11-14 13:02:00 +01:00
Craig Taverner	f5246cda55	Use SearchStats instead of field.isAggregatable in data node planning (#115744 ) (#116800 ) Since ES\|QL makes use of field-caps and only considers `isAggregatable` during Lucene pushdown, turning off doc-values disables Lucene pushdown. This is incorrect. The physical planning decision for Lucene pushdown is made during local planning on the data node, at which point `SearchStats` are known, and both `isIndexed` and `hasDocValues` are separately knowable. The Lucene pushdown should happen for `isIndexed` and not consider `hasDocValues` at all. This PR adds hasDocValues to SearchStats and the uses isIndexed and hasDocValue separately during local physical planning on the data nodes. This immediately cleared up one issue for spatial data, which could not push down a lucene query when doc-values was disabled. Summary of what `isAggregatable` means for different implementations of `MappedFieldType`: * Default implementation of `isAggregatable` in `MappedFieldType` is `hasDocValues`, and does not consider `isIndexed` * All classes that extend `AbstractScriptFieldType` (eg. `LongScriptFieldType`) hard coded `isAggregatable` to `true`. This presumably means Lucene is happy to mimic having doc-values * `TestFieldType`, and classes that extend it, return the value of `fielddata`, so consider the field aggregatable if there is field-data. * `AggregateDoubleMetricFieldType` and `ConstantFieldType` hard coded to `true` * `DenseVectorFieldType` hard coded to `false` * `IdFieldType` return the value of `fieldDataEnabled.getAsBoolean()` In no case is `isIndexed` used for `isAggregatable`. However, for our Lucene pushdown of filters, `isIndexed` would make a lot more sense. But for pushdown of TopN, `hasDocValues` makes more sense. Summarising the results of the various options for the various field types, where `?` means configrable: \| Class \| isAggregatable \| isIndexed \| isStored \| hasDocValues \| \| --- \| --- \| --- \| --- \| --- \| \| AbstractScriptFieldType \| true \| false \| false \| false \| \| AggregateDoubleMetricFieldType \| true \| true \| false \| false \| \| DenseVectorFieldType \| false \| ? \| false \| !indexed \| \| IdFieldType \| fieldData \| true \| true \| false \| \| TsidExtractingIdField \| false \| true \| true \| false \| \| TextFieldType \| fieldData \| ? \| ? \| false \| \| ? (the rest) \| hasDocValues \| ? \| ? \| ? \| It has also been observed that we cannot push filters to source without checking `hasDocValues` when we use the `SingleValueQuery`. So this leads to three groups of conditions: \| Category \| require `indexed` \| require `docValues` \| \| --- \| --- \| --- \| \| Filters(single-value) \| true \| true \| \| Filters(multi-value) \| true \| false \| \| TopN \| true \| true \| And for all cases we will also consider `isAggregatable` as a disjunction to cover the script field types, leading to two possible combinations: * `fa.isAggregatable() \|\| searchStats.isIndexed(fa.name()) && searchStats.hasDocValues(fa.name())` * `fa.isAggregatable() \|\| searchStats.isIndexed(fa.name())`	2024-11-14 22:11:01 +11:00
Armin Braun	05f3ba3edd	Add singleton for noop BitSetFilterCache.Listener (#116753 ) (#116773 ) Noticed during a code review that added yet another one of these: We have quite a few instances of duplicate noop implementations, lets make tests a little less verbose here. Technically the constant is test-only but it felt right to just leave it on the interface.	2024-11-14 09:01:36 +11:00
Panagiotis Bailis	7cae545fed	Adding patch version from 8.16 for skip_inner_hits_search_source (#116741 )	2024-11-13 19:04:32 +02:00
Tanguy Leroux	d6b2425771	Fix TranslogDeletionPolicy when assertions are disabled (#116654 ) (#116714 ) Current code causes a NPE when assertions are disabled: the openTranslogRef is only non-null when assertions are enabled. Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>	2024-11-14 01:31:12 +11:00
Simon Cooper	c441ada314	[8.x] Add a deprecation warning that the JSON format of non-detailed errors is changing in v9 (#116330 ) (#114739 )	2024-11-13 14:17:50 +00:00
Dimitris Rempapis	08f8312457	_validate request does not honour ignore_unavailable (#116656 ) (#116717 ) The IndicesOption has been updated into the ValidateQueryRequest to encapsulate the following logic. If we target a closed index and ignore_unavailable=false, we get an IndexClosedException, otherwise if the request contains ignore_unavailable=true, we safely skip the closed index.	2024-11-13 14:31:18 +02:00
Panagiotis Bailis	7d33c5c597	[8.x] Backporting propagating nested inner_hits to the parent compound retriever (#116707 )	2024-11-13 12:33:43 +01:00
Nikolaj Volgushev	7668eee283	Use retry logic and real file system in file settings ITs (#116392 ) (#116709 ) Several file-settings ITs fail (rarely) with exceptions like: ``` java.nio.file.AccessDeniedException: C:\Users\jenkins\workspace\platform-support\14\server\build\testrun\internalClusterTest\temp\org.elasticsearch.reservedstate.service.SnaphotsAndFileSettingsIT_5733F2A737542BE-001\tempFile-001.tmp -> C:\Users\jenkins\workspace\platform-support\14\server\build\testrun\internalClusterTest\temp\org.elasticsearch.reservedstate.service.SnaphotsAndFileSettingsIT_5733F2A737542BE-001\tempDir-002\config\operator\settings.json \| at sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:89) \| -- \| -- \| \| at sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:103) \| \| \| at sun.nio.fs.WindowsFileCopy.move(WindowsFileCopy.java:317) \| \| \| at sun.nio.fs.WindowsFileSystemProvider.move(WindowsFileSystemProvider.java:293) \| \| \| at org.apache.lucene.tests.mockfile.FilterFileSystemProvider.move(FilterFileSystemProvider.java:144) \| \| \| at org.apache.lucene.tests.mockfile.FilterFileSystemProvider.move(FilterFileSystemProvider.java:144) \| \| \| at org.apache.lucene.tests.mockfile.FilterFileSystemProvider.move(FilterFileSystemProvider.java:144) \| \| \| at org.apache.lucene.tests.mockfile.FilterFileSystemProvider.move(FilterFileSystemProvider.java:144) \| \| \| at java.nio.file.Files.move(Files.java:1430) \| \| \| at org.elasticsearch.reservedstate.service.SnaphotsAndFileSettingsIT.writeJSONFile(SnaphotsAndFileSettingsIT.java:86) \| \| \| at org.elasticsearch.reservedstate.service.SnaphotsAndFileSettingsIT.testRestoreWithPersistedFileSettings(SnaphotsAndFileSettingsIT.java:321) ``` This happens in Windows file systems, due to a race condition where the file settings service is reading the settings file concurrently with the test trying to modify it (a no-go in Windows). It turns out we have already addressed this with a retry for one test suite (https://github.com/elastic/elasticsearch/pull/91863), plus addressed a related issue around mock windows file-systems misbehaving (https://github.com/elastic/elasticsearch/pull/92653). This PR extends the above fixes to all file-settings related ITs. (cherry picked from commit `91559da015`)	2024-11-13 21:30:51 +11:00
Lorenzo Dematté	cb4485e168	[Entitlements] External IT test for checkSystemExit (#116435 ) (#116705 )	2024-11-13 20:41:48 +11:00
Patrick Doyle	37edf70bda	Backport entitlement work up to #116473 to 8.x (#116613 ) * Add initial entitlement policy parsing (#114448) This change adds entitlement policy parsing with the following design: * YAML file for readability and re-use of our x-content parsers * hierarchical structure to group entitlements under a single scope * no general entitlements without a scope or for the entire project * Avoid double instrumentation via class annotation (#115398) * Move entitlement jars to libs (#115883) The distribution tools are meant to be CLIs. This commit moves the entitlements jar projects to the libs dir, under a single libs/entitlement root directory to keep the related jars together. * Entitlement tools: SecurityManager scanner (#116020) * Dynamic entitlement agent (#116125) * Refactor: treat "maybe" JVM options uniformly * WIP * Get entitlement running with bridge all the way through, with qualified exports * Cosmetic changes to SystemJvmOptions * Disable entitlements by default * Bridge module comments * Fixup forbidden APIs * spotless * Rename EntitlementChecker * Fixup InstrumenterTests * exclude recursive dep * Fix some compliance stuff * Rename asm-provider * Stop using bridge in InstrumenterTests * Generalize readme for asm-provider * InstrumenterTests doesn't need EntitlementCheckerHandle * Better javadoc * Call parseBoolean * Add entitlement to internal module list * Docs as requested by Lorenzo * Changes from Jack * Rename ElasticsearchEntitlementChecker * Remove logging javadoc * exportInitializationToAgent should reference EntitlementInitialization, not EntitlementBootstrap. They're currently in the same module, but if that ever changes, this code would have become wrong. * Some suggestions from Mark --------- Co-authored-by: Ryan Ernst <ryan@iernst.net> * Remove unused EntitlementInternals (#116473) * Revert "Entitlement tools: SecurityManager scanner (#116020)" This reverts commit `023fb663de`. --------- Co-authored-by: Jack Conradson <osjdconrad@gmail.com> Co-authored-by: Lorenzo Dematté <lorenzo.dematte@elastic.co> Co-authored-by: Ryan Ernst <ryan@iernst.net>	2024-11-13 05:36:55 +11:00
Ying Mao	a49309f9f4	Hides `hugging_face_elser` service from the `GET _inference/_services API` (#116664 ) (#116677 ) * Adding hideFromConfigurationApi flag * Update docs/changelog/116664.yaml	2024-11-13 04:31:01 +11:00
Ying Mao	2ec5299460	Adds support for `input_type` field to Vertex inference service (#116431 ) (#116673 ) * Adding input type to google vertex ai service * Update docs/changelog/116431.yaml * PR feedback - backwards compatibility * Fix lint error (cherry picked from commit `7039a1dc8c`)	2024-11-13 04:13:04 +11:00
elasticsearchmachine	77881c697d	Bump versions after 8.16.0 release	2024-11-12 16:47:20 +00:00
Ignacio Vera	8e35324b8d	Deduplicate DocValueFormat objects from InternalAggregation when deserializing (#116640 ) (#116670 )	2024-11-13 03:06:37 +11:00
elasticsearchmachine	1ce95bbdd9	Bump versions after 8.15.4 release	2024-11-12 12:17:04 +00:00
Kostas Krikellas	de1db9877f	[8.x] Refactor DocumentDimensions to RoutingFields (#116321 ) (#116604 ) * Refactor DocumentDimensions to RoutingFields (#116321) * Refactor DocumentDimensions to RoutingFields * update * add test * add test * updates from review * updates from review * spotless * remove final from subclass * fix final (cherry picked from commit `2054357902`) # Conflicts: # server/src/main/java/org/elasticsearch/index/mapper/TimeSeriesIdFieldMapper.java * fix imports	2024-11-12 21:19:43 +11:00
Ignacio Vera	41e07cad23	Deduplicate non-empty InternalAggregation metadata when deserializing (#116589 ) (#116635 )	2024-11-12 18:49:45 +11:00
Keith Massey	7c3d4027cd	Adding a deprecation info API warning for data streams with old indices (#116447 ) (#116626 ) * Adding a deprecation info API warning for data streams with old indices (#116447) * removing use of a method not available in 8.x	2024-11-12 11:26:32 +11:00
Lorenzo Dematté	d698e72af3	[8.x] Add a cluster listener to fix missing system index mappings after upgrade (#115771 ) This PR modifies `TransportVersionsFixupListener` to include all of compatibility versions (not only TransportVersion) in the fixup. `TransportVersionsFixupListener` spots the instances when the master has been upgraded to the most recent code version, along with non-master nodes, but some nodes are missing a "proper" (non-inferred) Transport version. This PR adds another check to also ensure that we have real (non-empty) system index mapping versions. To do so, it modifies NodeInfo so it carries all of CompatibilityVersions (TransportVersion + SystemIndexDescriptor.MappingVersions). This was initially done via a separate fixup listener + ad-hoc transport action, but the 2 listeners "raced" to update ClusterState on the same CompatibilityVersions structure; it just made sense to do it at the same time. The fixup is very similar to https://github.com/elastic/elasticsearch/pull/110710, which does the same for cluster features; plus, it adds a CI test to cover the bug raised in https://github.com/elastic/elasticsearch/issues/112694 Closes https://github.com/elastic/elasticsearch/issues/112694	2024-11-12 05:45:10 +11:00
Lorenzo Dematté	67231ab0d8	Adding full CompatibilityVersions to NodeInfo (#116582 ) Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>	2024-11-12 03:05:38 +11:00
Benjamin Trent	f14c8bd306	Add new multi_dense_vector field for brute-force search (#116275 ) (#116526 ) This adds a new `multi_dense_vector` field that focuses on the maxSim usecase provided by Col[BERT\|Pali]. Indexing vectors in HNSW as it stands makes no sense. Performance wise or for cost. However, we should totally support rescoring and brute-force search over vectors with maxSim. This is step one of many. Behind a feature flag, this adds support for indexing any number of vectors of the same dimension. Supports bit/byte/float. Scripting support will be a follow up. Marking as non-issue as its behind a flag and unusable currently. (cherry picked from commit `7369c0818d`) Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>	2024-11-12 01:02:39 +11:00
Armin Braun	2b5faa9499	Two small improvemetns to IndexNameExpressionResolver (#116552 ) (#116563 ) Not using an iterator loop for the mostly single item list saves measurable runtime in the benchmarks for the resolver. Also, cleaned up a redundant method argument.	2024-11-11 22:43:12 +11:00
Benjamin Trent	308ad0c05f	[8.x] Add docvalue_fields Support for dense_vector Fields (#114484 ) (#116491 ) * Add `docvalue_fields` Support for `dense_vector` Fields (#114484) Currently dense_vector field don't support docvalue_fields. This add this support for debugging purposes. Users can inspect row values of their vectors even if the source is disabled. Co-authored-by: Mayya Sharipova <mayya.sharipova@elastic.co> (cherry picked from commit `c8a8d4d931`) * fixing for backport --------- Co-authored-by: Rassyan <yjkhngds@gmail.com>	2024-11-09 08:15:13 +11:00
Jake Landis	8adb2c4043	[8.x] Add a monitor_stats privilege and allow that privilege for remote cluster privileges (#114964 ) (#116517 ) * Add a monitor_stats privilege and allow that privilege for remote cluster privileges (#114964) This commit does the following: * Add a new monitor_stats privilege * Ensure that monitor_stats can be set in the remote_cluster privileges * Give's Kibana the ability to remotely call monitor_stats via RCS 2.0 Since this is the first case where there is more than 1 remote_cluster privilege, the following framework concern has been added: * Ensure that when sending to elder RCS 2.0 clusters that we don't send the new privilege previous only supported all or nothing remote_cluster blocks * Ensure that we when sending API key role descriptors that contains remote_cluster, we don't send the new privileges for RCS 1.0/2.0 if it not new enough * Fix and extend the BWC tests for RCS 1.0 and RCS 2.0 (cherry picked from commit `af99654dac`) * adjust bwc for 8.x branch	2024-11-09 06:26:14 +11:00
Benjamin Trent	4eb1c00535	Adjust analyze limit exception to be a bad_request (#116325 ) (#116495 ) The exception is due to large input on the user and is resolvable by either the user adjusting their request or changing their cluster settings. So a user focused error is preferred. I chose bad_request as it seemed like the best fit. closes: https://github.com/elastic/elasticsearch/issues/116323	2024-11-09 03:05:46 +11:00
Ignacio Vera	fc120f7708	Deduplicate the name of the aggregation when deserializing InternalAggregation (#116307 ) (#116457 )	2024-11-08 16:48:45 +01:00
Andrei Dan	d75ed26899	Validate missing shards after the coordinator rewrite (#116382 ) (#116489 ) The coordinate rewrite can skip searching shards when the query filters on `@timestamp`, event.ingested or the _tier field. We currently check for missing shards across all the indices that are the query is running against however, some shards/indices might not play a role in the query at all after the coordinator rewrite. This moves the check for missing shards after we've run the coordinator rewrite so we validate only the shards that will be searched by the query. (cherry picked from commit `cd2433d60c`) Signed-off-by: Andrei Dan <andrei.dan@elastic.co>	2024-11-09 02:43:33 +11:00
Aurélien FOUCRET	347b7fe369	[8.x] Add kql query to the DSL (#116262 ) (#116482 ) * Add kql query to the DSL (#116262) (cherry picked from commit `e2c29f5487`) # Conflicts: # server/src/main/java/org/elasticsearch/rest/action/search/SearchCapabilities.java * Fix typo introduced during merge.	2024-11-09 01:50:49 +11:00
Nhat Nguyen	9497410147	Add num docs and size to logsdb telemetry (#116128 ) (#116270 ) Follow-up on #115994 to add telemetry for the total number of documents and size in bytes of logsdb indices. Relates #115994	2024-11-07 14:18:33 -08:00
Dimitris Rempapis	bfefe8d789	Fields caps does not honour ignore_unavailable (#116021 ) (#116430 ) The IndicesOption has been updated into the FieldCapabilitiesRequest to encapsulate the following logic. If we target a closed index and ignore_unavailable=false, we get an IndexClosedException, otherwise if the request contains ignore_unavailable=true, we safely skip the closed index. (cherry picked from commit `3ae7921fb0`)	2024-11-08 05:52:27 +11:00
Nhat Nguyen	c57f4526d4	Fallback to field-caps (#115977 ) (#116429 ) This change falls back to the old field-caps action if the remote cluster has not been updated to 8.16 or later.	2024-11-08 05:39:22 +11:00
Ignacio Vera	5b6387b7eb	Deduplicate the list of names when deserializing InternalTopMetrics (#116298 ) (#116417 ) use deduplication infrastructure to deduplicate the names of metrics in InternalTopMetrics.	2024-11-08 03:18:25 +11:00
Nikolaj Volgushev	fd97a9b4d2	[8.x] Fix race conditions in file settings service tests (#116309 ) (#116402 ) * Merge * Fix merge	2024-11-08 03:14:35 +11:00
Iván Cea Fontenla	22c0eab6dc	Aggs: Add real memory CB call when building internal aggregators in buckets (#116329 ) (#116393 ) Related with https://github.com/elastic/elasticsearch/issues/88128 This PR pretends to reduce the potential OOMs received when building internal aggregations.	2024-11-07 22:29:23 +11:00
Matteo Piergiovanni	94498b4b41	[8.x] Better sizing BytesRef for Strings in Queries (#115655 ) (#116381 ) * Better sizing BytesRef for Strings in Queries (#115655) * Better sizing BytesRefs for Strings in Queries * Update docs/changelog/115655.yaml * iter * added test * iter * extracted method * iter --------- Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com> (cherry picked from commit `9ebe95a8a8`) * iter	2024-11-07 11:56:17 +01:00
Pooya Salehi	69df7fbfe1	Long balance computation should not delay new index primary assignment (#115511 ) (#116316 ) A long desired balance computation could delay a newly created index shard from being assigned since first the computation has to finish for the assignments to be published and the shards getting assigned. With this change we add a new setting which allows setting a maximum time for a computation in case there are unassigned primary shards. Note that this is similar to how a new cluster state causes early publishing of the desired balance. Closes ES-9616 Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>	2024-11-07 11:48:46 +01:00
Dan Rubinstein	0ac7f65096	Adding inference endpoint validation for AzureAiStudioService (#113713 ) (#116347 ) * Adding inference endpoint validation for AzureAiStudioService * Run spotlessApple * Update docs/changelog/113713.yaml * Remove isInClusterService from InferenceService * Run spotless apply --------- Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>	2024-11-07 06:09:03 +11:00
Ignacio Vera	2a51685fba	Make InternalCentroid leaner (#116302 ) (#116334 ) We are currently holding to fields to extract values, this commit makes them abstract methods so we don't use any heap.	2024-11-07 03:30:11 +11:00
Benjamin Trent	616b3908a0	[8.x] Add support for bitwise inner-product in painless (#116082 ) (#116285 ) * Add support for bitwise inner-product in painless (#116082) This adds bitwise inner product to painless. The idea here is: - For two bit arrays, which we determine to be a byte array whose dimensions match `dense_vector.dim/8`, we simply return bitwise `&` - For a stored bit array (remember, with `dense_vector.dim/8` bytes), sum up the provided byte or float array using the bit array as a mask. This is effectively supporting asynchronous quantization. A prime example of how this works is: https://github.com/cohere-ai/BinaryVectorDB Basically, you do your initial search against the binary space and then rerank with a differently quantized vector allowing for more information without additional storage space. closes: https://github.com/elastic/elasticsearch/issues/111232 * removing unnecessary task adjustment --------- Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>	2024-11-07 00:35:19 +11:00
Tim Brooks	735e6355a9	Parse bulk lines in individual steps (#114086 ) (#116210 ) Currently our incremental bulk parsing framework only parses once both the action line and document line are available. In addition, it will re-search lines for line delimiters as data is received. This commit ensures that the state is not lost in between parse attempts.	2024-11-05 12:26:10 -07:00

1 2 3 4 5 ...

14959 commits