* Default new semantic_text fields to use BBQ when models are compatible
* Update docs/changelog/126629.yaml
* Gate default BBQ by IndexVersion
* Cleanup from PR feedback
* PR feedback
* Fix test
* Fix test
* PR feedback
* Update test to test correct options
* Hack alert: Fix issue where mapper service was always being created with current index version
This change integrates the new model registry with the `SemanticTextFieldMapper`, allowing inference IDs to be eagerly resolved at parse time.
It also preserves the existing lenient behavior: no error is thrown if the specified inference id does not exist, only a warning is logged.
* Revert endpoint creation validation for ELSER and E5
* Update docs/changelog/126792.yaml
* Revert start model deployment being in TransportPutInferenceModelAction
---------
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
We have observed many OOMs due to the memory required to inject chunked inference results for semantic_text fields. This PR uses coordinating indexing pressure to account for this memory usage. When indexing pressure memory usage exceeds the threshold set by indexing_pressure.memory.limit, chunked inference result injection will be suspended to prevent OOMs.
I noticed that we tend to create the flag instance and call this method
everywhere. This doesn't compile the same way as a real boolean constant
unless you're running with `-XX:+TrustFinalNonStaticFields`.
For most of the code spots changed here that's irrelevant but at least
the usage in the mapper parsing code is a little hot and gets a small
speedup from this potentially.
Also we're simply wasting some bytes for the static footprint of ES by
using the `FeatureFlag` indirection instead of just a boolean.
Add support for Cohere Task Settings and Truncate, through
the Amazon Bedrock provider integration.
Task Settings can now be passed bother during Inference endpoint
creation and Inference POST requests.
Close#126156
* Adding validation to ElasticsearchInternalService
* Update docs/changelog/123044.yaml
* [CI] Auto commit changes from spotless
* Removing checkModelConfig
* Fixing IT
* [CI] Auto commit changes from spotless
* Remove DeepSeek checkModelConfig and fix tests
* Cleaning up comments, updating validation input type, and moving model deployment starting to model validator
---------
Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
Transport actions have associated request and response classes. However,
the base type restrictions are not necessary to duplicate when creating
a map of transport actions. Relatedly, the ActionHandler class doesn't
actually need strongly typed action type and classes since they are lost
when shoved into the node client map. This commit removes these type
restrictions and generic parameters.
* test
* Revert "test"
This reverts commit 9f4e2adba0.
* Refactor InferenceService to allow passing in chunking settings
* Add chunking config to inference field metadata and store in semantic_text field
* Fix test compilation errors
* Hacking around trying to get ingest to work
* Debugging
* [CI] Auto commit changes from spotless
* POC works and update TODO to fix this
* [CI] Auto commit changes from spotless
* Refactor chunking settings from model settings to field inference request
* A bit of cleanup
* Revert a bunch of changes to try to narrow down what broke CI
* test
* Revert "test"
This reverts commit 9f4e2adba0.
* Fix InferenceFieldMetadataTest
* [CI] Auto commit changes from spotless
* Add chunking settings back in
* Update builder to use new map
* Fix compilation errors after merge
* Debugging tests
* debugging
* Cleanup
* Add yaml test
* Update tests
* Add chunking to test inference service
* Trying to get tests to work
* Shard bulk inference test never specifies chunking settings
* Fix test
* Always process batches in order
* Fix chunking in test inference service and yaml tests
* [CI] Auto commit changes from spotless
* Refactor - remove convenience method with default chunking settings
* Fix ShardBulkInferenceActionFilterTests
* Fix ElasticsearchInternalServiceTests
* Fix SemanticTextFieldMapperTests
* [CI] Auto commit changes from spotless
* Fix test data to fit within bounds
* Add additional yaml test cases
* Playing with xcontent parsing
* A little cleanup
* Update docs/changelog/121041.yaml
* Fix failures introduced by merge
* [CI] Auto commit changes from spotless
* Address PR feedback
* [CI] Auto commit changes from spotless
* Fix predicate in updated test
* Better handling of null/empty ChunkingSettings
* Update parsing settings
* Fix errors post merge
* PR feedback
* [CI] Auto commit changes from spotless
* PR feedback and fix Xcontent parsing for SemanticTextField
* Remove chunking settings check to use what's passed in from sender service
* Fix some tests
* Cleanup
* Test failure whack-a-mole
* Cleanup
* Refactor to handle memory optimized bulk shard inference actions - this is ugly but at least it compiles
* [CI] Auto commit changes from spotless
* Minor cleanup
* A bit more cleanup
* Spotless
* Revert change
* Update chunking setting update logic
* Go back to serializing maps
* Revert change to model settings - source still errors on missing model_id
* Fix updating chunking settings
* Look up model if null
* Fix test
* Work around https://github.com/elastic/elasticsearch/issues/125723 in semantic text field serialization
* Add BWC tests
* Add chunking_settings to docs
* Refactor/rename
* Address minor PR feedback
* Add test case for null update
* PR feedback - adjust refactor of chunked inputs
* Refactored AbstractTestInferenceService to return offsets instead of just Strings
* [CI] Auto commit changes from spotless
* Fix tests where chunk output was of size 3
* Update mappings per PR feedback
* PR Feedback
* Fix problems related to merge
* PR optimization
* Fix test
* Delete extra file
---------
Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
ServerSentEvent is now a record with `event` and `data`, rather than
it being a record for value with a separate `ServerSentEventField`.
- `value` was renamed to `data`
- `hasValue` was renamed to `hasData`
- Parsing was refactored to read more like its spec: writing to a buffer
and flushing when we reach a blank newline
- We now support multiline data payloads
* Initial draft test with index version setup
* Adding test in phases
* [CI] Auto commit changes from spotless
* Adding test for search functionality
* Adding test for highlighting
* Adding randomization during selection process
* Fix code styles by running spotlessApply
* Fix code styles by running spotlessApply
* Fixing forbiddenAPIcall issue
* Decoupled namedWritables to use separate fake plugin and simplified other override methods
* Updating settings string to variable and removed unused code
* Fix SemanticQueryBuilder dependencies
* fix setting maximum number of tests to run
* utilizing semantci_text index version param and removed unwanted override
---------
Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
The chunked text is only required when the actual inference request is made,
using a string supplier means the string creation can be done much much closer
to where the request is made reducing the lifespan of the copied string.
The Elastic inference service removes the default models at startup if the node cannot access EIS.
Since #125242 we don't store default models in the cluster state but we still try to delete them.
This change ensures that we don't try to update the cluster state when a default model is deleted
since the delete is not performed on the master node and default models are never stored in the cluster state.
In preparation for integrating with SageMaker, we want to reuse the
existing SecretSettings.
- AmazonBedrockSecretSettings moved from services.amazonbedrock to
common.amazon.
- AmazonBedrockSecretSettings was renamed to AwsSecretSettings.
- accessKey and secretKey are now encapsulated.
Cohere embeddings are expected to be normalized to unit vectors, but due to floating point precision issues,
our check ({@link DenseVectorFieldMapper#isNotUnitVector(float)}) often fails.
This change fixes this bug by setting the default similarity for newly created Cohere inference endpoint to cosine.
Closes#122878
ModelRegistryMetadata has now been backported to 8.19 via #125150. This update ensures that we properly differentiate between nodes running 8.19.x (which supports the new custom metadata) and 9.0.x (which does not).
To achieve this, this PR introduces a new `supportsVersion(TransportVersion)` method for `NamedWriteable` and `NamedDiff`, allowing subclasses to customize their backward compatibility behavior.
When retrieving a default inference endpoint for the first time, the system automatically creates the endpoint.
However, unlike the `put inference model` action, the `get` action does not redirect the request to the master node.
Since #121106, we rely on the assumption that every model creation (`put model`) must run on the master node, as it modifies the cluster state. However, this assumption led to a bug where the get action tries to store default inference endpoints from a different node.
This change resolves the issue by preventing default inference endpoints from being added to the cluster state. These endpoints are not strictly needed there, as they are already reported by inference services upon startup.
**Note:** This bug did not prevent the default endpoints from being used, but it caused repeated attempts to store them in the index, resulting in logging errors on every usage.