elasticsearch

mirror of https://github.com/elastic/elasticsearch.git synced 2025-06-28 17:34:17 -04:00

Author	SHA1	Message	Date
Parker Timmins	9aaba25d58	Simple version of patterned_text with a single doc value for arguments (#129292 ) Initial version of patterned_text mapper. Behaves similarly to match_only_text. This version uses a single SortedSetDocValues for a template and another for arguments. It splits the message by delimiters, the classifies a token as an argument if it contains a digit. All arguments are concatenated and inserted as a single doc value. A single inverted index is used, without positions. Phrase queries are still possible, using the SourceConfirmedTextQuery, but are not fast.	2025-06-25 21:31:32 -05:00
Jordan Powers	5d1999781a	Use optimized text in match_only_text fields (#129371 ) Follow-up to #126492 to use the json parsing optimizations for match_only_text fields. Relates to #129072.	2025-06-17 08:15:40 -07:00
Martijn van Groningen	a0cc698fa2	Update multi field stored by default index version check (#129386 ) Relates to #129126	2025-06-17 12:20:38 +02:00
Simon Cooper	3988ee1935	Check positions on MultiPhraseQueries as well as phrase queries (#129326 )	2025-06-12 16:05:07 +01:00
Ignacio Vera	f02a3c423f	Revert "Use IndexOrDocValuesQuery in NumberFieldType#termQuery implementations (#128293 )" (#129206 ) This reverts commit `de7c91c1d9`.	2025-06-12 10:10:29 +02:00
Martijn van Groningen	33af83a0ca	Synthetic source: avoid storing multi fields of type text and match_only_text by default. (#129126 ) Don't store text and match_only_text field by default when source mode is synthetic and a field is a multi field or when there is a suitable multi field. Without this change, ES would store field otherwise twice in a multi-field configuration. For example: ``` ... "os": { "properties": { "name": { "ignore_above": 1024, "type": "keyword", "fields": { "text": { "type": "match_only_text" } } } ... ``` In this case, two stored fields were added, one in case for the `name` field and one for `name.text` multi-field. This change prevents this, and would never store a stored field when text or match_only_text field is a multi-field.	2025-06-10 16:32:47 +02:00
Benjamin Trent	2a44166a2c	Applying Apache Lucene fix: https://github.com/apache/lucene/pull/14732 (#128671 ) * Applying Apache Lucene fix: https://github.com/apache/lucene/pull/14732 * fixing test * fixing annot	2025-06-02 09:50:25 -04:00
Ignacio Vera	de7c91c1d9	Use IndexOrDocValuesQuery in NumberFieldType#termQuery implementations (#128293 )	2025-05-23 16:58:50 +02:00
Oleksandr Kolomiiets	0c1b3acee2	Properly handle multi fields in block loaders with synthetic source enabled (#127483 )	2025-04-30 09:33:35 -07:00
Benjamin Trent	3d67e0e7ca	Fix npe when using source confirmed text query against missing field (#127414 ) We should check for the field and statistics actually existing when checking matches and explanation with `match_only_text` fields closes: https://github.com/elastic/elasticsearch/issues/125635	2025-04-30 03:05:01 +10:00
Oleksandr Kolomiiets	26e2261132	Remove legacy block loader test infrastructure (#127273 )	2025-04-25 10:26:27 -07:00
Oleksandr Kolomiiets	5e2b199b94	[TEST] Move test data generation out of logsdb namespace (#119994 )	2025-04-23 08:29:32 -07:00
Jordan Powers	71e74bdd66	Store arrays offsets for scaled float fields natively with synthetic source (#125793 ) This patch builds on the work in #113757, #122999, #124594, #125529, and #125709 to natively store array offsets for scaled float fields instead of falling back to ignored source when synthetic_source_keep: arrays.	2025-03-28 20:26:29 +01:00
Oleksandr Kolomiiets	033d28e792	Use FallbackSyntheticSourceBlockLoader for shape and geo_shape (#124927 )	2025-03-18 08:49:08 -07:00
Nik Everett	50aaa1c2a6	ESQL: Pragma to load from stored fields (#122891 ) This creates a `pragma` you can use to request that fields load from a stored field rather than doc values. It implements that pragma for `keyword` and number fields. We expect that, for some disk configuration and some number of fields, that it's faster to load those fields from _source or stored fields than it is to use doc values. Our default is doc values and on my laptop it's always faster to use doc values. But we don't ship my laptop to every cluster. This will let us experiment and debug slow queries by trying to load fields a different way. You access this pragma with: ``` curl -HContent-Type:application/json -XPOST localhost:9200/_query?pretty -d '{ "query": "FROM foo", "pragma": { "field_extract_preference": "STORED" } }' ``` On a release build you'll need to add `"accept_pragma_risks": true`.	2025-03-12 09:40:42 -04:00
Oleksandr Kolomiiets	99262c6256	Use FallbackSyntheticSourceBlockLoader for boolean and date fields (#124050 )	2025-03-05 11:43:47 -08:00
Gal Lalouche	a6e47ae85b	Refactor FieldCapabilities creation by adding a proper builder object (#121310 ) Reduce boilerplate associated with creating `FieldCapabilities` instances. Since it's a class with a huge number of fields, it makes sense to define a builder object, as that can also help with all the Boolean and null blindness going on. Note while there is a static Builder class in `FieldCapabilities`, it is not a proper builder object (no setters, still need to pass a lot of otherwise default parameters) and also package-private. To avoid changing that, I defined a new `FieldCapabilitiesBuilder` class. I also went over the code and refactored places which used the old constructor.	2025-03-05 13:09:36 +01:00
Martijn van Groningen	086329c5cb	Tidy up some noise during indexing with synthetic source. (#123724 )	2025-02-28 16:52:17 +00:00
kanoshiou	7326928502	Fix failed ScaledFloatFieldMapperTests (#123144 )	2025-02-21 11:34:46 -08:00
kanoshiou	de41d5704b	ESQL: Fix precision of `scaled_float` field values retrieved from stored source (#122586 )	2025-02-20 14:01:34 -08:00
Oleksandr Kolomiiets	ba8c5764f8	Use FallbackSyntheticSourceBlockLoader for unsigned_long and scaled_float fields (#122637 )	2025-02-18 09:28:26 -08:00
Oleksandr Kolomiiets	b8d7e99cb9	Use FallbackSyntheticSourceBlockLoader for number fields (#122280 )	2025-02-12 16:12:19 -08:00
Chris Hegarty	4baffe4de1	Upgrade to Lucene 10.1.0 (#119308 ) This commit upgrades to Lucene 10.1.0.	2025-01-30 13:41:02 +00:00
Kostas Krikellas	8de9539e29	Lazy initialization for `SyntheticSourceSupport.loader()` (#120896 ) * Lazy initialization for `SyntheticSourceSupport.loader()` * [CI] Auto commit changes from spotless * add missing --------- Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>	2025-01-27 17:12:42 +02:00
Rene Groeschke	ba61f8c7f7	Update Gradle wrapper to 8.12 (#118683 ) This updates the gradle wrapper to 8.12 We addressed deprecation warnings due to the update that includes: - Fix change in TestOutputEvent api - Fix deprecation in groovy syntax - Use latest ospackage plugin containing our fix - Remove project usages at execution time - Fix deprecated project references in repository-old-versions	2024-12-30 15:34:24 +01:00
Armin Braun	e94f145350	Fix a bunch of non-final static fields (#119185 ) Fixing almost all missing `final` spots, who knows maybe we get a small speedup from some constant folding here and there.	2024-12-26 19:14:36 +01:00
Dimitris Rempapis	a514aad3c2	Fix/meta fields bad request (#117229 ) 400 rather a 5xx error is returned when _source / _seq_no / _feature / _nested_path / _field_names is requested, via fields	2024-12-03 10:58:20 +02:00
Oleksandr Kolomiiets	54db947020	Fix scaled_float test (#117662 )	2024-11-28 07:33:35 -08:00
Oleksandr Kolomiiets	2b8e4e727c	Migrate mapper-related modules to internal-*-rest-test (#117298 )	2024-11-23 00:35:24 +00:00
Rene Groeschke	f6ac6e1c3b	[Build] Remove deprecated BuildParams (#116984 )	2024-11-22 16:30:57 +01:00
Rene Groeschke	13c8aaeffa	[Gradle] Remove static use of BuildParams (#115122 ) Static fields dont do well in Gradle with configuration cache enabled. - Use buildParams extension in build scripts - Keep BuildParams.ci for now for easy serverless migration - Tweak testing doc	2024-11-15 17:58:57 +01:00
Kostas Krikellas	4573ab8ec1	[TEST] Replace _source.mode with index.mapping.source.mode in integration tests - take 2 (#116072 ) * Reapply "[TEST] Replace _source.mode with index.mapping.source.mode in integra…" (#116069) This reverts commit `e8bf344a28`. * [TEST] Replace _source.mode with index.mapping.source.mode in integration tests * add reason * add reason * spotless * revert unneeded	2024-11-04 09:39:34 +02:00
Kostas Krikellas	e8bf344a28	Revert "[TEST] Replace _source.mode with index.mapping.source.mode in integra…" (#116069 ) This reverts commit `a360757968`.	2024-11-01 10:53:08 +02:00
Kostas Krikellas	a360757968	[TEST] Replace _source.mode with index.mapping.source.mode in integration tests (#115926 ) * Replace _source.mode with index.mapping.source.mode in integration tests * fix tests * revert 40_source_mode_setting.yml	2024-11-01 09:46:06 +02:00
Nhat Nguyen	f3b34f3e34	Remove old synthetic source mapping config (#115889 ) This change replaces the old synthetic source config in mappings with the newly introduced index setting. Closes #115859	2024-10-30 09:15:16 -07:00
Martijn van Groningen	387062eb80	Sometimes delegate to SourceLoader in ValueSourceReaderOperator for required stored fields (#115114 ) If source is required by a block loader then the StoredFieldsSpec that gets populated should be enhanced by SourceLoader#requiredStoredFields(...) in ValuesSourceReaderOperator. Otherwise in case of synthetic source many stored fields aren't loaded, which causes only a subset of _source to be synthesized. For example when unmapped fields exist or field values that exceed configured ignore above will not appear is _source. This happens when field types fallback to a block loader implementation that uses _source. The required field values are then extracted from the source once loaded. This change also reverts the production code changes introduced via #114903. That change only ensured that _ignored_source field was added to the required list of stored fields. In reality more fields could be required. This change is better fix, since it handles also other cases and the SourceLoader implementation indicates which stored fields are needed. Closes #115076	2024-10-23 10:20:42 +02:00
Luca Cavanna	8efd08b019	Upgrade to Lucene 10 (#114741 ) The most relevant ES changes that upgrading to Lucene 10 requires are: - use the appropriate IOContext - Scorer / ScorerSupplier breaking changes - Regex automaton are no longer determinized by default - minimize moved to test classes - introduce Elasticsearch900Codec - adjust slicing code according to the added support for intra-segment concurrency - disable intra-segment concurrency in tests - adjust accessor methods for many Lucene classes that became a record - adapt to breaking changes in the analysis area Co-authored-by: Christoph Büscher <christophbuescher@posteo.de> Co-authored-by: Mayya Sharipova <mayya.sharipova@elastic.co> Co-authored-by: ChrisHegarty <chegar999@gmail.com> Co-authored-by: Brian Seeders <brian.seeders@elastic.co> Co-authored-by: Armin Braun <me@obrown.io> Co-authored-by: Panagiotis Bailis <pmpailis@gmail.com> Co-authored-by: Benjamin Trent <4357155+benwtrent@users.noreply.github.com>	2024-10-21 13:38:23 +02:00
Martijn van Groningen	c62a96c8ab	Include ignored source as part of loading field values in ValueSourceReaderOperator via BlockSourceReader. (#114903 ) Currently, in compute engine when loading source if source mode is synthetic, the synthetic source loader is already used. But the ignored_source field isn't always marked as a required source field, causing the source to potentially miss a lot of fields. This change includes _ignored_source field as a required stored field and allowing keyword fields without doc values or stored fields to be used in case of synthetic source. Relying on synthetic source to get the values (because a field doesn't have stored fields / doc values) is slow. In case of synthetic source we already keep ignored field/values in a special place, named ignored source. Long term in case of synthetic source we should only load ignored source in case a field has no doc values or stored field. Like is being explored in #114886 Thereby avoiding synthesizing the complete _source in order to get only one field.	2024-10-18 07:49:00 +02:00
Oleksandr Kolomiiets	2c10a18774	Fix block loader tests for token_count (#113718 )	2024-10-01 10:25:26 -07:00
Chris Hegarty	32dde26e49	Upgrade to Lucene 9.12.0 (#113333 ) This commit upgrades to Lucene 9.12.0. Co-authored-by: Adrien Grand <jpountz@gmail.com> Co-authored-by: Armin Braun <me@obrown.io> Co-authored-by: Benjamin Trent <ben.w.trent@gmail.com> Co-authored-by: Chris Hegarty <chegar999@gmail.com> Co-authored-by: John Wagster <john.wagster@elastic.co> Co-authored-by: Luca Cavanna <javanna@apache.org> Co-authored-by: Mayya Sharipova <mayya.sharipova@elastic.co>	2024-10-01 08:39:27 +01:00
Mark Vieira	a59c182f9f	Add AGPLv3 as a supported license	2024-09-13 15:29:46 -07:00
Kostas Krikellas	86a88d735f	Fix synthetic source field names for multi-fields (#112850 ) * Fix synthetic source field names for multi-fields * enable logsdb in randomized tests * Revert "enable logsdb in randomized tests" This reverts commit `2e2c22e2bb`. * Update docs/changelog/112850.yaml * fix	2024-09-13 15:00:55 +03:00
Oleksandr Kolomiiets	082e7211b3	Use fallback synthetic source for copy_to and doc_values: false cases (#112294 )	2024-09-10 12:12:51 -07:00
Kostas Krikellas	f3bc281978	Refactor build params for FieldMapper, adding SourceKeepMode (#112455 ) * Refactor build params for FieldMapper * more mappers and tests * more mappers * more mappers * spotless * spotless * stored by default * Revert "stored by default" This reverts commit `bbd247d64b`. * restore storeIgnored * sync * list valid values for SourceKeepMode * small refactoring * spotless	2024-09-06 14:16:17 +03:00
Oleksandr Kolomiiets	38adbb0724	Prevent synthetic field loaders accessing stored fields from using stale data (#112173 )	2024-08-27 14:55:00 -07:00
Luca Cavanna	915e4a50c5	Rename Mapper#name to Mapper#fullPath (#110040 ) This addresses a long standing TODO that caused quite a few bugs over time, in that the mapper name does not include its full path, while the MappedFieldType name does. We have renamed Mapper.Builder#name to leafName (#109971) and Mapper#simpleName to leafName (#110030). This commit renames Mapper#name to fullPath for clarity This required some adjustments in FieldAliasMapper to avoid confusion between the existing path method and fullPath. I renamed path to targetPath for clarity. ObjectMapper already had a fullPath method that returned name, and was effectively a copy of name, so it could be removed.	2024-06-21 22:47:27 +02:00
Luca Cavanna	54e7b4d93b	Rename Mapper#simpleName to Mapper#leafName (#110030 ) This addresses a long standing TODO that caused quite a few bugs over time, in that the mapper name does not include its full path, while the MappedFieldType name does. We have method called simpleName to signal that, but leafName signals that more clearly and aligns with the name we have recently introduced in Mapper.Builder (renamed from name to leafName). Relates to #109971	2024-06-21 14:28:36 +02:00
Luca Cavanna	15c7abe111	Rename Mapper#name to Mapper#leafName (#109971 ) This addresses a long standing TODO that caused quite a few bugs over time, in that the mapper name does not include its full path, while the MappedFieldType name does.	2024-06-21 11:48:17 +02:00
Oleksandr Kolomiiets	60a34f2a90	Do not use nested arrays as malformed values in scaled_float and unsigned_long synthetic source tests (#109650 ) They don't provide any additional value because arrays are parsed at the level above and tests already cover arrays. Fixes #109649.	2024-06-13 07:15:32 +10:00
Oleksandr Kolomiiets	c847235ed0	Support synthetic source for scaled_float and unsigned_long when ignore_malformed is used (#109506 )	2024-06-12 11:05:23 -07:00

1 2 3 4 5 ...

359 commits