elasticsearch

mirror of https://github.com/elastic/elasticsearch.git synced 2025-04-25 15:47:23 -04:00

Author	SHA1	Message	Date
István Zoltán Szabó	b507537bf0	[DOCS] Expands param descriptions for semantic_text (#114024 ) (#114055 ) Co-authored-by: Mike Pellegrini <mike.pellegrini@elastic.co>	2024-10-04 04:13:11 +10:00
john-wagster	8a8ad1b815	updated rangetype to be more inline with the docs (https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-range-query.html ) and added tests to reflect as much (#113872 )	2024-10-02 01:40:55 +10:00
Simon Cooper	53d9c3cc6a	Add some information on locale database to the ES docs (#113587 )	2024-09-30 09:28:13 +01:00
Kostas Krikellas	7b3d726eca	Revert "Apply auto-flattening to `subobjects: auto` (#112092 )" (#113692 ) (#113760 ) * Revert "Apply auto-flattening to `subobjects: auto` (#112092)" This reverts commit `fffe8844` * fix DataGenerationHelper (cherry picked from commit `c9f378da29`) # Conflicts: # server/src/main/java/org/elasticsearch/index/mapper/DocumentParserContext.java	2024-09-30 18:19:26 +10:00
István Zoltán Szabó	cf55728d77	[DOCS] Improves semantic text documentation. (#113606 ) (#113611 )	2024-09-27 00:34:37 +10:00
Kostas Krikellas	8539876663	[8.x] Apply auto-flattening to `subobjects: auto` (#113584 ) * Apply auto-flattening to `subobjects: auto` (#112092) * Introduce mode `subobjects=auto` for objects * Update docs/changelog/110524.yaml * compilation error * tests and fixes * refactor * spotless * more tests * fix nested objects * fix test * update fetch test * add QA coverage * update tests * update tests * update tests * Apply auto-flattening to `subobjects: auto` * Update docs/changelog/112092.yaml * sync * dont flatten subobjects auto * refine test * fix path for nested flattened objects and dynamic * document `subobjects: auto` * Apply suggestions from code review Co-authored-by: Felix Barnsteiner <felixbarny@users.noreply.github.com> * comment updates * restore indentation in comment * update comment * update comment * update comment * update comment * rename isFlattenable * add test for dynamic template * fix copy_to and noop dynamic updates * tests * update comment * fix tests * update cluster feature in yaml test * address comments --------- Co-authored-by: Felix Barnsteiner <felixbarny@users.noreply.github.com> (cherry picked from commit `fffe8844e9`) # Conflicts: # modules/dot-prefix-validation/build.gradle # rest-api-spec/build.gradle * Update build.gradle	2024-09-26 20:17:11 +10:00
Salvatore Campagna	bac208a154	Introduce an `ignore_above` index-level setting (#113121 ) (#113414 ) Here we introduce a new index-level setting, `ignore_above`, similar to what we have for `ignore_malformed`. The setting will apply to all `keyword`, `wildcard` and `flattened` fields. Each field mapping will still be allowed to override the index-level setting using a mapping-level `ignore_above` value. (cherry picked from commit `208a1fe571`)	2024-09-24 06:16:08 +10:00
Felix Barnsteiner	0aebbb53d6	[8.x] Add support for multi-value dimensions (#112645 ) (#113369 ) * Add support for multi-value dimensions (#112645) Closes https://github.com/elastic/elasticsearch/issues/110387 Having this in now affords us not having to introduce version checks in the ES exporter later. We can simply use the same serialization logic for metric attributes as we do for other signals. This also enables us to properly map `.ip` fields to the ip field type as ip fields containing a list of IPs are not converted to a comma-separated list. (cherry picked from commit `8d223cbf7a`) # Conflicts: # server/src/main/java/org/elasticsearch/index/mapper/TimeSeriesIdFieldMapper.java Remove skip test for 8.x This was just needed for 8.x to 9.0 compatibility tests	2024-09-24 00:05:25 +10:00
Stef Nestor	a4dba7db8d	(Doc+) Sparse Vectors NA to mapping analyzers (#112523 ) * retry	2024-09-05 09:19:19 -06:00
Simon Cooper	a36d90cf34	Use CLDR locale provider on JDK 23+ (#110222 ) JDK 23 removes the COMPAT locale provider, leaving CLDR as the only option. This commit configures Elasticsearch to use the CLDR provider when on JDK 23, but still use the existing COMPAT provider when on JDK 22 and below. This causes some differences in locale behaviour; this also adapts various tests to still work whether run on COMPAT or CLDR.	2024-09-04 13:42:40 +01:00
Ignacio Vera	3747765ab8	[DOC] geo_shape field type supports geo_hex aggregation (#112448 )	2024-09-04 11:12:11 +02:00
István Zoltán Szabó	2c29a3ae0a	[DOCS] Highlights auto-chunking in intro of semantic text. (#111836 )	2024-08-29 12:43:10 +02:00
Liam Thompson	4034615e29	[DOCS] Clarify copy_to behavior with strict dynamic mappings (#111408 ) * [DOCS] Clarify copy_to behavior with strict dynamic mappings * Add id * De-verbosify * Delete pesky comma * More info about root and nest * Fixes per review, clarify non-recursive explanation * Skip tests for illustrative example * Fix example syntax * Fix typo	2024-08-01 14:37:17 +02:00
Felix Barnsteiner	3090438037	Add support for boolean dimensions (#111457 ) Closes #111338	2024-07-31 23:00:32 +10:00
István Zoltán Szabó	1a5b008921	[DOCS] Clarifies semantic query behavior on sparse and dense vector fields (#111339 ) * [DOCS] Clarifies semantic query behavior on sparse and dense vector fields. * [DOCS] Adds a NOTE to the semantic query docs.	2024-07-26 16:53:38 +02:00
Carlos Delgado	ff3a77ca46	Clarify some semantic_text docs (#111329 )	2024-07-26 16:45:29 +02:00
István Zoltán Szabó	22ead8d106	[DOCS] Documents automatic text chunking behavior for semantic text. (#111331 )	2024-07-26 12:02:47 +02:00
Tommaso Teofili	9b86fd17aa	Document how to update dense vector field type (#111038 )	2024-07-23 09:55:31 +02:00
Ioana Tagirta	e99aaad800	Document how to query for a specific feature within rank_features (#110749 )	2024-07-11 16:19:14 +02:00
Oleksandr Kolomiiets	276ae121c2	Reflect latest changes in synthetic source documentation (#109501 )	2024-07-04 09:48:04 -07:00
Carlos Delgado	30b32b6a46	semantic_text: Updated copy-to docs (#110350 )	2024-07-03 10:18:40 +02:00
Kathleen DeRusso	7a1d532ffb	Pass over Sparse Vector docs for correctness (#110282 ) * Remove legacy mentions of text expansion queries * Add missing query_vector param to sparse_vector query docs * Fix formatting errors in sparse vector query dsl doc * Remove unnecessary test setup block	2024-07-02 13:37:25 -04:00
Felix Barnsteiner	cdbe092d90	Update docs now that keyword dimensions support ignore_above (#110385 ) This is a follow-up from https://github.com/elastic/elasticsearch/pull/110337	2024-07-02 17:04:57 +02:00
Benjamin Trent	5add44d7d1	Adds new `bit` element_type for dense_vectors (#110059 ) This commit adds `bit` vector support by adding `element_type: bit` for vectors. This new element type works for indexed and non-indexed vectors. Additionally, it works with `hnsw` and `flat` index types. No quantization based codec works with this element type, this is consistent with `byte` vectors. `bit` vectors accept up to `32768` dimensions in size and expect vectors that are being indexed to be encoded either as a hexidecimal string or a `byte[]` array where each element of the `byte` array represents `8` bits of the vector. `bit` vectors support script usage and regular query usage. When indexed, all comparisons done are `xor` and `popcount` summations (aka, hamming distance), and the scores are transformed and normalized given the vector dimensions. Note, indexed bit vectors require `l2_norm` to be the similarity. For scripts, `l1norm` is the same as `hamming` distance and `l2norm` is `sqrt(l1norm)`. `dotProduct` and `cosineSimilarity` are not supported. Note, the dimensions expected by this element_type are always to be divisible by `8`, and the `byte[]` vectors provided for index must be have size `dim/8` size, where each byte element represents `8` bits of the vectors. closes: https://github.com/elastic/elasticsearch/issues/48322	2024-06-27 04:48:41 +10:00
Mayya Sharipova	5c87eef89d	[DOCS Vectors with cosine automatically normalized (#110071 ) PR #99445 introduced automatic normalization of dense vectors with cosine similarity. This adds a note about this in the documentation. Relates to #99445	2024-06-22 22:32:25 +10:00
Oleksandr Kolomiiets	8bc5ecdc31	Support synthetic source together with ignore_malformed in histogram fields (#109882 )	2024-06-20 09:09:45 -07:00
Oleksandr Kolomiiets	5440f178aa	Support synthetic source for geo_point when ignore_malformed is used (#109651 )	2024-06-18 08:37:27 -07:00
Benjamin Trent	3aed0afb2b	Add new int4 quantization to dense_vector (#109317 ) This adds a new quantization mechanism for HNSW and flat indices. Here we add `int4` quantization via the `int4_hnsw` and `int4_flat` index types. This quantization methodology further reduces the memory required for fast HNSW, meaning that the memory required is 8x smaller than with regular float32 values. 8x reduction means that 1M 1024 dimension vectors goes from requiring 3.8GB to 477MB. Recall continues to stay steady, there is some reduction that is recoverable via slightly oversampling and reranking. For example over 500k CohereV3 vectors, only 5 extra vectors are required to be gathered to achieve over 0.98 recall in a brute-force scenario. ![recall](`b47a79d0`-020d-4baa-8199-41a932df00f7)	2024-06-18 00:15:43 +10:00
Carlos Delgado	d10dfb4ac5	Add limitations section to semantic_text field type docs (#109666 )	2024-06-13 15:19:00 +02:00
Oleksandr Kolomiiets	c847235ed0	Support synthetic source for scaled_float and unsigned_long when ignore_malformed is used (#109506 )	2024-06-12 11:05:23 -07:00
Benjamin Trent	29288d6590	Merge remote-tracking branch 'upstream/main' into lucene_snapshot_9_11	2024-06-11 06:54:23 -04:00
Carlos Delgado	d975997a3a	Add semantic-text warning about inference endpoints removal (#109561 )	2024-06-11 18:33:25 +10:00
Oleksandr Kolomiiets	a9f31bd2aa	Support synthetic source for date fields when ignore_malformed is used (#109410 )	2024-06-10 10:26:31 -07:00
john-wagster	dd83b5b8d0	Multivalue Sparse Vector Support (#109007 ) Updated LuceneDocument to take advantage of looking up feature values on existing features and selecting the max when parsing multi-value sparse vectors	2024-06-04 12:50:58 -04:00
István Zoltán Szabó	95ce898436	[DOCS] Adds docs to semantic text (#108311 ) Co-authored-by: Carlos Delgado <6339205+carlosdelest@users.noreply.github.com> Co-authored-by: Mike Pellegrini <mike.pellegrini@elastic.co> Co-authored-by: Kathleen DeRusso <kathleen.derusso@elastic.co>	2024-05-31 16:56:07 +02:00
Oleksandr Kolomiiets	42f4294a86	Enable fallback synthetic source for token_count (#109044 )	2024-05-27 10:22:59 -07:00
Oleksandr Kolomiiets	eea996c172	Add synthetic source support for geo_shape via fallback implementation (#108881 ) This PR enables geo_shape mapper to use fallback synthetic source infrastructure and as such adds synthetic source support for this field type.	2024-05-24 10:19:22 -07:00
Oleksandr Kolomiiets	8cfdbcc9a4	Documentation for ignore_malformed support with synthetic source for aggregate_metric_double (#108983 )	2024-05-24 09:49:38 -07:00
Kathleen DeRusso	7f35f1bed0	Add sparse_vector query (#108254 ) --------- Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com> Co-authored-by: Benjamin Trent <ben.w.trent@gmail.com>	2024-05-22 17:06:57 -04:00
Oleksandr Kolomiiets	91d502cec6	Add generic fallback implementation for synthetic source (#108222 ) This PR uses infrastructure from #107567 to implement a fallback implementation of synthetic source for field mappers that don't support it natively. In that case we will store source of such field as is in a separate stored field.	2024-05-21 11:30:30 -07:00
Oleksandr Kolomiiets	a454ac1987	Do not produce infinity values in synthetic source for range fields (#108699 )	2024-05-17 09:19:14 -07:00
Thomas Neirynck	6020bc7e06	[Docs] Add warning kibana has incomplete support for nested fields (#107971 )	2024-05-13 08:42:21 -04:00
Oleksandr Kolomiiets	c3d45b99f2	Document binary field defauls in TSDB indices (#108046 )	2024-04-30 08:02:16 -07:00
Benjamin Trent	67748cf616	Adding docs about scaled_float saturation with long values (#107966 )	2024-04-30 08:25:37 -04:00
eyalkoren	ee262954ee	Adding aggregations support for the `_ignored` field (#101373 ) Enables aggregations on the _ignored metadata field replacing the stored field with doc values.	2024-04-29 16:41:34 +02:00
Oleksandr Kolomiiets	e1d902d33b	Implement synthetic source support for annotated text field (#107735 ) This PR adds synthetic source support for annotated_text fields. Existing implementation for text is reused including test infrastructure so the majority of the change is moving and making things accessible. Contributes to #106460, #78744.	2024-04-25 10:31:27 -07:00
Oleksandr Kolomiiets	cde894a5ce	Implement synthetic source support for range fields (#107081 ) * Implement synthetic source support for range fields This PR adds basic synthetic source support for range fields. There are following notable properties of synthetic source produced: * Ranges are always normalized to be inclusive on both ends (this is how they are stored). * Original order of ranges is not preserved. * Date ranges are always expressed in epoch millis, format is not preserved. * IP ranges are always expressed as a range of IPs while it could have been originally provided as a CIDR. This PR only implements retrieval of data for source reconstruction from doc values.	2024-04-24 11:32:20 -07:00
Oleksandr Kolomiiets	8ed92db288	Add synthetic source support for binary fields (#107549 ) Add synthetic source support for binary fields	2024-04-22 10:06:39 -07:00
Liam Thompson	33a71e3289	[DOCS] Refactor book-scoped variables in `docs/reference/index.asciidoc` (#107413 ) * Remove `es-test-dir` book-scoped variable * Remove `plugins-examples-dir` book-scoped variable * Remove `:dependencies-dir:` and `:xes-repo-dir:` book-scoped variables - In `index.asciidoc`, two variables (`:dependencies-dir:` and `:xes-repo-dir:`) were removed. - In `sql/index.asciidoc`, the `:sql-tests:` path was updated to fuller path - In `esql/index.asciidoc`, the `:esql-tests:` path was updated idem * Replace `es-repo-dir` with `es-ref-dir` * Move `:include-xpack: true` to few files that use it, remove from index.asciidoc	2024-04-17 14:37:07 +02:00
Carlos Delgado	f8e516eb9c	Update sparse_vector docs on index version availability (#107315 )	2024-04-10 17:41:42 +02:00

1 2 3 4 5 ...

881 commits