elasticsearch

mirror of https://github.com/elastic/elasticsearch.git synced 2025-04-25 07:37:19 -04:00

Author	SHA1	Message	Date
Abdon Pijpelink	1612ad1d65	fix typo (#103149 ) (#103381 ) Fixed a typo and a small grammatical error in the explanation of the `null_value` option (cherry picked from commit `fa52f82838`) Co-authored-by: Nimrod Dolev <nimrodavid@gmail.com>	2023-12-13 07:17:00 -05:00
Chris Hegarty	ff22c90735	Merge branch 'main' into lucene_snapshot_9_9	2023-12-02 09:42:22 +00:00
Jorge Sanz	c622dad8dd	[Docs] Move coordinate note for geojson/wkt up to the beginning of the geo_shape page (#102857 ) * Move coordinate note for geojson/wkt up to the beginning of the page * Add links to GeoJSON and WKT specs	2023-12-01 15:20:42 +01:00
Benjamin Trent	f00364aefd	Add byte quantization for float vectors in HNSW (#102093 ) Adds new `quantization_options` to `dense_vector`. This allows for vectors to be automatically quantized to `byte` when indexed. Example: ``` PUT vectors { "mappings": { "properties": { "my_vector": { "type": "dense_vector", "index": true, "index_options": { "type": "int8_hnsw" } } } } } ``` When querying, the query vector is automatically quantized and used when querying the HNSW graph. This reduces the memory required to only `25%` of what was previously required for `float` vectors at a slight loss of accuracy. This is currently only available when `index: true` and when using `hnsw`	2023-11-29 12:29:55 -05:00
amyjtechwriter	d25435e185	disabling source (#101839 )	2023-11-07 13:43:28 +00:00
James Rodewig	4c69746c24	[DOCS] Update tech preview copy (#101606 ) Updates the copy for tech preview and experimental features in the Elasticsearch docs. Relates to https://github.com/elastic/docs/pull/2807	2023-10-31 10:31:07 -04:00
Carlos Delgado	f2dfbfe8c4	[DOCS] Add sparse-vector field type to docs, changed references (#100348 )	2023-10-06 14:25:27 +02:00
Luca Cavanna	689a1e490a	Merge branch 'main' into lucene_snapshot_9_8	2023-10-02 13:56:12 +02:00
Kostas Krikellas	98b9e819ee	Represent histogram value count as long (#99912 ) * Represent histogram value count as long Histograms currently use integers to store the count of each value, which can overflow. Switch to using long integers to avoid this. TDigestState was updated to use long for centroid value count in #99491 Fixes #99820 * Update docs/changelog/99912.yaml * spotless fix	2023-09-29 12:30:55 +03:00
Benjamin Trent	92cea2797e	Add nested support for dense_vector fields and knn search (#99763 ) * Nested dense_vector support * Adjust nested support based on new lucene version * fixing after rebase * fixing some code * fixing tests adding transport version * spotless * [Automated] Update Lucene snapshot to 9.9.0-snapshot-b3e67403aaf * Adds new max_inner_product vector similarity function (#99527) Adds new max_inner_product vector similarity function. This differs from dot_product in the following ways: Doesn't require vectors to be normalized Scales the similarity between vectors differently to prevent negative scores * requiring top level filter to be parent filter * adding docs & fixing tests * adding and fixing docs * adding changlog * removing unnecessary file changes * removing unused imports * fixing test * maybe fix doc tests * continue tests in docs * fixing more tests * fixing tests --------- Co-authored-by: Jim Ferenczi <jim.ferenczi@elastic.co> Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>	2023-09-28 11:38:04 -04:00
Luca Cavanna	15c87b681c	Merge branch 'main' into lucene_snapshot_9_8	2023-09-28 12:19:14 +02:00
Kostas Krikellas	137bb45662	Support runtime fields in synthetic source (#99796 ) * Support runtime fields in synthetic source * Update docs/changelog/99796.yaml * Introduce SyntheticSourceProvider * Address comments * More fixes * Fix checkstyle violation * More unittest updates * Use SourceProvider in MapperServiceTestCase * Remove runtime field from unittest * Update synthetic source doc	2023-09-26 14:29:56 +03:00
Luca Cavanna	b3e769987d	Merge branch 'main' into lucene_snapshot_9_8	2023-09-22 13:11:10 +02:00
Mayya Sharipova	ddf17e6be5	Increase the max vector dims to 4096 (#99682 )	2023-09-20 15:43:40 -04:00
Benjamin Trent	dee85de61c	Adds new max_inner_product vector similarity function (#99527 ) Adds new max_inner_product vector similarity function. This differs from dot_product in the following ways: Doesn't require vectors to be normalized Scales the similarity between vectors differently to prevent negative scores	2023-09-20 20:51:46 +02:00
Benjamin Trent	83b70e37ef	Revert "Auto-normalize dot_product vectors at index & query (#98944 )" (#99421 ) This reverts commit `7b9c367aeb`.	2023-09-11 09:33:17 -04:00
Kathleen DeRusso	258d0cb0be	Automatically map floats as dense vector (#98512 )	2023-09-06 16:06:29 -04:00
Benjamin Trent	7b9c367aeb	Auto-normalize dot_product vectors at index & query (#98944 ) `dot_product` requires vectors to be unit-length. Previously, we would check that vectors were unit-length and throw if they were not. Instead, we will now auto-normalize vectors as they are indexed. `cosine` will continue to behave as usual, not normalizing the vectors. closes: https://github.com/elastic/elasticsearch/issues/98935	2023-08-30 09:50:49 -04:00
Carlos Delgado	2b838ae853	Dense vector field types are indexed by default (#98268 ) * First version * Spotless, I liked my version better * Fix param default values * Add a supplier for default value to ensure it's calculated correctly * Can't improve this without breaking tests * Added checks for not specifying a body in PUT requests * Fix default provider for enum params * Added yaml test * Changed docs and fix TODO * Removing synonyms changes * Added separate methods for providing default value as suppliers in enums * Fixed test * Add a supplier for default value to ensure it's calculated correctly * Added checks for not specifying a body in PUT requests * Remove synonyms changes * Remove some supplier changes * Better call enumParam with supplier version * Fix compiler error on supplier * Apply validators or requires depending on index version * Solved BWC tests that involved using validators instead of requiresParameters * Add tests * Spotless * Update docs/changelog/98268.yaml * Update changelog * Update docs/changelog/98268.yaml * PR comments * PR feedback * Serialize index only for new index versions --------- Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>	2023-08-17 10:53:14 -04:00
Kuni Sen	225503a447	Update field-mapping.asciidoc that Epoch format is not supported as dynamic date format (#98338 ) * Update field-mapping.asciidoc that Epoch format is not supported as dynamic date format Update field-mapping.asciidoc that Epoch format is not supported as dynamic date format * Update docs/reference/mapping/dynamic/field-mapping.asciidoc Co-authored-by: Abdon Pijpelink <abdon.pijpelink@elastic.co> --------- Co-authored-by: Abdon Pijpelink <abdon.pijpelink@elastic.co>	2023-08-10 16:44:44 +09:00
Mayya Sharipova	2076183dee	Move vectors of > 1024 dims out of experimental (#96850 ) With moving max dims check to codec from Lucene 9.8, we will always have a way to provide our own codec with the max dims defined by us.	2023-08-03 14:30:14 -04:00
Abdon Pijpelink	5947f3b455	[DOCS] Clarify TSDS/synthetic source/runtime field restrictions (#97980 )	2023-08-03 18:28:08 +02:00
Craig Taverner	8151092b45	Documentation for time-series geo_line (#97373 ) * Documentation for time-series geo_line * Fix incorrect ids in geoline docs * Some updates from review Added image of kibana map, improved first example, linked to TSDS and added section on line simplification with link to wikipedia. * Diagrams of truncation versus simplification	2023-07-05 17:53:27 +02:00
Abdon Pijpelink	16aba067a0	[DOCS] Make 2028 dims 'experimental' warning inline (#96369 )	2023-05-30 10:13:38 +02:00
debadair	777598d602	[DOCS] Remove redirect pages (#88738 ) * [DOCS] Remove manual redirects * [DOCS] Removed refs to modules-discovery-hosts-providers * [DOCS] Fixed broken internal refs * Fixing bad cross links in ES book, and adding redirects.asciidoc[] back into docs/reference/index.asciidoc. * Update docs/reference/search/point-in-time-api.asciidoc Co-authored-by: James Rodewig <james.rodewig@elastic.co> * Update docs/reference/setup/restart-cluster.asciidoc Co-authored-by: James Rodewig <james.rodewig@elastic.co> * Update docs/reference/sql/endpoints/translate.asciidoc Co-authored-by: James Rodewig <james.rodewig@elastic.co> * Update docs/reference/snapshot-restore/restore-snapshot.asciidoc Co-authored-by: James Rodewig <james.rodewig@elastic.co> * Update repository-azure.asciidoc * Update node-tool.asciidoc * Update repository-azure.asciidoc --------- Co-authored-by: amyjtechwriter <61687663+amyjtechwriter@users.noreply.github.com> Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com> Co-authored-by: Amy Jonsson <amy.jonsson@elastic.co> Co-authored-by: James Rodewig <james.rodewig@elastic.co>	2023-05-24 12:32:46 +01:00
Salvatore Campagna	6b1e0603ce	Test histogram with zero-count buckets and synthetic source (#95400 )	2023-05-03 15:23:36 +02:00
Michael Peterson	5169011325	Allow multiple field names/patterns for (path_)(un)match (#66364 ) (#95558 ) * Allow multiple field names/patterns for (path_)(un)match (#66364) Arrays of patterns are now allowed for dynamic_templates in the match, unmatch, path_match and path_unmatch fields. DynamicTemplate has been modified to support List<String> for these fields. The patterns can be either simple wildcards or regex. As with previous functionality, when match_pattern="regex", simple wildcards will be flagged with an error, but when match_pattern="simple", using regular expressions in the match will not throw an error. One new error pathway was added: if a user specifies a list of non-strings for one of these pattern fields (e.g., "match": [10, false]) a MapperParserException will be thrown. A dynamic_template yamlRestTest was added. This is a BWC change, so the REST test that uses arrays of patterns is limited to v8.9 and above. Closes #66364.	2023-04-27 12:58:49 -04:00
Martijn van Groningen	49e8ee4269	Remove remaining tsdb tech preview labels (#95563 ) Remove tech preview label from a number of tsdb settings and mapping attributes.	2023-04-26 12:11:03 +02:00
Mayya Sharipova	4d6e451d8b	Add an experimental label for 2048 vector dims (#95395 ) Add an experimental lable for increased vector dims. Relates to PR#95257	2023-04-20 07:48:12 -04:00
Salvatore Campagna	ec2bdee31b	Add time_series_dimensions param to flattened docs (#95374 )	2023-04-20 10:58:12 +02:00
Martijn van Groningen	1f40ced134	Tiny tsdb docs update (#95333 ) Update definition of metric type counter to include it resets to zero. Just like is defined on the tsdb page: https://www.elastic.co/guide/en/elasticsearch/reference/current/tsds.html#time-series-metric	2023-04-18 11:17:31 -04:00
Mayya Sharipova	32c17d79c5	Increase max number of vector dims to 2048 (#95257 ) Currently Lucene limits the max number of vector dimensions to 1024. This commit overrides KnnFloatVectorField and KnnByteVectorField classes to increase the limit to 2048 for indexed vectors in ES.	2023-04-17 09:05:49 -04:00
Salvatore Campagna	0eeef45ea2	Synthetic source support for flattened fields (#94842 ) Here we add synthetic source support for fields whose type is flattened. Note that flattened fields and synthetic source have the following limitations, all arising from the fact that in synthetic source we just see key/value pairs when reconstructing the original object and have no type information in mappings: * flattened fields use sorted set doc values of keywords, which means two things: first we do not allow duplicate values, second we treat all values as keywords * reconstructing array of objects results in nested objects (no array) * reconstructing arrays with just one element results in a single-value field since we have no way to distinguish single-valued from multi-values fields other then looking at the count of values	2023-04-11 10:54:28 +02:00
Jim Ferenczi	57cbbb3fcd	Minor ann docs update (#94783 ) Replace the link to the deprecated knn search API and added a link to the nightly benchmarks in Rally.	2023-03-31 17:59:25 +01:00
Alan Woodward	b2cf4757f3	Fix backwards description in runtime fields documentation (#94608 ) (#94642 ) `runtime_mappings` is the name of the param in the search request. In the document `put` statement, it's called `runtime` Co-authored-by: Matthew Hinea <matthew.hinea@gmail.com>	2023-03-22 11:53:35 -04:00
Ignacio Vera	397d52e24b	Allow docvalues-only search on geo_shape (#94396 ) allows searching on a geo_shape field type when the field is not indexed (index: false) but just doc values are enabled.	2023-03-08 16:30:06 +01:00
Hritik Kumar	f5af004117	Support `ignore_malformed` in boolean fields (#93239 ) This PR enables the `ignore_malformed`parameter to be accepted as an option in boolean field mappings. Support for synthetic source is not added yet, so if `ignore_malformed` is set to true, synthetic source isn't supported. Closes #89542	2023-02-21 18:22:10 +01:00
Przemyslaw Gomulka	b0ba832791	[doc] Mention dates_nanos in dates field type page (#93828 )	2023-02-15 16:58:24 +01:00
Benjamin Trent	e8c5ed46c6	Fixing our docs for vector sizing calculation (#93703 )	2023-02-13 07:52:53 -05:00
Benjamin Trent	323a13ac3f	Add `term` query support to rank_features mapped field (#93247 ) This adds term query capabilities for rank_features fields. term queries against rank_features are not scored in the typical way as regular fields. This is because the stored feature values take advantage of the term frequency storage mechanism, and thus regular BM25 does not work. Instead, a term query against a rank_features field is very similar to linear rank_feature query. If more complicated combinations of features and values are required, the rank_feature query should be used.	2023-02-01 13:32:13 -05:00
David Turner	ce736dd0e0	Revert "enhancement: boolean field to support ignore_malformed (#90122 )" This was merged in error without a full CI run, and has some issues. This reverts commit `edcdc43519`. This reverts commit `26c0a35558`.	2023-01-25 15:09:59 +00:00
Hritik Kumar	edcdc43519	enhancement: boolean field to support ignore_malformed (#90122 ) * enhancement: boolean field to support ignore_malformed * fix: changes in current builder for BooleanFieldMappers within tests files. * Updating documentation Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com> Co-authored-by: Amy Jonsson <amy.jonsson@elastic.co>	2023-01-25 13:56:50 +00:00
Christos Soulios	a183843893	[DOCS] Fix incorrect statement for `aggregate_metric_double` field type (#92961 ) Documentation incorrectly states that all aggregations are supported by the `aggregate_metric_double` field. This PR rectifies this error. Closes #92236	2023-01-16 12:33:20 -05:00
Dale Visser	1a9150dddb	[Docs] Differentiate runtime field and indexed field (#91057 ) Clarify wording of upgrading runtime fields to index field.	2023-01-13 17:05:26 +01:00
Abdon Pijpelink	85e965a35c	[DOCS] Remove experimental flag from index vectors for kNN search docs (#92867 )	2023-01-12 15:57:28 +01:00
Nicolas Ruflin	71739416cf	[Docs] Add more details to the `index` option docs (#92606 ) Docs around the `index` option were not very precise. The term "typical" was used without describing for which fields querying is still available when `index: false` is set. But more precise docs existed in the `doc_values` documentation found here for the index option: https://www.elastic.co/guide/en/elasticsearch/reference/current/doc-values.html This docs were mostly copied over. Co-authored-by: Abdon Pijpelink <abdon.pijpelink@elastic.co> Co-authored-by: Abdon Pijpelink <abdon.pijpelink@elastic.co>	2023-01-04 09:09:21 +01:00
Christoph Büscher	8067f01d48	Runtime fields to optionally ignore script errors (#92380 ) Currently Elasticsearch always returns a shard failure once a runtime error arises from using a runtime field, the exception being script-less runtime fields. This also means that execution of the query for that shard stops, which is okay for development and exploration. In a production scenario, however, it is often desirable to ignore runtime errors and continue with the query execution. This change adds a new a new on_script_error parameter to runtime field definitions similar to the already existing parameter for index-time scripted fields. When `on_script_error` is set to `continue`, errors from script execution are effectively ignored. This means affected documents don't show up in query results, but also don't prevent other matches from the same shard. Runtime fields accessed through the fields API don't return values on errors, aggregations will ignore documents that throw errors. Note that this change affects scripted runtime fields only, while leaving default behaviour untouched. Also, ignored errors are not reported back to users for now. Relates to #72143	2022-12-23 09:29:12 +01:00
Madhusudhan Konda	af65e71114	The exception is inserted in a code block (#90325 ) * The exception is inserted in a code block * Update docs/reference/mapping/types/text.asciidoc Co-authored-by: Abdon Pijpelink <abdon.pijpelink@elastic.co>	2022-12-21 17:22:35 +01:00
QY	7b17e1b5dc	[DOCS] Remove outdated note in `Date field type` (#92408 ) Negative epoch timestamps are supported in 8.2.0 by pr #80208	2022-12-20 14:01:11 +01:00
Nik Everett	b9bb7252be	Docs: synthetic _source can't params._source (#91630 ) This documents that `params._source` isn't available for synthetic `_source` indices and suggests to instead use `doc['foo']` or `field('foo')`.	2022-11-22 15:23:30 -05:00

1 2 3 4 5 ...

821 commits