* Allow multiple field names/patterns for (path_)(un)match (#66364)
Arrays of patterns are now allowed for dynamic_templates in the match,
unmatch, path_match and path_unmatch fields. DynamicTemplate has been modified to
support List<String> for these fields. The patterns can be either simple wildcards
or regex. As with previous functionality, when match_pattern="regex", simple wildcards
will be flagged with an error, but when match_pattern="simple", using regular expressions
in the match will not throw an error.
One new error pathway was added: if a user specifies a list of non-strings for
one of these pattern fields (e.g., "match": [10, false]) a MapperParserException
will be thrown.
A dynamic_template yamlRestTest was added. This is a BWC change, so the REST test
that uses arrays of patterns is limited to v8.9 and above.
Closes#66364.
Currently Lucene limits the max number of vector dimensions to 1024.
This commit overrides KnnFloatVectorField and KnnByteVectorField
classes to increase the limit to 2048 for indexed vectors in ES.
Here we add synthetic source support for fields whose type is flattened.
Note that flattened fields and synthetic source have the following limitations,
all arising from the fact that in synthetic source we just see key/value pairs
when reconstructing the original object and have no type information in mappings:
* flattened fields use sorted set doc values of keywords, which means two things:
first we do not allow duplicate values, second we treat all values as keywords
* reconstructing array of objects results in nested objects (no array)
* reconstructing arrays with just one element results in a single-value field since we
have no way to distinguish single-valued from multi-values fields other then looking
at the count of values
`runtime_mappings` is the name of the param in the search request. In the
document `put` statement, it's called `runtime`
Co-authored-by: Matthew Hinea <matthew.hinea@gmail.com>
This PR enables the `ignore_malformed`parameter to be accepted as an option in
boolean field mappings. Support for synthetic source is not added yet, so if
`ignore_malformed` is set to true, synthetic source isn't supported.
Closes#89542
This adds term query capabilities for rank_features fields. term queries against rank_features are not scored in the typical way as regular fields. This is because the stored feature values take advantage of the term frequency storage mechanism, and thus regular BM25 does not work.
Instead, a term query against a rank_features field is very similar to linear rank_feature query. If more complicated combinations of features and values are required, the rank_feature query should be used.
* enhancement: boolean field to support ignore_malformed
* fix: changes in current builder for BooleanFieldMappers within tests files.
* Updating documentation
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
Co-authored-by: Amy Jonsson <amy.jonsson@elastic.co>
Documentation incorrectly states that all aggregations are supported by
the `aggregate_metric_double` field.
This PR rectifies this error.
Closes#92236
Docs around the `index` option were not very precise. The term "typical" was used without describing for which fields querying is still available when `index: false` is set. But more precise docs existed in the `doc_values` documentation found here for the index option: https://www.elastic.co/guide/en/elasticsearch/reference/current/doc-values.html This docs were mostly copied over.
Co-authored-by: Abdon Pijpelink <abdon.pijpelink@elastic.co>
Co-authored-by: Abdon Pijpelink <abdon.pijpelink@elastic.co>
Currently Elasticsearch always returns a shard failure once a runtime error arises from using a runtime field, the exception being script-less runtime fields. This also means that execution of the query for that shard stops, which is okay for development and exploration. In a production scenario, however, it is often desirable to ignore runtime errors and continue with the query execution.
This change adds a new a new on_script_error parameter to runtime field definitions similar to the already existing
parameter for index-time scripted fields. When `on_script_error` is set to `continue`, errors from script execution are effectively ignored. This means affected documents don't show up in query results, but also don't prevent other matches from the same shard. Runtime fields accessed through the fields API don't return values on errors, aggregations will ignore documents that throw errors.
Note that this change affects scripted runtime fields only, while leaving default behaviour untouched. Also, ignored errors are not reported back to users for now.
Relates to #72143
* The exception is inserted in a code block
* Update docs/reference/mapping/types/text.asciidoc
Co-authored-by: Abdon Pijpelink <abdon.pijpelink@elastic.co>
Synthetic _source's array flattening activities can remove some arrays
entirely. Specifically:
```
{
"foo": [
{
"bar": 1
},
{
"baz": 2
}
]
}
```
Turns into:
```
{
"foo": {
"bar": 1,
"baz": 2
}
}
```
See, no more array! It's because the values are flattend to the leaf
fields and didn't have multiple values. This is implied by the docs we
had, but sure wasn't obvious. So now it's documented specifically.
This change adds support fielddata and subsequently scripting for byte vectors. This is a follow up to
#90774 and completes the initial work for #89784.
Before it linked to script_score and approximate kNN separately, but now we have
a single page that describes both approaches. This change also removes a link to
the deprecated _knn_search API.
* Refine geo-point and geo-shape docs
While reviewing the docs for another issue, some deprecated
references to prefix-trees were discovered, leading to interest
in bringing the docs a little more up-to-date.
* Update docs/reference/mapping/types/geo-point.asciidoc
Co-authored-by: Abdon Pijpelink <abdon.pijpelink@elastic.co>
* Update docs/reference/mapping/types/geo-shape.asciidoc
Co-authored-by: Abdon Pijpelink <abdon.pijpelink@elastic.co>
Co-authored-by: Abdon Pijpelink <abdon.pijpelink@elastic.co>
This change adds an element_type as an optional mapping parameter for dense vector fields as
described in #89784. This also adds a byte element_type for dense vector fields that supports storing
dense vectors using only 8-bits per dimension. This is only supported when the mapping parameter
index is set to true.
The code follows a similar pattern to our NumberFieldMapper where we have an enum for
ElementType, and it has methods that DenseVectorFieldType and DenseVectorMapper can delegate to
to support each available type (just float and byte for now).
I got some new this morning that we're going to have to rework how we
handle ignore-above in synthetic _source which makes me a bit weary of
removing tech-preview in 8.5. I asked a few folks and they felt more
comfortable giving it a little longer in tech preview. I expect until
ignore-above is in.
This adds synthetic `_source` support for `ip` fields with
`ignore_malfored` set to `true`. We save the field values in hidden
stored field, just like we do for `ignore_above` keyword fields. Then we
load them at load time.
I've been hacking on synthetic source for a while now and not seen any
need to break backwards compatibility or any major bugs. I think it's
time to remove the `preview` marker from it so folks can use it without
fear.
It seems that for now we don't have a good use for the histogram and summary metric types.
They had been left as place holders for a while, but at this point there is no concrete plan forward for them.
This PR removes the histogram and summary metric types. We may add them back in the future.
Also, this PR completely removes the time_series_metric mapping parameter from the histogram field type and only allows the gauge metric type for aggregate_metric_double fields.