Commit graph

707 commits

Author SHA1 Message Date
Julie Tibshirani
3c1b070329
Avoid negative scores with cross_fields type (#89016)
The cross_fields scoring type can produce negative scores when some documents
are missing fields. When blending term document frequencies, we take the maximum
document frequency across all fields. If one field appears in fewer documents
than another, this means that its IDF can become negative. This is because IDF
is calculated as `Math.log(1 + (docCount - docFreq + 0.5) / (docFreq + 0.5))`

This change adjusts the docFreq for each field to `Math.min(docCount, docFreq)`
so that the IDF can never become negative. It makes sense that the term document
frequency should never exceed the number of documents containing the field.
2022-09-06 13:02:24 -07:00
Abdon Pijpelink
f2257cae89
[DOCS] Adds note about escaping backslashes in regex (#89276)
* [DOCS] Adds note about escaping backslashes in regex

* Fix typo

* Simplify example
2022-08-17 09:40:30 +02:00
Abdon Pijpelink
e4c7febea1
Fix: Update geo-bounding-box-query.asciidoc (#87459) (#89301)
There are some redundant words so I just removed those words. Please accept this change.

(cherry picked from commit e1e5398051)

Co-authored-by: Adnan Ashraf <adnan.ashraff1@gmail.com>
2022-08-12 18:38:05 +09:30
Gonçalo Montalvão Marques
c4bd4d3cbf
Fix typo in geo-distance-query doc (#89148) 2022-08-08 09:59:47 +02:00
Abdon Pijpelink
0eca582326
[DOCS] Remove camel case variations (#88650)
* [DOCS] Remove camel case variations. Closes #73417

* [DOCS] Switch to sentence casing in titles
2022-07-20 17:06:34 +02:00
Mayya Sharipova
1ae209335d
Undeprecate function_score query (#87807)
We had a plan to deprecate function_score query with
script_score query, but ran into a roadblock of missing
functionality to combine scores from different
functions (particularly "first" script_score).
Wee have several proposal to address this missing
functionality:
 [scripted_boolean](https://github.com/elastic/elasticsearch/issues/27588#issuecomment-444887726)
 [compound_query](https://github.com/elastic/elasticsearch/issues/51967)
 [first_query](https://github.com/elastic/elasticsearch/issues/52482)

But for now, we decided not to deprecate function_score query,
and hence we need to remove any mention that we are deprecating it.

Relates to #42811
Closes #71934
2022-06-17 11:04:26 -04:00
Luca Cavanna
fe327c6e1a
[DOCS] Clarify index_prefix in prefix query docs (#87450)
The current docs mention that Elasticsearch indexes prefixes between 2 and 5 characters in a separate field. 2 and 5 are default values, and the size of the prefixes indexed depend on the configuration settings.
2022-06-14 14:32:37 +02:00
Ignacio Vera
e6b4097fc8
new geo_grid query to be used with geogrid aggregations (#86596)
Query that allows users to define a query where the input is the key of the bucket and it will match the 
documents inside that bucket.
2022-05-23 11:38:07 +02:00
Craig Taverner
5f7ea792ac
Soft-deprecation of point/geo_point formats (#86835)
* Soft-deprecation of point/geo_point formats

Since GeoJSON and WKT are now common formats for all three types:
  geo_shape, geo_point and point
We decided to soft-deprecate the other point formats by ordering:
* GeoJSON (object with keys `type` and `coordinates`)
* WKT `POINT(x y)`
* Object with keys `lat` and `lon` (or `x` and `y` for point)
* Array [lon,lat]
* String `"lat,lon"` (or `"x,y"` in point)
* String with geohash (only in `geo_point`)

The geohash is last because it is only in one field type.
The string version is second last because it is the most controversial
being the only version to reverse the coordinate order from all other
formats (for geo_point only, since the coordinates are not reversed
in point).

In addition we replaced many examples in both documentation and tests
to prioritize WKT over the plain string format.

Many remaining examples of array format or object with keys still exist
and could be replaced by, for example, GeoJSON, if we feel the need.

* Incorrect quote position
2022-05-17 23:46:43 +02:00
Nik Everett
de5ca3cfaa
fixed typo (#84694) (#84726)
Co-authored-by: Mustafa Balila (rootsofnull) <hitsugayatoshiro899@gmail.com>
2022-03-07 14:30:51 -05:00
James Rodewig
1fe2b0d866
[DOCS] Fix percolate query headings (#83988)
Fixes the heading levels for the percolate query doc so the on-page TOC displays correctly.
2022-02-15 15:56:04 -05:00
James Rodewig
c1aba1e109
[DOCS] Move tip for percolate query example (#83972)
Moves a tip for the percolate query to the beginning of the example.
2022-02-15 15:24:33 -05:00
James Rodewig
b552d5cb0e
[DOCS] Re-add network traffic para to term query (#83047)
Re-adds a paragraph about minimizing network traffic for a terms lookup.

This paragraph was erroneously removed as part of https://github.com/elastic/elasticsearch/pull/42889.
2022-01-25 10:27:10 -05:00
James Rodewig
e53ecc3f43
[DOCS] Document missing flag values for regexp query (#82265)
Documents the `EMPTY` and `NONE` `flag` values for the `regexp` query.

Also documents the `""` (empty string) value, which is an alias for `ALL`.

Closes #81978.
2022-01-18 14:15:29 -05:00
jenish jain
13e9a605b8
[DOCS] Fix track_total_hits xref (#82739) 2022-01-18 12:43:17 -05:00
James Rodewig
0a3f6acadd
[DOCS] Clarify nested query behavior for must_not clauses (#82727)
Closes #81052.
2022-01-18 10:14:26 -05:00
James Rodewig
f5f76ff1ca
[DOCS] Note that default_field support wildcards (#81127)
Changes:

* Notes that the query string query's `default_field` and `fields` parameters support wildcards.
* Adds an xref to the `index.query.default_field` docs to the `default_field` parameter.
2022-01-04 08:26:13 -05:00
James Rodewig
dd1ed30731
[DOCS] Fix combined_fields query ref in multi_match query docs (#81456)
The current `multi_match` docs contain an erroneous reference to the `combined_fields` query. This updates the reference to reference the correct query.

Relates to https://github.com/elastic/elasticsearch/pull/76893
2021-12-07 16:47:44 -05:00
James Rodewig
f56a0f4b66
[DOCS] Remove testenv annotations from doc snippet tests (#80023)
Removes `testenv` annotations and related code. These annotations originally let you skip x-pack snippet tests in the docs. However, that's no longer possible.

Relates to #79309, #31619
2021-11-05 18:38:50 -04:00
James Rodewig
0333d89f6e
[DOCS] Add wildcard parameter to wildcard query docs (#79722)
Changes:

* Documents the `wildcard` parameter for the `wildcard` query. This parameter is an alias for the `value` parameter.
* Reorders the parameters alphabetically.

Closes #79711
2021-10-26 12:35:11 -04:00
Alexander Reelsen
19d12f19f5
[DOCS] Add script note to nested query docs (#77431)
As the script has only access to the nested document, this should be
documented.

Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>
2021-10-05 10:32:20 -04:00
James Rodewig
e729c3f543
[DOCS] Clarify geoshape orientation docs (#75888)
Adds additional information about how Elasticsearch uses polygon orientation. Elasticsearch only uses a polygon's orientation to determine if it crosses the international dateline. If so, Elasticsearch splits the polygon at the dateline.

Closes #74891
2021-09-08 11:10:03 -04:00
Adam Locke
1056c857ee
[DOCS] Update combined fields wording (#76893)
* [DOCS] Update combined fields wording

* Clarifications from review feedback
2021-08-26 13:16:55 -04:00
James Rodewig
22a6c1f0d3
[DOCS] Add search xref tip to query_string docs (#76728)
Adds a tip containing a cross-reference to the "Search your data" docs.
This is the preferred starting point for ES search.
2021-08-20 08:44:20 -04:00
Paweł Krześniak
d2a6c1627f
[DOCS] Fix typo in parent-child example request (#76646) 2021-08-18 08:59:36 -04:00
James Rodewig
8eaf4043e4
[DOCS] Terms lookup doesn't support remote indices (#76371)
Changes:
* Notes that you can't use cross-cluster search to run a terms lookup on a remote index.
* Removes a redundant sentence noting `_source` is enabled by default.

Closes #61364.
2021-08-12 08:33:10 -04:00
James Rodewig
fc0ac1923d
[DOCS] Correct spelling for geo terms (#76028)
Changes:
* Use "geopoint" when not referring to the literal field type
* Use "geoshape" when not referring to the literal field type or query type
* Use "GeoJSON" consistently
2021-08-03 09:55:48 -04:00
David Harsha
ed7a65e053
Allow specifying index in pinned queries (#74873)
The current `ids` option doesn't allow pinning a specific document in a
single index when searching over multiple indices. This introduces a
`documents` option, which is an array of `_id` and `_index`
fields to allow index-specific pins.

Closes https://github.com/elastic/elasticsearch/issues/67855.
2021-07-27 15:55:07 +03:00
Adrien Grand
feb6620d14
indices.query.bool.max_clause_count now limits all query clauses (#75297)
In the upcoming Lucene 9 release, `indices.query.bool.max_clause_count` is
going to apply to the entire query tree rather than per `bool` query. In order
to avoid breaks, the limit has been bumped from 1024 to 4096.

The semantics will effectively change when we upgrade to Lucene 9, this PR
is only about agreeing on a migration strategy and documenting this change.

To avoid further breaks, I am leaning towards keeping the current setting name
even though it contains `bool`. I believe that it still makes sense given that
`bool` queries are typically the main contributors to high numbers of clauses.

Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>
2021-07-21 12:16:30 +02:00
André Pessanha
bb37e09d92
Rename field_masking_span to span_field_masking (#74718)
`field_masking_span` is the only span query that does not begin with
`span_`.  This commit deprecates the existing name and adds a new
name `span_field_masking` to better fit with the other queries.
2021-07-09 08:56:38 +01:00
James Rodewig
d4ed43c5a4
[DOCS] Remove deprecated geo_shape parameters (#74519)
* Removes docs and references for the following `geo_shape` mapping parameters:
  * `tree`
  * `tree_levels`
  * `strategy`
  * `distance_error_pct`
* Updates a related breaking change.

Relates to #70850
2021-06-29 08:52:05 -04:00
James Rodewig
139eabad2d
[DOCS] Query strings are normalized for fuzzy (~) operator (#73921)
Notes that `fuzzy` queries made using the query string query's `~`
operator are normalized.

Closes #73299
2021-06-28 13:13:41 -04:00
Ignacio Vera
d7ef5b6d21
Remove bounding box query type parameter (#74536)
The parameter has been deprecates in 7.14 as it is a no-op.
2021-06-28 07:37:04 +02:00
Ignacio Vera
28b4982df4
Deprecate Bounding box query type parameter (#74493)
This parameter has no effect on the query execution.
2021-06-24 07:34:57 +02:00
Mayya Sharipova
f8215e752c
Add doc on rank_feature(s) negative score impact (#71795)
Add a warning about consequences of negative score impact
for documents that don't have values for rank_feature(s)
fields.

Related to #69994
2021-04-20 06:56:05 -04:00
Julie Tibshirani
318bf14126
Introduce combined_fields query (#71213)
This PR introduces a new query called `combined_fields` for searching multiple
text fields. It takes a term-centric view, first analyzing the query string
into individual terms, then searching for each term any of the fields as though
they were one combined field. It is based on Lucene's `CombinedFieldQuery`,
which takes a principled approach to scoring based on the BM25F formula.

This query provides an alternative to the `cross_fields` `multi_match` mode. It
has simpler behavior and a more robust approach to scoring.

Addresses #41106.
2021-04-14 13:33:19 -07:00
Adam Locke
af700f4628
[DOCS] Update runtime fields for script query (#71338)
Fixes typo, moves example out of a NOTE admonition, and puts context before the example.
2021-04-06 10:12:08 -04:00
Nik Everett
5677c6822e
Point script query docs at runtime fields (#71291)
This adds a "note" on the docs for the script query pointing folks to
runtime fields because they are more flexible. It also translates the
request example into runtime fields.

Relates to #69291

Co-authored-by: Adam Locke <adam.locke@elastic.co>
2021-04-05 13:11:29 -04:00
Adam Locke
f06dc219b2
[DOCS] Fixes deprecation message for Geo-polygon query (#71141)
* [DOCS] Fixes deprecation message for Geo-polygon query

* Change deprecation to full block admonition.
2021-03-31 16:37:29 -04:00
James Rodewig
693807a6d3
[DOCS] Fix double spaces (#71082) 2021-03-31 09:57:47 -04:00
James Rodewig
38edcb65ae
[DOCS] Document index.query.default_field index setting (#69922) 2021-03-17 17:11:25 -04:00
Julie Tibshirani
da668e134a
Correct cross_fields docs on how analyzer groups are combined. (#69936)
When performing a multi_match in cross_fields mode, we group fields based on
their analyzer and create a blended query per group. Our docs claimed that the
group scores were combined through a boolean query, but they are actually
combined through a dismax that incorporates the tiebreaker parameter.

This commit updates the docs and adds a test verifying the behavior.
2021-03-08 14:56:17 -08:00
James Rodewig
79828761bc
[DOCS] Fix prefix_length data type (#70075) 2021-03-08 09:19:00 -05:00
James Rodewig
35c02c45f7
[DOCS] Note case_sensitive param was added in 7.10 (#69405) (#69466)
Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>

Co-authored-by: Bhavya Gupta <46423346+bhavya121999@users.noreply.github.com>
2021-02-23 13:12:28 -05:00
James Rodewig
31fc59efdf
[DOCS] Fix capitalization for Query DSL (#69236) 2021-02-18 18:57:19 -05:00
James Rodewig
c65615911f
[DOCS] Expand simple query string query's multi-position token section (#68753) 2021-02-09 16:07:02 -05:00
Ignacio Vera
f58d7854c5
Deprecate GeoPolygon query in favour of GeoShape query. (#64227) 2021-02-09 10:21:18 +01:00
James Rodewig
0f5af55258
[DOCS] Update example request description (#68587) (#68658)
The doc is misleading : The following intervals search returns documents containing `my favorite food` **immediately** followed by `hot water` or `cold porridge`

max_gaps apply only to the match query and is not used for checking proximity with the other match, the example given actually`This search would match a my_text value of my favorite food is cold`

Co-authored-by: Julien Guay <guay_j@yahoo.fr>
2021-02-08 08:50:56 -05:00
James Rodewig
7f3a4525a4
[DOCS] Remove outdated deprecated notes (#68246) 2021-02-01 09:30:45 -05:00
Ignacio Vera
808b4e71f1
Add support for Spatial Relationships to geo_point field (#67631)
Lucene 8.8 supports to query LatLonPoint field using spatial relationships.
2021-01-20 13:18:28 +01:00