Commit graph

735 commits

Author SHA1 Message Date
Kathleen DeRusso
0570b0baaa
Update text expansion/weighted tokens documentation make examples consistent with clients (#103663)
* Update text expansion docs and clarify int/float for token pruning config

* Fix formatting

* Fix tests

* Fix tests
2024-01-02 14:21:45 -05:00
Daniel Mitterdorfer
26115fc151
Exists query also works with only doc_values (#103647)
With this commit we amend the docs for the `exists` query to clarify
that it works with either `index` *or* `doc_values` set to `true` in the
mapping. Only if both are disabled, the `exists` query won't work.
2023-12-21 16:33:42 +01:00
Mayya Sharipova
d6c53e03d2
Improve span queries documentation (#103490)
Improvement includes:
1. Remove reference to Lucene queries (this information is not necessary
for Elastic users, and can be outdated)
2. For `span_field_masking` include a node to use
"require_field_match" : false parameter for highlighters to work.

Closes #101804
2023-12-19 14:51:19 -05:00
Kathleen DeRusso
3520584aac
Add optional pruning config (weighted terms scoring) to text expansion query (#102862)
Co-authored-by: Jim Ferenczi <jim.ferenczi@elastic.co>
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2023-12-13 14:53:13 -05:00
Mayya Sharipova
b014843078
Return matched_queries in Percolator (#103084)
Return matched_queries for named queries in Percolator.

In a response, each hit together with
a `_percolator_document_slot` field will contain
`_percolator_document_slot_<slotNumber>_matched_queries` fields that will show
which sub-queries matched each percolated document.

Closes #10163
2023-12-11 09:07:26 -05:00
Kathleen DeRusso
4dd9e2a772
[Query Rules] Add some usability clarifications to docs (#102990)
* [Query Rules] Add some usability clarifications to docs

* Fix typo
2023-12-06 17:16:56 -05:00
Riahiamirreza
f99b4459d7
Remove redundant character in mlt-query.asciidoc (#102945) 2023-12-04 14:44:12 -06:00
Kathleen DeRusso
4567d397fa
Clarify text expansion query docs to not suggest enabling track_total_hits for performance (#102102) 2023-11-20 08:56:26 -05:00
Mayya Sharipova
61c7483fc9
Make knn search a query (#98916)
This introduced a new knn query:
- knn query is executed during the Query phase similar to all other queries.
- No k parameter, k defaults to  size
- num_candidates is a size of queue for candidates to consider while
  search a graph on each shard
- For aggregations: "size" results are collected with total = size * shards.
   Aggregations will see size * shards results.
- All filters from DSL are applied as post-filters, except: 1) alias filter
 is applied as  pre-filter or 2) a filter provided as a parameter
 inside knn query.
2023-11-01 14:21:40 -04:00
Benjamin Trent
79c0bd277f
Clarify that duplicate _name values for queries in the same request is undefined (#101523)
relates to: #101480
2023-10-30 14:58:20 -04:00
Mayya Sharipova
e2920cfbb0
Add docs on constant_score_blended rewrite (#101494)
PR #94494 introduced a new rewrite method from Lucene from 8.8,
but no documentation chages were added. This adds a new method
to documentation.
2023-10-30 14:42:37 -04:00
Benjamin Trent
d3e9bf02f8
Updating percolate query docs to account for custom similarity limitation (#101386) 2023-10-27 06:47:13 -04:00
Carlos Delgado
f2dfbfe8c4
[DOCS] Add sparse-vector field type to docs, changed references (#100348) 2023-10-06 14:25:27 +02:00
Ioana Tagirta
7cd1987e5d
Make _index optional for pinned query docs (#97450)
Currently pinned queries require either the `ids` or `docs` parameter.
`docs` allows pinning documents from specific indices. However for
`docs` the `_index` field is always required:

```
GET test/_search
{
  "query": {
    "pinned": {
      "organic": {
        "query_string": {
          "query": "something"
        }
      },
      "docs": [
        { "_id": "1" }
      ]
    }
  }
}
```

returns an error:

```
{
  "error": {
    "root_cause": [
      {
        "type": "parsing_exception",
        "reason": "[10:22] [pinned] failed to parse field [docs]",
        "line": 10,
        "col": 22
      }
    ],
    "type": "parsing_exception",
    "reason": "[10:22] [pinned] failed to parse field [docs]",
    "line": 10,
    "col": 22,
    "caused_by": {
      "type": "x_content_parse_exception",
      "reason": "[10:22] [pinned] failed to parse field [docs]",
      "caused_by": {
        "type": "illegal_argument_exception",
        "reason": "Required [_index]"
      }
    }
  },
  "status": 400
}
```

The proposal here is to make `_index` optional. I don't think we have a
strong requirement for making `_index` required, when it was initially
introduced in https://github.com/elastic/elasticsearch/pull/74873, we
mostly wanted the ability to pin docs from specific indices.

Making `_index` optional can give more flexibility to use a combination
of pinned documents from specific indices or just document ids. This
change can also help with pinned query rules. Currently pinned query
rules can accept either `ids` or `docs`. If multiple pinned query rules
match and they use a combination of `ids` and `docs`, we cannot build a
pinned query and we would need to return an error. This is because a
pinned query cannot accept both `ids` and `docs`. By making `_index`
optional we would no longer need to return an error when pinned query
rules use a combination of `ids` and `docs`, because we can easily
translate `ids` in `docs`.

The following pinned queries would be equivalent:

```
GET test/_search
{
  "query": {
    "pinned": {
      "organic": {
        "query_string": {
          "query": "something"
        }
      },
      "docs": [
        { "_id": "1" }
      ]
    }
  }
}

GET test/_search
{
  "query": {
    "pinned": {
      "organic": {
        "query_string": {
          "query": "something"
        }
      },
      "ids": [1]
    }
  }
}
```

The scores should be consistent when using a combination of _docs that
might use _index or not - see example

<details>   <summary>Example </summary>

```

PUT test-1/_doc/1 {   "title": "doc 1" }

PUT test-1/_doc/2 {   "title": "doc 2" }

PUT test-2/_doc/1 {   "title": "doc 1" }

PUT test-2/_doc/3 {   "title": "lalala" }

POST test-1,test-2/_search {   "query": {     "pinned": {      
"organic": {         "query_string": {           "query": "lalala"      
}       },       "docs": [         { "_id": "2", "_index": "test-1" },  
{ "_id": "1" }       ]     }   } }

```

response:

```

{   "took": 1,   "timed_out": false,   "_shards": {     "total": 2,    
"successful": 2,     "skipped": 0,     "failed": 0   },   "hits": {    
"total": {       "value": 4,       "relation": "eq"     },    
"max_score": 1.7014124e+38,     "hits": [       {         "_index":
"test-1",         "_id": "2",         "_score": 1.7014124e+38,        
"_source": {           "title": "doc 2"         }       },       {      
"_index": "test-1",         "_id": "1",         "_score": 1.7014122e+38,
// same score as doc with id 1 from test-2         "_source": {         
"title": "doc 1"         }       },       {         "_index": "test-2", 
"_id": "1",         "_score": 1.7014122e+38, // same score as doc with
id 1 from test-1         "_source": {           "title": "doc 1"        
}       },       {         "_index": "test-2",         "_id": "3",      
"_score": 0.8025915, // organic result         "_source": {          
"title": "lalala"         }       }     ]   } }

```

</details>

For query rules, if we have two query rules that both match and use a
combination of `ids` and `pinned`:

```
PUT _query_rules/test-ruleset
{
  "ruleset_id": "test-ruleset",
  "rules": [
    {
      "rule_id": "1",
      "type": "pinned",
      "criteria": [
        {
          "type": "exact",
          "metadata": "query_string",
          "value": "country"
        }
      ],
      "actions": {
        "docs": [
          { "_index": "singers", "_id": "1" }
        ]
      }
    },
    {
      "rule_id": "2",
      "type": "pinned",
      "criteria": [
        {
          "type": "exact",
          "metadata": "query_string",
          "value": "country"
        }
      ],
      "actions": {
        "ids": [
          2
        ]
      }
    }
  ]
}
```

and the following query:

```
POST singers/_search
{
    "query": {
        "rule_query": {
            "organic": {
                "query_string": {
                  "default_field": "name",
                  "query": "country"
                }
            },
            "match_criteria": {
                "query_string": "country"
            },
            "ruleset_id": "test-ruleset"
        }
    }
}
```

then this would get translated into the following pinned query:

```
POST singers/_search
{
    "query": {
        "pinned": {
            "organic": {
                "query_string": {
                  "default_field": "name",
                  "query": "country"
                }
            },
            "docs": [
               { "_index": "singers", "_id": "1" },
               {"_id": 2 }
            ]
        }
    }
}
```

I think we can also simplify the pinned query rule so that it only
receives `docs`:

```
PUT _query_rules/test-ruleset
{
  "ruleset_id": "test-ruleset",
  "rules": [
    {
      "rule_id": "1",
      "type": "pinned",
      "criteria": [
        {
          "type": "exact",
          "metadata": "query_string",
          "value": "country"
        }
      ],
      "actions": {
        "docs": [
          { "_id": "1" },
          { "_id": "2", "_index": "singers" }
        ]
      }
    }
  ]
}
```
2023-09-07 04:39:56 -04:00
Abdon Pijpelink
8ac9fef3b7
[DOCS] Add 'boost' paramater to match query (#98108) 2023-08-09 14:28:27 +02:00
Kathleen DeRusso
93dd279dea
Docs: Link to search with query rules page from query rules DSL (#98269)
* Link to search with query rules page from query rules DSL

* Update docs/reference/query-dsl/rule-query.asciidoc

Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com>

---------

Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com>
2023-08-08 08:50:53 -04:00
Kathleen DeRusso
23e35d5687
[Query Rules] Add documentation for rule_query (#97667)
* Add docs for rule query

* Add test

* Fix formatting in rule query dsl

* Remove query string as required from rule query docs

* PR feedback

* Update with API changes

* Expand and clarify 'search using query rules' doc

* Clean up wording

* Update put syntax

* Fix examples after refactor

* Update docs/reference/query-dsl/rule-query.asciidoc

Co-authored-by: Abdon Pijpelink <abdon.pijpelink@elastic.co>

* PR feedback + update privilege

* PR feedback

* More PR feedback

* Small correction

---------

Co-authored-by: Abdon Pijpelink <abdon.pijpelink@elastic.co>
2023-08-02 15:56:06 -04:00
István Zoltán Szabó
57fd6b84fb
[DOCS] Expands ELSER tutorial with optimization info (#97392)
Co-authored-by: David Kyle <david.kyle@elastic.co>
2023-07-19 10:38:11 +02:00
debadair
777598d602
[DOCS] Remove redirect pages (#88738)
* [DOCS] Remove manual redirects

* [DOCS] Removed refs to modules-discovery-hosts-providers

* [DOCS] Fixed broken internal refs

* Fixing bad cross links in ES book, and adding redirects.asciidoc[] back into docs/reference/index.asciidoc.

* Update docs/reference/search/point-in-time-api.asciidoc

Co-authored-by: James Rodewig <james.rodewig@elastic.co>

* Update docs/reference/setup/restart-cluster.asciidoc

Co-authored-by: James Rodewig <james.rodewig@elastic.co>

* Update docs/reference/sql/endpoints/translate.asciidoc

Co-authored-by: James Rodewig <james.rodewig@elastic.co>

* Update docs/reference/snapshot-restore/restore-snapshot.asciidoc

Co-authored-by: James Rodewig <james.rodewig@elastic.co>

* Update repository-azure.asciidoc

* Update node-tool.asciidoc

* Update repository-azure.asciidoc

---------

Co-authored-by: amyjtechwriter <61687663+amyjtechwriter@users.noreply.github.com>
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
Co-authored-by: Amy Jonsson <amy.jonsson@elastic.co>
Co-authored-by: James Rodewig <james.rodewig@elastic.co>
2023-05-24 12:32:46 +01:00
István Zoltán Szabó
a6ab5ce824
[DOCS] Adds reference documentation to the text expansion query (#96151) 2023-05-17 09:39:23 +02:00
Luzia Kündig
49ccc6c275
Update range-query.asciidoc (#95822)
the example query for "date between today and yesterday" only returns documents from the day before if "lt" is used.
2023-05-05 11:15:40 +02:00
Jim Ferenczi
2f29830cd3
Add the ability to return the score of the named queries (#94564)
This change adds a new rest parameter called `rest_include_named_queries_score` that when set, includes the score of the named queries that matched the document.
Note that with this change, the score of named queries is always returned when using the transport client. The rest level has the ability to set the format of
the matched_queries section for BWC (kept as is by default).

Closes #65563
2023-03-23 13:17:26 +00:00
apurvsibal
ff6f1bc227
Wrapper query docs refer to transport client and HLRC #89263 (#91149) 2022-12-15 13:44:58 +00:00
Abdon Pijpelink
7d01d768c2
[DOCS] Warn about calling vector functions repeatedly (#91864)
* [DOCS] Add script score vector function clarification

* [DOCS] Warn about calling vector functions repeatedly
2022-12-12 09:43:46 +01:00
Abdon Pijpelink
dceff3f862
[DOCS] Warn about potential overhead of named queries (#91512) 2022-11-14 14:05:40 +01:00
Julie Tibshirani
509b636d25
Correct docs for multi_match scoring (#91430)
The documentation claimed that for the most_fields type, the score is equal to
the sum of all matches divided by the number of matches. This is not correct,
we actually don't divide by the number of matches.

This line in the documentation was added several years ago as part of a large
PR, and was likely just a mistake.
2022-11-10 08:16:56 -08:00
Etki
cec0ab20ff
Added reference to terms_set query in regular terms query documentation (#91204)
* Added reference to terms_set query in regular terms query documentation

* Update docs/reference/query-dsl/terms-query.asciidoc

Co-authored-by: Abdon Pijpelink <abdon.pijpelink@elastic.co>

Co-authored-by: Abdon Pijpelink <abdon.pijpelink@elastic.co>
2022-11-09 16:00:52 +01:00
Craig Taverner
c19f642d94
Refine geo-point and geo-shape docs (#90913)
* Refine geo-point and geo-shape docs

While reviewing the docs for another issue, some deprecated
references to prefix-trees were discovered, leading to interest
in bringing the docs a little more up-to-date.

* Update docs/reference/mapping/types/geo-point.asciidoc

Co-authored-by: Abdon Pijpelink <abdon.pijpelink@elastic.co>

* Update docs/reference/mapping/types/geo-shape.asciidoc

Co-authored-by: Abdon Pijpelink <abdon.pijpelink@elastic.co>

Co-authored-by: Abdon Pijpelink <abdon.pijpelink@elastic.co>
2022-10-26 12:21:34 +02:00
Julie Tibshirani
3c1b070329
Avoid negative scores with cross_fields type (#89016)
The cross_fields scoring type can produce negative scores when some documents
are missing fields. When blending term document frequencies, we take the maximum
document frequency across all fields. If one field appears in fewer documents
than another, this means that its IDF can become negative. This is because IDF
is calculated as `Math.log(1 + (docCount - docFreq + 0.5) / (docFreq + 0.5))`

This change adjusts the docFreq for each field to `Math.min(docCount, docFreq)`
so that the IDF can never become negative. It makes sense that the term document
frequency should never exceed the number of documents containing the field.
2022-09-06 13:02:24 -07:00
Abdon Pijpelink
f2257cae89
[DOCS] Adds note about escaping backslashes in regex (#89276)
* [DOCS] Adds note about escaping backslashes in regex

* Fix typo

* Simplify example
2022-08-17 09:40:30 +02:00
Abdon Pijpelink
e4c7febea1
Fix: Update geo-bounding-box-query.asciidoc (#87459) (#89301)
There are some redundant words so I just removed those words. Please accept this change.

(cherry picked from commit e1e5398051)

Co-authored-by: Adnan Ashraf <adnan.ashraff1@gmail.com>
2022-08-12 18:38:05 +09:30
Gonçalo Montalvão Marques
c4bd4d3cbf
Fix typo in geo-distance-query doc (#89148) 2022-08-08 09:59:47 +02:00
Abdon Pijpelink
0eca582326
[DOCS] Remove camel case variations (#88650)
* [DOCS] Remove camel case variations. Closes #73417

* [DOCS] Switch to sentence casing in titles
2022-07-20 17:06:34 +02:00
Mayya Sharipova
1ae209335d
Undeprecate function_score query (#87807)
We had a plan to deprecate function_score query with
script_score query, but ran into a roadblock of missing
functionality to combine scores from different
functions (particularly "first" script_score).
Wee have several proposal to address this missing
functionality:
 [scripted_boolean](https://github.com/elastic/elasticsearch/issues/27588#issuecomment-444887726)
 [compound_query](https://github.com/elastic/elasticsearch/issues/51967)
 [first_query](https://github.com/elastic/elasticsearch/issues/52482)

But for now, we decided not to deprecate function_score query,
and hence we need to remove any mention that we are deprecating it.

Relates to #42811
Closes #71934
2022-06-17 11:04:26 -04:00
Luca Cavanna
fe327c6e1a
[DOCS] Clarify index_prefix in prefix query docs (#87450)
The current docs mention that Elasticsearch indexes prefixes between 2 and 5 characters in a separate field. 2 and 5 are default values, and the size of the prefixes indexed depend on the configuration settings.
2022-06-14 14:32:37 +02:00
Ignacio Vera
e6b4097fc8
new geo_grid query to be used with geogrid aggregations (#86596)
Query that allows users to define a query where the input is the key of the bucket and it will match the 
documents inside that bucket.
2022-05-23 11:38:07 +02:00
Craig Taverner
5f7ea792ac
Soft-deprecation of point/geo_point formats (#86835)
* Soft-deprecation of point/geo_point formats

Since GeoJSON and WKT are now common formats for all three types:
  geo_shape, geo_point and point
We decided to soft-deprecate the other point formats by ordering:
* GeoJSON (object with keys `type` and `coordinates`)
* WKT `POINT(x y)`
* Object with keys `lat` and `lon` (or `x` and `y` for point)
* Array [lon,lat]
* String `"lat,lon"` (or `"x,y"` in point)
* String with geohash (only in `geo_point`)

The geohash is last because it is only in one field type.
The string version is second last because it is the most controversial
being the only version to reverse the coordinate order from all other
formats (for geo_point only, since the coordinates are not reversed
in point).

In addition we replaced many examples in both documentation and tests
to prioritize WKT over the plain string format.

Many remaining examples of array format or object with keys still exist
and could be replaced by, for example, GeoJSON, if we feel the need.

* Incorrect quote position
2022-05-17 23:46:43 +02:00
Nik Everett
de5ca3cfaa
fixed typo (#84694) (#84726)
Co-authored-by: Mustafa Balila (rootsofnull) <hitsugayatoshiro899@gmail.com>
2022-03-07 14:30:51 -05:00
James Rodewig
1fe2b0d866
[DOCS] Fix percolate query headings (#83988)
Fixes the heading levels for the percolate query doc so the on-page TOC displays correctly.
2022-02-15 15:56:04 -05:00
James Rodewig
c1aba1e109
[DOCS] Move tip for percolate query example (#83972)
Moves a tip for the percolate query to the beginning of the example.
2022-02-15 15:24:33 -05:00
James Rodewig
b552d5cb0e
[DOCS] Re-add network traffic para to term query (#83047)
Re-adds a paragraph about minimizing network traffic for a terms lookup.

This paragraph was erroneously removed as part of https://github.com/elastic/elasticsearch/pull/42889.
2022-01-25 10:27:10 -05:00
James Rodewig
e53ecc3f43
[DOCS] Document missing flag values for regexp query (#82265)
Documents the `EMPTY` and `NONE` `flag` values for the `regexp` query.

Also documents the `""` (empty string) value, which is an alias for `ALL`.

Closes #81978.
2022-01-18 14:15:29 -05:00
jenish jain
13e9a605b8
[DOCS] Fix track_total_hits xref (#82739) 2022-01-18 12:43:17 -05:00
James Rodewig
0a3f6acadd
[DOCS] Clarify nested query behavior for must_not clauses (#82727)
Closes #81052.
2022-01-18 10:14:26 -05:00
James Rodewig
f5f76ff1ca
[DOCS] Note that default_field support wildcards (#81127)
Changes:

* Notes that the query string query's `default_field` and `fields` parameters support wildcards.
* Adds an xref to the `index.query.default_field` docs to the `default_field` parameter.
2022-01-04 08:26:13 -05:00
James Rodewig
dd1ed30731
[DOCS] Fix combined_fields query ref in multi_match query docs (#81456)
The current `multi_match` docs contain an erroneous reference to the `combined_fields` query. This updates the reference to reference the correct query.

Relates to https://github.com/elastic/elasticsearch/pull/76893
2021-12-07 16:47:44 -05:00
James Rodewig
f56a0f4b66
[DOCS] Remove testenv annotations from doc snippet tests (#80023)
Removes `testenv` annotations and related code. These annotations originally let you skip x-pack snippet tests in the docs. However, that's no longer possible.

Relates to #79309, #31619
2021-11-05 18:38:50 -04:00
James Rodewig
0333d89f6e
[DOCS] Add wildcard parameter to wildcard query docs (#79722)
Changes:

* Documents the `wildcard` parameter for the `wildcard` query. This parameter is an alias for the `value` parameter.
* Reorders the parameters alphabetically.

Closes #79711
2021-10-26 12:35:11 -04:00
Alexander Reelsen
19d12f19f5
[DOCS] Add script note to nested query docs (#77431)
As the script has only access to the nested document, this should be
documented.

Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>
2021-10-05 10:32:20 -04:00
James Rodewig
e729c3f543
[DOCS] Clarify geoshape orientation docs (#75888)
Adds additional information about how Elasticsearch uses polygon orientation. Elasticsearch only uses a polygon's orientation to determine if it crosses the international dateline. If so, Elasticsearch splits the polygon at the dateline.

Closes #74891
2021-09-08 11:10:03 -04:00