mirror of
https://github.com/elastic/elasticsearch.git
synced 2025-06-29 18:03:32 -04:00
In a few previous PR's we restructured the ES|QL docs to make it possible to generate them dynamically. This PR just moves a few files around to make the query languages docs easier to work with, and a little more organized like the ES|QL docs. A bit part of this was setting up redirects to the new locations, so other repo's could correctly link to the elasticsearch docs.
189 lines
8.7 KiB
Markdown
189 lines
8.7 KiB
Markdown
---
|
|
mapped_pages:
|
|
- https://www.elastic.co/guide/en/elasticsearch/reference/current/sql-functions-search.html
|
|
---
|
|
|
|
# Full-Text Search Functions [sql-functions-search]
|
|
|
|
Search functions should be used when performing full-text search, namely when the `MATCH` or `QUERY` predicates are being used. Outside a, so-called, search context, these functions will return default values such as `0` or `NULL`.
|
|
|
|
Elasticsearch SQL optimizes all queries executed against {{es}} depending on the scoring needs. Using [`track_scores`](/reference/elasticsearch/rest-apis/sort-search-results.md#_track_scores) on the search request or [`_doc` sorting](/reference/elasticsearch/rest-apis/sort-search-results.md) that disables scores calculation, Elasticsearch SQL instructs {{es}} not to compute scores when these are not needed. For example, every time a `SCORE()` function is encountered in the SQL query, the scores are computed.
|
|
|
|
## `MATCH` [sql-functions-search-match]
|
|
|
|
```sql
|
|
MATCH(
|
|
field_exp, <1>
|
|
constant_exp <2>
|
|
[, options]) <3>
|
|
```
|
|
|
|
**Input**:
|
|
|
|
1. field(s) to match
|
|
2. matching text
|
|
3. additional parameters; optional
|
|
|
|
|
|
**Description**: A full-text search option, in the form of a predicate, available in Elasticsearch SQL that gives the user control over powerful [match](/reference/query-languages/query-dsl/query-dsl-match-query.md) and [multi_match](/reference/query-languages/query-dsl/query-dsl-multi-match-query.md) {{es}} queries.
|
|
|
|
The first parameter is the field or fields to match against. In case it receives one value only, Elasticsearch SQL will use a `match` query to perform the search:
|
|
|
|
```sql
|
|
SELECT author, name FROM library WHERE MATCH(author, 'frank');
|
|
|
|
author | name
|
|
---------------+-------------------
|
|
Frank Herbert |Dune
|
|
Frank Herbert |Dune Messiah
|
|
Frank Herbert |Children of Dune
|
|
Frank Herbert |God Emperor of Dune
|
|
```
|
|
|
|
However, it can also receive a list of fields and their corresponding optional `boost` value. In this case, Elasticsearch SQL will use a `multi_match` query to match the documents:
|
|
|
|
```sql
|
|
SELECT author, name, SCORE() FROM library WHERE MATCH('author^2,name^5', 'frank dune');
|
|
|
|
author | name | SCORE()
|
|
---------------+-------------------+---------------
|
|
Frank Herbert |Dune |11.443176
|
|
Frank Herbert |Dune Messiah |9.446629
|
|
Frank Herbert |Children of Dune |8.043278
|
|
Frank Herbert |God Emperor of Dune|7.0029488
|
|
```
|
|
|
|
::::{note}
|
|
The `multi_match` query in {{es}} has the option of [per-field boosting](/reference/query-languages/query-dsl/query-dsl-multi-match-query.md) that gives preferential weight (in terms of scoring) to fields being searched in, using the `^` character. In the example above, the `name` field has a greater weight in the final score than the `author` field when searching for `frank dune` text in both of them.
|
|
::::
|
|
|
|
|
|
Both options above can be used in combination with the optional third parameter of the `MATCH()` predicate, where one can specify additional configuration parameters (separated by semicolon `;`) for either `match` or `multi_match` queries. For example:
|
|
|
|
```sql
|
|
SELECT author, name, SCORE() FROM library WHERE MATCH(name, 'to the star', 'operator=OR;fuzziness=AUTO:1,5;minimum_should_match=1')
|
|
ORDER BY SCORE() DESC LIMIT 2;
|
|
|
|
author | name | SCORE()
|
|
-----------------+------------------------------------+---------------
|
|
Douglas Adams |The Hitchhiker's Guide to the Galaxy|3.1756816
|
|
Peter F. Hamilton|Pandora's Star |3.0997515
|
|
```
|
|
|
|
::::{note}
|
|
The allowed optional parameters for a single-field `MATCH()` variant (for the `match` {{es}} query) are: `analyzer`, `auto_generate_synonyms_phrase_query`, `lenient`, `fuzziness`, `fuzzy_transpositions`, `fuzzy_rewrite`, `minimum_should_match`, `operator`, `max_expansions`, `prefix_length`.
|
|
::::
|
|
|
|
|
|
::::{note}
|
|
The allowed optional parameters for a multi-field `MATCH()` variant (for the `multi_match` {{es}} query) are: `analyzer`, `auto_generate_synonyms_phrase_query`, `lenient`, `fuzziness`, `fuzzy_transpositions`, `fuzzy_rewrite`, `minimum_should_match`, `operator`, `max_expansions`, `prefix_length`, `slop`, `tie_breaker`, `type`.
|
|
::::
|
|
|
|
|
|
|
|
## `QUERY` [sql-functions-search-query]
|
|
|
|
```sql
|
|
QUERY(
|
|
constant_exp <1>
|
|
[, options]) <2>
|
|
```
|
|
|
|
**Input**:
|
|
|
|
1. query text
|
|
2. additional parameters; optional
|
|
|
|
|
|
**Description**: Just like `MATCH`, `QUERY` is a full-text search predicate that gives the user control over the [query_string](/reference/query-languages/query-dsl/query-dsl-query-string-query.md) query in {{es}}.
|
|
|
|
The first parameter is basically the input that will be passed as is to the `query_string` query, which means that anything that `query_string` accepts in its `query` field can be used here as well:
|
|
|
|
```sql
|
|
SELECT author, name, SCORE() FROM library WHERE QUERY('name:dune');
|
|
|
|
author | name | SCORE()
|
|
---------------+-------------------+---------------
|
|
Frank Herbert |Dune |2.2886353
|
|
Frank Herbert |Dune Messiah |1.8893257
|
|
Frank Herbert |Children of Dune |1.6086556
|
|
Frank Herbert |God Emperor of Dune|1.4005898
|
|
```
|
|
|
|
A more advanced example, showing more of the features that `query_string` supports, of course possible with Elasticsearch SQL:
|
|
|
|
```sql
|
|
SELECT author, name, page_count, SCORE() FROM library WHERE QUERY('_exists_:"author" AND page_count:>200 AND (name:/star.*/ OR name:duna~)');
|
|
|
|
author | name | page_count | SCORE()
|
|
------------------+-------------------+---------------+---------------
|
|
Frank Herbert |Dune |604 |3.7164764
|
|
Frank Herbert |Dune Messiah |331 |3.4169943
|
|
Frank Herbert |Children of Dune |408 |3.2064917
|
|
Frank Herbert |God Emperor of Dune|454 |3.0504425
|
|
Peter F. Hamilton |Pandora's Star |768 |3.0
|
|
Robert A. Heinlein|Starship Troopers |335 |3.0
|
|
```
|
|
|
|
The query above uses the `_exists_` query to select documents that have values in the `author` field, a range query for `page_count` and regex and fuzziness queries for the `name` field.
|
|
|
|
If one needs to customize various configuration options that `query_string` exposes, this can be done using the second *optional* parameter. Multiple settings can be specified separated by a semicolon `;`:
|
|
|
|
```sql
|
|
SELECT author, name, SCORE() FROM library WHERE QUERY('dune god', 'default_operator=and;default_field=name');
|
|
|
|
author | name | SCORE()
|
|
---------------+-------------------+---------------
|
|
Frank Herbert |God Emperor of Dune|3.6984892
|
|
```
|
|
|
|
::::{note}
|
|
The allowed optional parameters for `QUERY()` are: `allow_leading_wildcard`, `analyze_wildcard`, `analyzer`, `auto_generate_synonyms_phrase_query`, `default_field`, `default_operator`, `enable_position_increments`, `escape`, `fuzziness`, `fuzzy_max_expansions`, `fuzzy_prefix_length`, `fuzzy_rewrite`, `fuzzy_transpositions`, `lenient`, `max_determinized_states`, `minimum_should_match`, `phrase_slop`, `rewrite`, `quote_analyzer`, `quote_field_suffix`, `tie_breaker`, `time_zone`, `type`.
|
|
::::
|
|
|
|
|
|
|
|
## `SCORE` [sql-functions-search-score]
|
|
|
|
```sql
|
|
SCORE()
|
|
```
|
|
|
|
**Input**: *none*
|
|
|
|
**Output**: `double` numeric value
|
|
|
|
**Description**: Returns the [relevance](https://www.elastic.co/guide/en/elasticsearch/guide/2.x/relevance-intro.html) of a given input to the executed query. The higher score, the more relevant the data.
|
|
|
|
::::{note}
|
|
When doing multiple text queries in the `WHERE` clause then, their scores will be combined using the same rules as {{es}}'s [bool query](/reference/query-languages/query-dsl/query-dsl-bool-query.md).
|
|
::::
|
|
|
|
|
|
Typically `SCORE` is used for ordering the results of a query based on their relevance:
|
|
|
|
```sql
|
|
SELECT SCORE(), * FROM library WHERE MATCH(name, 'dune') ORDER BY SCORE() DESC;
|
|
|
|
SCORE() | author | name | page_count | release_date
|
|
---------------+---------------+-------------------+---------------+--------------------
|
|
2.2886353 |Frank Herbert |Dune |604 |1965-06-01T00:00:00Z
|
|
1.8893257 |Frank Herbert |Dune Messiah |331 |1969-10-15T00:00:00Z
|
|
1.6086556 |Frank Herbert |Children of Dune |408 |1976-04-21T00:00:00Z
|
|
1.4005898 |Frank Herbert |God Emperor of Dune|454 |1981-05-28T00:00:00Z
|
|
```
|
|
|
|
However, it is perfectly fine to return the score without sorting by it:
|
|
|
|
```sql
|
|
SELECT SCORE() AS score, name, release_date FROM library WHERE QUERY('dune') ORDER BY YEAR(release_date) DESC;
|
|
|
|
score | name | release_date
|
|
---------------+-------------------+--------------------
|
|
1.4005898 |God Emperor of Dune|1981-05-28T00:00:00Z
|
|
1.6086556 |Children of Dune |1976-04-21T00:00:00Z
|
|
1.8893257 |Dune Messiah |1969-10-15T00:00:00Z
|
|
2.2886353 |Dune |1965-06-01T00:00:00Z
|
|
```
|
|
|
|
|