mirror of
https://github.com/elastic/elasticsearch.git
synced 2025-04-24 23:27:25 -04:00
HLRC API for _termvectors (#33447)
* HLRC API for _termvectors relates to #27205
This commit is contained in:
parent
f19565c3e0
commit
bf4d90a5dc
12 changed files with 1347 additions and 4 deletions
134
docs/java-rest/high-level/document/term-vectors.asciidoc
Normal file
134
docs/java-rest/high-level/document/term-vectors.asciidoc
Normal file
|
@ -0,0 +1,134 @@
|
|||
--
|
||||
:api: term-vectors
|
||||
:request: TermVectorsRequest
|
||||
:response: TermVectorsResponse
|
||||
--
|
||||
|
||||
[id="{upid}-{api}"]
|
||||
=== Term Vectors API
|
||||
|
||||
Term Vectors API returns information and statistics on terms in the fields
|
||||
of a particular document. The document could be stored in the index or
|
||||
artificially provided by the user.
|
||||
|
||||
|
||||
[id="{upid}-{api}-request"]
|
||||
==== Term Vectors Request
|
||||
|
||||
A +{request}+ expects an `index`, a `type` and an `id` to specify
|
||||
a certain document, and fields for which the information is retrieved.
|
||||
|
||||
["source","java",subs="attributes,callouts,macros"]
|
||||
--------------------------------------------------
|
||||
include-tagged::{doc-tests-file}[{api}-request]
|
||||
--------------------------------------------------
|
||||
|
||||
Term vectors can also be generated for artificial documents, that is for
|
||||
documents not present in the index:
|
||||
|
||||
["source","java",subs="attributes,callouts,macros"]
|
||||
--------------------------------------------------
|
||||
include-tagged::{doc-tests-file}[{api}-request-artificial]
|
||||
--------------------------------------------------
|
||||
<1> An artificial document is provided as an `XContentBuilder` object,
|
||||
the Elasticsearch built-in helper to generate JSON content.
|
||||
|
||||
===== Optional arguments
|
||||
|
||||
["source","java",subs="attributes,callouts,macros"]
|
||||
--------------------------------------------------
|
||||
include-tagged::{doc-tests-file}[{api}-request-optional-arguments]
|
||||
--------------------------------------------------
|
||||
<1> Set `fieldStatistics` to `false` (default is `true`) to omit document count,
|
||||
sum of document frequencies, sum of total term frequencies.
|
||||
<2> Set `termStatistics` to `true` (default is `false`) to display
|
||||
total term frequency and document frequency.
|
||||
<3> Set `positions` to `false` (default is `true`) to omit the output of
|
||||
positions.
|
||||
<4> Set `offsets` to `false` (default is `true`) to omit the output of
|
||||
offsets.
|
||||
<5> Set `payloads` to `false` (default is `true`) to omit the output of
|
||||
payloads.
|
||||
<6> Set `filterSettings` to filter the terms that can be returned based
|
||||
on their tf-idf scores.
|
||||
<7> Set `perFieldAnalyzer` to specify a different analyzer than
|
||||
the one that the field has.
|
||||
<8> Set `realtime` to `false` (default is `true`) to retrieve term vectors
|
||||
near realtime.
|
||||
<9> Set a routing parameter
|
||||
|
||||
|
||||
include::../execution.asciidoc[]
|
||||
|
||||
|
||||
[id="{upid}-{api}-response"]
|
||||
==== TermVectorsResponse
|
||||
|
||||
The `TermVectorsResponse` contains the following information:
|
||||
|
||||
["source","java",subs="attributes,callouts,macros"]
|
||||
--------------------------------------------------
|
||||
include-tagged::{doc-tests-file}[{api}-response]
|
||||
--------------------------------------------------
|
||||
<1> The index name of the document.
|
||||
<2> The type name of the document.
|
||||
<3> The id of the document.
|
||||
<4> Indicates whether or not the document found.
|
||||
|
||||
|
||||
===== Inspecting Term Vectors
|
||||
If `TermVectorsResponse` contains non-null list of term vectors,
|
||||
more information about them can be obtained using following:
|
||||
|
||||
["source","java",subs="attributes,callouts,macros"]
|
||||
--------------------------------------------------
|
||||
include-tagged::{doc-tests-file}[{api}-term-vectors]
|
||||
--------------------------------------------------
|
||||
<1> The list of `TermVector` for the document
|
||||
<2> The name of the current field
|
||||
<3> Fields statistics for the current field - document count
|
||||
<4> Fields statistics for the current field - sum of total term frequencies
|
||||
<5> Fields statistics for the current field - sum of document frequencies
|
||||
<6> Terms for the current field
|
||||
<7> The name of the term
|
||||
<8> Term frequency of the term
|
||||
<9> Document frequency of the term
|
||||
<10> Total term frequency of the term
|
||||
<11> Score of the term
|
||||
<12> Tokens of the term
|
||||
<13> Position of the token
|
||||
<14> Start offset of the token
|
||||
<15> End offset of the token
|
||||
<16> Payload of the token
|
||||
|
||||
|
||||
[id="{upid}-{api}-response"]
|
||||
==== TermVectorsResponse
|
||||
|
||||
The `TermVectorsResponse` contains the following information:
|
||||
|
||||
["source","java",subs="attributes,callouts,macros"]
|
||||
--------------------------------------------------
|
||||
include-tagged::{doc-tests-file}[{api}-response]
|
||||
--------------------------------------------------
|
||||
<1> The index name of the document.
|
||||
<2> The type name of the document.
|
||||
<3> The id of the document.
|
||||
<4> Indicates whether or not the document found.
|
||||
<5> Indicates whether or not there are term vectors for this document.
|
||||
<6> The list of `TermVector` for the document
|
||||
<7> The name of the current field
|
||||
<8> Fields statistics for the current field - document count
|
||||
<9> Fields statistics for the current field - sum of total term frequencies
|
||||
<10> Fields statistics for the current field - sum of document frequencies
|
||||
<11> Terms for the current field
|
||||
<12> The name of the term
|
||||
<13> Term frequency of the term
|
||||
<14> Document frequency of the term
|
||||
<15> Total term frequency of the term
|
||||
<16> Score of the term
|
||||
<17> Tokens of the term
|
||||
<18> Position of the token
|
||||
<19> Start offset of the token
|
||||
<20> End offset of the token
|
||||
<21> Payload of the token
|
|
@ -14,6 +14,7 @@ Single document APIs::
|
|||
* <<{upid}-exists>>
|
||||
* <<{upid}-delete>>
|
||||
* <<{upid}-update>>
|
||||
* <<{upid}-term-vectors>>
|
||||
|
||||
[[multi-doc]]
|
||||
Multi-document APIs::
|
||||
|
@ -29,6 +30,7 @@ include::document/get.asciidoc[]
|
|||
include::document/exists.asciidoc[]
|
||||
include::document/delete.asciidoc[]
|
||||
include::document/update.asciidoc[]
|
||||
include::document/term-vectors.asciidoc[]
|
||||
include::document/bulk.asciidoc[]
|
||||
include::document/multi-get.asciidoc[]
|
||||
include::document/reindex.asciidoc[]
|
||||
|
@ -372,4 +374,4 @@ don't leak into the rest of the documentation.
|
|||
:response!:
|
||||
:doc-tests-file!:
|
||||
:upid!:
|
||||
--
|
||||
--
|
Loading…
Add table
Add a link
Reference in a new issue