mirror of
https://github.com/elastic/elasticsearch.git
synced 2025-06-28 09:28:55 -04:00
* [DOCS] Documentation for the stable plugin API * Removed references to rivers * Add link to Cloud docs for managing plugins * Add caveat about needing to update plugins * Remove reference to site plugins * Wording and clarifications * Fix test * Add link to text analysis docs * Text analysis API dependencies * Remove reference to REST endpoints and fix list * Move plugin descriptor file to its own page * Typos * Review feedback * Delete unused properties file * Changed into * Changed 'elasticsearchVersion' into 'pluginApiVersion' * Swap 'The analysis plugin API' and 'Plugin file structure' sections * Update docs/plugins/authors.asciidoc Co-authored-by: Arianna Laudazzi <46651782+alaudazzi@users.noreply.github.com> * Update docs/plugins/development/creating-non-text-analysis-plugins.asciidoc Co-authored-by: Arianna Laudazzi <46651782+alaudazzi@users.noreply.github.com> * Update docs/plugins/development/creating-non-text-analysis-plugins.asciidoc Co-authored-by: Arianna Laudazzi <46651782+alaudazzi@users.noreply.github.com> * Update docs/plugins/development/creating-text-analysis-plugins.asciidoc Co-authored-by: Arianna Laudazzi <46651782+alaudazzi@users.noreply.github.com> * Update docs/plugins/development/creating-text-analysis-plugins.asciidoc Co-authored-by: Arianna Laudazzi <46651782+alaudazzi@users.noreply.github.com> * Update docs/plugins/development/creating-non-text-analysis-plugins.asciidoc Co-authored-by: Arianna Laudazzi <46651782+alaudazzi@users.noreply.github.com> * Update docs/plugins/development/creating-text-analysis-plugins.asciidoc Co-authored-by: Arianna Laudazzi <46651782+alaudazzi@users.noreply.github.com> * Update docs/plugins/development/creating-text-analysis-plugins.asciidoc Co-authored-by: Arianna Laudazzi <46651782+alaudazzi@users.noreply.github.com> * Update docs/plugins/development/example-text-analysis-plugin.asciidoc Co-authored-by: Arianna Laudazzi <46651782+alaudazzi@users.noreply.github.com> * Update docs/plugins/development/plugin-descriptor-file.asciidoc Co-authored-by: Arianna Laudazzi <46651782+alaudazzi@users.noreply.github.com> * Update docs/plugins/plugin-script.asciidoc Co-authored-by: Arianna Laudazzi <46651782+alaudazzi@users.noreply.github.com> * Update docs/plugins/development/creating-non-text-analysis-plugins.asciidoc Co-authored-by: Arianna Laudazzi <46651782+alaudazzi@users.noreply.github.com> * Update docs/plugins/development/creating-non-text-analysis-plugins.asciidoc Co-authored-by: Arianna Laudazzi <46651782+alaudazzi@users.noreply.github.com> * Rewording * Add modulename and extended.plugins descriptions for descriptor file * Add link to existing plugins in Github * Review feedback * Use 'stable' and 'classic' plugin naming * Fix capitalization * Review feedback --------- Co-authored-by: Arianna Laudazzi <46651782+alaudazzi@users.noreply.github.com> Co-authored-by: William Brafford <william.brafford@elastic.co>
112 lines
2.7 KiB
Text
112 lines
2.7 KiB
Text
[[analysis-stempel]]
|
|
=== Stempel Polish analysis plugin
|
|
|
|
The Stempel analysis plugin integrates Lucene's Stempel analysis
|
|
module for Polish into elasticsearch.
|
|
|
|
:plugin_name: analysis-stempel
|
|
include::install_remove.asciidoc[]
|
|
|
|
[[analysis-stempel-tokenizer]]
|
|
[discrete]
|
|
==== `stempel` tokenizer and token filters
|
|
|
|
The plugin provides the `polish` analyzer and the `polish_stem` and `polish_stop` token filters,
|
|
which are not configurable.
|
|
|
|
==== Reimplementing and extending the analyzers
|
|
|
|
The `polish` analyzer could be reimplemented as a `custom` analyzer that can
|
|
then be extended and configured differently as follows:
|
|
|
|
[source,console]
|
|
----------------------------------------------------
|
|
PUT /stempel_example
|
|
{
|
|
"settings": {
|
|
"analysis": {
|
|
"analyzer": {
|
|
"rebuilt_stempel": {
|
|
"tokenizer": "standard",
|
|
"filter": [
|
|
"lowercase",
|
|
"polish_stop",
|
|
"polish_stem"
|
|
]
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
----------------------------------------------------
|
|
// TEST[s/\n$/\nstartyaml\n - compare_analyzers: {index: stempel_example, first: polish, second: rebuilt_stempel}\nendyaml\n/]
|
|
|
|
[[analysis-polish-stop]]
|
|
==== `polish_stop` token filter
|
|
|
|
The `polish_stop` token filter filters out Polish stopwords (`_polish_`), and
|
|
any other custom stopwords specified by the user. This filter only supports
|
|
the predefined `_polish_` stopwords list. If you want to use a different
|
|
predefined list, then use the
|
|
{ref}/analysis-stop-tokenfilter.html[`stop` token filter] instead.
|
|
|
|
[source,console]
|
|
--------------------------------------------------
|
|
PUT /polish_stop_example
|
|
{
|
|
"settings": {
|
|
"index": {
|
|
"analysis": {
|
|
"analyzer": {
|
|
"analyzer_with_stop": {
|
|
"tokenizer": "standard",
|
|
"filter": [
|
|
"lowercase",
|
|
"polish_stop"
|
|
]
|
|
}
|
|
},
|
|
"filter": {
|
|
"polish_stop": {
|
|
"type": "polish_stop",
|
|
"stopwords": [
|
|
"_polish_",
|
|
"jeść"
|
|
]
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
|
|
GET polish_stop_example/_analyze
|
|
{
|
|
"analyzer": "analyzer_with_stop",
|
|
"text": "Gdzie kucharek sześć, tam nie ma co jeść."
|
|
}
|
|
--------------------------------------------------
|
|
|
|
The above request returns:
|
|
|
|
[source,console-result]
|
|
--------------------------------------------------
|
|
{
|
|
"tokens" : [
|
|
{
|
|
"token" : "kucharek",
|
|
"start_offset" : 6,
|
|
"end_offset" : 14,
|
|
"type" : "<ALPHANUM>",
|
|
"position" : 1
|
|
},
|
|
{
|
|
"token" : "sześć",
|
|
"start_offset" : 15,
|
|
"end_offset" : 20,
|
|
"type" : "<ALPHANUM>",
|
|
"position" : 2
|
|
}
|
|
]
|
|
}
|
|
--------------------------------------------------
|