Commit graph

339 commits

Author SHA1 Message Date
James Rodewig
630604bd45
[DOCS] Fix case sensitivity for elision token filter (#69873) 2021-03-03 09:09:05 -05:00
James Rodewig
9b88ae92e6
[DOCS] Fix typos for duplicate words (#69125) 2021-02-17 10:34:20 -05:00
James Rodewig
c65615911f
[DOCS] Expand simple query string query's multi-position token section (#68753) 2021-02-09 16:07:02 -05:00
James Rodewig
d5d8be9bff [DOCS] Fix typo 2021-02-03 10:45:16 -05:00
James Rodewig
86814df052
[DOCS] Clean up index template xrefs (#67264) 2021-01-11 12:38:09 -05:00
Toast
966189fa6a
[DOCS] Fix typo (#65912) 2020-12-05 10:05:13 -05:00
James Rodewig
fa7c63e6c4
[DOCS] Fix whitespace in pattern replace token filter docs (#64345) 2020-10-29 10:07:10 -04:00
James Rodewig
1ea83359bb
[DOCS] Fix case for 'Boolean' (#64299) 2020-10-29 09:04:43 -04:00
Elasticsearch addict
32c7e08c6d
[DOCS] Fix pattern replace token filter intro (#64189)
Removes an incorrect statement about anchoring regex patterns on tokens.
2020-10-27 09:33:03 -04:00
James Rodewig
39d064d668
[DOCS] Update snowball links (#63351) 2020-10-06 15:29:57 -04:00
James Rodewig
80a828c15f
[DOCS] Update link to Snowball documentation (#63305) (#63347)
The current link points to an obsolete site, which is no longer maintained.

Co-authored-by: Stefan Walter <67258699+rd-stefan-walter@users.noreply.github.com>
2020-10-06 13:40:51 -04:00
James Rodewig
b3e8767a35
[DOCS] Clarify that v2.0+ hyphenation files aren't supported (#60579) (#63072)
Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>

Co-authored-by: jgkirschbaum <juergen.kirschbaum@gmail.com>
2020-09-30 09:28:23 -04:00
James Rodewig
a94e5cb7c4
[DOCS] Replace Wikipedia links with attribute (#61171) 2020-08-17 09:44:24 -04:00
James Rodewig
5827d09ba6
[DOCS] Add xref to multiplexer token filter docs (#60431) (#61170)
Co-authored-by: paiboon auengkongkatong <paiboon15721@gmail.com>
2020-08-14 15:10:33 -04:00
James Rodewig
5d9de8ce46
[DOCS] Add missing lang values to snowball token filter (#60489) 2020-08-04 17:26:37 -04:00
Alexander Reelsen
c7ac9e7073
[DOCS] http -> https, remove outdated plugin docs (#60380)
Plugin discovery documentation contained information about installing
Elasticsearch 2.0 and installing an oracle JDK, both of which is no
longer valid.

While noticing that the instructions used cleartext HTTP to install
packages, this commit replaces HTTPs links instead of HTTP where possible.

In addition a few community links have been removed, as they do not seem
to exist anymore.
2020-07-31 15:58:38 -04:00
James Rodewig
441c3a21b1
[DOCS] Update my-index examples (#60132)
Changes the following example index names to `my-index-000001` for consistency:

* `my-index`
* `my_index`
* `myindex`
2020-07-27 14:46:39 -04:00
James Rodewig
2774cd6938
[DOCS] Swap [float] for [discrete] (#60124)
Changes instances of `[float]` in our docs for `[discrete]`.

Asciidoctor prefers the `[discrete]` tag for floating headings:
https://asciidoctor.org/docs/asciidoc-asciidoctor-diffs/#blocks
2020-07-23 11:48:22 -04:00
James Rodewig
80b674fb25
[DOCS] Reformat snippets to use two-space indents (#59973) 2020-07-21 12:24:26 -04:00
malpani
08de504b44
Support ignore_keywords flag for word delimiter graph token filter (#59563)
This commit allows customizing the word delimiter token filters to skip processing 
tokens tagged as keyword through the `ignore_keywords` flag Lucene's 
WordDelimiterGraphFilter already exposes.

Fix for #59491
2020-07-21 16:11:11 +01:00
Rui Almeida
2c450214ac
[DOCS] Fix keyword marker docs (#59834) 2020-07-20 08:54:55 -04:00
James Rodewig
8b6e310070
[DOCS] Reformat predicate_token_filter tokenfilter (#57705) 2020-07-16 13:07:19 -04:00
James Rodewig
2be9db01c8
[DOCS] Replace datatype with data type (#58972) 2020-07-07 13:52:10 -04:00
James Rodewig
8439c888b6
[DOCS] Fix headings for simple analyzer docs (#58910) 2020-07-02 09:28:56 -04:00
James Rodewig
05da3e0e48
[DOCS] Fix analyzer page titles (#58362)
Changes the titles for analyzer pages to sentence case.

Also changes the 'Pattern character filter' page title to sentence case.
2020-06-26 09:30:37 -04:00
James Rodewig
b2b3599012
[DOCS] Fix tokenizer page titles (#58361)
Changes the titles for tokenizer pages to sentence case.

Also moves the 'Path hierarchy tokenizer examples' page within the
'Path hierarchy tokenizer' page and adds a related redirect.
2020-06-26 09:08:44 -04:00
James Rodewig
bb66d594d1
[DOCS] Reformat pattern_replace token filter (#57699)
Changes:

* Rewrites description and adds Lucene link
* Adds analyze example
* Adds parameter definitions
* Adds custom analyzer example
2020-06-11 12:04:22 -04:00
James Rodewig
fd8af38078
[DOCS] Reformat mapping charfilter (#57818)
Changes:

* Adds title abbreviation
* Adds Lucene link to description
* Adds standard headings
* Simplifies analyze example
* Simplifies analyzer example and adds contextual text
2020-06-09 12:23:08 -04:00
James Rodewig
06b41614a2 [DOCS] Fix typo in html_strip char filter docs 2020-06-08 10:37:16 -04:00
James Rodewig
98a64da87c
[DOCS] Reformat html_strip charfilter (#57764)
Changes:

* Converts title to sentence case
* Adds a title abbreviation
* Adds Lucene link to description
* Reformat sections
2020-06-08 08:30:23 -04:00
Tomasz Elendt
66ded59929
Support multiple tokens on LHS in stemmer_override rules (#56113) (#56484)
This commit adds support for rules with multiple tokens on LHS, also
known as "contraction rules", into stemmer override token
filter. Contraction rules are handy into translating multiple
inflected words into the same root form. One side effect of this change is
that it brings stemmer override rules format closer to synonym rules
format so that it makes it easier to translate one into another.

This change also makes stemmer override rules parser more strict so
that it should catch more errors which were previously accepted.

Closes #56113
2020-05-29 22:28:41 +02:00
James Rodewig
16be0e65d3
[DOCS] Reformat min_hash token filter docs (#57181)
Changes:

* Rewrites description and adds a Lucene link
* Reformats the configurable parameters as a definition list
* Changes the `Theory` heading to `Using the min_hash token filter for
  similarity search`
* Adds some additional detail to the analyzer example
2020-05-27 14:55:27 -04:00
James Rodewig
00ab16ff97
[DOCS] Reformat shingle token filter (#57040)
Changes:

* Rewrites description and adds Lucene link
* Adds analyze example
* Rewrites parameter documentation
* Updates custom analyzer and filter examples
* Adds anchor to `index.max_shingle_diff` index-level setting
2020-05-21 13:41:51 -04:00
James Rodewig
2ed91444fe
[DOCS] Reformat hunspell token filter (#56955)
Changes:

* Rewrites description and adds Lucene link
* Adds analyze example
* Rewrites parameter documentation
* Updates custom analyzer example
* Rewrites related setting documentation
2020-05-20 14:29:08 -04:00
Andrei Balici
da31b4b83d
Add max_token_length setting to the CharGroupTokenizer (#56860)
Adds `max_token_length` option to the CharGroupTokenizer.
Updates documentation as well to reflect the changes.

Closes #56676
2020-05-20 14:15:57 +02:00
James Rodewig
6fe84e67e9
[DOCS] Fix fingerprint token filter's analyzer example (#56811) (#56944)
Co-authored-by: Abhilash Bolla <2282894+ivssh@users.noreply.github.com>
2020-05-19 09:38:37 -04:00
James Rodewig
36ae8ebfde
[DOCS] Reformat porter_stem token filter (#56053)
Makes the following changes to the `porter_stem` token filter docs:

* Rewrites description and adds a Lucene link
* Adds detailed analyze example
* Adds an analyzer example
2020-05-04 10:03:03 -04:00
Amit Khandelwal
00fef6dfd3
Analysis enhancement - add preserve_original setting in ngram-token-filter (#55432) 2020-05-04 10:06:37 +01:00
James Rodewig
6dbdf879b2 [DOCS] Correct Lucene link in kstrem token filter docs 2020-04-29 09:28:05 -04:00
James Rodewig
77a35c641d
[DOCS] Reformat kstem token filter (#55823)
Makes the following changes to the `kstem` token filter docs:

* Rewrite description and adds a Lucene work
* Adds detailed analyze example
* Adds an analyzer example
2020-04-29 08:27:30 -04:00
Amit Khandelwal
9e41feda86
Expose preserve_original in edge_ngram token filter (#55766)
The Lucene `preserve_original` setting is currently not supported in the `edge_ngram`
token filter. This change adds it with a default value of `false`.

Closes #55767
2020-04-28 10:22:59 +02:00
James Rodewig
d67a1b47e4 [DOCS] Correct stemmer token filters anchor 2020-04-27 14:56:25 -04:00
James Rodewig
f08b3c93cb [DOCS] Correct stemmer token filter anchor 2020-04-27 14:49:19 -04:00
James Rodewig
bb9dbcb4c8
[DOCS] Reformat stemmer token filter (#55693)
Makes the following changes to the `stemmer` token filter docs:

* Adds detailed analyze example
* Rewrites parameter definitions
* Adds custom analyzer example
* Adds a `language` value for the `estonian` stemmer
* Reorders the `language` values to show recommended algorithms first,
  followed by other values alphabetically
2020-04-24 11:08:55 -04:00
James Rodewig
1c4e60e86d
[DOCS] Add stemming concept docs (#55156)
Adds conceptual documentation for stemming, including:

* An overview of why stemming is helpful in search
* Algorithmic vs. dictionary stemming
* Token filters used to control stemming, such as `stemmer_override`, `keyword_marker`, and `conditional`
2020-04-24 10:41:50 -04:00
James Rodewig
24160366b8
[DOCS] Reformat flatten_graph token filter (#54268)
* [DOCS] Reformat `flatten_graph` token filter

Makes the following changes to the `flatten_graph` token filter docs:

* Rewrites description and adds Lucene link
* Adds detailed analyze example
* Adds analyzer example
2020-04-16 08:34:15 -04:00
James Rodewig
e867dfabff
[DOCS] Add token filter reference docs template (#52290)
Creates a reusable template for token filter reference documentation.

Contributors can make a copy of this template and customize it when
documenting new token filters.
2020-04-10 08:44:17 -04:00
markharwood
d83798f237
Add pre-configured “lowercase” normalizer (#53882)
Add pre-configured “lowercase” normalizer
Includes tests that user-defined "lowercase" normalizer overrides the default one.

Closes #53872
2020-04-03 10:12:06 +01:00
James Rodewig
28cfb8ca69
[DOCS] Reformat keyword_repeat token filter (#54428) 2020-04-01 11:37:25 -04:00
James Rodewig
ba89f7096c [DOCS] Add missing word to keyword marker token filter docs 2020-03-30 10:45:55 -04:00