elasticsearch

mirror of https://github.com/elastic/elasticsearch.git synced 2025-04-25 07:37:19 -04:00

Author	SHA1	Message	Date
James Rodewig	2774cd6938	[DOCS] Swap `[float]` for `[discrete]` (#60124 ) Changes instances of `[float]` in our docs for `[discrete]`. Asciidoctor prefers the `[discrete]` tag for floating headings: https://asciidoctor.org/docs/asciidoc-asciidoctor-diffs/#blocks	2020-07-23 11:48:22 -04:00
James Rodewig	80b674fb25	[DOCS] Reformat snippets to use two-space indents (#59973 )	2020-07-21 12:24:26 -04:00
malpani	08de504b44	Support ignore_keywords flag for word delimiter graph token filter (#59563 ) This commit allows customizing the word delimiter token filters to skip processing tokens tagged as keyword through the `ignore_keywords` flag Lucene's WordDelimiterGraphFilter already exposes. Fix for #59491	2020-07-21 16:11:11 +01:00
Rui Almeida	2c450214ac	[DOCS] Fix keyword marker docs (#59834 )	2020-07-20 08:54:55 -04:00
James Rodewig	8b6e310070	[DOCS] Reformat `predicate_token_filter` tokenfilter (#57705 )	2020-07-16 13:07:19 -04:00
James Rodewig	2be9db01c8	[DOCS] Replace `datatype` with `data type` (#58972 )	2020-07-07 13:52:10 -04:00
James Rodewig	8439c888b6	[DOCS] Fix headings for simple analyzer docs (#58910 )	2020-07-02 09:28:56 -04:00
James Rodewig	05da3e0e48	[DOCS] Fix analyzer page titles (#58362 ) Changes the titles for analyzer pages to sentence case. Also changes the 'Pattern character filter' page title to sentence case.	2020-06-26 09:30:37 -04:00
James Rodewig	b2b3599012	[DOCS] Fix tokenizer page titles (#58361 ) Changes the titles for tokenizer pages to sentence case. Also moves the 'Path hierarchy tokenizer examples' page within the 'Path hierarchy tokenizer' page and adds a related redirect.	2020-06-26 09:08:44 -04:00
James Rodewig	bb66d594d1	[DOCS] Reformat `pattern_replace` token filter (#57699 ) Changes: * Rewrites description and adds Lucene link * Adds analyze example * Adds parameter definitions * Adds custom analyzer example	2020-06-11 12:04:22 -04:00
James Rodewig	fd8af38078	[DOCS] Reformat `mapping` charfilter (#57818 ) Changes: * Adds title abbreviation * Adds Lucene link to description * Adds standard headings * Simplifies analyze example * Simplifies analyzer example and adds contextual text	2020-06-09 12:23:08 -04:00
James Rodewig	06b41614a2	[DOCS] Fix typo in `html_strip` char filter docs	2020-06-08 10:37:16 -04:00
James Rodewig	98a64da87c	[DOCS] Reformat `html_strip` charfilter (#57764 ) Changes: * Converts title to sentence case * Adds a title abbreviation * Adds Lucene link to description * Reformat sections	2020-06-08 08:30:23 -04:00
Tomasz Elendt	66ded59929	Support multiple tokens on LHS in stemmer_override rules (#56113 ) (#56484 ) This commit adds support for rules with multiple tokens on LHS, also known as "contraction rules", into stemmer override token filter. Contraction rules are handy into translating multiple inflected words into the same root form. One side effect of this change is that it brings stemmer override rules format closer to synonym rules format so that it makes it easier to translate one into another. This change also makes stemmer override rules parser more strict so that it should catch more errors which were previously accepted. Closes #56113	2020-05-29 22:28:41 +02:00
James Rodewig	16be0e65d3	[DOCS] Reformat `min_hash` token filter docs (#57181 ) Changes: * Rewrites description and adds a Lucene link * Reformats the configurable parameters as a definition list * Changes the `Theory` heading to `Using the min_hash token filter for similarity search` * Adds some additional detail to the analyzer example	2020-05-27 14:55:27 -04:00
James Rodewig	00ab16ff97	[DOCS] Reformat `shingle` token filter (#57040 ) Changes: * Rewrites description and adds Lucene link * Adds analyze example * Rewrites parameter documentation * Updates custom analyzer and filter examples * Adds anchor to `index.max_shingle_diff` index-level setting	2020-05-21 13:41:51 -04:00
James Rodewig	2ed91444fe	[DOCS] Reformat `hunspell` token filter (#56955 ) Changes: * Rewrites description and adds Lucene link * Adds analyze example * Rewrites parameter documentation * Updates custom analyzer example * Rewrites related setting documentation	2020-05-20 14:29:08 -04:00
Andrei Balici	da31b4b83d	Add `max_token_length` setting to the CharGroupTokenizer (#56860 ) Adds `max_token_length` option to the CharGroupTokenizer. Updates documentation as well to reflect the changes. Closes #56676	2020-05-20 14:15:57 +02:00
James Rodewig	6fe84e67e9	[DOCS] Fix fingerprint token filter's analyzer example (#56811 ) (#56944 ) Co-authored-by: Abhilash Bolla <2282894+ivssh@users.noreply.github.com>	2020-05-19 09:38:37 -04:00
James Rodewig	36ae8ebfde	[DOCS] Reformat `porter_stem` token filter (#56053 ) Makes the following changes to the `porter_stem` token filter docs: * Rewrites description and adds a Lucene link * Adds detailed analyze example * Adds an analyzer example	2020-05-04 10:03:03 -04:00
Amit Khandelwal	00fef6dfd3	Analysis enhancement - add preserve_original setting in ngram-token-filter (#55432 )	2020-05-04 10:06:37 +01:00
James Rodewig	6dbdf879b2	[DOCS] Correct Lucene link in `kstrem` token filter docs	2020-04-29 09:28:05 -04:00
James Rodewig	77a35c641d	[DOCS] Reformat `kstem` token filter (#55823 ) Makes the following changes to the `kstem` token filter docs: * Rewrite description and adds a Lucene work * Adds detailed analyze example * Adds an analyzer example	2020-04-29 08:27:30 -04:00
Amit Khandelwal	9e41feda86	Expose `preserve_original` in `edge_ngram` token filter (#55766 ) The Lucene `preserve_original` setting is currently not supported in the `edge_ngram` token filter. This change adds it with a default value of `false`. Closes #55767	2020-04-28 10:22:59 +02:00
James Rodewig	d67a1b47e4	[DOCS] Correct stemmer token filters anchor	2020-04-27 14:56:25 -04:00
James Rodewig	f08b3c93cb	[DOCS] Correct stemmer token filter anchor	2020-04-27 14:49:19 -04:00
James Rodewig	bb9dbcb4c8	[DOCS] Reformat `stemmer` token filter (#55693 ) Makes the following changes to the `stemmer` token filter docs: * Adds detailed analyze example * Rewrites parameter definitions * Adds custom analyzer example * Adds a `language` value for the `estonian` stemmer * Reorders the `language` values to show recommended algorithms first, followed by other values alphabetically	2020-04-24 11:08:55 -04:00
James Rodewig	1c4e60e86d	[DOCS] Add stemming concept docs (#55156 ) Adds conceptual documentation for stemming, including: * An overview of why stemming is helpful in search * Algorithmic vs. dictionary stemming * Token filters used to control stemming, such as `stemmer_override`, `keyword_marker`, and `conditional`	2020-04-24 10:41:50 -04:00
James Rodewig	24160366b8	[DOCS] Reformat `flatten_graph` token filter (#54268 ) * [DOCS] Reformat `flatten_graph` token filter Makes the following changes to the `flatten_graph` token filter docs: * Rewrites description and adds Lucene link * Adds detailed analyze example * Adds analyzer example	2020-04-16 08:34:15 -04:00
James Rodewig	e867dfabff	[DOCS] Add token filter reference docs template (#52290 ) Creates a reusable template for token filter reference documentation. Contributors can make a copy of this template and customize it when documenting new token filters.	2020-04-10 08:44:17 -04:00
markharwood	d83798f237	Add pre-configured “lowercase” normalizer (#53882 ) Add pre-configured “lowercase” normalizer Includes tests that user-defined "lowercase" normalizer overrides the default one. Closes #53872	2020-04-03 10:12:06 +01:00
James Rodewig	28cfb8ca69	[DOCS] Reformat `keyword_repeat` token filter (#54428 )	2020-04-01 11:37:25 -04:00
James Rodewig	ba89f7096c	[DOCS] Add missing word to keyword marker token filter docs	2020-03-30 10:45:55 -04:00
James Rodewig	40067d04dd	[DOCS] Add missing "the" to keyword tokenizer docs	2020-03-30 08:53:55 -04:00
jureaky	4fe8ad357c	[DOCS] Add a lowercase email example to keyword tokenizer docs (#53257 )	2020-03-30 08:35:55 -04:00
James Rodewig	4f503bf9df	[DOCS] Reformat `keyword_marker` token filter (#54076 ) Makes the following changes to the `keyword_marker` token filter docs: * Rewrites description and adds Lucene link * Adds detailed analyze example * Rewrites parameter definitions * Adds custom analyzer and filter example	2020-03-25 09:01:30 -04:00
James Rodewig	0a35f3900d	[DOCS] Remove double space in WDG docs	2020-03-23 17:15:37 -04:00
James Rodewig	747a164fae	[DOCS] Fix "letter case" typo Changes "lettercase" to "letter case" in the `uppercase` token filter docs.	2020-03-23 17:11:39 -04:00
lgypro	7a1502db6c	[Docs] Fix typo in _analyze api docs (#53837 )	2020-03-20 11:45:31 +01:00
James Rodewig	8d5478f56c	[DOCS] Add token graph concept docs (#53339 ) Adds conceptual docs for token graphs. These docs cover: * How a token graph is constructed from a token stream * How synonyms and multi-position tokens impact token graphs * How token graphs are used during search * Why some token filters produce invalid token graphs Also makes the following supporting changes: * Adds anchors to the 'Anatomy of an Analyzer' docs for cross-linking * Adds several SVGs for token graph diagrams	2020-03-19 07:42:26 -04:00
James Rodewig	3a39ed0055	[DOCS] Remove `light_bengali` stemmer (#53697 ) Only the `bengali` stemmer is available in Lucene and surfaced through Elasticsearch. This removes the incorrect `light_bengali` link in our docs.	2020-03-18 08:33:20 -04:00
James Rodewig	e8ed337b2a	[DOCS] Reformat `remove_duplicates` token filter (#53608 ) Makes the following changes to the `remove_duplicates` token filter docs: * Rewrites description and adds Lucene link * Adds detailed analyze example * Adds custom analyzer example	2020-03-16 11:21:20 -04:00
Jim Ferenczi	9ad0597617	Removes old Lucene's experimental flag from analyzer documentations (#53217 ) This change removes the Lucene's experimental flag from the documentations of the following tokenizer/filters: * Simple Pattern Split Tokenizer * Simple Pattern tokenizer * Flatten Graph Token Filter * Word Delimiter Graph Token Filter The flag is still present in Lucene codebase but we're fully supporting these tokenizers/filters in ES for a long time now so the docs flag is misleading. Co-authored-by: James Rodewig <james.rodewig@elastic.co>	2020-03-12 21:17:11 +01:00
James Rodewig	d16fe48312	[DOCS] Reformat `word_delimiter` token filter (#53387 ) Makes the following changes to the `word_delimiter` token filter docs: * Adds a warning admonition recommending the `word_delimiter_graph` filter instead. This warning includes a link to the deprecated Lucene `WordDelimiterFilter`. * Updates the description * Adds detailed analyze snippet * Adds custom analyzer and custom filter snippets * Reorganizes and updates parameter documentation	2020-03-11 08:44:44 -04:00
James Rodewig	377539e055	[DOCS] Use keyword tokenizer in word delimiter graph examples (#53384 ) In a tip admonition, we recommend using the `keyword` tokenizer with the `word_delimiter_graph` token filter. However, we only use the `whitespace` tokenizer in the example snippets. This updates those snippets to use the `keyword` tokenizer instead. Also corrects several spacing issues for arrays in these docs.	2020-03-11 04:45:26 -04:00
James Rodewig	0089805b68	[DOCS] Correct anchor in word delimiter graph token filter docs	2020-03-10 10:32:00 -04:00
James Rodewig	1c8ab01ee6	[DOCS] Reformat `word_delimiter_graph` token filter (#53170 ) Makes the following changes to the `word_delimiter_graph` token filter docs: * Updates the Lucene experimental admonition. * Updates description * Adds analyze snippet * Adds custom analyzer and custom filter snippets * Reorganizes and updates parameter list * Expands and updates section re: differences between `word_delimiter` and `word_delimiter_graph`	2020-03-09 06:27:41 -04:00
James Rodewig	10f9a8fd64	[DOCS] Note that `trim` filter doesn't change offsets (#53220 ) The [word delimiter graph token filter docs][0] note that the `trim` filter changes the length of tokens without changing their offsets. This explicitly mentions that in the `trim` filter docs. [0]: https://www.elastic.co/guide/en/elasticsearch/reference/master/analysis-word-delimiter-graph-tokenfilter.html	2020-03-06 07:27:14 -05:00
James Rodewig	9f641dc07d	[DOCS] Fix several Asciidoctor double arrow replacements (#52827 ) Per the [Asciidoctor docs][0], Asciidoctor replaces the following syntax with double arrows in the rendered HTML: * => renders as ⇒ * <= renders as ⇐ This escapes several unintended replacements, such as in the Painless docs. Where appropriate, it also replaces some double arrow instances with single arrows for consistency. [0]: https://asciidoctor.org/docs/user-manual/#replacements	2020-03-04 08:42:37 -05:00
James Rodewig	e016864b7d	[DOCS] Reformat `stop` token filter (#53059 ) Makes the following changes to the `stop` token filter docs: * Updates description * Adds a link to the related Lucene filter * Adds detailed analyze snippet * Updates custom analyzer and custom filter snippets * Adds a list of predefined stop words by language Co-authored-by: ScottieL <36999642+ScottieL@users.noreply.github.com>	2020-03-03 13:05:12 -05:00

1 2 3 4 5 ...

322 commits