elasticsearch

mirror of https://github.com/elastic/elasticsearch.git synced 2025-06-28 17:34:17 -04:00

Author	SHA1	Message	Date
James Rodewig	a7ebddd2f2	[DOCS] Add attribute for Lucene analysis links (#51687 ) Adds a `lucene-analysis-docs` attribute for the Lucene `/analysis/` javadocs directory. This should prevent typos and keep the docs DRY.	2020-01-30 11:22:30 -05:00
James Rodewig	3c28a10b85	[DOCS] Rewrite analysis intro (#51184 ) * [DOCS] Rewrite analysis intro. Move index/search analysis content. * Rewrites 'Text analysis' page intro as high-level definition. Adds guidance on when users should configure text analysis * Rewrites and splits index/search analysis content: * Conceptual content -> 'Index and search analysis' under 'Concepts' * Task-based content -> 'Specify an analyzer' under 'Configure...' * Adds detailed examples for when to use the same index/search analyzer and when not. * Adds new example snippets for specifying search analyzers * clarifications * Add toc. Decrement headings. * Reword 'When to configure' section * Remove sentence from tip	2020-01-30 09:19:53 -05:00
James Rodewig	c99a0e9a5e	[DOCS] Reformat unique token filter docs (#50748 ) * Updates the description * Adds analyze, custom analyzer, and custom filter snippets * Adds parameter documentation	2020-01-28 10:33:45 -05:00
James Rodewig	0189d29c53	[DOCS] Add response snippets to 'Testing analyzers' page (#51427 ) Adds response snippets to the `POST _analyze` snippets in the 'Testing analyzers' page. Co-authored-by: Emmanuel DEMEY <demey.emmanuel@gmail.com>	2020-01-27 08:41:05 -05:00
James Rodewig	0fa6ac0fb9	[DOCS] Add tutorials section to analysis topic (#50809 ) Adds a 'Configure text analysis' page to house tutorial content for the analysis topic. Also relocates the following pages as children as this new page: * 'Test an analyzer' * 'Configuring built-in analyzers' * 'Create a custom analyzer' I plan to add a tutorial for specifying index-time and search-time analyzers to this section as part of a future PR.	2020-01-16 13:11:42 -05:00
James Rodewig	0605eb2078	[DOCS] Add concepts section to analysis topic (#50801 ) This helps the topic better match the structure of our machine learning docs, e.g. https://www.elastic.co/guide/en/machine-learning/7.5/ml-concepts.html This PR only includes the 'Anatomy of an analyzer' page as a 'Concepts' child page, but I plan to add other concepts, such as 'Index time vs. search time', with later PRs.	2020-01-16 13:00:04 -05:00
James Rodewig	8f06f94d9b	[DOCS] Retitle analysis reference pages (#51071 ) * Changes titles to sentence case. * Appends pages with 'reference' to differentiate their content from conceptual overviews. * Moves the 'Normalizers' page to end of the Analysis topic pages.	2020-01-16 12:27:54 -05:00
PND	e16d1e5725	[Docs] Fix example output of edge n-gram token filter. (#51085 )	2020-01-16 11:34:23 +01:00
James Rodewig	14185fbf79	[DOCS] Add section ID to analysis overview page	2020-01-08 14:43:05 -06:00
James Rodewig	495ce1add0	[DOCS] Add overview page to analysis topic (#50515 ) Adds a 'text analysis overview' page to the analysis topic docs. The goals of this page are: * Concisely summarize the analysis process while avoiding in-depth concepts, tutorials, or API examples * Explain why analysis is important, largely through highlighting problems with full-text searches missing analysis * Highlight how analysis can be used to improve search results	2020-01-08 12:53:08 -06:00
James Rodewig	b0ffc60b80	[DOCS] Reformat reverse token filter docs (#50672 ) * Updates the description and adds a Lucene link * Adds analyze and custom analyzer snippets	2020-01-07 10:54:16 -06:00
James Rodewig	2bc37ea4e9	[DOCS] Reformat truncate token filter docs (#50687 ) * Updates the description and adds a Lucene link * Adds analyze, custom analyzer, and custom filter snippets * Adds parameter documentation	2020-01-07 10:32:54 -06:00
James Rodewig	90e139e252	[DOCS] Reformat uppercase token filter docs (#50555 ) * Updates the description and adds a Lucene link * Adds analyze and custom analyzer snippets	2020-01-03 08:34:11 -05:00
James Rodewig	18ee52a5b2	[DOCS] Abbreviate token filter titles (#50511 )	2019-12-27 11:00:51 -05:00
Xiang Dai	432bd0e92c	Fix docs typos (#50365 ) Fixes a few typos in the docs. Signed-off-by: Xiang Dai 764524258@qq.com	2019-12-23 10:35:14 -05:00
James Rodewig	9907b0aab8	[DOCS] Reformat token count limit filter docs (#49835 )	2019-12-13 08:43:35 -05:00
James Rodewig	4dfc07c922	[DOCS] Reformat lowercase token filter docs (#49935 )	2019-12-12 09:39:06 -05:00
James Rodewig	e964a97005	[DOCS] Reformat length token filter docs (#49805 ) * Adds a title abbreviation * Updates the description and adds a Lucene link * Reformats the parameters section * Adds analyze, custom analyzer, and custom filter snippets Relates to #44726.	2019-12-04 09:58:19 -05:00
James Rodewig	6ea54eecf0	[DOCS] Reformat keep types and keep words token filter docs (#49604 ) * Adds title abbreviations * Updates the descriptions and adds Lucene links * Reformats parameter definitions * Adds analyze and custom analyzer snippets * Adds explanations of token types to keep types token filter and tokenizer docs	2019-12-02 09:22:21 -05:00
James Rodewig	1471f34c54	[DOCS] Reformat delimited payload token filter docs (#49380 ) * Adds a title abbreviation * Relocates the older name deprecation warning * Updates the description and adds a Lucene link * Adds a note to explain payloads and how to store them * Adds analyze and custom analyzer snippets * Adds a 'Return stored payloads' example	2019-11-25 15:38:52 -05:00
James Rodewig	642390c3a7	[DOCS] Fix edge n-gram tokenizer nav Adds a missing float tag to the edge n-gram tokenizer docs. This tag ensures the edge n-gram tokenizer docs display on the same page.	2019-11-22 15:51:52 -05:00
James Rodewig	ddf5c0a76a	[DOCS] Reformat n-gram token filter docs (#49438 ) Reformats the edge n-gram and n-gram token filter docs. Changes include: * Adds title abbreviations * Updates the descriptions and adds Lucene links * Reformats parameter definitions * Adds analyze and custom analyzer snippets * Adds notes explaining differences between the edge n-gram and n-gram filters Additional changes: * Switches titles to use "n-gram" throughout. * Fixes a typo in the edge n-gram tokenizer docs * Adds an explicit anchor for the `index.max_ngram_diff` setting	2019-11-22 10:38:01 -05:00
Christoph Büscher	ed86750fa4	Allow custom characters in token_chars of ngram tokenizers (#49250 ) Currently the `token_chars` setting in both `edgeNGram` and `ngram` tokenizers only allows for a list of predefined character classes, which might not fit every use case. For example, including underscore "_" in a token would currently require the `punctuation` class which comes with a lot of other characters. This change adds an additional "custom" option to the `token_chars` setting, which requires an additional `custom_token_chars` setting to be present and which will be interpreted as a set of characters to inlcude into a token. Closes #25894	2019-11-20 10:36:39 +01:00
James Rodewig	3cf6569e0e	[DOCS] Reformat elision token filter docs (#49262 )	2019-11-19 10:54:29 -05:00
James Rodewig	ee6f80b1de	[DOCS] Reformat fingerprint token filter docs (#49311 )	2019-11-19 10:54:16 -05:00
gpaimla	d1ea9910c3	Implement Lucene EstonianAnalyzer, Stemmer (#49149 ) This PR adds a new analyzer and stemmer for the Estonian language. Closes #48895	2019-11-18 17:19:54 +01:00
James Rodewig	2fe9ba53ec	[DOCS] Note limitations of `max_gram` parm in `edge_ngram` tokenizer for index analyzers (#49007 ) The `edge_ngram` tokenizer limits tokens to the `max_gram` character length. Autocomplete searches for terms longer than this limit return no results. To prevent this, you can use the `truncate` token filter to truncate tokens to the `max_gram` character length. However, this could return irrelevant results. This commit adds some advisory text to make users aware of this limitation and outline the tradeoffs for each approach. Closes #48956.	2019-11-13 14:27:10 -05:00
James Rodewig	c4e113ec60	[DOCS] Reformat compound word token filters (#49006 ) * Separates the compound token filters doc pages into separate token filter pages: * Dictionary decompounder token filter * Hyphenation decompounder token filter * Adds analyze API examples for each compound token filter * Adds a redirect for the removed compound token filters page Co-Authored-By: debadair <debadair@elastic.co>	2019-11-13 09:35:00 -05:00
James Rodewig	547f30077c	[DOCS] Reformat condition token filter (#48775 )	2019-11-11 08:49:01 -05:00
Julian Simioni	05bc46e7e4	[Docs] Consolidate single example into a single line (#48904 ) The first example of splitting rules for the `word_delimiter` token filter was spread across two bullet points. This makes it look like they are two separate splitting rules.	2019-11-08 15:13:29 -05:00
James Rodewig	8ce338ee3d	[DOCS] Reformat decimal digit token filter docs (#48722 )	2019-11-01 12:37:24 -04:00
Peter Johnson	65700b6940	[DOCS] Fix typo in synonym token filter docs (#48691 )	2019-10-31 09:13:15 -04:00
James Rodewig	eb9eb927ff	[DOCS] Remove unneeded filter from common grams analyze ex (#48748 )	2019-10-31 09:07:27 -04:00
James Rodewig	60f9de543b	[DOCS] Reformat common grams token filter (#48426 )	2019-10-30 08:40:11 -04:00
James Rodewig	31fc615381	[DOCS] Reformat ASCII folding token filter docs (#48143 )	2019-10-23 15:06:18 -05:00
James Rodewig	a0795163a9	[DOCS] Reformat classic token filter docs (#48314 )	2019-10-23 09:38:22 -05:00
James Rodewig	bb635e5a9e	[DOCS] Reformat CJK bigram and CJK width token filter docs (#48210 )	2019-10-21 09:43:59 -04:00
James Rodewig	c367c5cf75	[DOCS] Reformat apostrophe token filter docs (#48076 )	2019-10-16 08:50:12 -04:00
Wilder Pereira	630bfa1001	[DOCS] Remove unneeded spaces from custom analyzer snippet (#47332 )	2019-10-15 15:52:52 -04:00
James Rodewig	59933abb0e	[DOCS] Sort analyzers, tokenizers, and token filters alphabetically (#48068 )	2019-10-15 15:46:50 -04:00
Alan Woodward	c1f99e2d75	Remove `_type` from SearchHit (#46942 ) This commit removes the `_type` field from all search hit responses. Relates to #41059	2019-09-23 19:14:54 +01:00
James Rodewig	de2c8f7231	Fixed sample code for minhash (#46385 ) The sample code is wrong. Field type is required for the sample field. I guess the intention was to give the sample field the name ```fingerprint```, mapping it as ```text``` using the custom analyzer ```my_analyzer```	2019-09-12 13:29:07 -04:00
Abhilash Bolla	b4c18b9c44	Fixed grammar in pattern replace char filter docs. (#46546 ) Minor grammar fix in the pattern replace char filter docs.	2019-09-10 09:46:06 -07:00
James Rodewig	5772c1c7dd	[DOCS] [2 of 5] Change // CONSOLE comments to [source,console] (#46353 )	2019-09-09 13:13:41 -04:00
James Rodewig	e43be90e6c	[DOCS] [5 of 5] Change // TESTRESPONSE comments to [source,console-results] (#46449 )	2019-09-06 14:05:36 -04:00
James Rodewig	466c59a4a7	[DOCS] Replace "// TESTRESPONSE" magic comments with "[source,console-result] (#46295 )	2019-09-05 16:47:18 -04:00
James Rodewig	be7b873a43	[DOCS] Correct custom analyzer callouts (#46030 )	2019-08-29 10:07:52 -04:00
MK Swanson	f47886e44a	[DOCS] Modified section headings, edited text for clarity. (#44988 ) * [DOCS] Modified section headings, edited text for clarity. * [DOCS] Modified section headings, edited text for clarity. * [DOCS] Modified section headings, edited text for clarity.	2019-07-30 16:03:05 -04:00
James Rodewig	ea1adb61c2	[DOCS] Update anchors and links for Elasticsearch API relocation (#44500 )	2019-07-19 09:16:35 -04:00
Christoph Büscher	56ee1a5e00	Allow reloading of search time analyzers (#43313 ) Currently changing resources (like dictionaries, synonym files etc...) of search time analyzers is only possible by closing an index, changing the underlying resource (e.g. synonym files) and then re-opening the index for the change to take effect. This PR adds a new API endpoint that allows triggering reloading of certain analysis resources (currently token filters) that will then pick up changes in underlying file resources. To achieve this we introduce a new type of custom analyzer (ReloadableCustomAnalyzer) that uses a ReuseStrategy that allows swapping out analysis components. Custom analyzers that contain filters that are markes as "updateable" will automatically choose this implementation. This PR also adds this capability to `synonym` token filters for use in search time analyzers. Relates to #29051	2019-06-27 18:27:11 +02:00

1 2 3 4 5 ...

369 commits