[DOCS] Replace Wikipedia links with attribute (#61171)

2025-06-28 17:34:17 -04:00 · 2020-08-17 09:44:24 -04:00 · 2020-08-17 09:44:24 -04:00 · a94e5cb7c4
commit a94e5cb7c4
parent 3b44274373
78 changed files with 164 additions and 164 deletions
--- a/docs/reference/analysis/token-graphs.asciidoc
+++ b/docs/reference/analysis/token-graphs.asciidoc
@ -8,7 +8,7 @@ tokens, it also records the following:
 * The `positionLength`, the number of positions that a token spans

 Using these, you can create a
-https://en.wikipedia.org/wiki/Directed_acyclic_graph[directed acyclic graph],
+{wikipedia}/Directed_acyclic_graph[directed acyclic graph],
 called a _token graph_, for a stream. In a token graph, each position represents
 a node. Each token represents an edge or arc, pointing to the next position.

--- a/docs/reference/analysis/tokenfilters/cjk-bigram-tokenfilter.asciidoc
+++ b/docs/reference/analysis/tokenfilters/cjk-bigram-tokenfilter.asciidoc
@ -4,7 +4,7 @@
 <titleabbrev>CJK bigram</titleabbrev>
 ++++

-Forms https://en.wikipedia.org/wiki/Bigram[bigrams] out of CJK (Chinese,
+Forms {wikipedia}/Bigram[bigrams] out of CJK (Chinese,
 Japanese, and Korean) tokens.

 This filter is included in {es}'s built-in <<cjk-analyzer,CJK language
@ -161,7 +161,7 @@ All non-CJK input is passed through unmodified.
 `output_unigrams`
 (Optional, boolean)
 If `true`, emit tokens in both bigram and
-https://en.wikipedia.org/wiki/N-gram[unigram] form. If `false`, a CJK character
+{wikipedia}/N-gram[unigram] form. If `false`, a CJK character
 is output in unigram form when it has no adjacent characters. Defaults to
 `false`.

--- a/docs/reference/analysis/tokenfilters/common-grams-tokenfilter.asciidoc
+++ b/docs/reference/analysis/tokenfilters/common-grams-tokenfilter.asciidoc
@ -4,7 +4,7 @@
 <titleabbrev>Common grams</titleabbrev>
 ++++

-Generates https://en.wikipedia.org/wiki/Bigram[bigrams] for a specified set of
+Generates {wikipedia}/Bigram[bigrams] for a specified set of
 common words.

 For example, you can specify `is` and `the` as common words. This filter then
--- a/docs/reference/analysis/tokenfilters/edgengram-tokenfilter.asciidoc
+++ b/docs/reference/analysis/tokenfilters/edgengram-tokenfilter.asciidoc
@ -4,7 +4,7 @@
 <titleabbrev>Edge n-gram</titleabbrev>
 ++++

-Forms an https://en.wikipedia.org/wiki/N-gram[n-gram] of a specified length from
+Forms an {wikipedia}/N-gram[n-gram] of a specified length from
 the beginning of a token.

 For example, you can use the `edge_ngram` token filter to change `quick` to
--- a/docs/reference/analysis/tokenfilters/elision-tokenfilter.asciidoc
+++ b/docs/reference/analysis/tokenfilters/elision-tokenfilter.asciidoc
@ -4,7 +4,7 @@
 <titleabbrev>Elision</titleabbrev>
 ++++

-Removes specified https://en.wikipedia.org/wiki/Elision[elisions] from
+Removes specified {wikipedia}/Elision[elisions] from
 the beginning of tokens. For example, you can use this filter to change
 `l'avion` to `avion`.

--- a/docs/reference/analysis/tokenfilters/minhash-tokenfilter.asciidoc
+++ b/docs/reference/analysis/tokenfilters/minhash-tokenfilter.asciidoc
@ -4,7 +4,7 @@
 <titleabbrev>MinHash</titleabbrev>
 ++++

-Uses the https://en.wikipedia.org/wiki/MinHash[MinHash] technique to produce a
+Uses the {wikipedia}/MinHash[MinHash] technique to produce a
 signature for a token stream. You can use MinHash signatures to estimate the
 similarity of documents. See <<analysis-minhash-tokenfilter-similarity-search>>.

@ -95,8 +95,8 @@ locality sensitive hashing (LSH).

 Depending on what constitutes the similarity between documents,
 various LSH functions https://arxiv.org/abs/1408.2927[have been proposed].
-For https://en.wikipedia.org/wiki/Jaccard_index[Jaccard similarity], a popular
-LSH function is https://en.wikipedia.org/wiki/MinHash[MinHash].
+For {wikipedia}/Jaccard_index[Jaccard similarity], a popular
+LSH function is {wikipedia}/MinHash[MinHash].
 A general idea of the way MinHash produces a signature for a document
 is by applying a random permutation over the whole index vocabulary (random
 numbering for the vocabulary), and recording the minimum value for this permutation
--- a/docs/reference/analysis/tokenfilters/ngram-tokenfilter.asciidoc
+++ b/docs/reference/analysis/tokenfilters/ngram-tokenfilter.asciidoc
@ -4,7 +4,7 @@
 <titleabbrev>N-gram</titleabbrev>
 ++++

-Forms https://en.wikipedia.org/wiki/N-gram[n-grams] of specified lengths from
+Forms {wikipedia}/N-gram[n-grams] of specified lengths from
 a token.

 For example, you can use the `ngram` token filter to change `fox` to
--- a/docs/reference/analysis/tokenfilters/shingle-tokenfilter.asciidoc
+++ b/docs/reference/analysis/tokenfilters/shingle-tokenfilter.asciidoc
@ -4,7 +4,7 @@
 <titleabbrev>Shingle</titleabbrev>
 ++++

-Add shingles, or word https://en.wikipedia.org/wiki/N-gram[n-grams], to a token
+Add shingles, or word {wikipedia}/N-gram[n-grams], to a token
 stream by concatenating adjacent tokens. By default, the `shingle` token filter
 outputs two-word shingles and unigrams.

--- a/docs/reference/analysis/tokenfilters/stop-tokenfilter.asciidoc
+++ b/docs/reference/analysis/tokenfilters/stop-tokenfilter.asciidoc
@ -4,7 +4,7 @@
 <titleabbrev>Stop</titleabbrev>
 ++++

-Removes https://en.wikipedia.org/wiki/Stop_words[stop words] from a token
+Removes {wikipedia}/Stop_words[stop words] from a token
 stream.

 When not customized, the filter removes the following English stop words by
--- a/docs/reference/analysis/tokenizers/edgengram-tokenizer.asciidoc
+++ b/docs/reference/analysis/tokenizers/edgengram-tokenizer.asciidoc
@ -6,7 +6,7 @@

 The `edge_ngram` tokenizer first breaks text down into words whenever it
 encounters one of a list of specified characters, then it emits
-https://en.wikipedia.org/wiki/N-gram[N-grams] of each word where the start of
+{wikipedia}/N-gram[N-grams] of each word where the start of
 the N-gram is anchored to the beginning of the word.

 Edge N-Grams are useful for _search-as-you-type_ queries.
--- a/docs/reference/analysis/tokenizers/ngram-tokenizer.asciidoc
+++ b/docs/reference/analysis/tokenizers/ngram-tokenizer.asciidoc
@ -6,7 +6,7 @@

 The `ngram` tokenizer first breaks text down into words whenever it encounters
 one of a list of specified characters, then it emits
-https://en.wikipedia.org/wiki/N-gram[N-grams] of each word of the specified
+{wikipedia}/N-gram[N-grams] of each word of the specified
 length.

 N-grams are like a sliding window that moves across the word - a continuous