Commit graph

179 commits

Author SHA1 Message Date
Benjamin Trent
ffd409956a
fix bbq memory size estimate (#124022) (#124045)
(cherry picked from commit 76cf99c11e)

Co-authored-by: weizijun <weizijun.wzj@alibaba-inc.com>
2025-03-05 07:09:19 +11:00
Stef Nestor
2a20cc26ff
(Doc+) Flush out Slow Logs (#118518) (#119318)
* (Doc+) Slow Logs

---------

Co-authored-by: shainaraskas <58563081+shainaraskas@users.noreply.github.com>
2024-12-28 04:38:12 +11:00
Liam Thompson
087bfd39e6
[DOCS] Rename how-to subsection, move recipes to search relevance (#117044) (#117046) 2024-11-20 04:53:03 +11:00
Benjamin Trent
f9077a09ef
Clarify the vector files utilized for preloading (#116488) (#116622)
Adds clarification for vector preloading, what extension is to what
storage kind, and that quantized vectors are stored in separate files
allowing for individual preload. 

closes: https://github.com/elastic/elasticsearch/issues/116273
2024-11-12 08:58:44 +11:00
Benjamin Trent
47be7e4605
Fixing list for size estimates (#116486) (#116494) 2024-11-08 12:49:11 -05:00
Benjamin Trent
6ba9a6a09c
Updating knn tuning guide and size estimates (#115691) (#115753) 2024-10-28 23:45:42 +11:00
Salvatore Campagna
cb7f9b7be0
Update synthetic source documentation (#112363) (#115097) 2024-10-18 14:40:46 +02:00
Liam Thompson
1b9c39efb7
[DOCS] Fix typo in knn tuning guide (#113880) (#113922)
(cherry picked from commit 9b582c15ff)
2024-10-02 18:24:09 +10:00
Benjamin Trent
6adc854799
Adjust the knn tuning guide (#113566) (#113864) 2024-10-01 23:19:03 +10:00
Stef Nestor
f5de9c00c8
(Doc+) "min_primary_shard_size" for 10-50GB shards (#111574)
👋🏽 howdy, team! 

Expands [10-50GB sharding recommendation](https://www.elastic.co/guide/en/elasticsearch/reference/master/size-your-shards.html#shard-size-recommendation) to include ILM's more recent [`min_primary_shard_size`](https://www.elastic.co/guide/en/elasticsearch/reference/master/ilm-rollover.html) option to avoid small shards.
2024-08-21 11:57:09 +02:00
Stef Nestor
a7470c05b1
(Doc+) How to resolve shards >50GB (#111254)
* (Doc+) How to resolve shards >50GB

---------

Co-authored-by: Ievgen Degtiarenko <ievgen.degtiarenko@gmail.com>
2024-07-25 08:28:46 -06:00
Stef Nestor
cc245c4022
Add link to Max Shards Per Node docs (#110993)
... from exception message
2024-07-22 16:48:51 +01:00
David Turner
3d39baa7c0
Add link to MAX_DOCS exception message (#110911)
Follow-up to #110449
2024-07-16 13:48:23 +01:00
Stef Nestor
512bca8669
(Doc+) Error "number of documents in the index can't exceed" (#110449)
* (Doc+) Error "number of documents in the index can't exceed"

👋 howdy, team! 

This adds resolution outline for error ... which induces ongoing, lowkey support
```
Number of documents in the index can't exceed [2147483519]
```

* feedback

* feedback

Co-authored-by: David Turner <david.turner@elastic.co>

* feedback

Co-authored-by: David Turner <david.turner@elastic.co>
Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com>

* feedback

* feedback

* Test change to address docs check failure

* Revert test change

* Test docs check

---------

Co-authored-by: David Turner <david.turner@elastic.co>
Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com>
2024-07-16 11:39:30 +02:00
shainaraskas
900eb82c99
[DOCS] Address local vs. remote storage + shard limits feedback (#109360) 2024-06-12 13:50:23 -04:00
Liam Thompson
33a71e3289
[DOCS] Refactor book-scoped variables in docs/reference/index.asciidoc (#107413)
* Remove `es-test-dir` book-scoped variable

* Remove `plugins-examples-dir` book-scoped variable

* Remove `:dependencies-dir:` and `:xes-repo-dir:` book-scoped variables

- In `index.asciidoc`, two variables (`:dependencies-dir:` and `:xes-repo-dir:`) were removed.
- In `sql/index.asciidoc`, the `:sql-tests:` path was updated to fuller path
- In `esql/index.asciidoc`, the `:esql-tests:` path was updated idem

* Replace `es-repo-dir` with `es-ref-dir`

* Move `:include-xpack: true` to few files that use it, remove from index.asciidoc
2024-04-17 14:37:07 +02:00
shainaraskas
a3794e7584
[DOCS] Remove orphaned cluster issues troubleshooing doc (#106959) 2024-04-01 16:12:48 -04:00
István Zoltán Szabó
6073e748a3
[DOCS] Adds more detail on disk usage of kNN quantized vectors (#105724)
Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com>
Co-authored-by: Benjamin Trent <ben.w.trent@gmail.com>
2024-02-22 15:54:03 +01:00
David Turner
369096365c
Expand docs about max-shards-per-node (#105607)
Adds a little more detail on what sorts of problems may occur if you
exceed the default limits.
2024-02-20 08:43:18 +00:00
Abdon Pijpelink
7b37d4242e
[DOCS] Mention that vector quantization increases disk usage (#104509) 2024-01-18 14:01:07 +01:00
Abdon Pijpelink
ea4b6fd3ea
[DOCS] Change order on 'tune knn' page (#104036) 2024-01-08 12:04:39 +01:00
Abdon Pijpelink
7d1c342883
[DOCS] Stop recommending dot_product over cosine similarity (#103856) 2024-01-03 14:37:21 +01:00
Benjamin Trent
f00364aefd
Add byte quantization for float vectors in HNSW (#102093)
Adds new `quantization_options` to `dense_vector`. This allows for
vectors to be automatically quantized to `byte` when indexed.

Example:

```
PUT vectors
{
  "mappings": {
    "properties": {
      "my_vector": {
        "type": "dense_vector",
        "index": true,
        "index_options": {
          "type": "int8_hnsw"
        }
      }
    }
  }
}
```

When querying, the query vector is automatically quantized and used when
querying the HNSW graph. This reduces the memory required to only `25%`
of what was previously required for `float` vectors at a slight loss of
accuracy.

This is currently only available when `index: true` and when using
`hnsw`
2023-11-29 12:29:55 -05:00
James Rodewig
3a91763d27
[DOCS] Deprecate rollups (#101265) 2023-10-25 16:52:25 -04:00
Mayya Sharipova
b582276dd6
Update kNN search guide with knn parallelization (#100705)
Relates to PR #98204
2023-10-11 15:31:03 -04:00
Benjamin Trent
83b70e37ef
Revert "Auto-normalize dot_product vectors at index & query (#98944)" (#99421)
This reverts commit 7b9c367aeb.
2023-09-11 09:33:17 -04:00
Benjamin Trent
7b9c367aeb
Auto-normalize dot_product vectors at index & query (#98944)
`dot_product` requires vectors to be unit-length. Previously, we would
check that vectors were unit-length and throw if they were not. 

Instead, we will now auto-normalize vectors as they are indexed.

`cosine` will continue to behave as usual, not normalizing the vectors.

closes: https://github.com/elastic/elasticsearch/issues/98935
2023-08-30 09:50:49 -04:00
David Turner
60935c68cc
Adjust sizing guidance re. doc count (#97831)
In #87246 we describe some reasons why it's a good idea to limit the doc
count of a shard, and we started to do so in #94065, so this commit
adjusts the sizing guidance docs to match.
2023-07-20 14:56:52 +01:00
David Turner
ddd4ba5e30
Fix docs for explaining unassigned shards (#97538)
Today the `current_node` parameter is given in several sample requests
illustrating how to explain an unassigned shard using the cluster
allocation explain API. This doesn't make sense, an unassigned shard has
no `current_node`. This commit removes the misleading parameter in these
cases.
2023-07-11 08:01:12 +01:00
Mayya Sharipova
b366935df8
Add file extensions for vector search for preload (#96955)
In this tuning guide we mentioned preload to warm up
the filesystem cache, but we did not provide file extensions
used in vector search. This adds these extensions.
2023-06-20 13:52:51 -04:00
David Turner
846d640ddf
Suggest capturing a heap dump to diagnose high heap (#96526)
The `high-jvm-memory-pressure.html` troubleshooting docs give some
suggestions, but vitally they omit the advice to capture a heap dump
which is what we really need users to do if they want to understand
their high heap usage. This commit adds a note to the docs to that
effect.
2023-06-02 09:43:52 -04:00
debadair
777598d602
[DOCS] Remove redirect pages (#88738)
* [DOCS] Remove manual redirects

* [DOCS] Removed refs to modules-discovery-hosts-providers

* [DOCS] Fixed broken internal refs

* Fixing bad cross links in ES book, and adding redirects.asciidoc[] back into docs/reference/index.asciidoc.

* Update docs/reference/search/point-in-time-api.asciidoc

Co-authored-by: James Rodewig <james.rodewig@elastic.co>

* Update docs/reference/setup/restart-cluster.asciidoc

Co-authored-by: James Rodewig <james.rodewig@elastic.co>

* Update docs/reference/sql/endpoints/translate.asciidoc

Co-authored-by: James Rodewig <james.rodewig@elastic.co>

* Update docs/reference/snapshot-restore/restore-snapshot.asciidoc

Co-authored-by: James Rodewig <james.rodewig@elastic.co>

* Update repository-azure.asciidoc

* Update node-tool.asciidoc

* Update repository-azure.asciidoc

---------

Co-authored-by: amyjtechwriter <61687663+amyjtechwriter@users.noreply.github.com>
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
Co-authored-by: Amy Jonsson <amy.jonsson@elastic.co>
Co-authored-by: James Rodewig <james.rodewig@elastic.co>
2023-05-24 12:32:46 +01:00
Stef Nestor
4c5a3fb4da
[+Doc] Troubleshooting / Hot Spotting (#95429)
* [+Doc] Troubleshooting / Hot Spotting

---------

Co-authored-by: Abdon Pijpelink <abdon.pijpelink@elastic.co>
2023-04-26 12:29:47 -06:00
Jim Ferenczi
57cbbb3fcd
Minor ann docs update (#94783)
Replace the link to the deprecated knn search API and
added a link to the nightly benchmarks in Rally.
2023-03-31 17:59:25 +01:00
Benjamin Trent
e8c5ed46c6
Fixing our docs for vector sizing calculation (#93703) 2023-02-13 07:52:53 -05:00
Luca Belluccini
7c5b6483a1
[DOCS] Typo in Search speed (#91934)
* [DOCS] Typo in Search speed

The PR https://github.com/elastic/elasticsearch/pull/89782 introduced some broken tags to leak in the text

* Fix tags

* Make all headings discrete

Co-authored-by: Abdon Pijpelink <abdon.pijpelink@elastic.co>
2022-11-28 13:55:47 +01:00
Julie Tibshirani
1b249639f1
Remove experimental marking from kNN search (#91065)
This commit removes the experimental tag from kNN search docs and makes some
docs improvements:
* Add a prominent warning about memory usage in the kNN search guide
* Link to the performance tuning guide from the main guide
* Clarify the memory requirements section in the tuning guide
2022-10-27 18:00:56 +02:00
Julie Tibshirani
f4038b3f15
Add guide for tuning kNN search (#89782)
This 'how to' guide explains performance considerations specific to kNN search.
It takes inspiration from the 'tune for search speed' guide.
2022-10-12 14:53:53 -07:00
Ievgen Degtiarenko
4d6d979e0e
Deprecate state field in /_cluster/reroute response (#90399) 2022-10-05 08:18:27 +02:00
Iraklis Psaroudakis
ad8d064de5
Redefine section on sizing data nodes (#90274)
Now that we have the estimated field mappings heap overhead
in nodes stats, we can refer to them in the guide for sizing
data nodes appropriately.

Relates to #86639
2022-09-30 12:37:21 +03:00
Iraklis Psaroudakis
3ed7a04d22
Introduce node mappings stats (#89807)
So that they are visible in NodeIndicesStats only at the node and index (but not shard) levels. Also visible in the _cat/nodes table. And make an exact count yaml REST test.
2022-09-19 15:47:47 +03:00
Iraklis Psaroudakis
34471b1cd2
Introduce max headroom for disk watermark stages (#88639)
Introduce max headroom settings for the low, high, and flood disk watermark stages, similar to the existing max headroom setting for the flood stage of the frozen tier. Introduce new max headrooms in HealthMetadata and in ReactiveStorageDeciderService. Add multiple tests in DiskThresholdDeciderUnitTests, DiskThresholdDeciderTests and DiskThresholdMonitorTests. Moreover, addition & subtraction for ByteSizeValue, and min.
2022-09-19 14:59:18 +03:00
Abdon Pijpelink
56edb88fed
Update disk-usage.asciidoc (#89709) (#89874)
added missing word

(cherry picked from commit 3e35455511)

Co-authored-by: Brady Vidovic <bradvido@users.noreply.github.com>
2022-09-07 23:28:44 +09:30
David Turner
546a2e2898
Add note on per-segment field name overhead (#89152)
We encountered a case where a substantial fraction of the heap usage was
due to per-segment-per-field `FieldInfo` objects, particularly
`FieldInfo#name`. This commit adds a note to the sizing docs about this
overhead.
2022-08-10 08:17:55 +01:00
David Turner
c81f907ad8
Refine size-your-shards wording (#89081)
Clarify that the limits in the docs are absolute maxima that will avoid
things just breaking but won't necessarily give great performance.
2022-08-08 18:36:32 +09:30
Dimitrios Liappis
5056b666de
[DOCS] Warn about impact of large readahead on search (#88007)
When using LVM or software raid on Linux the kernel, or specific
distribution rules, may use higher ergonomic defaults for the
readahead of resulting block device(s). This can adversely affect
search performance due to high page cache thrashing, in search
heavy scenarios when mmap is involved.

Add a clarification section in the docs raising awareness about this
value and preferring the lower default.
2022-06-27 13:00:44 +03:00
Elasticsearch addict
336df7a266
Update disabling _source doc mentioning highlight (#87582)
Closes #87311
2022-06-13 09:11:25 -04:00
Armin Braun
2a5d65c17f
Remove shards per gb of heap guidance (#86223)
This guidance does not apply any longer.
The overhead per shard has been significantly reduced in recent versions
and removed rule of thumb will be too pessimistic in many if not
most cases and might be too optimistic in other specific ones.

=> Replace guidance with rule of thumb per field count on data nodes and
rule of thumb by index count (which is far more relevant nowadays than
shards) for master nodes.

relates #77466

Co-authored-by: David Turner <david.turner@elastic.co>
Co-authored-by: Henning Andersen <33268011+henningandersen@users.noreply.github.com>
2022-06-09 15:31:01 +02:00
Leaf-Lin
7bd4708886 Revert "Move fix common issues into troubleshooting"
This reverts commit 4a563e9bfb.
2022-06-07 17:14:38 +10:00
Leaf-Lin
4a563e9bfb Move fix common issues into troubleshooting 2022-06-07 17:07:03 +10:00