This PR extends the repository integrity health indicator to cover also unknown and invalid repositories. Because these errors are local to a node, we extend the `LocalHealthMonitor` to monitor the repositories and report the changes in their health regarding the unknown or invalid status.
To simplify this extension in the future, we introduce the `HealthTracker` abstract class that can be used to create new local health checks.
Furthermore, we change the severity of the health status when the repository integrity indicator reports unhealthy from `RED` to `YELLOW` because even though this is a serious issue, there is no user impact yet.
Adds links from the stable master health indicator to the relevant
troubleshooting docs, as well as making the "contact support" link a
versioned link directly to the right subsection of the troubleshooting
docs page.
**Problem:**
For historical reasons, source files for the Elasticsearch Guide's security, watcher, and Logstash API docs are housed in the `x-pack/docs` directory. This can confuse new contributors who expect Elasticsearch Guide docs to be located in `docs/reference`.
**Solution:**
- Move the security, watcher, and Logstash API doc source files to the `docs/reference` directory
- Update doc snippet tests to use security
Rel: https://github.com/elastic/platform-docs-team/issues/208
* [DOCS] Add 'Troubleshooting an unstable cluster' to nav
* Adjust docs links in code
* Revert "Adjust docs links in code"
This reverts commit f3846b1d78.
---------
Co-authored-by: David Turner <david.turner@elastic.co>
* [DOCS] Remote cluster troubleshooting guide
* Fix test failures
* Apply suggestions from code review
Co-authored-by: Yang Wang <ywangd@gmail.com>
* Review feedback
* Group issues under 'common' and 'API key'
* Apply suggestions from code review
Co-authored-by: Yang Wang <ywangd@gmail.com>
---------
Co-authored-by: Yang Wang <ywangd@gmail.com>
Suggest calling `jstack` every 15s to ensure that at least one capture
shows a stuck thread. Also adds a link to this guide to the list on the
troubleshooting overview page.
This troubleshooting guide is what will be returned from the SLM health indicator
when a SLM policy has suffered from too many repeat failures without a successful
execution.
Adds some docs giving more detailed background about what data
corruption really means and some suggestions about how to narrow down
the root cause.
Co-authored-by: Henning Andersen <33268011+henningandersen@users.noreply.github.com>
* Adding discovery troubleshooting link
* Add tags to pull in discovery troubleshooting content
* Move discovery troubleshooting to separate page and add redirects
Co-authored-by: Adam Locke <adam.locke@elastic.co>
* Move fix common cluster issues to troubleshooting
* Include fix common cluster issues in the troubleshooting doc
* [DOCS] Remove extra include from How-To
Co-authored-by: Deb Adair <debadair@elastic.co>
This adds troubleshooting documentation for the case when the ShardsAvailabilityHealthIndicatorService
reports that there are not enough nodes in the data tier (user action "increase_node_capacity_for_allocations" or
"increase_tier_capacity_for_allocations_". This covers both the cloud and self-managed environments. For
cloud we first recommend increasing the number of availability zones (because you cannot directly add nodes), and
decreasing index.number_of_replicas if that is not possible. For self-managed, we first recommend adding nodes,
and decreasing index.number_of_replicas if that is not possible.
* Adding Getting Help section
Add getting help section in the troubleshooting guide to be pointed by health API when issues are too complicated to be addressed.
This is taken from https://www.elastic.co/guide/en/cloud/current/ec-get-help.html, someone might want to elaborate it a bit more?
* Fix broken partintro, modify headings, and update wording
Co-authored-by: Adam Locke <adam.locke@elastic.co>
This adds a troubleshooting doc for indices that mix index filtering allocation
with data tiers routing.
Co-authored-by: Lee Hinman <dakrone@users.noreply.github.com>