mirror of
https://github.com/elastic/elasticsearch.git
synced 2025-06-28 09:28:55 -04:00
* (Doc+) Capture Elasticsearch diagnostic * add diagnostic topic to nav, chunk content, style edits * fix test --------- Co-authored-by: shainaraskas <shaina.raskas@elastic.co>
152 lines
No EOL
6.3 KiB
Text
152 lines
No EOL
6.3 KiB
Text
[[diagnostic]]
|
|
== Capturing diagnostics
|
|
++++
|
|
<titleabbrev>Capture diagnostics</titleabbrev>
|
|
++++
|
|
:keywords: Elasticsearch diagnostic, diagnostics
|
|
|
|
The {es} https://github.com/elastic/support-diagnostics[Support Diagnostic] tool captures a point-in-time snapshot of cluster statistics and most settings.
|
|
It works against all {es} versions.
|
|
|
|
This information can be used to troubleshoot problems with your cluster. For examples of issues that you can troubleshoot using Support Diagnostic tool output, refer to https://www.elastic.co/blog/why-does-elastic-support-keep-asking-for-diagnostic-files[the Elastic blog].
|
|
|
|
You can generate diagnostic information using this tool before you contact https://support.elastic.co[Elastic Support] or
|
|
https://discuss.elastic.co[Elastic Discuss] to minimize turnaround time.
|
|
|
|
[discrete]
|
|
[[diagnostic-tool-requirements]]
|
|
=== Requirements
|
|
|
|
- Java Runtime Environment or Java Development Kit v1.8 or higher
|
|
|
|
[discrete]
|
|
[[diagnostic-tool-access]]
|
|
=== Access the tool
|
|
|
|
The Support Diagnostic tool is included as a sub-library in some Elastic deployments:
|
|
|
|
* {ece}: Located under **{ece}** > **Deployment** > **Operations** >
|
|
**Prepare Bundle** > **{es}**.
|
|
* {eck}: Run as https://www.elastic.co/guide/en/cloud-on-k8s/current/k8s-take-eck-dump.html[`eck-diagnostics`].
|
|
|
|
You can also directly download the `diagnostics-X.X.X-dist.zip` file for the latest Support Diagnostic release
|
|
from https://github.com/elastic/support-diagnostics/releases/latest[the `support-diagnostic` repo].
|
|
|
|
|
|
[discrete]
|
|
[[diagnostic-capture]]
|
|
=== Capture diagnostic information
|
|
|
|
To capture an {es} diagnostic:
|
|
|
|
. In a terminal, verify that your network and user permissions are sufficient to connect to your {es}
|
|
cluster by polling the cluster's <<cluster-health,health>>.
|
|
+
|
|
For example, with the parameters `host:localhost`, `port:9200`, and `username:elastic`, you'd use the following curl request:
|
|
+
|
|
[source,sh]
|
|
----
|
|
curl -X GET -k -u elastic -p https://localhost:9200/_cluster/health
|
|
----
|
|
// NOTCONSOLE
|
|
+
|
|
If you receive a an HTTP 200 `OK` response, then you can proceed to the next step. If you receive a different
|
|
response code, then <<diagnostic-non-200,diagnose the issue>> before proceeding.
|
|
|
|
. Using the same environment parameters, run the diagnostic tool script.
|
|
+
|
|
For information about the parameters that you can pass to the tool, refer to the https://github.com/elastic/support-diagnostics#standard-options[diagnostic
|
|
parameter reference].
|
|
+
|
|
The following command options are recommended:
|
|
+
|
|
**Unix-based systems**
|
|
+
|
|
[source,sh]
|
|
----
|
|
sudo ./diagnostics.sh --type local --host localhost --port 9200 -u elastic -p --bypassDiagVerify --ssl --noVerify
|
|
----
|
|
+
|
|
**Windows**
|
|
+
|
|
[source,sh]
|
|
----
|
|
sudo .\diagnostics.bat --type local --host localhost --port 9200 -u elastic -p --bypassDiagVerify --ssl --noVerify
|
|
----
|
|
+
|
|
[TIP]
|
|
.Script execution modes
|
|
====
|
|
You can execute the script in three https://github.com/elastic/support-diagnostics#diagnostic-types[modes]:
|
|
|
|
* `local` (default, recommended): Polls the <<rest-apis,{es} API>>,
|
|
gathers operating system info, and captures cluster and GC logs.
|
|
|
|
* `remote`: Establishes an ssh session
|
|
to the applicable target server to pull the same information as `local`.
|
|
|
|
* `api`: Polls the <<rest-apis,{es} API>>. All other data must be
|
|
collected manually.
|
|
====
|
|
|
|
. When the script has completed, verify that no errors were logged to `diagnostic.log`.
|
|
If the log file contains errors, then refer to <<diagnostic-log-errors,Diagnose errors in `diagnostic.log`>>.
|
|
|
|
. If the script completed without errors, then an archive with the format `<diagnostic type>-diagnostics-<DateTimeStamp>.zip` is created in the working directory, or an output directory you have specified. You can review or share the diagnostic archive as needed.
|
|
|
|
[discrete]
|
|
[[diagnostic-non-200]]
|
|
=== Diagnose a non-200 cluster health response
|
|
|
|
When you poll your cluster health, if you receive any response other than `200 0K`, then the diagnostic tool
|
|
might not work as intended. The following are possible error codes and their resolutions:
|
|
|
|
HTTP 401 `UNAUTHENTICATED`::
|
|
Additional information in the error will usually indicate either
|
|
that your `username:password` pair is invalid, or that your `.security`
|
|
index is unavailable and you need to setup a temporary
|
|
<<file-realm,file-based realm>> user with `role:superuser` to authenticate.
|
|
|
|
HTTP 403 `UNAUTHORIZED`::
|
|
Your `username` is recognized but
|
|
has insufficient permissions to run the diagnostic. Either use a different
|
|
username or elevate the user's privileges.
|
|
|
|
HTTP 429 `TOO_MANY_REQUESTS` (for example, `circuit_breaking_exception`)::
|
|
Your username authenticated and authorized, but the cluster is under
|
|
sufficiently high strain that it's not responding to API calls. These
|
|
responses are usually intermittent. You can proceed with running the diagnostic,
|
|
but the diagnostic results might be incomplete.
|
|
|
|
HTTP 504 `BAD_GATEWAY`::
|
|
Your network is experiencing issues reaching the cluster. You might be using a proxy or firewall.
|
|
Consider running the diagnostic tool from a different location, confirming your port, or using an IP
|
|
instead of a URL domain.
|
|
|
|
HTTP 503 `SERVICE_UNAVAILABLE` (for example, `master_not_discovered_exception`)::
|
|
Your cluster does not currently have an elected master node, which is
|
|
required for it to be API-responsive. This might be temporary while the master
|
|
node rotates. If the issue persists, then <<cluster-fault-detection,investigate the cause>>
|
|
before proceeding.
|
|
|
|
[discrete]
|
|
[[diagnostic-log-errors]]
|
|
=== Diagnose errors in `diagnostic.log`
|
|
|
|
The following are common errors that you might encounter when running the diagnostic tool:
|
|
|
|
* `Error: Could not find or load main class com.elastic.support.diagnostics.DiagnosticApp`
|
|
+
|
|
This indicates that you accidentally downloaded the source code file
|
|
instead of `diagnostics-X.X.X-dist.zip` from the releases page.
|
|
|
|
* `Could not retrieve the Elasticsearch version due to a system or network error - unable to continue.`
|
|
+
|
|
This indicates that the diagnostic couldn't run commands against the cluster.
|
|
Poll the cluster's health again, and ensure that you're using the same parameters
|
|
when you run the dianostic batch or shell file.
|
|
|
|
* A `security_exception` that includes `is unauthorized for user`:
|
|
+
|
|
The provided user has insufficient admin permissions to run the diagnostic tool. Use another
|
|
user, or grant the user `role:superuser` privileges. |