This adds a `size` parameter that controls the maximum number of
returned affected resources. The parameter defaults to `1000`, must be
positive, and less than `10_000`
This renames the explain Health API parameter to verbose.
We decided to rename explain because verbose is a more established
term in the industry for "opt-in to get more information" and allows for more
flexibility to control what exactly that extra information is (explain is already
pushing the limits of what it semantically represents as it's controlling both
the diagnosis insights and the raw details information)
Currently, we report the count of affected nodes and indices as part of
the disk indicator using a leaky abstraction. Namely we use the status
we assign to nodes internally to nodes based on their disk usage (red,
yellow, green, unknown).
However, these statuses don't have an explicit meaning outside the
implementation details e.g. a red node would probably convey it's a node
experiencing disk issues but not what kind
This proposes being explicit in what we return to our health API users
e.g.
```
"details": {
"indices_with_readonly_block": 2,
"nodes_with_enough_disk_space": 0,
"nodes_with_unknown_disk_status": 0,
"nodes_over_high_watermark": 0,
"nodes_over_flood_watermark": 2
}
```
Adds a new health indicator that reports problems if indexes have a block placed on them, or if
any nodes in the cluster are running low on disk space.
Part of the stable master history health indicator's results (the
`cluster_formation` section within `details`) used dynamic keys in a
map. This gets rid of that. So now instead of:
```
"details": {
"current_master": {
"node_id": null,
"name": null
},
"recent_masters": [
{
"node_id": "31WBm9iTTRuMyWnBhWNUGA",
"name": "master-node-3"
}
],
"cluster_formation": {
"31WBm9iTTRuMyWnBhWNUGA": "master not discovered or elected yet, an election requires at least 2 nodes with ids from [nADkAeGsT-q12gw89Ga1FA, 31WBm9iTTRuMyWnBhWNUGA, w8v48JvuRsuDCjwBn8KbRw], have only discovered non-quorum [{master-node-3}{31WBm9iTTRuMyWnBhWNUGA}{lJmGYiTPS_W7AJU7csG_gQ}{master-node-3}{127.0.0.1}{127.0.0.1:9301}{dm}]; discovery will continue using [127.0.0.1:9300, 127.0.0.1:9302, 127.0.0.1:9303, 127.0.0.1:9304, 127.0.0.1:9305, [::1]:9300, [::1]:9302, [::1]:9303, [::1]:9304, [::1]:9305] from hosts providers and [{master-node-2}{nADkAeGsT-q12gw89Ga1FA}{logzEHuuTpqwJp-RWssBPw}{master-node-2}{127.0.0.1}{127.0.0.1:9300}{dm}, {master-node-3}{31WBm9iTTRuMyWnBhWNUGA}{lJmGYiTPS_W7AJU7csG_gQ}{master-node-3}{127.0.0.1}{127.0.0.1:9301}{dm}] from last-known cluster state; node term 39, last-accepted version 461 in term 39"
}
}
```
We will have:
```
"details": {
"current_master": {
"node_id": null,
"name": null
},
"recent_masters": [
{
"node_id": "31WBm9iTTRuMyWnBhWNUGA",
"name": "master-node-3"
}
],
"cluster_formation": [
{
"node_id": "31WBm9iTTRuMyWnBhWNUGA",
"cluster_formation_message": "master not discovered or elected yet, an election requires at least 2 nodes with ids from [nADkAeGsT-q12gw89Ga1FA, 31WBm9iTTRuMyWnBhWNUGA, w8v48JvuRsuDCjwBn8KbRw], have only discovered non-quorum [{master-node-3}{31WBm9iTTRuMyWnBhWNUGA}{lJmGYiTPS_W7AJU7csG_gQ}{master-node-3}{127.0.0.1}{127.0.0.1:9301}{dm}]; discovery will continue using [127.0.0.1:9300, 127.0.0.1:9302, 127.0.0.1:9303, 127.0.0.1:9304, 127.0.0.1:9305, [::1]:9300, [::1]:9302, [::1]:9303, [::1]:9304, [::1]:9305] from hosts providers and [{master-node-2}{nADkAeGsT-q12gw89Ga1FA}{logzEHuuTpqwJp-RWssBPw}{master-node-2}{127.0.0.1}{127.0.0.1:9300}{dm}, {master-node-3}{31WBm9iTTRuMyWnBhWNUGA}{lJmGYiTPS_W7AJU7csG_gQ}{master-node-3}{127.0.0.1}{127.0.0.1:9301}{dm}] from last-known cluster state; node term 39, last-accepted version 461 in term 39"
}
]
}
```
This commit removes the notion of components from the health API. They are gone from being
a top-level field in the response, and indicators is promoted into its place.
Remove help_url,rename summary->symptom,user_actions->diagnosis
Separate the diagnosis `message` field in `cause` and `action`
Co-authored-by: Mary Gouseti <mgouseti@gmail.com>
This PR adds listings of all the current details that can be returned from the implemented
Health Indicator Services. Response details are unique to each indicator and describe the
state of the system that the indicator is basing its health decisions on.
Co-authored-by: Andrei Dan <andrei.dan@elastic.co>
The health API has a notion of details within each health indicator that is returned. These details can sometimes be
expensive to compute or transfer. This change allows a user to specify whether the details are generated and
returned. By default now all details are generated and returned (previously this was only the case if a component
was specified in the request). This behavior can be changed with the explain query param.
Closes#86215