kibana/docs/api
Mike Côté cb2e28d1e4
Fix task manager polling flow controls (#153491)
Fixes https://github.com/elastic/kibana/issues/151938

In this PR, I'm re-writing the Task Manager poller so it doesn't run
concurrently when timeouts occur while also fixing the issue where
polling requests would pile up when polling takes time. To support this,
I've also made the following changes:
- Removed the observable monitor and the
`xpack.task_manager.max_poll_inactivity_cycles` setting
- Make the task store `search` and `updateByQuery` functions have no
retries. This prevents the request from retrying 5x whenever a timeout
occurs, causing each call taking up to 2 1/2 minutes before Kibana sees
the error (now down to 30s each). We have polling to manage retries in
these situations.
- Switch the task poller tests to use `sinon` for faking timers
- Removing the `assertStillInSetup` checks on plugin setup. Felt like a
maintenance burden that wasn't necessary to fix with my code changes.

The main code changes are within these files (to review thoroughly so
the polling cycle doesn't suddenly stop):
- x-pack/plugins/task_manager/server/polling/task_poller.ts
- x-pack/plugins/task_manager/server/polling_lifecycle.ts (easier to
review if you disregard whitespace `?w=1`)

## To verify
1. Tasks run normally (create a rule or something that goes through task
manager regularly).
2. When the update by query takes a while, the request is cancelled
after 30s or the time manually configured.
4. When the search for claimed tasks query takes a while, the request is
cancelled after 30s or the time manually configured.

**Tips:**
<details><summary>how to slowdown search for claimed task
queries</summary>

```
diff --git a/x-pack/plugins/task_manager/server/queries/task_claiming.ts b/x-pack/plugins/task_manager/server/queries/task_claiming.ts
index 07042650a37..2caefd63672 100644
--- a/x-pack/plugins/task_manager/server/queries/task_claiming.ts
+++ b/x-pack/plugins/task_manager/server/queries/task_claiming.ts
@@ -247,7 +247,7 @@ export class TaskClaiming {
         taskTypes,
       });

-    const docs = tasksUpdated > 0 ? await this.sweepForClaimedTasks(taskTypes, size) : [];
+    const docs = await this.sweepForClaimedTasks(taskTypes, size);

     this.emitEvents(docs.map((doc) => asTaskClaimEvent(doc.id, asOk(doc))));

@@ -346,6 +346,13 @@ export class TaskClaiming {
       size,
       sort: SortByRunAtAndRetryAt,
       seq_no_primary_term: true,
+      aggs: {
+        delay: {
+          shard_delay: {
+            value: '40s',
+          },
+        },
+      },
     });

     return docs;
```
</details>

<details><summary>how to slow down update by queries</summary>
Not the cleanest way but you'll see occasional request timeouts from the
updateByQuery calls. I had more luck creating rules running every 1s.

```
diff --git a/x-pack/plugins/task_manager/server/task_store.ts b/x-pack/plugins/task_manager/server/task_store.ts
index a06ee7b918a..07aa81e5388 100644
--- a/x-pack/plugins/task_manager/server/task_store.ts
+++ b/x-pack/plugins/task_manager/server/task_store.ts
@@ -126,6 +126,7 @@ export class TaskStore {
       // Timeouts are retried and make requests timeout after (requestTimeout * (1 + maxRetries))
       // The poller doesn't need retry logic because it will try again at the next polling cycle
       maxRetries: 0,
+      requestTimeout: 900,
     });
   }

@@ -458,6 +459,7 @@ export class TaskStore {
           ignore_unavailable: true,
           refresh: true,
           conflicts: 'proceed',
+          requests_per_second: 1,
           body: {
             ...opts,
             max_docs,
```
</details>

---------

Co-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com>
2023-05-03 09:33:10 -04:00
..
actions-and-connectors [DOCS] Create open API specification for run connector (#149274) 2023-01-26 08:53:47 -08:00
alerting Remove alerting_framework_heath from the alerting framework health API response (#154276) 2023-04-20 07:16:27 -04:00
cases [DOCS] Add stub for find case activity API (#152041) 2023-02-28 18:16:33 +01:00
dashboard [Security Solution] [Sourcerer] [Feature Branch] Update to use Kibana Data Views (#114806) 2021-11-04 14:51:32 -06:00
data-views [Docs] Fix curl examples in update data view api docs (#153292) 2023-04-03 16:02:26 -04:00
index-patterns [data views] Allow data view rename via rest api (#141869) 2022-09-27 06:56:46 -05:00
logstash-configuration-management Add a note that if list pipelines exceeds 10k, Kibana faces performan… (#131992) 2022-05-11 16:24:16 -07:00
machine-learning [DOCS] Link to open API specification from ML sync API (#142136) 2022-09-29 08:29:07 -07:00
osquery-manager [Defend Workflows] Fix saved queries 500 (#150426) 2023-02-14 16:11:14 +01:00
role-management Warn & Disallow Creating Role with Existing Name (#132218) 2022-05-25 08:34:41 -07:00
saved-objects Document compatibilityMode query string parameter in Resolve import errors API. (#155696) 2023-04-25 17:36:31 +02:00
session-management Expose session invalidation API. (#92376) 2021-03-24 09:54:08 +01:00
short-urls Short url docs (#113084) 2021-10-12 19:46:58 +02:00
spaces-management Support generating legacy URL aliases for objects that change IDs during import. (#149021) 2023-04-03 10:54:23 +02:00
task-manager Fix task manager polling flow controls (#153491) 2023-05-03 09:33:10 -04:00
upgrade-assistant Fixed some typos (#125802) 2022-03-02 16:40:34 -06:00
actions-and-connectors.asciidoc Add open API specification for list connector types (#145951) 2022-11-24 11:30:51 -07:00
alerting.asciidoc [DOCS] Add prereqs to mute unmute alert APIs (#141337) 2022-09-22 13:48:12 -07:00
cases.asciidoc [DOCS] Add stub for find case activity API (#152041) 2023-02-28 18:16:33 +01:00
dashboard-api.asciidoc [data views] data view api docs - index patterns => data views (#119415) 2021-12-01 07:32:05 -06:00
data-views.asciidoc Adds get all REST API to data views (#131683) 2022-06-02 10:00:19 -07:00
features.asciidoc Timelion App removal (#110255) 2021-09-10 14:53:07 +03:00
index-patterns.asciidoc [DOCS] Add deprecated index pattern APIs (#124065) 2022-01-31 15:47:25 -08:00
logstash-configuration-management.asciidoc [DOCS] Updates API requests and examples (#60695) 2020-03-20 16:33:20 -05:00
machine-learning.asciidoc [DOCS] Add machine learning sync API (#112033) 2021-09-21 08:33:48 -07:00
osquery-manager.asciidoc [Osquery] Add docs for Osquery API (#137162) 2022-08-09 18:43:31 +02:00
role-management.asciidoc [DOCS] Updates API requests and examples (#60695) 2020-03-20 16:33:20 -05:00
saved-objects.asciidoc API docs: Add deprecation warning to all deprecated Saved Object APIs (#150267) 2023-02-07 09:28:13 -07:00
session-management.asciidoc Expose session invalidation API. (#92376) 2021-03-24 09:54:08 +01:00
short-urls.asciidoc Short url docs (#113084) 2021-10-12 19:46:58 +02:00
spaces-management.asciidoc Document update objects spaces API (#145109) 2022-11-15 16:55:58 +00:00
upgrade-assistant.asciidoc [DOCS] Adds missing add default field API (#86332) 2020-12-17 13:59:56 -06:00