mirror of
https://github.com/elastic/kibana.git
synced 2025-04-24 09:48:58 -04:00
362 commits
Author | SHA1 | Message | Date | |
---|---|---|---|---|
|
5264ffad87
|
[8.16] Fixes Failing test: Jest Integration Tests.x-pack/platform/plugins/shared/task_manager/server/integration_tests - capacity based claiming should claim tasks to full capacity (#201681) (#210232)
# Backport This will backport the following commits from `main` to `8.16`: - [Fixes Failing test: Jest Integration Tests.x-pack/platform/plugins/shared/task_manager/server/integration_tests - capacity based claiming should claim tasks to full capacity (#201681)](https://github.com/elastic/kibana/pull/201681) <!--- Backport version: 9.6.4 --> ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sorenlouv/backport) <!--BACKPORT [{"author":{"name":"Ying Mao","email":"ying.mao@elastic.co"},"sourceCommit":{"committedDate":"2025-02-05T21:20:38Z","message":"Fixes Failing test: Jest Integration Tests.x-pack/platform/plugins/shared/task_manager/server/integration_tests - capacity based claiming should claim tasks to full capacity (#201681)\n\nResolves https://github.com/elastic/kibana/issues/205949,\r\nhttps://github.com/elastic/kibana/issues/191117\r\n\r\n## Summary\r\n\r\nTrying to fix flaky integration test by performing a bulk create for the\r\ntest tasks instead of creating one by one. After making this change, was\r\nable to run the integration test ~100 times without failure.\r\n\r\n---------\r\n\r\nCo-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>","sha":"7f28ae63e36d73ec471df7109909b1249f7edafd","branchLabelMapping":{"^v9.1.0$":"main","^v8.19.0$":"8.x","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","Feature:Task Manager","Team:ResponseOps","v9.0.0","backport:version","v8.18.0","v9.1.0","v8.19.0"],"title":"Fixes Failing test: Jest Integration Tests.x-pack/platform/plugins/shared/task_manager/server/integration_tests - capacity based claiming should claim tasks to full capacity","number":201681,"url":"https://github.com/elastic/kibana/pull/201681","mergeCommit":{"message":"Fixes Failing test: Jest Integration Tests.x-pack/platform/plugins/shared/task_manager/server/integration_tests - capacity based claiming should claim tasks to full capacity (#201681)\n\nResolves https://github.com/elastic/kibana/issues/205949,\r\nhttps://github.com/elastic/kibana/issues/191117\r\n\r\n## Summary\r\n\r\nTrying to fix flaky integration test by performing a bulk create for the\r\ntest tasks instead of creating one by one. After making this change, was\r\nable to run the integration test ~100 times without failure.\r\n\r\n---------\r\n\r\nCo-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>","sha":"7f28ae63e36d73ec471df7109909b1249f7edafd"}},"sourceBranch":"main","suggestedTargetBranches":[],"targetPullRequestStates":[{"branch":"9.0","label":"v9.0.0","branchLabelMappingKey":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"url":"https://github.com/elastic/kibana/pull/209920","number":209920,"state":"MERGED","mergeCommit":{"sha":"90f844b2255f7463614f029c329b80cc741cf151","message":"[9.0] Fixes Failing test: Jest Integration Tests.x-pack/platform/plugins/shared/task_manager/server/integration_tests - capacity based claiming should claim tasks to full capacity (#201681) (#209920)\n\n# Backport\n\nThis will backport the following commits from `main` to `9.0`:\n- [Fixes Failing test: Jest Integration\nTests.x-pack/platform/plugins/shared/task_manager/server/integration_tests\n- capacity based claiming should claim tasks to full capacity\n(#201681)](https://github.com/elastic/kibana/pull/201681)\n\n<!--- Backport version: 9.4.3 -->\n\n### Questions ?\nPlease refer to the [Backport tool\ndocumentation](https://github.com/sqren/backport)\n\n<!--BACKPORT [{\"author\":{\"name\":\"Ying\nMao\",\"email\":\"ying.mao@elastic.co\"},\"sourceCommit\":{\"committedDate\":\"2025-02-05T21:20:38Z\",\"message\":\"Fixes\nFailing test: Jest Integration\nTests.x-pack/platform/plugins/shared/task_manager/server/integration_tests\n- capacity based claiming should claim tasks to full capacity\n(#201681)\\n\\nResolves\nhttps://github.com/elastic/kibana/issues/205949,\\r\\nhttps://github.com/elastic/kibana/issues/191117\\r\\n\\r\\n##\nSummary\\r\\n\\r\\nTrying to fix flaky integration test by performing a bulk\ncreate for the\\r\\ntest tasks instead of creating one by one. After\nmaking this change, was\\r\\nable to run the integration test ~100 times\nwithout failure.\\r\\n\\r\\n---------\\r\\n\\r\\nCo-authored-by: Elastic Machine\n<elasticmachine@users.noreply.github.com>\",\"sha\":\"7f28ae63e36d73ec471df7109909b1249f7edafd\",\"branchLabelMapping\":{\"^v9.1.0$\":\"main\",\"^v8.19.0$\":\"8.x\",\"^v(\\\\d+).(\\\\d+).\\\\d+$\":\"$1.$2\"}},\"sourcePullRequest\":{\"labels\":[\"release_note:skip\",\"Feature:Task\nManager\",\"Team:ResponseOps\",\"v9.0.0\",\"backport:version\",\"v8.18.0\",\"v9.1.0\",\"v8.19.0\"],\"title\":\"Fixes\nFailing test: Jest Integration\nTests.x-pack/platform/plugins/shared/task_manager/server/integration_tests\n- capacity based claiming should claim tasks to full\ncapacity\",\"number\":201681,\"url\":\"https://github.com/elastic/kibana/pull/201681\",\"mergeCommit\":{\"message\":\"Fixes\nFailing test: Jest Integration\nTests.x-pack/platform/plugins/shared/task_manager/server/integration_tests\n- capacity based claiming should claim tasks to full capacity\n(#201681)\\n\\nResolves\nhttps://github.com/elastic/kibana/issues/205949,\\r\\nhttps://github.com/elastic/kibana/issues/191117\\r\\n\\r\\n##\nSummary\\r\\n\\r\\nTrying to fix flaky integration test by performing a bulk\ncreate for the\\r\\ntest tasks instead of creating one by one. After\nmaking this change, was\\r\\nable to run the integration test ~100 times\nwithout failure.\\r\\n\\r\\n---------\\r\\n\\r\\nCo-authored-by: Elastic Machine\n<elasticmachine@users.noreply.github.com>\",\"sha\":\"7f28ae63e36d73ec471df7109909b1249f7edafd\"}},\"sourceBranch\":\"main\",\"suggestedTargetBranches\":[\"9.0\",\"8.18\",\"8.x\"],\"targetPullRequestStates\":[{\"branch\":\"9.0\",\"label\":\"v9.0.0\",\"branchLabelMappingKey\":\"^v(\\\\d+).(\\\\d+).\\\\d+$\",\"isSourceBranch\":false,\"state\":\"NOT_CREATED\"},{\"branch\":\"8.18\",\"label\":\"v8.18.0\",\"branchLabelMappingKey\":\"^v(\\\\d+).(\\\\d+).\\\\d+$\",\"isSourceBranch\":false,\"state\":\"NOT_CREATED\"},{\"branch\":\"main\",\"label\":\"v9.1.0\",\"branchLabelMappingKey\":\"^v9.1.0$\",\"isSourceBranch\":true,\"state\":\"MERGED\",\"url\":\"https://github.com/elastic/kibana/pull/201681\",\"number\":201681,\"mergeCommit\":{\"message\":\"Fixes\nFailing test: Jest Integration\nTests.x-pack/platform/plugins/shared/task_manager/server/integration_tests\n- capacity based claiming should claim tasks to full capacity\n(#201681)\\n\\nResolves\nhttps://github.com/elastic/kibana/issues/205949,\\r\\nhttps://github.com/elastic/kibana/issues/191117\\r\\n\\r\\n##\nSummary\\r\\n\\r\\nTrying to fix flaky integration test by performing a bulk\ncreate for the\\r\\ntest tasks instead of creating one by one. After\nmaking this change, was\\r\\nable to run the integration test ~100 times\nwithout failure.\\r\\n\\r\\n---------\\r\\n\\r\\nCo-authored-by: Elastic Machine\n<elasticmachine@users.noreply.github.com>\",\"sha\":\"7f28ae63e36d73ec471df7109909b1249f7edafd\"}},{\"branch\":\"8.x\",\"label\":\"v8.19.0\",\"branchLabelMappingKey\":\"^v8.19.0$\",\"isSourceBranch\":false,\"state\":\"NOT_CREATED\"}]}]\nBACKPORT-->\n\nCo-authored-by: Ying Mao <ying.mao@elastic.co>"}},{"branch":"8.18","label":"v8.18.0","branchLabelMappingKey":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"url":"https://github.com/elastic/kibana/pull/209923","number":209923,"state":"MERGED","mergeCommit":{"sha":"afb1bd58a86cd7d43ec3dfcc7473f9b1a896ad4f","message":"[8.18] Fixes Failing test: Jest Integration Tests.x-pack/platform/plugins/shared/task_manager/server/integration_tests - capacity based claiming should claim tasks to full capacity (#201681) (#209923)\n\n# Backport\n\nThis will backport the following commits from `main` to `8.18`:\n- [Fixes Failing test: Jest Integration\nTests.x-pack/platform/plugins/shared/task_manager/server/integration_tests\n- capacity based claiming should claim tasks to full capacity\n(#201681)](https://github.com/elastic/kibana/pull/201681)\n\n<!--- Backport version: 9.6.4 -->\n\n### Questions ?\nPlease refer to the [Backport tool\ndocumentation](https://github.com/sorenlouv/backport)\n\n<!--BACKPORT [{\"author\":{\"name\":\"Ying\nMao\",\"email\":\"ying.mao@elastic.co\"},\"sourceCommit\":{\"committedDate\":\"2025-02-05T21:20:38Z\",\"message\":\"Fixes\nFailing test: Jest Integration\nTests.x-pack/platform/plugins/shared/task_manager/server/integration_tests\n- capacity based claiming should claim tasks to full capacity\n(#201681)\\n\\nResolves\nhttps://github.com/elastic/kibana/issues/205949,\\r\\nhttps://github.com/elastic/kibana/issues/191117\\r\\n\\r\\n##\nSummary\\r\\n\\r\\nTrying to fix flaky integration test by performing a bulk\ncreate for the\\r\\ntest tasks instead of creating one by one. After\nmaking this change, was\\r\\nable to run the integration test ~100 times\nwithout failure.\\r\\n\\r\\n---------\\r\\n\\r\\nCo-authored-by: Elastic Machine\n<elasticmachine@users.noreply.github.com>\",\"sha\":\"7f28ae63e36d73ec471df7109909b1249f7edafd\",\"branchLabelMapping\":{\"^v9.1.0$\":\"main\",\"^v8.19.0$\":\"8.x\",\"^v(\\\\d+).(\\\\d+).\\\\d+$\":\"$1.$2\"}},\"sourcePullRequest\":{\"labels\":[\"release_note:skip\",\"Feature:Task\nManager\",\"Team:ResponseOps\",\"v9.0.0\",\"backport:version\",\"v8.18.0\",\"v9.1.0\",\"v8.19.0\"],\"title\":\"Fixes\nFailing test: Jest Integration\nTests.x-pack/platform/plugins/shared/task_manager/server/integration_tests\n- capacity based claiming should claim tasks to full\ncapacity\",\"number\":201681,\"url\":\"https://github.com/elastic/kibana/pull/201681\",\"mergeCommit\":{\"message\":\"Fixes\nFailing test: Jest Integration\nTests.x-pack/platform/plugins/shared/task_manager/server/integration_tests\n- capacity based claiming should claim tasks to full capacity\n(#201681)\\n\\nResolves\nhttps://github.com/elastic/kibana/issues/205949,\\r\\nhttps://github.com/elastic/kibana/issues/191117\\r\\n\\r\\n##\nSummary\\r\\n\\r\\nTrying to fix flaky integration test by performing a bulk\ncreate for the\\r\\ntest tasks instead of creating one by one. After\nmaking this change, was\\r\\nable to run the integration test ~100 times\nwithout failure.\\r\\n\\r\\n---------\\r\\n\\r\\nCo-authored-by: Elastic Machine\n<elasticmachine@users.noreply.github.com>\",\"sha\":\"7f28ae63e36d73ec471df7109909b1249f7edafd\"}},\"sourceBranch\":\"main\",\"suggestedTargetBranches\":[\"8.18\",\"8.x\"],\"targetPullRequestStates\":[{\"branch\":\"9.0\",\"label\":\"v9.0.0\",\"branchLabelMappingKey\":\"^v(\\\\d+).(\\\\d+).\\\\d+$\",\"isSourceBranch\":false,\"url\":\"https://github.com/elastic/kibana/pull/209920\",\"number\":209920,\"state\":\"OPEN\"},{\"branch\":\"8.18\",\"label\":\"v8.18.0\",\"branchLabelMappingKey\":\"^v(\\\\d+).(\\\\d+).\\\\d+$\",\"isSourceBranch\":false,\"state\":\"NOT_CREATED\"},{\"branch\":\"main\",\"label\":\"v9.1.0\",\"branchLabelMappingKey\":\"^v9.1.0$\",\"isSourceBranch\":true,\"state\":\"MERGED\",\"url\":\"https://github.com/elastic/kibana/pull/201681\",\"number\":201681,\"mergeCommit\":{\"message\":\"Fixes\nFailing test: Jest Integration\nTests.x-pack/platform/plugins/shared/task_manager/server/integration_tests\n- capacity based claiming should claim tasks to full capacity\n(#201681)\\n\\nResolves\nhttps://github.com/elastic/kibana/issues/205949,\\r\\nhttps://github.com/elastic/kibana/issues/191117\\r\\n\\r\\n##\nSummary\\r\\n\\r\\nTrying to fix flaky integration test by performing a bulk\ncreate for the\\r\\ntest tasks instead of creating one by one. After\nmaking this change, was\\r\\nable to run the integration test ~100 times\nwithout failure.\\r\\n\\r\\n---------\\r\\n\\r\\nCo-authored-by: Elastic Machine\n<elasticmachine@users.noreply.github.com>\",\"sha\":\"7f28ae63e36d73ec471df7109909b1249f7edafd\"}},{\"branch\":\"8.x\",\"label\":\"v8.19.0\",\"branchLabelMappingKey\":\"^v8.19.0$\",\"isSourceBranch\":false,\"state\":\"NOT_CREATED\"}]}]\nBACKPORT-->"}},{"branch":"main","label":"v9.1.0","branchLabelMappingKey":"^v9.1.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/201681","number":201681,"mergeCommit":{"message":"Fixes Failing test: Jest Integration Tests.x-pack/platform/plugins/shared/task_manager/server/integration_tests - capacity based claiming should claim tasks to full capacity (#201681)\n\nResolves https://github.com/elastic/kibana/issues/205949,\r\nhttps://github.com/elastic/kibana/issues/191117\r\n\r\n## Summary\r\n\r\nTrying to fix flaky integration test by performing a bulk create for the\r\ntest tasks instead of creating one by one. After making this change, was\r\nable to run the integration test ~100 times without failure.\r\n\r\n---------\r\n\r\nCo-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>","sha":"7f28ae63e36d73ec471df7109909b1249f7edafd"}},{"branch":"8.x","label":"v8.19.0","branchLabelMappingKey":"^v8.19.0$","isSourceBranch":false,"url":"https://github.com/elastic/kibana/pull/209922","number":209922,"state":"MERGED","mergeCommit":{"sha":"e8c26120d3d58beec3a233f6ee92556d96b20835","message":"[8.x] Fixes Failing test: Jest Integration Tests.x-pack/platform/plugins/shared/task_manager/server/integration_tests - capacity based claiming should claim tasks to full capacity (#201681) (#209922)\n\n# Backport\n\nThis will backport the following commits from `main` to `8.x`:\n- [Fixes Failing test: Jest Integration\nTests.x-pack/platform/plugins/shared/task_manager/server/integration_tests\n- capacity based claiming should claim tasks to full capacity\n(#201681)](https://github.com/elastic/kibana/pull/201681)\n\n<!--- Backport version: 9.6.4 -->\n\n### Questions ?\nPlease refer to the [Backport tool\ndocumentation](https://github.com/sorenlouv/backport)\n\n<!--BACKPORT [{\"author\":{\"name\":\"Ying\nMao\",\"email\":\"ying.mao@elastic.co\"},\"sourceCommit\":{\"committedDate\":\"2025-02-05T21:20:38Z\",\"message\":\"Fixes\nFailing test: Jest Integration\nTests.x-pack/platform/plugins/shared/task_manager/server/integration_tests\n- capacity based claiming should claim tasks to full capacity\n(#201681)\\n\\nResolves\nhttps://github.com/elastic/kibana/issues/205949,\\r\\nhttps://github.com/elastic/kibana/issues/191117\\r\\n\\r\\n##\nSummary\\r\\n\\r\\nTrying to fix flaky integration test by performing a bulk\ncreate for the\\r\\ntest tasks instead of creating one by one. After\nmaking this change, was\\r\\nable to run the integration test ~100 times\nwithout failure.\\r\\n\\r\\n---------\\r\\n\\r\\nCo-authored-by: Elastic Machine\n<elasticmachine@users.noreply.github.com>\",\"sha\":\"7f28ae63e36d73ec471df7109909b1249f7edafd\",\"branchLabelMapping\":{\"^v9.1.0$\":\"main\",\"^v8.19.0$\":\"8.x\",\"^v(\\\\d+).(\\\\d+).\\\\d+$\":\"$1.$2\"}},\"sourcePullRequest\":{\"labels\":[\"release_note:skip\",\"Feature:Task\nManager\",\"Team:ResponseOps\",\"v9.0.0\",\"backport:version\",\"v8.18.0\",\"v9.1.0\",\"v8.19.0\"],\"title\":\"Fixes\nFailing test: Jest Integration\nTests.x-pack/platform/plugins/shared/task_manager/server/integration_tests\n- capacity based claiming should claim tasks to full\ncapacity\",\"number\":201681,\"url\":\"https://github.com/elastic/kibana/pull/201681\",\"mergeCommit\":{\"message\":\"Fixes\nFailing test: Jest Integration\nTests.x-pack/platform/plugins/shared/task_manager/server/integration_tests\n- capacity based claiming should claim tasks to full capacity\n(#201681)\\n\\nResolves\nhttps://github.com/elastic/kibana/issues/205949,\\r\\nhttps://github.com/elastic/kibana/issues/191117\\r\\n\\r\\n##\nSummary\\r\\n\\r\\nTrying to fix flaky integration test by performing a bulk\ncreate for the\\r\\ntest tasks instead of creating one by one. After\nmaking this change, was\\r\\nable to run the integration test ~100 times\nwithout failure.\\r\\n\\r\\n---------\\r\\n\\r\\nCo-authored-by: Elastic Machine\n<elasticmachine@users.noreply.github.com>\",\"sha\":\"7f28ae63e36d73ec471df7109909b1249f7edafd\"}},\"sourceBranch\":\"main\",\"suggestedTargetBranches\":[\"8.18\",\"8.x\"],\"targetPullRequestStates\":[{\"branch\":\"9.0\",\"label\":\"v9.0.0\",\"branchLabelMappingKey\":\"^v(\\\\d+).(\\\\d+).\\\\d+$\",\"isSourceBranch\":false,\"url\":\"https://github.com/elastic/kibana/pull/209920\",\"number\":209920,\"state\":\"OPEN\"},{\"branch\":\"8.18\",\"label\":\"v8.18.0\",\"branchLabelMappingKey\":\"^v(\\\\d+).(\\\\d+).\\\\d+$\",\"isSourceBranch\":false,\"state\":\"NOT_CREATED\"},{\"branch\":\"main\",\"label\":\"v9.1.0\",\"branchLabelMappingKey\":\"^v9.1.0$\",\"isSourceBranch\":true,\"state\":\"MERGED\",\"url\":\"https://github.com/elastic/kibana/pull/201681\",\"number\":201681,\"mergeCommit\":{\"message\":\"Fixes\nFailing test: Jest Integration\nTests.x-pack/platform/plugins/shared/task_manager/server/integration_tests\n- capacity based claiming should claim tasks to full capacity\n(#201681)\\n\\nResolves\nhttps://github.com/elastic/kibana/issues/205949,\\r\\nhttps://github.com/elastic/kibana/issues/191117\\r\\n\\r\\n##\nSummary\\r\\n\\r\\nTrying to fix flaky integration test by performing a bulk\ncreate for the\\r\\ntest tasks instead of creating one by one. After\nmaking this change, was\\r\\nable to run the integration test ~100 times\nwithout failure.\\r\\n\\r\\n---------\\r\\n\\r\\nCo-authored-by: Elastic Machine\n<elasticmachine@users.noreply.github.com>\",\"sha\":\"7f28ae63e36d73ec471df7109909b1249f7edafd\"}},{\"branch\":\"8.x\",\"label\":\"v8.19.0\",\"branchLabelMappingKey\":\"^v8.19.0$\",\"isSourceBranch\":false,\"state\":\"NOT_CREATED\"}]}]\nBACKPORT-->"}}]}] BACKPORT--> |
||
|
022f0299af
|
[8.16][Task Manager] Honor config-provided poll_interval (#205064) | ||
|
a98da9bf50
|
[8.16] Exclude unrecognized tasks from the task manager aggregate API (#202163) (#202681)
# Backport This will backport the following commits from `main` to `8.16`: - [Exclude unrecognized tasks from the task manager aggregate API (#202163)](https://github.com/elastic/kibana/pull/202163) <!--- Backport version: 9.4.3 --> ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sqren/backport) <!--BACKPORT [{"author":{"name":"Mike Côté","email":"mikecote@users.noreply.github.com"},"sourceCommit":{"committedDate":"2024-12-03T11:21:49Z","message":"Exclude unrecognized tasks from the task manager aggregate API (#202163)\n\nIn this PR, I'm removing tasks with a status `unrecognized` from\r\nreturning on any `taskStore.aggregate` calls. Without this, we had\r\nunrecognized recurring tasks that were still part of the task manager\r\ncapacity calculation under\r\n`assumedAverageRecurringRequiredThroughputPerMinutePerKibana`.\r\n\r\n## To Verify\r\n1. Create a few ES Query alerting rules running every 1s\r\n2. Capture the task manager health API report via\r\n`/api/task_manager/_health`\r\n3. Apply the following diff to mark es query alerting tasks as\r\nunrecognized\r\n```\r\ndiff --git a/x-pack/plugins/stack_alerts/server/rule_types/es_query/index.ts b/x-pack/plugins/stack_alerts/server/rule_types/es_query/index.ts\r\nindex 1988eebc21a..8d649f4c6a5 100644\r\n--- a/x-pack/plugins/stack_alerts/server/rule_types/es_query/index.ts\r\n+++ b/x-pack/plugins/stack_alerts/server/rule_types/es_query/index.ts\r\n@@ -10,5 +10,5 @@ import { getRuleType } from './rule_type';\r\n\r\n export function register(params: RegisterRuleTypesParams, isServerless: boolean) {\r\n const { alerting, core } = params;\r\n- alerting.registerType(getRuleType(core, isServerless));\r\n+ // alerting.registerType(getRuleType(core, isServerless));\r\n }\r\ndiff --git a/x-pack/plugins/task_manager/server/removed_tasks/mark_removed_tasks_as_unrecognized.ts b/x-pack/plugins/task_manager/server/removed_tasks/mark_removed_tasks_as_unrecognized.ts\r\nindex e28d5221e72..dbfc1bbd135 100644\r\n--- a/x-pack/plugins/task_manager/server/removed_tasks/mark_removed_tasks_as_unrecognized.ts\r\n+++ b/x-pack/plugins/task_manager/server/removed_tasks/mark_removed_tasks_as_unrecognized.ts\r\n@@ -33,6 +33,11 @@ export async function scheduleMarkRemovedTasksAsUnrecognizedDefinition(\r\n state: {},\r\n params: {},\r\n });\r\n+ try {\r\n+ await taskScheduling.runSoon(TASK_ID);\r\n+ } catch (e) {\r\n+ // Ignore\r\n+ }\r\n } catch (e) {\r\n logger.error(`Error scheduling ${TASK_ID} task, received ${e.message}`);\r\n }\r\ndiff --git a/x-pack/plugins/task_manager/server/task_type_dictionary.ts b/x-pack/plugins/task_manager/server/task_type_dictionary.ts\r\nindex e0b28eccea3..142c07bb507 100644\r\n--- a/x-pack/plugins/task_manager/server/task_type_dictionary.ts\r\n+++ b/x-pack/plugins/task_manager/server/task_type_dictionary.ts\r\n@@ -32,6 +32,8 @@ export const REMOVED_TYPES: string[] = [\r\n\r\n 'cleanup_failed_action_executions',\r\n 'reports:monitor',\r\n+\r\n+ 'alerting:.es-query',\r\n ];\r\n\r\n /**\r\n```\r\n5. Capture the task manager health API report again via\r\n`/api/task_manager/_health`\r\n6. Notice the number dropped for\r\n`capacity_estimation.value.observed.avg_recurring_required_throughput_per_minute`","sha":"3b670980431da414535940aeeb0088d2ae7ff89c","branchLabelMapping":{"^v9.0.0$":"main","^v8.18.0$":"8.x","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","Feature:Task Manager","Team:ResponseOps","v9.0.0","backport:version","v8.17.0","v8.18.0","v8.16.2"],"title":"Exclude unrecognized tasks from the task manager aggregate API","number":202163,"url":"https://github.com/elastic/kibana/pull/202163","mergeCommit":{"message":"Exclude unrecognized tasks from the task manager aggregate API (#202163)\n\nIn this PR, I'm removing tasks with a status `unrecognized` from\r\nreturning on any `taskStore.aggregate` calls. Without this, we had\r\nunrecognized recurring tasks that were still part of the task manager\r\ncapacity calculation under\r\n`assumedAverageRecurringRequiredThroughputPerMinutePerKibana`.\r\n\r\n## To Verify\r\n1. Create a few ES Query alerting rules running every 1s\r\n2. Capture the task manager health API report via\r\n`/api/task_manager/_health`\r\n3. Apply the following diff to mark es query alerting tasks as\r\nunrecognized\r\n```\r\ndiff --git a/x-pack/plugins/stack_alerts/server/rule_types/es_query/index.ts b/x-pack/plugins/stack_alerts/server/rule_types/es_query/index.ts\r\nindex 1988eebc21a..8d649f4c6a5 100644\r\n--- a/x-pack/plugins/stack_alerts/server/rule_types/es_query/index.ts\r\n+++ b/x-pack/plugins/stack_alerts/server/rule_types/es_query/index.ts\r\n@@ -10,5 +10,5 @@ import { getRuleType } from './rule_type';\r\n\r\n export function register(params: RegisterRuleTypesParams, isServerless: boolean) {\r\n const { alerting, core } = params;\r\n- alerting.registerType(getRuleType(core, isServerless));\r\n+ // alerting.registerType(getRuleType(core, isServerless));\r\n }\r\ndiff --git a/x-pack/plugins/task_manager/server/removed_tasks/mark_removed_tasks_as_unrecognized.ts b/x-pack/plugins/task_manager/server/removed_tasks/mark_removed_tasks_as_unrecognized.ts\r\nindex e28d5221e72..dbfc1bbd135 100644\r\n--- a/x-pack/plugins/task_manager/server/removed_tasks/mark_removed_tasks_as_unrecognized.ts\r\n+++ b/x-pack/plugins/task_manager/server/removed_tasks/mark_removed_tasks_as_unrecognized.ts\r\n@@ -33,6 +33,11 @@ export async function scheduleMarkRemovedTasksAsUnrecognizedDefinition(\r\n state: {},\r\n params: {},\r\n });\r\n+ try {\r\n+ await taskScheduling.runSoon(TASK_ID);\r\n+ } catch (e) {\r\n+ // Ignore\r\n+ }\r\n } catch (e) {\r\n logger.error(`Error scheduling ${TASK_ID} task, received ${e.message}`);\r\n }\r\ndiff --git a/x-pack/plugins/task_manager/server/task_type_dictionary.ts b/x-pack/plugins/task_manager/server/task_type_dictionary.ts\r\nindex e0b28eccea3..142c07bb507 100644\r\n--- a/x-pack/plugins/task_manager/server/task_type_dictionary.ts\r\n+++ b/x-pack/plugins/task_manager/server/task_type_dictionary.ts\r\n@@ -32,6 +32,8 @@ export const REMOVED_TYPES: string[] = [\r\n\r\n 'cleanup_failed_action_executions',\r\n 'reports:monitor',\r\n+\r\n+ 'alerting:.es-query',\r\n ];\r\n\r\n /**\r\n```\r\n5. Capture the task manager health API report again via\r\n`/api/task_manager/_health`\r\n6. Notice the number dropped for\r\n`capacity_estimation.value.observed.avg_recurring_required_throughput_per_minute`","sha":"3b670980431da414535940aeeb0088d2ae7ff89c"}},"sourceBranch":"main","suggestedTargetBranches":["8.17","8.x","8.16"],"targetPullRequestStates":[{"branch":"main","label":"v9.0.0","branchLabelMappingKey":"^v9.0.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/202163","number":202163,"mergeCommit":{"message":"Exclude unrecognized tasks from the task manager aggregate API (#202163)\n\nIn this PR, I'm removing tasks with a status `unrecognized` from\r\nreturning on any `taskStore.aggregate` calls. Without this, we had\r\nunrecognized recurring tasks that were still part of the task manager\r\ncapacity calculation under\r\n`assumedAverageRecurringRequiredThroughputPerMinutePerKibana`.\r\n\r\n## To Verify\r\n1. Create a few ES Query alerting rules running every 1s\r\n2. Capture the task manager health API report via\r\n`/api/task_manager/_health`\r\n3. Apply the following diff to mark es query alerting tasks as\r\nunrecognized\r\n```\r\ndiff --git a/x-pack/plugins/stack_alerts/server/rule_types/es_query/index.ts b/x-pack/plugins/stack_alerts/server/rule_types/es_query/index.ts\r\nindex 1988eebc21a..8d649f4c6a5 100644\r\n--- a/x-pack/plugins/stack_alerts/server/rule_types/es_query/index.ts\r\n+++ b/x-pack/plugins/stack_alerts/server/rule_types/es_query/index.ts\r\n@@ -10,5 +10,5 @@ import { getRuleType } from './rule_type';\r\n\r\n export function register(params: RegisterRuleTypesParams, isServerless: boolean) {\r\n const { alerting, core } = params;\r\n- alerting.registerType(getRuleType(core, isServerless));\r\n+ // alerting.registerType(getRuleType(core, isServerless));\r\n }\r\ndiff --git a/x-pack/plugins/task_manager/server/removed_tasks/mark_removed_tasks_as_unrecognized.ts b/x-pack/plugins/task_manager/server/removed_tasks/mark_removed_tasks_as_unrecognized.ts\r\nindex e28d5221e72..dbfc1bbd135 100644\r\n--- a/x-pack/plugins/task_manager/server/removed_tasks/mark_removed_tasks_as_unrecognized.ts\r\n+++ b/x-pack/plugins/task_manager/server/removed_tasks/mark_removed_tasks_as_unrecognized.ts\r\n@@ -33,6 +33,11 @@ export async function scheduleMarkRemovedTasksAsUnrecognizedDefinition(\r\n state: {},\r\n params: {},\r\n });\r\n+ try {\r\n+ await taskScheduling.runSoon(TASK_ID);\r\n+ } catch (e) {\r\n+ // Ignore\r\n+ }\r\n } catch (e) {\r\n logger.error(`Error scheduling ${TASK_ID} task, received ${e.message}`);\r\n }\r\ndiff --git a/x-pack/plugins/task_manager/server/task_type_dictionary.ts b/x-pack/plugins/task_manager/server/task_type_dictionary.ts\r\nindex e0b28eccea3..142c07bb507 100644\r\n--- a/x-pack/plugins/task_manager/server/task_type_dictionary.ts\r\n+++ b/x-pack/plugins/task_manager/server/task_type_dictionary.ts\r\n@@ -32,6 +32,8 @@ export const REMOVED_TYPES: string[] = [\r\n\r\n 'cleanup_failed_action_executions',\r\n 'reports:monitor',\r\n+\r\n+ 'alerting:.es-query',\r\n ];\r\n\r\n /**\r\n```\r\n5. Capture the task manager health API report again via\r\n`/api/task_manager/_health`\r\n6. Notice the number dropped for\r\n`capacity_estimation.value.observed.avg_recurring_required_throughput_per_minute`","sha":"3b670980431da414535940aeeb0088d2ae7ff89c"}},{"branch":"8.17","label":"v8.17.0","branchLabelMappingKey":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"},{"branch":"8.x","label":"v8.18.0","branchLabelMappingKey":"^v8.18.0$","isSourceBranch":false,"state":"NOT_CREATED"},{"branch":"8.16","label":"v8.16.2","branchLabelMappingKey":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"}]}] BACKPORT--> Co-authored-by: Mike Côté <mikecote@users.noreply.github.com> |
||
|
eb1381bebd
|
[8.16] Expose values of certain task manager configuration settings in the telemetry (#202511) (#202684)
# Backport This will backport the following commits from `main` to `8.16`: - [Expose values of certain task manager configuration settings in the telemetry (#202511)](https://github.com/elastic/kibana/pull/202511) <!--- Backport version: 9.4.3 --> ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sqren/backport) <!--BACKPORT [{"author":{"name":"Mike Côté","email":"mikecote@users.noreply.github.com"},"sourceCommit":{"committedDate":"2024-12-03T11:22:23Z","message":"Expose values of certain task manager configuration settings in the telemetry (#202511)\n\nIn this PR, I'm adding some settings to the `exposeToUsage` variable\r\nwhich allows the values of these settings to be reported via telemetry.\r\nThis way we can see what values, ratios, etc a certain setting has.\r\n\r\nSettings to report values instead of `[redacted]`:\r\n- `xpack.task_manager.claim_strategy`\r\n- `xpack.task_manager.discovery.active_nodes_lookback`\r\n- `xpack.task_manager.unsafe.exclude_task_types`","sha":"bca48508144714b678772541a22a6e5f9210f4a5","branchLabelMapping":{"^v9.0.0$":"main","^v8.18.0$":"8.x","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","Feature:Task Manager","Team:ResponseOps","v9.0.0","backport:version","v8.17.0","v8.18.0","v8.16.2"],"title":"Expose values of certain task manager configuration settings in the telemetry","number":202511,"url":"https://github.com/elastic/kibana/pull/202511","mergeCommit":{"message":"Expose values of certain task manager configuration settings in the telemetry (#202511)\n\nIn this PR, I'm adding some settings to the `exposeToUsage` variable\r\nwhich allows the values of these settings to be reported via telemetry.\r\nThis way we can see what values, ratios, etc a certain setting has.\r\n\r\nSettings to report values instead of `[redacted]`:\r\n- `xpack.task_manager.claim_strategy`\r\n- `xpack.task_manager.discovery.active_nodes_lookback`\r\n- `xpack.task_manager.unsafe.exclude_task_types`","sha":"bca48508144714b678772541a22a6e5f9210f4a5"}},"sourceBranch":"main","suggestedTargetBranches":["8.17","8.x","8.16"],"targetPullRequestStates":[{"branch":"main","label":"v9.0.0","branchLabelMappingKey":"^v9.0.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/202511","number":202511,"mergeCommit":{"message":"Expose values of certain task manager configuration settings in the telemetry (#202511)\n\nIn this PR, I'm adding some settings to the `exposeToUsage` variable\r\nwhich allows the values of these settings to be reported via telemetry.\r\nThis way we can see what values, ratios, etc a certain setting has.\r\n\r\nSettings to report values instead of `[redacted]`:\r\n- `xpack.task_manager.claim_strategy`\r\n- `xpack.task_manager.discovery.active_nodes_lookback`\r\n- `xpack.task_manager.unsafe.exclude_task_types`","sha":"bca48508144714b678772541a22a6e5f9210f4a5"}},{"branch":"8.17","label":"v8.17.0","branchLabelMappingKey":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"},{"branch":"8.x","label":"v8.18.0","branchLabelMappingKey":"^v8.18.0$","isSourceBranch":false,"state":"NOT_CREATED"},{"branch":"8.16","label":"v8.16.2","branchLabelMappingKey":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"}]}] BACKPORT--> Co-authored-by: Mike Côté <mikecote@users.noreply.github.com> |
||
|
dd31886ad7
|
[8.16] Fix bug in capacity warning logs when assumedAverageRecurringRequiredThroughputPerMinutePerKibana was sufficient (#200578) (#201444)
# Backport This will backport the following commits from `main` to `8.16`: - [Fix bug in capacity warning logs when assumedAverageRecurringRequiredThroughputPerMinutePerKibana was sufficient (#200578)](https://github.com/elastic/kibana/pull/200578) <!--- Backport version: 9.4.3 --> ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sqren/backport) <!--BACKPORT [{"author":{"name":"Mike Côté","email":"mikecote@users.noreply.github.com"},"sourceCommit":{"committedDate":"2024-11-22T17:33:43Z","message":"Fix bug in capacity warning logs when assumedAverageRecurringRequiredThroughputPerMinutePerKibana was sufficient (#200578)\n\nIn this PR, I'm fixing a bug where the task manager unhealthy logs were\r\nshowing the wrong reason. Because the if condition was checking\r\n`assumedAverageRecurringRequiredThroughputPerMinutePerKibana` to be less\r\nthan `capacityPerMinutePerKibana`, it would log this as the reason task\r\nmanager is healthy. However whenever the value is less, it means task\r\nmanager is running fine from a recurring task perspective. Changing the\r\nif condition allows the next step in the code to log which then logs\r\n`assumedRequiredThroughputPerMinutePerKibana` as the real reason why\r\ntask manager is unhealthy.\r\n\r\n---------\r\n\r\nCo-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>","sha":"9fa8ec71529faa31a44fe2eac0902fb0d4b98797","branchLabelMapping":{"^v9.0.0$":"main","^v8.18.0$":"8.x","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","Feature:Task Manager","Team:ResponseOps","v9.0.0","backport:version","v8.17.0","v8.16.1"],"title":"Fix bug in capacity warning logs when assumedAverageRecurringRequiredThroughputPerMinutePerKibana was sufficient","number":200578,"url":"https://github.com/elastic/kibana/pull/200578","mergeCommit":{"message":"Fix bug in capacity warning logs when assumedAverageRecurringRequiredThroughputPerMinutePerKibana was sufficient (#200578)\n\nIn this PR, I'm fixing a bug where the task manager unhealthy logs were\r\nshowing the wrong reason. Because the if condition was checking\r\n`assumedAverageRecurringRequiredThroughputPerMinutePerKibana` to be less\r\nthan `capacityPerMinutePerKibana`, it would log this as the reason task\r\nmanager is healthy. However whenever the value is less, it means task\r\nmanager is running fine from a recurring task perspective. Changing the\r\nif condition allows the next step in the code to log which then logs\r\n`assumedRequiredThroughputPerMinutePerKibana` as the real reason why\r\ntask manager is unhealthy.\r\n\r\n---------\r\n\r\nCo-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>","sha":"9fa8ec71529faa31a44fe2eac0902fb0d4b98797"}},"sourceBranch":"main","suggestedTargetBranches":["8.17","8.16"],"targetPullRequestStates":[{"branch":"main","label":"v9.0.0","branchLabelMappingKey":"^v9.0.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/200578","number":200578,"mergeCommit":{"message":"Fix bug in capacity warning logs when assumedAverageRecurringRequiredThroughputPerMinutePerKibana was sufficient (#200578)\n\nIn this PR, I'm fixing a bug where the task manager unhealthy logs were\r\nshowing the wrong reason. Because the if condition was checking\r\n`assumedAverageRecurringRequiredThroughputPerMinutePerKibana` to be less\r\nthan `capacityPerMinutePerKibana`, it would log this as the reason task\r\nmanager is healthy. However whenever the value is less, it means task\r\nmanager is running fine from a recurring task perspective. Changing the\r\nif condition allows the next step in the code to log which then logs\r\n`assumedRequiredThroughputPerMinutePerKibana` as the real reason why\r\ntask manager is unhealthy.\r\n\r\n---------\r\n\r\nCo-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>","sha":"9fa8ec71529faa31a44fe2eac0902fb0d4b98797"}},{"branch":"8.17","label":"v8.17.0","branchLabelMappingKey":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"},{"branch":"8.16","label":"v8.16.1","branchLabelMappingKey":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"}]}] BACKPORT--> Co-authored-by: Mike Côté <mikecote@users.noreply.github.com> |
||
|
6e35221e29
|
[8.16] Reduce noisy logs about claiming on all partitions (#199405) (#199530)
# Backport This will backport the following commits from `main` to `8.16`: - [Reduce noisy logs about claiming on all partitions (#199405)](https://github.com/elastic/kibana/pull/199405) <!--- Backport version: 8.9.8 --> ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sqren/backport) <!--BACKPORT [{"author":{"name":"Mike Côté","email":"mikecote@users.noreply.github.com"},"sourceCommit":{"committedDate":"2024-11-08T17:57:34Z","message":"Reduce noisy logs about claiming on all partitions (#199405)\n\nResolves https://github.com/elastic/response-ops-team/issues/257\r\n\r\nIn this PR, I'm throttling the `Background task node\r\n\"${taskPartitioner.getPodName()}\" has no assigned partitions, claiming\r\nagainst all partitions` warning logs to once per minute. I'm also adding\r\nan info log whenever the node has assigned partitions and a warning\r\nmessage was previously logged.\r\n\r\n## To verify\r\n\r\n1. Startup Kibana with a fresh Elasticsearch instance\r\n2. Notice the warning log followed by the info log\r\n3. Apply the following diff\r\n```\r\ndiff --git a/x-pack/plugins/task_manager/server/kibana_discovery_service/kibana_discovery_service.ts b/x-pack/plugins/task_manager/server/kibana_discovery_service/kibana_discovery_service.ts\r\nindex 1c4fcb00981..df6bf6ca377 100644\r\n--- a/x-pack/plugins/task_manager/server/kibana_discovery_service/kibana_discovery_service.ts\r\n+++ b/x-pack/plugins/task_manager/server/kibana_discovery_service/kibana_discovery_service.ts\r\n@@ -63,7 +63,7 @@ export class KibanaDiscoveryService {\r\n const lastSeenDate = new Date();\r\n const lastSeen = lastSeenDate.toISOString();\r\n try {\r\n- await this.upsertCurrentNode({ id: this.currentNode, lastSeen });\r\n+ // await this.upsertCurrentNode({ id: this.currentNode, lastSeen });\r\n if (!this.started) {\r\n this.logger.info('Kibana Discovery Service has been started');\r\n this.started = true;\r\n```\r\n4. Notice the warning log message happening every minute","sha":"32b942edfbf8ebfb891945ef3a6ca9a1794c7168","branchLabelMapping":{"^v9.0.0$":"main","^v8.17.0$":"8.x","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","Feature:Task Manager","Team:ResponseOps","v9.0.0","backport:version","v8.17.0"],"number":199405,"url":"https://github.com/elastic/kibana/pull/199405","mergeCommit":{"message":"Reduce noisy logs about claiming on all partitions (#199405)\n\nResolves https://github.com/elastic/response-ops-team/issues/257\r\n\r\nIn this PR, I'm throttling the `Background task node\r\n\"${taskPartitioner.getPodName()}\" has no assigned partitions, claiming\r\nagainst all partitions` warning logs to once per minute. I'm also adding\r\nan info log whenever the node has assigned partitions and a warning\r\nmessage was previously logged.\r\n\r\n## To verify\r\n\r\n1. Startup Kibana with a fresh Elasticsearch instance\r\n2. Notice the warning log followed by the info log\r\n3. Apply the following diff\r\n```\r\ndiff --git a/x-pack/plugins/task_manager/server/kibana_discovery_service/kibana_discovery_service.ts b/x-pack/plugins/task_manager/server/kibana_discovery_service/kibana_discovery_service.ts\r\nindex 1c4fcb00981..df6bf6ca377 100644\r\n--- a/x-pack/plugins/task_manager/server/kibana_discovery_service/kibana_discovery_service.ts\r\n+++ b/x-pack/plugins/task_manager/server/kibana_discovery_service/kibana_discovery_service.ts\r\n@@ -63,7 +63,7 @@ export class KibanaDiscoveryService {\r\n const lastSeenDate = new Date();\r\n const lastSeen = lastSeenDate.toISOString();\r\n try {\r\n- await this.upsertCurrentNode({ id: this.currentNode, lastSeen });\r\n+ // await this.upsertCurrentNode({ id: this.currentNode, lastSeen });\r\n if (!this.started) {\r\n this.logger.info('Kibana Discovery Service has been started');\r\n this.started = true;\r\n```\r\n4. Notice the warning log message happening every minute","sha":"32b942edfbf8ebfb891945ef3a6ca9a1794c7168"}},"sourceBranch":"main","suggestedTargetBranches":[],"targetPullRequestStates":[{"branch":"main","label":"v9.0.0","labelRegex":"^v9.0.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/199405","number":199405,"mergeCommit":{"message":"Reduce noisy logs about claiming on all partitions (#199405)\n\nResolves https://github.com/elastic/response-ops-team/issues/257\r\n\r\nIn this PR, I'm throttling the `Background task node\r\n\"${taskPartitioner.getPodName()}\" has no assigned partitions, claiming\r\nagainst all partitions` warning logs to once per minute. I'm also adding\r\nan info log whenever the node has assigned partitions and a warning\r\nmessage was previously logged.\r\n\r\n## To verify\r\n\r\n1. Startup Kibana with a fresh Elasticsearch instance\r\n2. Notice the warning log followed by the info log\r\n3. Apply the following diff\r\n```\r\ndiff --git a/x-pack/plugins/task_manager/server/kibana_discovery_service/kibana_discovery_service.ts b/x-pack/plugins/task_manager/server/kibana_discovery_service/kibana_discovery_service.ts\r\nindex 1c4fcb00981..df6bf6ca377 100644\r\n--- a/x-pack/plugins/task_manager/server/kibana_discovery_service/kibana_discovery_service.ts\r\n+++ b/x-pack/plugins/task_manager/server/kibana_discovery_service/kibana_discovery_service.ts\r\n@@ -63,7 +63,7 @@ export class KibanaDiscoveryService {\r\n const lastSeenDate = new Date();\r\n const lastSeen = lastSeenDate.toISOString();\r\n try {\r\n- await this.upsertCurrentNode({ id: this.currentNode, lastSeen });\r\n+ // await this.upsertCurrentNode({ id: this.currentNode, lastSeen });\r\n if (!this.started) {\r\n this.logger.info('Kibana Discovery Service has been started');\r\n this.started = true;\r\n```\r\n4. Notice the warning log message happening every minute","sha":"32b942edfbf8ebfb891945ef3a6ca9a1794c7168"}},{"branch":"8.x","label":"v8.17.0","labelRegex":"^v8.17.0$","isSourceBranch":false,"url":"https://github.com/elastic/kibana/pull/199529","number":199529,"state":"OPEN"}]}] BACKPORT--> |
||
|
9bdaef80cd
|
[8.16] [Response Ops][Task Manager] Propagate `msearch` error status code so backpressure mechanism responds correctly (#197501) (#198034)
# Backport This will backport the following commits from `main` to `8.16`: - [[Response Ops][Task Manager] Propagate `msearch` error status code so backpressure mechanism responds correctly (#197501)](https://github.com/elastic/kibana/pull/197501) <!--- Backport version: 9.4.3 --> ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sqren/backport) <!--BACKPORT [{"author":{"name":"Ying Mao","email":"ying.mao@elastic.co"},"sourceCommit":{"committedDate":"2024-10-28T15:43:51Z","message":"[Response Ops][Task Manager] Propagate `msearch` error status code so backpressure mechanism responds correctly (#197501)\n\nResolves https://github.com/elastic/response-ops-team/issues/240\r\n\r\n## Summary\r\n\r\nCreating an `MsearchError` class that preserves the status code from any\r\nmsearch errors. These errors are already piped to the managed\r\nconfiguration observable that watches for and responds to ES errors from\r\nthe update by query claim strategy so I updated that filter to filter\r\nfor msearch 429 and 503 errors as well.\r\n\r\n## To Verify\r\n\r\n1. Make sure you're using the mget claim strategy\r\n(`xpack.task_manager.claim_strategy: 'mget'`) and start ES and Kibana.\r\n2. Inject a 429 error into an msearch response.\r\n\r\n```\r\n--- a/x-pack/plugins/task_manager/server/task_store.ts\r\n+++ b/x-pack/plugins/task_manager/server/task_store.ts\r\n@@ -571,6 +571,8 @@ export class TaskStore {\r\n });\r\n const { responses } = result;\r\n\r\n+ responses[0].status = 429;\r\n+\r\n const versionMap = this.createVersionMap([]);\r\n```\r\n\r\n3. See task manager log the msearch errors and eventually reduce polling\r\ncapacity\r\n\r\n```\r\n[2024-10-23T15:35:59.255-04:00][ERROR][plugins.taskManager] Failed to poll for work: Unexpected status code from taskStore::msearch: 429\r\n[2024-10-23T15:35:59.756-04:00][ERROR][plugins.taskManager] Failed to poll for work: Unexpected status code from taskStore::msearch: 429\r\n[2024-10-23T15:36:00.257-04:00][ERROR][plugins.taskManager] Failed to poll for work: Unexpected status code from taskStore::msearch: 429\r\n[2024-10-23T15:36:00.757-04:00][ERROR][plugins.taskManager] Failed to poll for work: Unexpected status code from taskStore::msearch: 429\r\n...\r\n\r\n[2024-10-23T15:36:06.267-04:00][WARN ][plugins.taskManager] Poll interval configuration is temporarily increased after Elasticsearch returned 19 \"too many request\" and/or \"execute [inline] script\" error(s).\r\n[2024-10-23T15:36:06.268-04:00][WARN ][plugins.taskManager] Capacity configuration is temporarily reduced after Elasticsearch returned 19 \"too many request\" and/or \"execute [inline] script\" error(s).\r\n```\r\n\r\n---------\r\n\r\nCo-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>","sha":"043e18b6a097f4405ff37a99396c0c8c92db6b44","branchLabelMapping":{"^v9.0.0$":"main","^v8.17.0$":"8.x","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","Feature:Task Manager","Team:ResponseOps","v9.0.0","backport:prev-minor","v8.16.0","v8.17.0"],"title":"[Response Ops][Task Manager] Propagate `msearch` error status code so backpressure mechanism responds correctly","number":197501,"url":"https://github.com/elastic/kibana/pull/197501","mergeCommit":{"message":"[Response Ops][Task Manager] Propagate `msearch` error status code so backpressure mechanism responds correctly (#197501)\n\nResolves https://github.com/elastic/response-ops-team/issues/240\r\n\r\n## Summary\r\n\r\nCreating an `MsearchError` class that preserves the status code from any\r\nmsearch errors. These errors are already piped to the managed\r\nconfiguration observable that watches for and responds to ES errors from\r\nthe update by query claim strategy so I updated that filter to filter\r\nfor msearch 429 and 503 errors as well.\r\n\r\n## To Verify\r\n\r\n1. Make sure you're using the mget claim strategy\r\n(`xpack.task_manager.claim_strategy: 'mget'`) and start ES and Kibana.\r\n2. Inject a 429 error into an msearch response.\r\n\r\n```\r\n--- a/x-pack/plugins/task_manager/server/task_store.ts\r\n+++ b/x-pack/plugins/task_manager/server/task_store.ts\r\n@@ -571,6 +571,8 @@ export class TaskStore {\r\n });\r\n const { responses } = result;\r\n\r\n+ responses[0].status = 429;\r\n+\r\n const versionMap = this.createVersionMap([]);\r\n```\r\n\r\n3. See task manager log the msearch errors and eventually reduce polling\r\ncapacity\r\n\r\n```\r\n[2024-10-23T15:35:59.255-04:00][ERROR][plugins.taskManager] Failed to poll for work: Unexpected status code from taskStore::msearch: 429\r\n[2024-10-23T15:35:59.756-04:00][ERROR][plugins.taskManager] Failed to poll for work: Unexpected status code from taskStore::msearch: 429\r\n[2024-10-23T15:36:00.257-04:00][ERROR][plugins.taskManager] Failed to poll for work: Unexpected status code from taskStore::msearch: 429\r\n[2024-10-23T15:36:00.757-04:00][ERROR][plugins.taskManager] Failed to poll for work: Unexpected status code from taskStore::msearch: 429\r\n...\r\n\r\n[2024-10-23T15:36:06.267-04:00][WARN ][plugins.taskManager] Poll interval configuration is temporarily increased after Elasticsearch returned 19 \"too many request\" and/or \"execute [inline] script\" error(s).\r\n[2024-10-23T15:36:06.268-04:00][WARN ][plugins.taskManager] Capacity configuration is temporarily reduced after Elasticsearch returned 19 \"too many request\" and/or \"execute [inline] script\" error(s).\r\n```\r\n\r\n---------\r\n\r\nCo-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>","sha":"043e18b6a097f4405ff37a99396c0c8c92db6b44"}},"sourceBranch":"main","suggestedTargetBranches":["8.16","8.x"],"targetPullRequestStates":[{"branch":"main","label":"v9.0.0","branchLabelMappingKey":"^v9.0.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/197501","number":197501,"mergeCommit":{"message":"[Response Ops][Task Manager] Propagate `msearch` error status code so backpressure mechanism responds correctly (#197501)\n\nResolves https://github.com/elastic/response-ops-team/issues/240\r\n\r\n## Summary\r\n\r\nCreating an `MsearchError` class that preserves the status code from any\r\nmsearch errors. These errors are already piped to the managed\r\nconfiguration observable that watches for and responds to ES errors from\r\nthe update by query claim strategy so I updated that filter to filter\r\nfor msearch 429 and 503 errors as well.\r\n\r\n## To Verify\r\n\r\n1. Make sure you're using the mget claim strategy\r\n(`xpack.task_manager.claim_strategy: 'mget'`) and start ES and Kibana.\r\n2. Inject a 429 error into an msearch response.\r\n\r\n```\r\n--- a/x-pack/plugins/task_manager/server/task_store.ts\r\n+++ b/x-pack/plugins/task_manager/server/task_store.ts\r\n@@ -571,6 +571,8 @@ export class TaskStore {\r\n });\r\n const { responses } = result;\r\n\r\n+ responses[0].status = 429;\r\n+\r\n const versionMap = this.createVersionMap([]);\r\n```\r\n\r\n3. See task manager log the msearch errors and eventually reduce polling\r\ncapacity\r\n\r\n```\r\n[2024-10-23T15:35:59.255-04:00][ERROR][plugins.taskManager] Failed to poll for work: Unexpected status code from taskStore::msearch: 429\r\n[2024-10-23T15:35:59.756-04:00][ERROR][plugins.taskManager] Failed to poll for work: Unexpected status code from taskStore::msearch: 429\r\n[2024-10-23T15:36:00.257-04:00][ERROR][plugins.taskManager] Failed to poll for work: Unexpected status code from taskStore::msearch: 429\r\n[2024-10-23T15:36:00.757-04:00][ERROR][plugins.taskManager] Failed to poll for work: Unexpected status code from taskStore::msearch: 429\r\n...\r\n\r\n[2024-10-23T15:36:06.267-04:00][WARN ][plugins.taskManager] Poll interval configuration is temporarily increased after Elasticsearch returned 19 \"too many request\" and/or \"execute [inline] script\" error(s).\r\n[2024-10-23T15:36:06.268-04:00][WARN ][plugins.taskManager] Capacity configuration is temporarily reduced after Elasticsearch returned 19 \"too many request\" and/or \"execute [inline] script\" error(s).\r\n```\r\n\r\n---------\r\n\r\nCo-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>","sha":"043e18b6a097f4405ff37a99396c0c8c92db6b44"}},{"branch":"8.16","label":"v8.16.0","branchLabelMappingKey":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"},{"branch":"8.x","label":"v8.17.0","branchLabelMappingKey":"^v8.17.0$","isSourceBranch":false,"state":"NOT_CREATED"}]}] BACKPORT--> Co-authored-by: Ying Mao <ying.mao@elastic.co> |
||
|
b627c018bc
|
[8.16] Improve error logs for task manager poller (#197635) (#197803)
# Backport This will backport the following commits from `main` to `8.16`: - [Improve error logs for task manager poller (#197635)](https://github.com/elastic/kibana/pull/197635) <!--- Backport version: 9.4.3 --> ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sqren/backport) <!--BACKPORT [{"author":{"name":"Mike Côté","email":"mikecote@users.noreply.github.com"},"sourceCommit":{"committedDate":"2024-10-25T11:12:44Z","message":"Improve error logs for task manager poller (#197635)\n\nI noticed some scenarios we see error logs from the task poller like\r\n`Failed to poll for work: undefined` making me think `err.message` is\r\nempty in some situations. I'm modifying the code to handle string\r\nsituations if ever they occur by performing `err.message || err` and to\r\nalso include a stack trace when strings are passed-in.\r\n\r\n---------\r\n\r\nCo-authored-by: Patrick Mueller <patrick.mueller@elastic.co>","sha":"81b63c60eb6d1fe623f2e177cd55d2f285f79590","branchLabelMapping":{"^v9.0.0$":"main","^v8.17.0$":"8.x","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","Feature:Task Manager","Team:ResponseOps","v9.0.0","backport:prev-minor","v8.16.0","v8.17.0"],"title":"Improve error logs for task manager poller","number":197635,"url":"https://github.com/elastic/kibana/pull/197635","mergeCommit":{"message":"Improve error logs for task manager poller (#197635)\n\nI noticed some scenarios we see error logs from the task poller like\r\n`Failed to poll for work: undefined` making me think `err.message` is\r\nempty in some situations. I'm modifying the code to handle string\r\nsituations if ever they occur by performing `err.message || err` and to\r\nalso include a stack trace when strings are passed-in.\r\n\r\n---------\r\n\r\nCo-authored-by: Patrick Mueller <patrick.mueller@elastic.co>","sha":"81b63c60eb6d1fe623f2e177cd55d2f285f79590"}},"sourceBranch":"main","suggestedTargetBranches":["8.16","8.x"],"targetPullRequestStates":[{"branch":"main","label":"v9.0.0","branchLabelMappingKey":"^v9.0.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/197635","number":197635,"mergeCommit":{"message":"Improve error logs for task manager poller (#197635)\n\nI noticed some scenarios we see error logs from the task poller like\r\n`Failed to poll for work: undefined` making me think `err.message` is\r\nempty in some situations. I'm modifying the code to handle string\r\nsituations if ever they occur by performing `err.message || err` and to\r\nalso include a stack trace when strings are passed-in.\r\n\r\n---------\r\n\r\nCo-authored-by: Patrick Mueller <patrick.mueller@elastic.co>","sha":"81b63c60eb6d1fe623f2e177cd55d2f285f79590"}},{"branch":"8.16","label":"v8.16.0","branchLabelMappingKey":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"},{"branch":"8.x","label":"v8.17.0","branchLabelMappingKey":"^v8.17.0$","isSourceBranch":false,"state":"NOT_CREATED"}]}] BACKPORT--> Co-authored-by: Mike Côté <mikecote@users.noreply.github.com> |
||
|
6bde9626df
|
[8.16] Apply backpressure to the task poller whenever Elasticsearch requests respond with 503 errors (#196900) (#197705)
# Backport This will backport the following commits from `main` to `8.16`: - [Apply backpressure to the task poller whenever Elasticsearch requests respond with 503 errors (#196900)](https://github.com/elastic/kibana/pull/196900) <!--- Backport version: 8.9.8 --> ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sqren/backport) <!--BACKPORT [{"author":{"name":"Ersin Erdal","email":"92688503+ersin-erdal@users.noreply.github.com"},"sourceCommit":{"committedDate":"2024-10-23T23:16:45Z","message":"Apply backpressure to the task poller whenever Elasticsearch requests respond with 503 errors (#196900)\n\nResolves: #195134\r\n\r\nThis PR adds 503 error check to the error filter of\r\n`createManagedConfiguration` function, besides the 501 error .\r\nSo it applies backpressure to the task poller for 503 errors as well.","sha":"292a7d384e51ca9e76d09f817f583bd0b201d9e0","branchLabelMapping":{"^v9.0.0$":"main","^v8.17.0$":"8.x","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","Team:ResponseOps","v9.0.0","backport:prev-minor","v8.16.0","v8.17.0"],"number":196900,"url":"https://github.com/elastic/kibana/pull/196900","mergeCommit":{"message":"Apply backpressure to the task poller whenever Elasticsearch requests respond with 503 errors (#196900)\n\nResolves: #195134\r\n\r\nThis PR adds 503 error check to the error filter of\r\n`createManagedConfiguration` function, besides the 501 error .\r\nSo it applies backpressure to the task poller for 503 errors as well.","sha":"292a7d384e51ca9e76d09f817f583bd0b201d9e0"}},"sourceBranch":"main","suggestedTargetBranches":["8.16"],"targetPullRequestStates":[{"branch":"main","label":"v9.0.0","labelRegex":"^v9.0.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/196900","number":196900,"mergeCommit":{"message":"Apply backpressure to the task poller whenever Elasticsearch requests respond with 503 errors (#196900)\n\nResolves: #195134\r\n\r\nThis PR adds 503 error check to the error filter of\r\n`createManagedConfiguration` function, besides the 501 error .\r\nSo it applies backpressure to the task poller for 503 errors as well.","sha":"292a7d384e51ca9e76d09f817f583bd0b201d9e0"}},{"branch":"8.16","label":"v8.16.0","labelRegex":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"},{"branch":"8.x","label":"v8.17.0","labelRegex":"^v8.17.0$","isSourceBranch":false,"url":"https://github.com/elastic/kibana/pull/197544","number":197544,"state":"MERGED","mergeCommit":{"sha":"e2001cadd96d112108ee60780bf3708d5fa48a14","message":"[8.x] Apply backpressure to the task poller whenever Elasticsearch requests respond with 503 errors (#196900) (#197544)\n\n# Backport\n\nThis will backport the following commits from `main` to `8.x`:\n- [Apply backpressure to the task poller whenever Elasticsearch requests\nrespond with 503 errors\n(#196900)](https://github.com/elastic/kibana/pull/196900)\n\n<!--- Backport version: 9.4.3 -->\n\n### Questions ?\nPlease refer to the [Backport tool\ndocumentation](https://github.com/sqren/backport)\n\n<!--BACKPORT [{\"author\":{\"name\":\"Ersin\nErdal\",\"email\":\"92688503+ersin-erdal@users.noreply.github.com\"},\"sourceCommit\":{\"committedDate\":\"2024-10-23T23:16:45Z\",\"message\":\"Apply\nbackpressure to the task poller whenever Elasticsearch requests respond\nwith 503 errors (#196900)\\n\\nResolves: #195134\\r\\n\\r\\nThis PR adds 503\nerror check to the error filter of\\r\\n`createManagedConfiguration`\nfunction, besides the 501 error .\\r\\nSo it applies backpressure to the\ntask poller for 503 errors as\nwell.\",\"sha\":\"292a7d384e51ca9e76d09f817f583bd0b201d9e0\",\"branchLabelMapping\":{\"^v9.0.0$\":\"main\",\"^v8.17.0$\":\"8.x\",\"^v(\\\\d+).(\\\\d+).\\\\d+$\":\"$1.$2\"}},\"sourcePullRequest\":{\"labels\":[\"release_note:skip\",\"Team:ResponseOps\",\"v9.0.0\",\"backport:prev-minor\"],\"title\":\"Apply\nbackpressure to the task poller whenever Elasticsearch requests respond\nwith 503\nerrors\",\"number\":196900,\"url\":\"https://github.com/elastic/kibana/pull/196900\",\"mergeCommit\":{\"message\":\"Apply\nbackpressure to the task poller whenever Elasticsearch requests respond\nwith 503 errors (#196900)\\n\\nResolves: #195134\\r\\n\\r\\nThis PR adds 503\nerror check to the error filter of\\r\\n`createManagedConfiguration`\nfunction, besides the 501 error .\\r\\nSo it applies backpressure to the\ntask poller for 503 errors as\nwell.\",\"sha\":\"292a7d384e51ca9e76d09f817f583bd0b201d9e0\"}},\"sourceBranch\":\"main\",\"suggestedTargetBranches\":[],\"targetPullRequestStates\":[{\"branch\":\"main\",\"label\":\"v9.0.0\",\"branchLabelMappingKey\":\"^v9.0.0$\",\"isSourceBranch\":true,\"state\":\"MERGED\",\"url\":\"https://github.com/elastic/kibana/pull/196900\",\"number\":196900,\"mergeCommit\":{\"message\":\"Apply\nbackpressure to the task poller whenever Elasticsearch requests respond\nwith 503 errors (#196900)\\n\\nResolves: #195134\\r\\n\\r\\nThis PR adds 503\nerror check to the error filter of\\r\\n`createManagedConfiguration`\nfunction, besides the 501 error .\\r\\nSo it applies backpressure to the\ntask poller for 503 errors as\nwell.\",\"sha\":\"292a7d384e51ca9e76d09f817f583bd0b201d9e0\"}}]}] BACKPORT-->\n\nCo-authored-by: Ersin Erdal <92688503+ersin-erdal@users.noreply.github.com>"}}]}] BACKPORT--> |
||
|
0fa7788ea5
|
[8.16] Disable Inference Connector experimental feature (#196036) (#197496)
# Backport This will backport the following commits from `main` to `8.16`: - [Disable Inference Connector experimental feature (#196036)](https://github.com/elastic/kibana/pull/196036) <!--- Backport version: 9.6.1 --> ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sorenlouv/backport) <!--BACKPORT [{"author":{"name":"Yuliia Naumenko","email":"jo.naumenko@gmail.com"},"sourceCommit":{"committedDate":"2024-10-14T20:21:35Z","message":"Disable Inference Connector experimental feature (#196036)","sha":"10622964efa74ce26e361c85adaa815981ff148c","branchLabelMapping":{"^v9.0.0$":"main","^v8.16.0$":"8.x","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","backport:skip","v9.0.0","8.16 candidate","v8.16.0"],"title":"Disable Inference Connector experimental feature","number":196036,"url":"https://github.com/elastic/kibana/pull/196036","mergeCommit":{"message":"Disable Inference Connector experimental feature (#196036)","sha":"10622964efa74ce26e361c85adaa815981ff148c"}},"sourceBranch":"main","suggestedTargetBranches":["8.16"],"targetPullRequestStates":[{"branch":"main","label":"v9.0.0","branchLabelMappingKey":"^v9.0.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/196036","number":196036,"mergeCommit":{"message":"Disable Inference Connector experimental feature (#196036)","sha":"10622964efa74ce26e361c85adaa815981ff148c"}},{"branch":"8.x","label":"v8.16.0","branchLabelMappingKey":"^v8.16.0$","isSourceBranch":false,"state":"NOT_CREATED"}]}] BACKPORT--> |
||
|
a4c05e3efb
|
[8.16] Onboard elastic owned ECH clusters to use `mget` task claiming (#196757) (#196855)
# Backport This will backport the following commits from `main` to `8.16`: - [Onboard elastic owned ECH clusters to use `mget` task claiming (#196757)](https://github.com/elastic/kibana/pull/196757) <!--- Backport version: 9.4.3 --> ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sqren/backport) <!--BACKPORT [{"author":{"name":"Mike Côté","email":"mikecote@users.noreply.github.com"},"sourceCommit":{"committedDate":"2024-10-18T11:31:32Z","message":"Onboard elastic owned ECH clusters to use `mget` task claiming (#196757)\n\nSimilar to https://github.com/elastic/kibana/pull/196317\r\n\r\nIn this PR, I'm flipping the mget feature flag to on for all elastic\r\nowned ECH clusters. Elastic owned clusters are determined by looking at\r\n`plugins.cloud?.isElasticStaffOwned`.\r\n\r\n## To verify\r\nObserve the PR deployment which doesn't start with `a` or `b` yet is\r\nusing the mget claim strategy by logging `Using claim strategy mget` on\r\nstartup.","sha":"97f2a9098fb91708250459910820b1b99d40f1c4","branchLabelMapping":{"^v9.0.0$":"main","^v8.17.0$":"8.x","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","Feature:Task Manager","Team:ResponseOps","v9.0.0","backport:prev-minor","ci:cloud-deploy","v8.16.0","v8.17.0"],"title":"Onboard elastic owned ECH clusters to use `mget` task claiming","number":196757,"url":"https://github.com/elastic/kibana/pull/196757","mergeCommit":{"message":"Onboard elastic owned ECH clusters to use `mget` task claiming (#196757)\n\nSimilar to https://github.com/elastic/kibana/pull/196317\r\n\r\nIn this PR, I'm flipping the mget feature flag to on for all elastic\r\nowned ECH clusters. Elastic owned clusters are determined by looking at\r\n`plugins.cloud?.isElasticStaffOwned`.\r\n\r\n## To verify\r\nObserve the PR deployment which doesn't start with `a` or `b` yet is\r\nusing the mget claim strategy by logging `Using claim strategy mget` on\r\nstartup.","sha":"97f2a9098fb91708250459910820b1b99d40f1c4"}},"sourceBranch":"main","suggestedTargetBranches":["8.16","8.x"],"targetPullRequestStates":[{"branch":"main","label":"v9.0.0","branchLabelMappingKey":"^v9.0.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/196757","number":196757,"mergeCommit":{"message":"Onboard elastic owned ECH clusters to use `mget` task claiming (#196757)\n\nSimilar to https://github.com/elastic/kibana/pull/196317\r\n\r\nIn this PR, I'm flipping the mget feature flag to on for all elastic\r\nowned ECH clusters. Elastic owned clusters are determined by looking at\r\n`plugins.cloud?.isElasticStaffOwned`.\r\n\r\n## To verify\r\nObserve the PR deployment which doesn't start with `a` or `b` yet is\r\nusing the mget claim strategy by logging `Using claim strategy mget` on\r\nstartup.","sha":"97f2a9098fb91708250459910820b1b99d40f1c4"}},{"branch":"8.16","label":"v8.16.0","branchLabelMappingKey":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"},{"branch":"8.x","label":"v8.17.0","branchLabelMappingKey":"^v8.17.0$","isSourceBranch":false,"state":"NOT_CREATED"}]}] BACKPORT--> Co-authored-by: Mike Côté <mikecote@users.noreply.github.com> |
||
|
cb1e03aad1
|
[8.x] Fixes Failing test: Jest Integration Tests.x-pack/plugins/task_manager/server/integration_tests - unrecognized task types should be no workload aggregator errors when there are removed task types (#196179) (#196217)
# Backport This will backport the following commits from `main` to `8.x`: - [Fixes Failing test: Jest Integration Tests.x-pack/plugins/task_manager/server/integration_tests - unrecognized task types should be no workload aggregator errors when there are removed task types (#196179)](https://github.com/elastic/kibana/pull/196179) <!--- Backport version: 9.4.3 --> ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sqren/backport) <!--BACKPORT [{"author":{"name":"Ying Mao","email":"ying.mao@elastic.co"},"sourceCommit":{"committedDate":"2024-10-14T22:35:37Z","message":"Fixes Failing test: Jest Integration Tests.x-pack/plugins/task_manager/server/integration_tests - unrecognized task types should be no workload aggregator errors when there are removed task types (#196179)","sha":"d70583faddd3b5f3f3e8d59888a8bebdb262a4d2","branchLabelMapping":{"^v9.0.0$":"main","^v8.16.0$":"8.x","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","Feature:Task Manager","Team:ResponseOps","v9.0.0","backport:prev-minor","v8.16.0"],"title":"Fixes Failing test: Jest Integration Tests.x-pack/plugins/task_manager/server/integration_tests - unrecognized task types should be no workload aggregator errors when there are removed task types","number":196179,"url":"https://github.com/elastic/kibana/pull/196179","mergeCommit":{"message":"Fixes Failing test: Jest Integration Tests.x-pack/plugins/task_manager/server/integration_tests - unrecognized task types should be no workload aggregator errors when there are removed task types (#196179)","sha":"d70583faddd3b5f3f3e8d59888a8bebdb262a4d2"}},"sourceBranch":"main","suggestedTargetBranches":["8.x"],"targetPullRequestStates":[{"branch":"main","label":"v9.0.0","branchLabelMappingKey":"^v9.0.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/196179","number":196179,"mergeCommit":{"message":"Fixes Failing test: Jest Integration Tests.x-pack/plugins/task_manager/server/integration_tests - unrecognized task types should be no workload aggregator errors when there are removed task types (#196179)","sha":"d70583faddd3b5f3f3e8d59888a8bebdb262a4d2"}},{"branch":"8.x","label":"v8.16.0","branchLabelMappingKey":"^v8.16.0$","isSourceBranch":false,"state":"NOT_CREATED"}]}] BACKPORT--> Co-authored-by: Ying Mao <ying.mao@elastic.co> Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com> |
||
|
0c1333301d
|
[8.x] [Response Ops][Task Manager] Onboard 12.5% of ECH clusters to use `mget` task claiming (#196317) (#196460)
# Backport This will backport the following commits from `main` to `8.x`: - [[Response Ops][Task Manager] Onboard 12.5% of ECH clusters to use `mget` task claiming (#196317)](https://github.com/elastic/kibana/pull/196317) <!--- Backport version: 9.4.3 --> ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sqren/backport) <!--BACKPORT [{"author":{"name":"Ying Mao","email":"ying.mao@elastic.co"},"sourceCommit":{"committedDate":"2024-10-16T00:24:52Z","message":"[Response Ops][Task Manager] Onboard 12.5% of ECH clusters to use `mget` task claiming (#196317)\n\nResolves https://github.com/elastic/response-ops-team/issues/239\r\n\r\n## Summary\r\n\r\nDeployed to cloud: deployment ID was `ab4e88d139f93d43024837d96144e7d4`.\r\nSince the deployment ID starts with an `a`, this should start with\r\n`mget` and I can see in the logs with the latest push that this is true\r\n\r\n<img width=\"2190\" alt=\"Screenshot 2024-10-15 at 2 59 20 PM\"\r\nsrc=\"https://github.com/user-attachments/assets/079bc4d8-365e-4ba6-b7a9-59fe506283d9\">\r\n\r\n\r\nDeployed to serverless: project ID was\r\n`d33d22a94ce246d091220eace2c4e4bb`. See in the logs: `Using claim\r\nstrategy mget as configured for deployment\r\nd33d22a94ce246d091220eace2c4e4bb`\r\n\r\nCo-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>","sha":"267efdf31fe9ae314b0bed99bc23db5452a2aaa3","branchLabelMapping":{"^v9.0.0$":"main","^v8.16.0$":"8.x","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","Feature:Task Manager","Team:ResponseOps","v9.0.0","backport:prev-minor","ci:cloud-deploy","ci:project-deploy-elasticsearch","v8.16.0"],"title":"[Response Ops][Task Manager] Onboard 12.5% of ECH clusters to use `mget` task claiming","number":196317,"url":"https://github.com/elastic/kibana/pull/196317","mergeCommit":{"message":"[Response Ops][Task Manager] Onboard 12.5% of ECH clusters to use `mget` task claiming (#196317)\n\nResolves https://github.com/elastic/response-ops-team/issues/239\r\n\r\n## Summary\r\n\r\nDeployed to cloud: deployment ID was `ab4e88d139f93d43024837d96144e7d4`.\r\nSince the deployment ID starts with an `a`, this should start with\r\n`mget` and I can see in the logs with the latest push that this is true\r\n\r\n<img width=\"2190\" alt=\"Screenshot 2024-10-15 at 2 59 20 PM\"\r\nsrc=\"https://github.com/user-attachments/assets/079bc4d8-365e-4ba6-b7a9-59fe506283d9\">\r\n\r\n\r\nDeployed to serverless: project ID was\r\n`d33d22a94ce246d091220eace2c4e4bb`. See in the logs: `Using claim\r\nstrategy mget as configured for deployment\r\nd33d22a94ce246d091220eace2c4e4bb`\r\n\r\nCo-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>","sha":"267efdf31fe9ae314b0bed99bc23db5452a2aaa3"}},"sourceBranch":"main","suggestedTargetBranches":["8.x"],"targetPullRequestStates":[{"branch":"main","label":"v9.0.0","branchLabelMappingKey":"^v9.0.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/196317","number":196317,"mergeCommit":{"message":"[Response Ops][Task Manager] Onboard 12.5% of ECH clusters to use `mget` task claiming (#196317)\n\nResolves https://github.com/elastic/response-ops-team/issues/239\r\n\r\n## Summary\r\n\r\nDeployed to cloud: deployment ID was `ab4e88d139f93d43024837d96144e7d4`.\r\nSince the deployment ID starts with an `a`, this should start with\r\n`mget` and I can see in the logs with the latest push that this is true\r\n\r\n<img width=\"2190\" alt=\"Screenshot 2024-10-15 at 2 59 20 PM\"\r\nsrc=\"https://github.com/user-attachments/assets/079bc4d8-365e-4ba6-b7a9-59fe506283d9\">\r\n\r\n\r\nDeployed to serverless: project ID was\r\n`d33d22a94ce246d091220eace2c4e4bb`. See in the logs: `Using claim\r\nstrategy mget as configured for deployment\r\nd33d22a94ce246d091220eace2c4e4bb`\r\n\r\nCo-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>","sha":"267efdf31fe9ae314b0bed99bc23db5452a2aaa3"}},{"branch":"8.x","label":"v8.16.0","branchLabelMappingKey":"^v8.16.0$","isSourceBranch":false,"state":"NOT_CREATED"}]}] BACKPORT--> Co-authored-by: Ying Mao <ying.mao@elastic.co> |
||
|
2240eb45bf
|
[8.x] [Response Ops][Task Manager] Stop polling on Kibana shutdown (#195415) (#196159)
# Backport This will backport the following commits from `main` to `8.x`: - [[Response Ops][Task Manager] Stop polling on Kibana shutdown (#195415)](https://github.com/elastic/kibana/pull/195415) <!--- Backport version: 9.4.3 --> ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sqren/backport) <!--BACKPORT [{"author":{"name":"Ying Mao","email":"ying.mao@elastic.co"},"sourceCommit":{"committedDate":"2024-10-14T14:49:50Z","message":"[Response Ops][Task Manager] Stop polling on Kibana shutdown (#195415)\n\nResolves https://github.com/elastic/kibana/issues/160329\r\n\r\n## Summary\r\n\r\nStop polling when task manager `stop()` is called. When Kibana receives\r\na `SIGTERM` signal, all the plugin stop functions are called. When TM\r\nreceives this signal, it should immediately stop claiming any new tasks\r\nand then there is a grace period before kubernetes kills the pod that\r\nallows any running tasks to complete.\r\n\r\nI experimented with removing the code that prevents the event log from\r\nindexing any additional documents after the `stop` signal is received,\r\nbut I received a bulk indexing error `There are no living connections`\r\neven thought Elasticsearch was up and running so it seems that some of\r\nthe core functionality that the event log uses are gone at this point.\r\n\r\n## To Verify\r\n\r\n1. Add a log indicating that polling is occuring\r\n\r\n```\r\n--- a/x-pack/plugins/task_manager/server/polling/task_poller.ts\r\n+++ b/x-pack/plugins/task_manager/server/polling/task_poller.ts\r\n@@ -61,6 +61,7 @@ export function createTaskPoller<T, H>({\r\n const subject = new Subject<Result<H, PollingError<T>>>();\r\n\r\n async function runCycle() {\r\n+ console.log('polling');\r\n timeoutId = null;\r\n const start = Date.now();\r\n try {\r\n```\r\n\r\n2. Start ES and Kibana. Use `ps aux` to determine Kibana's PID\r\n3. Send a sigterm signal to Kibana: `kill -TERM <kibana_pid>`. Task\r\nmanager should log `Stopping the task poller` and you should no longer\r\nsee the console logs indicating that TM is polling\r\n\r\nCo-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>","sha":"674027d66c94f4865c4f73c14a71c454d5198c98","branchLabelMapping":{"^v9.0.0$":"main","^v8.16.0$":"8.x","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","Feature:Task Manager","Team:ResponseOps","v9.0.0","backport:prev-minor","v8.16.0"],"title":"[Response Ops][Task Manager] Stop polling on Kibana shutdown","number":195415,"url":"https://github.com/elastic/kibana/pull/195415","mergeCommit":{"message":"[Response Ops][Task Manager] Stop polling on Kibana shutdown (#195415)\n\nResolves https://github.com/elastic/kibana/issues/160329\r\n\r\n## Summary\r\n\r\nStop polling when task manager `stop()` is called. When Kibana receives\r\na `SIGTERM` signal, all the plugin stop functions are called. When TM\r\nreceives this signal, it should immediately stop claiming any new tasks\r\nand then there is a grace period before kubernetes kills the pod that\r\nallows any running tasks to complete.\r\n\r\nI experimented with removing the code that prevents the event log from\r\nindexing any additional documents after the `stop` signal is received,\r\nbut I received a bulk indexing error `There are no living connections`\r\neven thought Elasticsearch was up and running so it seems that some of\r\nthe core functionality that the event log uses are gone at this point.\r\n\r\n## To Verify\r\n\r\n1. Add a log indicating that polling is occuring\r\n\r\n```\r\n--- a/x-pack/plugins/task_manager/server/polling/task_poller.ts\r\n+++ b/x-pack/plugins/task_manager/server/polling/task_poller.ts\r\n@@ -61,6 +61,7 @@ export function createTaskPoller<T, H>({\r\n const subject = new Subject<Result<H, PollingError<T>>>();\r\n\r\n async function runCycle() {\r\n+ console.log('polling');\r\n timeoutId = null;\r\n const start = Date.now();\r\n try {\r\n```\r\n\r\n2. Start ES and Kibana. Use `ps aux` to determine Kibana's PID\r\n3. Send a sigterm signal to Kibana: `kill -TERM <kibana_pid>`. Task\r\nmanager should log `Stopping the task poller` and you should no longer\r\nsee the console logs indicating that TM is polling\r\n\r\nCo-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>","sha":"674027d66c94f4865c4f73c14a71c454d5198c98"}},"sourceBranch":"main","suggestedTargetBranches":["8.x"],"targetPullRequestStates":[{"branch":"main","label":"v9.0.0","branchLabelMappingKey":"^v9.0.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/195415","number":195415,"mergeCommit":{"message":"[Response Ops][Task Manager] Stop polling on Kibana shutdown (#195415)\n\nResolves https://github.com/elastic/kibana/issues/160329\r\n\r\n## Summary\r\n\r\nStop polling when task manager `stop()` is called. When Kibana receives\r\na `SIGTERM` signal, all the plugin stop functions are called. When TM\r\nreceives this signal, it should immediately stop claiming any new tasks\r\nand then there is a grace period before kubernetes kills the pod that\r\nallows any running tasks to complete.\r\n\r\nI experimented with removing the code that prevents the event log from\r\nindexing any additional documents after the `stop` signal is received,\r\nbut I received a bulk indexing error `There are no living connections`\r\neven thought Elasticsearch was up and running so it seems that some of\r\nthe core functionality that the event log uses are gone at this point.\r\n\r\n## To Verify\r\n\r\n1. Add a log indicating that polling is occuring\r\n\r\n```\r\n--- a/x-pack/plugins/task_manager/server/polling/task_poller.ts\r\n+++ b/x-pack/plugins/task_manager/server/polling/task_poller.ts\r\n@@ -61,6 +61,7 @@ export function createTaskPoller<T, H>({\r\n const subject = new Subject<Result<H, PollingError<T>>>();\r\n\r\n async function runCycle() {\r\n+ console.log('polling');\r\n timeoutId = null;\r\n const start = Date.now();\r\n try {\r\n```\r\n\r\n2. Start ES and Kibana. Use `ps aux` to determine Kibana's PID\r\n3. Send a sigterm signal to Kibana: `kill -TERM <kibana_pid>`. Task\r\nmanager should log `Stopping the task poller` and you should no longer\r\nsee the console logs indicating that TM is polling\r\n\r\nCo-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>","sha":"674027d66c94f4865c4f73c14a71c454d5198c98"}},{"branch":"8.x","label":"v8.16.0","branchLabelMappingKey":"^v8.16.0$","isSourceBranch":false,"state":"NOT_CREATED"}]}] BACKPORT--> Co-authored-by: Ying Mao <ying.mao@elastic.co> |
||
|
0c6ab08a02
|
[8.x] [Connectors][GenAI] Inference Service Kibana connector (#189027) (#196035)
# Backport This will backport the following commits from `main` to `8.x`: - [[Connectors][GenAI] Inference Service Kibana connector (#189027)](https://github.com/elastic/kibana/pull/189027) <!--- Backport version: 9.4.3 --> ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sqren/backport) <!--BACKPORT [{"author":{"name":"Yuliia Naumenko","email":"jo.naumenko@gmail.com"},"sourceCommit":{"committedDate":"2024-10-13T20:39:09Z","message":"[Connectors][GenAI] Inference Service Kibana connector (#189027)\n\n## Summary\r\nResolves https://github.com/elastic/kibana/issues/188043\r\n\r\nThis PR adds new connector which is define integration with Elastic\r\nInference Endpoint via [Inference\r\nAPIs](https://www.elastic.co/guide/en/elasticsearch/reference/current/inference-apis.html)\r\nThe lifecycle of the Inference Endpoint are managed by the connector\r\nregistered handlers:\r\n\r\n- `preSaveHook` -\r\n[create](https://www.elastic.co/guide/en/elasticsearch/reference/current/put-inference-api.html)\r\nnew Inference Endpoint in the connector create mode (`isEdit === false`)\r\nand\r\n[delete](https://www.elastic.co/guide/en/elasticsearch/reference/current/delete-inference-api.html)+[create](https://www.elastic.co/guide/en/elasticsearch/reference/current/put-inference-api.html)\r\nin the connector edit mode (`isEdit === true`)\r\n- `postSaveHook` - check if the connector SO was created/updated and if\r\nnot removes Inference Endpoint from preSaveHook\r\n- `postDeleteHook` -\r\n[delete](https://www.elastic.co/guide/en/elasticsearch/reference/current/delete-inference-api.html)\r\nInference Endpoint if connector was deleted.\r\n\r\nIn the Kibana Stack Management Connectors, its represented with the new\r\ncard (Technical preview badge):\r\n\r\n<img width=\"1261\" alt=\"Screenshot 2024-09-27 at 2 11 12 PM\"\r\nsrc=\"https://github.com/user-attachments/assets/dcbcce1f-06e7-4d08-8b77-0ba4105354f8\">\r\n\r\nTo simplify the future integration with AI Assistants, the Connector\r\nconsists from the two main UI parts: provider selector and required\r\nprovider settings, which will be always displayed\r\n<img width=\"862\" alt=\"Screenshot 2024-10-07 at 7 59 09 AM\"\r\nsrc=\"https://github.com/user-attachments/assets/87bae493-c642-479e-b28f-6150354608dd\">\r\n\r\nand Additional options, which contains optional provider settings and\r\nTask Type configuration:\r\n\r\n<img width=\"861\" alt=\"Screenshot 2024-10-07 at 8 00 15 AM\"\r\nsrc=\"https://github.com/user-attachments/assets/2341c034-6198-4731-8ce7-e22e6c6fb20f\">\r\n\r\n\r\nsubActions corresponds to the different taskTypes Inference API\r\nsupports. Each of the task type has its own Inference Perform params.\r\nCurrently added:\r\n\r\n- completion & completionStream\r\n- rerank\r\n- text_embedding\r\n- sparse_embedding\r\n\r\nFollow up work:\r\n\r\n1. Collapse/expand Additional options, when the connector flyout/modal\r\nhas AI Assistant as a context (path through the extending context\r\nimplementation on the connector framework level)\r\n2. Add support for additional params for Completion subAction to be able\r\nto path functions\r\n3. Add support for tokens usage Dashboard, when inference API will\r\ninclude the used tokens count in the response\r\n4. Add functionality and UX for migration from existing specific AI\r\nconnectors to the Inference connector with proper provider and\r\ncompletion task\r\n5. Integrate Connector with the AI Assistants\r\n\r\n---------\r\n\r\nCo-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>\r\nCo-authored-by: István Zoltán Szabó <istvan.szabo@elastic.co>\r\nCo-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com>\r\nCo-authored-by: Steph Milovic <stephanie.milovic@elastic.co>","sha":"288d41d61ec2389b2e8856da75fd0f3107f9c484","branchLabelMapping":{"^v9.0.0$":"main","^v8.16.0$":"8.x","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["v9.0.0","release_note:feature","Feature:Actions/ConnectorTypes","8.16 candidate","v8.16.0","backport:version"],"title":"[Connectors][GenAI] Inference Service Kibana connector","number":189027,"url":"https://github.com/elastic/kibana/pull/189027","mergeCommit":{"message":"[Connectors][GenAI] Inference Service Kibana connector (#189027)\n\n## Summary\r\nResolves https://github.com/elastic/kibana/issues/188043\r\n\r\nThis PR adds new connector which is define integration with Elastic\r\nInference Endpoint via [Inference\r\nAPIs](https://www.elastic.co/guide/en/elasticsearch/reference/current/inference-apis.html)\r\nThe lifecycle of the Inference Endpoint are managed by the connector\r\nregistered handlers:\r\n\r\n- `preSaveHook` -\r\n[create](https://www.elastic.co/guide/en/elasticsearch/reference/current/put-inference-api.html)\r\nnew Inference Endpoint in the connector create mode (`isEdit === false`)\r\nand\r\n[delete](https://www.elastic.co/guide/en/elasticsearch/reference/current/delete-inference-api.html)+[create](https://www.elastic.co/guide/en/elasticsearch/reference/current/put-inference-api.html)\r\nin the connector edit mode (`isEdit === true`)\r\n- `postSaveHook` - check if the connector SO was created/updated and if\r\nnot removes Inference Endpoint from preSaveHook\r\n- `postDeleteHook` -\r\n[delete](https://www.elastic.co/guide/en/elasticsearch/reference/current/delete-inference-api.html)\r\nInference Endpoint if connector was deleted.\r\n\r\nIn the Kibana Stack Management Connectors, its represented with the new\r\ncard (Technical preview badge):\r\n\r\n<img width=\"1261\" alt=\"Screenshot 2024-09-27 at 2 11 12 PM\"\r\nsrc=\"https://github.com/user-attachments/assets/dcbcce1f-06e7-4d08-8b77-0ba4105354f8\">\r\n\r\nTo simplify the future integration with AI Assistants, the Connector\r\nconsists from the two main UI parts: provider selector and required\r\nprovider settings, which will be always displayed\r\n<img width=\"862\" alt=\"Screenshot 2024-10-07 at 7 59 09 AM\"\r\nsrc=\"https://github.com/user-attachments/assets/87bae493-c642-479e-b28f-6150354608dd\">\r\n\r\nand Additional options, which contains optional provider settings and\r\nTask Type configuration:\r\n\r\n<img width=\"861\" alt=\"Screenshot 2024-10-07 at 8 00 15 AM\"\r\nsrc=\"https://github.com/user-attachments/assets/2341c034-6198-4731-8ce7-e22e6c6fb20f\">\r\n\r\n\r\nsubActions corresponds to the different taskTypes Inference API\r\nsupports. Each of the task type has its own Inference Perform params.\r\nCurrently added:\r\n\r\n- completion & completionStream\r\n- rerank\r\n- text_embedding\r\n- sparse_embedding\r\n\r\nFollow up work:\r\n\r\n1. Collapse/expand Additional options, when the connector flyout/modal\r\nhas AI Assistant as a context (path through the extending context\r\nimplementation on the connector framework level)\r\n2. Add support for additional params for Completion subAction to be able\r\nto path functions\r\n3. Add support for tokens usage Dashboard, when inference API will\r\ninclude the used tokens count in the response\r\n4. Add functionality and UX for migration from existing specific AI\r\nconnectors to the Inference connector with proper provider and\r\ncompletion task\r\n5. Integrate Connector with the AI Assistants\r\n\r\n---------\r\n\r\nCo-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>\r\nCo-authored-by: István Zoltán Szabó <istvan.szabo@elastic.co>\r\nCo-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com>\r\nCo-authored-by: Steph Milovic <stephanie.milovic@elastic.co>","sha":"288d41d61ec2389b2e8856da75fd0f3107f9c484"}},"sourceBranch":"main","suggestedTargetBranches":["8.x"],"targetPullRequestStates":[{"branch":"main","label":"v9.0.0","branchLabelMappingKey":"^v9.0.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/189027","number":189027,"mergeCommit":{"message":"[Connectors][GenAI] Inference Service Kibana connector (#189027)\n\n## Summary\r\nResolves https://github.com/elastic/kibana/issues/188043\r\n\r\nThis PR adds new connector which is define integration with Elastic\r\nInference Endpoint via [Inference\r\nAPIs](https://www.elastic.co/guide/en/elasticsearch/reference/current/inference-apis.html)\r\nThe lifecycle of the Inference Endpoint are managed by the connector\r\nregistered handlers:\r\n\r\n- `preSaveHook` -\r\n[create](https://www.elastic.co/guide/en/elasticsearch/reference/current/put-inference-api.html)\r\nnew Inference Endpoint in the connector create mode (`isEdit === false`)\r\nand\r\n[delete](https://www.elastic.co/guide/en/elasticsearch/reference/current/delete-inference-api.html)+[create](https://www.elastic.co/guide/en/elasticsearch/reference/current/put-inference-api.html)\r\nin the connector edit mode (`isEdit === true`)\r\n- `postSaveHook` - check if the connector SO was created/updated and if\r\nnot removes Inference Endpoint from preSaveHook\r\n- `postDeleteHook` -\r\n[delete](https://www.elastic.co/guide/en/elasticsearch/reference/current/delete-inference-api.html)\r\nInference Endpoint if connector was deleted.\r\n\r\nIn the Kibana Stack Management Connectors, its represented with the new\r\ncard (Technical preview badge):\r\n\r\n<img width=\"1261\" alt=\"Screenshot 2024-09-27 at 2 11 12 PM\"\r\nsrc=\"https://github.com/user-attachments/assets/dcbcce1f-06e7-4d08-8b77-0ba4105354f8\">\r\n\r\nTo simplify the future integration with AI Assistants, the Connector\r\nconsists from the two main UI parts: provider selector and required\r\nprovider settings, which will be always displayed\r\n<img width=\"862\" alt=\"Screenshot 2024-10-07 at 7 59 09 AM\"\r\nsrc=\"https://github.com/user-attachments/assets/87bae493-c642-479e-b28f-6150354608dd\">\r\n\r\nand Additional options, which contains optional provider settings and\r\nTask Type configuration:\r\n\r\n<img width=\"861\" alt=\"Screenshot 2024-10-07 at 8 00 15 AM\"\r\nsrc=\"https://github.com/user-attachments/assets/2341c034-6198-4731-8ce7-e22e6c6fb20f\">\r\n\r\n\r\nsubActions corresponds to the different taskTypes Inference API\r\nsupports. Each of the task type has its own Inference Perform params.\r\nCurrently added:\r\n\r\n- completion & completionStream\r\n- rerank\r\n- text_embedding\r\n- sparse_embedding\r\n\r\nFollow up work:\r\n\r\n1. Collapse/expand Additional options, when the connector flyout/modal\r\nhas AI Assistant as a context (path through the extending context\r\nimplementation on the connector framework level)\r\n2. Add support for additional params for Completion subAction to be able\r\nto path functions\r\n3. Add support for tokens usage Dashboard, when inference API will\r\ninclude the used tokens count in the response\r\n4. Add functionality and UX for migration from existing specific AI\r\nconnectors to the Inference connector with proper provider and\r\ncompletion task\r\n5. Integrate Connector with the AI Assistants\r\n\r\n---------\r\n\r\nCo-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>\r\nCo-authored-by: István Zoltán Szabó <istvan.szabo@elastic.co>\r\nCo-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com>\r\nCo-authored-by: Steph Milovic <stephanie.milovic@elastic.co>","sha":"288d41d61ec2389b2e8856da75fd0f3107f9c484"}},{"branch":"8.x","label":"v8.16.0","branchLabelMappingKey":"^v8.16.0$","isSourceBranch":false,"state":"NOT_CREATED"}]}] BACKPORT--> Co-authored-by: Yuliia Naumenko <jo.naumenko@gmail.com> |
||
|
1d96fac969
|
[8.x] Fixes Failing test: Jest Integration Tests.x-pack/plugins/task_manager/server/integration_tests - unrecognized task types should be no workload aggregator errors when there are removed task types (#195496) (#195677)
# Backport This will backport the following commits from `main` to `8.x`: - [Fixes Failing test: Jest Integration Tests.x-pack/plugins/task_manager/server/integration_tests - unrecognized task types should be no workload aggregator errors when there are removed task types (#195496)](https://github.com/elastic/kibana/pull/195496) <!--- Backport version: 9.4.3 --> ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sqren/backport) <!--BACKPORT [{"author":{"name":"Ying Mao","email":"ying.mao@elastic.co"},"sourceCommit":{"committedDate":"2024-10-09T21:01:49Z","message":"Fixes Failing test: Jest Integration Tests.x-pack/plugins/task_manager/server/integration_tests - unrecognized task types should be no workload aggregator errors when there are removed task types (#195496)\n\nResolves https://github.com/elastic/kibana/issues/194208\r\n\r\n## Summary\r\n\r\nThe original integration test was checking for the (non) existence of\r\nany error logs on startup when there are removed task types, which was\r\nnot specific enough because there were occasionally error logs like\r\n\r\n```\r\n\"Task SLO:ORPHAN_SUMMARIES-CLEANUP-TASK \\\"SLO:ORPHAN_SUMMARIES-CLEANUP-TASK:1.0.0\\\" failed: ResponseError: search_phase_execution_exception\r\n```\r\n\r\nso this PR updates the integration test to check specifically for\r\nworkload aggregator error logs\r\n\r\nCo-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>","sha":"742cd1336e71d2236608450e2a6a77b3ce9b3c4c","branchLabelMapping":{"^v9.0.0$":"main","^v8.16.0$":"8.x","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","Feature:Task Manager","Team:ResponseOps","v9.0.0","backport:prev-minor","v8.16.0"],"title":"Fixes Failing test: Jest Integration Tests.x-pack/plugins/task_manager/server/integration_tests - unrecognized task types should be no workload aggregator errors when there are removed task types","number":195496,"url":"https://github.com/elastic/kibana/pull/195496","mergeCommit":{"message":"Fixes Failing test: Jest Integration Tests.x-pack/plugins/task_manager/server/integration_tests - unrecognized task types should be no workload aggregator errors when there are removed task types (#195496)\n\nResolves https://github.com/elastic/kibana/issues/194208\r\n\r\n## Summary\r\n\r\nThe original integration test was checking for the (non) existence of\r\nany error logs on startup when there are removed task types, which was\r\nnot specific enough because there were occasionally error logs like\r\n\r\n```\r\n\"Task SLO:ORPHAN_SUMMARIES-CLEANUP-TASK \\\"SLO:ORPHAN_SUMMARIES-CLEANUP-TASK:1.0.0\\\" failed: ResponseError: search_phase_execution_exception\r\n```\r\n\r\nso this PR updates the integration test to check specifically for\r\nworkload aggregator error logs\r\n\r\nCo-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>","sha":"742cd1336e71d2236608450e2a6a77b3ce9b3c4c"}},"sourceBranch":"main","suggestedTargetBranches":["8.x"],"targetPullRequestStates":[{"branch":"main","label":"v9.0.0","branchLabelMappingKey":"^v9.0.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/195496","number":195496,"mergeCommit":{"message":"Fixes Failing test: Jest Integration Tests.x-pack/plugins/task_manager/server/integration_tests - unrecognized task types should be no workload aggregator errors when there are removed task types (#195496)\n\nResolves https://github.com/elastic/kibana/issues/194208\r\n\r\n## Summary\r\n\r\nThe original integration test was checking for the (non) existence of\r\nany error logs on startup when there are removed task types, which was\r\nnot specific enough because there were occasionally error logs like\r\n\r\n```\r\n\"Task SLO:ORPHAN_SUMMARIES-CLEANUP-TASK \\\"SLO:ORPHAN_SUMMARIES-CLEANUP-TASK:1.0.0\\\" failed: ResponseError: search_phase_execution_exception\r\n```\r\n\r\nso this PR updates the integration test to check specifically for\r\nworkload aggregator error logs\r\n\r\nCo-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>","sha":"742cd1336e71d2236608450e2a6a77b3ce9b3c4c"}},{"branch":"8.x","label":"v8.16.0","branchLabelMappingKey":"^v8.16.0$","isSourceBranch":false,"state":"NOT_CREATED"}]}] BACKPORT--> Co-authored-by: Ying Mao <ying.mao@elastic.co> |
||
|
59444d03d6
|
[8.x] Put the auto calculation of capacity behind a feature flag, for now (#195390) (#195486)
# Backport This will backport the following commits from `main` to `8.x`: - [Put the auto calculation of capacity behind a feature flag, for now (#195390)](https://github.com/elastic/kibana/pull/195390) <!--- Backport version: 9.4.3 --> ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sqren/backport) <!--BACKPORT [{"author":{"name":"Mike Côté","email":"mikecote@users.noreply.github.com"},"sourceCommit":{"committedDate":"2024-10-08T17:48:07Z","message":"Put the auto calculation of capacity behind a feature flag, for now (#195390)\n\nIn this PR, I'm preparing for the 8.16 release where we'd like to start\r\nrolling out the `mget` task claiming strategy separately from the added\r\nconcurrency. To accomplish this, we need to put the capacity calculation\r\nbehind a feature flag that is default to false for now, until we do a\r\nsecond rollout with an increased concurrency. The increased concurrency\r\ncan be calculated and adjusted based on experiments of clusters setting\r\n`xpack.task_manager.capacity` to a higher value and observe the resource\r\nusage.\r\n\r\nPR to deploy to Cloud and verify that we always default to 10 normal\r\ntasks: https://github.com/elastic/kibana/pull/195392","sha":"9c8f689aca23ed8b1f560c57a9a660d318375412","branchLabelMapping":{"^v9.0.0$":"main","^v8.16.0$":"8.x","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","Feature:Task Manager","Team:ResponseOps","v9.0.0","backport:prev-minor","v8.16.0"],"title":"Put the auto calculation of capacity behind a feature flag, for now","number":195390,"url":"https://github.com/elastic/kibana/pull/195390","mergeCommit":{"message":"Put the auto calculation of capacity behind a feature flag, for now (#195390)\n\nIn this PR, I'm preparing for the 8.16 release where we'd like to start\r\nrolling out the `mget` task claiming strategy separately from the added\r\nconcurrency. To accomplish this, we need to put the capacity calculation\r\nbehind a feature flag that is default to false for now, until we do a\r\nsecond rollout with an increased concurrency. The increased concurrency\r\ncan be calculated and adjusted based on experiments of clusters setting\r\n`xpack.task_manager.capacity` to a higher value and observe the resource\r\nusage.\r\n\r\nPR to deploy to Cloud and verify that we always default to 10 normal\r\ntasks: https://github.com/elastic/kibana/pull/195392","sha":"9c8f689aca23ed8b1f560c57a9a660d318375412"}},"sourceBranch":"main","suggestedTargetBranches":["8.x"],"targetPullRequestStates":[{"branch":"main","label":"v9.0.0","branchLabelMappingKey":"^v9.0.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/195390","number":195390,"mergeCommit":{"message":"Put the auto calculation of capacity behind a feature flag, for now (#195390)\n\nIn this PR, I'm preparing for the 8.16 release where we'd like to start\r\nrolling out the `mget` task claiming strategy separately from the added\r\nconcurrency. To accomplish this, we need to put the capacity calculation\r\nbehind a feature flag that is default to false for now, until we do a\r\nsecond rollout with an increased concurrency. The increased concurrency\r\ncan be calculated and adjusted based on experiments of clusters setting\r\n`xpack.task_manager.capacity` to a higher value and observe the resource\r\nusage.\r\n\r\nPR to deploy to Cloud and verify that we always default to 10 normal\r\ntasks: https://github.com/elastic/kibana/pull/195392","sha":"9c8f689aca23ed8b1f560c57a9a660d318375412"}},{"branch":"8.x","label":"v8.16.0","branchLabelMappingKey":"^v8.16.0$","isSourceBranch":false,"state":"NOT_CREATED"}]}] BACKPORT--> Co-authored-by: Mike Côté <mikecote@users.noreply.github.com> |
||
|
e422533e3d
|
[8.x] Add more logs to Task Manager poller (#194741) (#194865)
# Backport This will backport the following commits from `main` to `8.x`: - [Add more logs to Task Manager poller (#194741)](https://github.com/elastic/kibana/pull/194741) <!--- Backport version: 9.4.3 --> ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sqren/backport) <!--BACKPORT [{"author":{"name":"Mike Côté","email":"mikecote@users.noreply.github.com"},"sourceCommit":{"committedDate":"2024-10-03T17:12:22Z","message":"Add more logs to Task Manager poller (#194741)\n\nIn this PR, I'm adding a few more logs to the task poller to indicate\r\ncritical events to the task poller.\r\n\r\n## To verify\r\n1. Startup Elasticsearch and Kibana (and ensure Elasticsearch data is\r\npersisted somewhere `yarn es snapshot -E path.data=...`)\r\n2. Observe the `Starting the task poller` message on startup\r\n3. Shut down Elasticsearch\r\n4. Observe the following messages:\r\n- `Stopping the task poller because Elasticsearch and/or saved-objects\r\nservice became unavailable`\r\n - `Stopping the task poller`\r\n - `Task poller finished running its last cycle`\r\n5. Startup Elasticsearch again\r\n6. Wait a while and observe the `Starting the task poller` message","sha":"790c5ce7c01fabde140675a24fd852d3f86f7ebc","branchLabelMapping":{"^v9.0.0$":"main","^v8.16.0$":"8.x","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","Feature:Task Manager","Team:ResponseOps","v9.0.0","backport:prev-minor","v8.16.0"],"title":"Add more logs to Task Manager poller","number":194741,"url":"https://github.com/elastic/kibana/pull/194741","mergeCommit":{"message":"Add more logs to Task Manager poller (#194741)\n\nIn this PR, I'm adding a few more logs to the task poller to indicate\r\ncritical events to the task poller.\r\n\r\n## To verify\r\n1. Startup Elasticsearch and Kibana (and ensure Elasticsearch data is\r\npersisted somewhere `yarn es snapshot -E path.data=...`)\r\n2. Observe the `Starting the task poller` message on startup\r\n3. Shut down Elasticsearch\r\n4. Observe the following messages:\r\n- `Stopping the task poller because Elasticsearch and/or saved-objects\r\nservice became unavailable`\r\n - `Stopping the task poller`\r\n - `Task poller finished running its last cycle`\r\n5. Startup Elasticsearch again\r\n6. Wait a while and observe the `Starting the task poller` message","sha":"790c5ce7c01fabde140675a24fd852d3f86f7ebc"}},"sourceBranch":"main","suggestedTargetBranches":["8.x"],"targetPullRequestStates":[{"branch":"main","label":"v9.0.0","branchLabelMappingKey":"^v9.0.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/194741","number":194741,"mergeCommit":{"message":"Add more logs to Task Manager poller (#194741)\n\nIn this PR, I'm adding a few more logs to the task poller to indicate\r\ncritical events to the task poller.\r\n\r\n## To verify\r\n1. Startup Elasticsearch and Kibana (and ensure Elasticsearch data is\r\npersisted somewhere `yarn es snapshot -E path.data=...`)\r\n2. Observe the `Starting the task poller` message on startup\r\n3. Shut down Elasticsearch\r\n4. Observe the following messages:\r\n- `Stopping the task poller because Elasticsearch and/or saved-objects\r\nservice became unavailable`\r\n - `Stopping the task poller`\r\n - `Task poller finished running its last cycle`\r\n5. Startup Elasticsearch again\r\n6. Wait a while and observe the `Starting the task poller` message","sha":"790c5ce7c01fabde140675a24fd852d3f86f7ebc"}},{"branch":"8.x","label":"v8.16.0","branchLabelMappingKey":"^v8.16.0$","isSourceBranch":false,"state":"NOT_CREATED"}]}] BACKPORT--> Co-authored-by: Mike Côté <mikecote@users.noreply.github.com> |
||
|
97760d5b59
|
[8.x] [Response Ops][Task Manager] Handle errors in `getCapacity` function during task polling (#194759) (#194823)
# Backport This will backport the following commits from `main` to `8.x`: - [[Response Ops][Task Manager] Handle errors in `getCapacity` function during task polling (#194759)](https://github.com/elastic/kibana/pull/194759) <!--- Backport version: 9.4.3 --> ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sqren/backport) <!--BACKPORT [{"author":{"name":"Ying Mao","email":"ying.mao@elastic.co"},"sourceCommit":{"committedDate":"2024-10-03T11:55:18Z","message":"[Response Ops][Task Manager] Handle errors in `getCapacity` function during task polling (#194759)\n\n## Summary\r\n\r\n* Moves the `getCapacity` call during task polling within the try/catch\r\nso any errors with this function will be caught and logged under the\r\n`Failed to poll for work` message and polling continues\r\n* During `mget` claim strategy, perform a final check after the\r\n`bulkGet` to check for tasks with a null `startedAt` value. If any tasks\r\nmeet this condition, log some basic info and manually assign the\r\n`startedAt`. This is a stop-gap measure to ensure we understand why we\r\nmight be seeing tasks with null `startedAt` values. In the future we may\r\nchoose to filter out these tasks from running in this cycle.","sha":"6827ba4dc58a262c6c2b332b7c155cc037c02475","branchLabelMapping":{"^v9.0.0$":"main","^v8.16.0$":"8.x","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","Feature:Task Manager","Team:ResponseOps","v9.0.0","backport:prev-major","v8.16.0"],"title":"[Response Ops][Task Manager] Handle errors in `getCapacity` function during task polling","number":194759,"url":"https://github.com/elastic/kibana/pull/194759","mergeCommit":{"message":"[Response Ops][Task Manager] Handle errors in `getCapacity` function during task polling (#194759)\n\n## Summary\r\n\r\n* Moves the `getCapacity` call during task polling within the try/catch\r\nso any errors with this function will be caught and logged under the\r\n`Failed to poll for work` message and polling continues\r\n* During `mget` claim strategy, perform a final check after the\r\n`bulkGet` to check for tasks with a null `startedAt` value. If any tasks\r\nmeet this condition, log some basic info and manually assign the\r\n`startedAt`. This is a stop-gap measure to ensure we understand why we\r\nmight be seeing tasks with null `startedAt` values. In the future we may\r\nchoose to filter out these tasks from running in this cycle.","sha":"6827ba4dc58a262c6c2b332b7c155cc037c02475"}},"sourceBranch":"main","suggestedTargetBranches":["8.x"],"targetPullRequestStates":[{"branch":"main","label":"v9.0.0","branchLabelMappingKey":"^v9.0.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/194759","number":194759,"mergeCommit":{"message":"[Response Ops][Task Manager] Handle errors in `getCapacity` function during task polling (#194759)\n\n## Summary\r\n\r\n* Moves the `getCapacity` call during task polling within the try/catch\r\nso any errors with this function will be caught and logged under the\r\n`Failed to poll for work` message and polling continues\r\n* During `mget` claim strategy, perform a final check after the\r\n`bulkGet` to check for tasks with a null `startedAt` value. If any tasks\r\nmeet this condition, log some basic info and manually assign the\r\n`startedAt`. This is a stop-gap measure to ensure we understand why we\r\nmight be seeing tasks with null `startedAt` values. In the future we may\r\nchoose to filter out these tasks from running in this cycle.","sha":"6827ba4dc58a262c6c2b332b7c155cc037c02475"}},{"branch":"8.x","label":"v8.16.0","branchLabelMappingKey":"^v8.16.0$","isSourceBranch":false,"state":"NOT_CREATED"}]}] BACKPORT--> Co-authored-by: Ying Mao <ying.mao@elastic.co> |
||
|
c24fee09bc
|
[8.x] Hook up discovery service to Task Manager health (#194113) (#194685)
# Backport This will backport the following commits from `main` to `8.x`: - [Hook up discovery service to Task Manager health (#194113)](https://github.com/elastic/kibana/pull/194113) <!--- Backport version: 9.4.3 --> ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sqren/backport) <!--BACKPORT [{"author":{"name":"Mike Côté","email":"mikecote@users.noreply.github.com"},"sourceCommit":{"committedDate":"2024-10-02T11:19:06Z","message":"Hook up discovery service to Task Manager health (#194113)\n\nResolves https://github.com/elastic/kibana/issues/192568\r\n\r\nIn this PR, I'm solving the issue where the task manager health API is\r\nunable to determine how many Kibana nodes are running. I'm doing so by\r\nleveraging the Kibana discovery service to get a count instead of\r\ncalculating it based on an aggregation on the `.kibana_task_manager`\r\nindex where we count the unique number of `ownerId`, which requires\r\ntasks to be running and a sufficient distribution across the Kibana\r\nnodes to determine the number properly.\r\n\r\nNote: This will only work when mget is the task claim strategy\r\n\r\n## To verify\r\n1. Set `xpack.task_manager.claim_strategy: mget` in kibana.yml\r\n2. Startup the PR locally with Elasticsearch and Kibana running\r\n3. Navigate to the `/api/task_manager/_health` route and confirm\r\n`observed_kibana_instances` is `1`\r\n4. Apply the following code and restart Kibana\r\n```\r\ndiff --git a/x-pack/plugins/task_manager/server/kibana_discovery_service/kibana_discovery_service.ts b/x-pack/plugins/task_manager/server/kibana_discovery_service/kibana_discovery_service.ts\r\nindex 090847032bf..69dfb6d1b36 100644\r\n--- a/x-pack/plugins/task_manager/server/kibana_discovery_service/kibana_discovery_service.ts\r\n+++ b/x-pack/plugins/task_manager/server/kibana_discovery_service/kibana_discovery_service.ts\r\n@@ -59,6 +59,7 @@ export class KibanaDiscoveryService {\r\n const lastSeen = lastSeenDate.toISOString();\r\n try {\r\n await this.upsertCurrentNode({ id: this.currentNode, lastSeen });\r\n+ await this.upsertCurrentNode({ id: `${this.currentNode}-2`, lastSeen });\r\n if (!this.started) {\r\n this.logger.info('Kibana Discovery Service has been started');\r\n this.started = true;\r\n```\r\n5. Navigate to the `/api/task_manager/_health` route and confirm\r\n`observed_kibana_instances` is `2`\r\n\r\n---------\r\n\r\nCo-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>","sha":"d0d2032f18a37e4c458a26d92092665453b737b0","branchLabelMapping":{"^v9.0.0$":"main","^v8.16.0$":"8.x","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","Feature:Task Manager","Team:ResponseOps","v9.0.0","backport:prev-minor","ci:cloud-deploy","v8.16.0"],"title":"Hook up discovery service to Task Manager health","number":194113,"url":"https://github.com/elastic/kibana/pull/194113","mergeCommit":{"message":"Hook up discovery service to Task Manager health (#194113)\n\nResolves https://github.com/elastic/kibana/issues/192568\r\n\r\nIn this PR, I'm solving the issue where the task manager health API is\r\nunable to determine how many Kibana nodes are running. I'm doing so by\r\nleveraging the Kibana discovery service to get a count instead of\r\ncalculating it based on an aggregation on the `.kibana_task_manager`\r\nindex where we count the unique number of `ownerId`, which requires\r\ntasks to be running and a sufficient distribution across the Kibana\r\nnodes to determine the number properly.\r\n\r\nNote: This will only work when mget is the task claim strategy\r\n\r\n## To verify\r\n1. Set `xpack.task_manager.claim_strategy: mget` in kibana.yml\r\n2. Startup the PR locally with Elasticsearch and Kibana running\r\n3. Navigate to the `/api/task_manager/_health` route and confirm\r\n`observed_kibana_instances` is `1`\r\n4. Apply the following code and restart Kibana\r\n```\r\ndiff --git a/x-pack/plugins/task_manager/server/kibana_discovery_service/kibana_discovery_service.ts b/x-pack/plugins/task_manager/server/kibana_discovery_service/kibana_discovery_service.ts\r\nindex 090847032bf..69dfb6d1b36 100644\r\n--- a/x-pack/plugins/task_manager/server/kibana_discovery_service/kibana_discovery_service.ts\r\n+++ b/x-pack/plugins/task_manager/server/kibana_discovery_service/kibana_discovery_service.ts\r\n@@ -59,6 +59,7 @@ export class KibanaDiscoveryService {\r\n const lastSeen = lastSeenDate.toISOString();\r\n try {\r\n await this.upsertCurrentNode({ id: this.currentNode, lastSeen });\r\n+ await this.upsertCurrentNode({ id: `${this.currentNode}-2`, lastSeen });\r\n if (!this.started) {\r\n this.logger.info('Kibana Discovery Service has been started');\r\n this.started = true;\r\n```\r\n5. Navigate to the `/api/task_manager/_health` route and confirm\r\n`observed_kibana_instances` is `2`\r\n\r\n---------\r\n\r\nCo-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>","sha":"d0d2032f18a37e4c458a26d92092665453b737b0"}},"sourceBranch":"main","suggestedTargetBranches":["8.x"],"targetPullRequestStates":[{"branch":"main","label":"v9.0.0","branchLabelMappingKey":"^v9.0.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/194113","number":194113,"mergeCommit":{"message":"Hook up discovery service to Task Manager health (#194113)\n\nResolves https://github.com/elastic/kibana/issues/192568\r\n\r\nIn this PR, I'm solving the issue where the task manager health API is\r\nunable to determine how many Kibana nodes are running. I'm doing so by\r\nleveraging the Kibana discovery service to get a count instead of\r\ncalculating it based on an aggregation on the `.kibana_task_manager`\r\nindex where we count the unique number of `ownerId`, which requires\r\ntasks to be running and a sufficient distribution across the Kibana\r\nnodes to determine the number properly.\r\n\r\nNote: This will only work when mget is the task claim strategy\r\n\r\n## To verify\r\n1. Set `xpack.task_manager.claim_strategy: mget` in kibana.yml\r\n2. Startup the PR locally with Elasticsearch and Kibana running\r\n3. Navigate to the `/api/task_manager/_health` route and confirm\r\n`observed_kibana_instances` is `1`\r\n4. Apply the following code and restart Kibana\r\n```\r\ndiff --git a/x-pack/plugins/task_manager/server/kibana_discovery_service/kibana_discovery_service.ts b/x-pack/plugins/task_manager/server/kibana_discovery_service/kibana_discovery_service.ts\r\nindex 090847032bf..69dfb6d1b36 100644\r\n--- a/x-pack/plugins/task_manager/server/kibana_discovery_service/kibana_discovery_service.ts\r\n+++ b/x-pack/plugins/task_manager/server/kibana_discovery_service/kibana_discovery_service.ts\r\n@@ -59,6 +59,7 @@ export class KibanaDiscoveryService {\r\n const lastSeen = lastSeenDate.toISOString();\r\n try {\r\n await this.upsertCurrentNode({ id: this.currentNode, lastSeen });\r\n+ await this.upsertCurrentNode({ id: `${this.currentNode}-2`, lastSeen });\r\n if (!this.started) {\r\n this.logger.info('Kibana Discovery Service has been started');\r\n this.started = true;\r\n```\r\n5. Navigate to the `/api/task_manager/_health` route and confirm\r\n`observed_kibana_instances` is `2`\r\n\r\n---------\r\n\r\nCo-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>","sha":"d0d2032f18a37e4c458a26d92092665453b737b0"}},{"branch":"8.x","label":"v8.16.0","branchLabelMappingKey":"^v8.16.0$","isSourceBranch":false,"state":"NOT_CREATED"}]}] BACKPORT--> Co-authored-by: Mike Côté <mikecote@users.noreply.github.com> |
||
|
e1d68c81b0
|
[8.x] [ResponseOps][TaskManager] Discovery service running after shutdown (next one!) (#193478) (#194225)
# Backport This will backport the following commits from `main` to `8.x`: - [[ResponseOps][TaskManager] Discovery service running after shutdown (next one!) (#193478)](https://github.com/elastic/kibana/pull/193478) <!--- Backport version: 9.4.3 --> ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sqren/backport) <!--BACKPORT [{"author":{"name":"Alexi Doak","email":"109488926+doakalexi@users.noreply.github.com"},"sourceCommit":{"committedDate":"2024-09-26T20:13:32Z","message":"[ResponseOps][TaskManager] Discovery service running after shutdown (next one!) (#193478)\n\nResolves https://github.com/elastic/kibana/issues/192505\r\n\r\n## Summary\r\n\r\nThis PR updates the discovery service to not schedule the current node\r\nupsert if Kibana is shutting down and also clear the timer. I removed\r\nthe verification steps bc I am not sure of the best way to verify this\r\nother than running locally and verifying that\r\n`scheduleUpsertCurrentNode()` does schedule anything.\r\n\r\n\r\n### Checklist\r\n\r\n- [ ] [Unit or functional\r\ntests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)\r\nwere updated or added to match the most common scenarios","sha":"5cce92b005fdbe9eff5e4057656ce8c8b2c740b2","branchLabelMapping":{"^v9.0.0$":"main","^v8.16.0$":"8.x","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","Team:ResponseOps","v9.0.0","backport:prev-minor","v8.16.0"],"title":"[ResponseOps][TaskManager] Discovery service running after shutdown (next one!)","number":193478,"url":"https://github.com/elastic/kibana/pull/193478","mergeCommit":{"message":"[ResponseOps][TaskManager] Discovery service running after shutdown (next one!) (#193478)\n\nResolves https://github.com/elastic/kibana/issues/192505\r\n\r\n## Summary\r\n\r\nThis PR updates the discovery service to not schedule the current node\r\nupsert if Kibana is shutting down and also clear the timer. I removed\r\nthe verification steps bc I am not sure of the best way to verify this\r\nother than running locally and verifying that\r\n`scheduleUpsertCurrentNode()` does schedule anything.\r\n\r\n\r\n### Checklist\r\n\r\n- [ ] [Unit or functional\r\ntests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)\r\nwere updated or added to match the most common scenarios","sha":"5cce92b005fdbe9eff5e4057656ce8c8b2c740b2"}},"sourceBranch":"main","suggestedTargetBranches":["8.x"],"targetPullRequestStates":[{"branch":"main","label":"v9.0.0","branchLabelMappingKey":"^v9.0.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/193478","number":193478,"mergeCommit":{"message":"[ResponseOps][TaskManager] Discovery service running after shutdown (next one!) (#193478)\n\nResolves https://github.com/elastic/kibana/issues/192505\r\n\r\n## Summary\r\n\r\nThis PR updates the discovery service to not schedule the current node\r\nupsert if Kibana is shutting down and also clear the timer. I removed\r\nthe verification steps bc I am not sure of the best way to verify this\r\nother than running locally and verifying that\r\n`scheduleUpsertCurrentNode()` does schedule anything.\r\n\r\n\r\n### Checklist\r\n\r\n- [ ] [Unit or functional\r\ntests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)\r\nwere updated or added to match the most common scenarios","sha":"5cce92b005fdbe9eff5e4057656ce8c8b2c740b2"}},{"branch":"8.x","label":"v8.16.0","branchLabelMappingKey":"^v8.16.0$","isSourceBranch":false,"state":"NOT_CREATED"}]}] BACKPORT--> Co-authored-by: Alexi Doak <109488926+doakalexi@users.noreply.github.com> |
||
|
1108910041
|
[8.x] [Response Ops][Task Manager] Adding integration test to ensure no `WorkloadAggregator` errors when there are unrecognized task types. (#193479) (#194016)
# Backport This will backport the following commits from `main` to `8.x`: - [[Response Ops][Task Manager] Adding integration test to ensure no `WorkloadAggregator` errors when there are unrecognized task types. (#193479)](https://github.com/elastic/kibana/pull/193479) <!--- Backport version: 9.4.3 --> ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sqren/backport) <!--BACKPORT [{"author":{"name":"Ying Mao","email":"ying.mao@elastic.co"},"sourceCommit":{"committedDate":"2024-09-25T14:22:11Z","message":"[Response Ops][Task Manager] Adding integration test to ensure no `WorkloadAggregator` errors when there are unrecognized task types. (#193479)\n\nFixes https://github.com/elastic/kibana-team/issues/1036\r\n\r\n## Summary\r\n\r\nAdding integration test as RCA action for incident where unrecognized\r\ntask types was causing issues generating the workload portion of the\r\ntask manager health report.\r\n\r\n## To verify\r\n\r\nAdd this line to your code to that will throw an error when there are\r\nunrecognized task types when generating the health report\r\n\r\n```\r\n--- a/x-pack/plugins/task_manager/server/task_type_dictionary.ts\r\n+++ b/x-pack/plugins/task_manager/server/task_type_dictionary.ts\r\n@@ -128,6 +128,7 @@ export class TaskTypeDictionary {\r\n }\r\n\r\n public get(type: string): TaskDefinition | undefined {\r\n+ this.ensureHas(type);\r\n return this.definitions.get(type);\r\n }\r\n```\r\n\r\nRun the integration test `node scripts/jest_integration.js\r\nx-pack/plugins/task_manager/server/integration_tests/removed_types.test.ts`\r\nand see that it fails because a `WorkloadAggregator` error is logged.\r\n\r\n---------\r\n\r\nCo-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>","sha":"01eae1556266c8377f6557f4ccacc53e0b4db7fc","branchLabelMapping":{"^v9.0.0$":"main","^v8.16.0$":"8.x","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","Feature:Task Manager","Team:ResponseOps","v9.0.0","backport:prev-minor","v8.16.0"],"title":"[Response Ops][Task Manager] Adding integration test to ensure no `WorkloadAggregator` errors when there are unrecognized task types.","number":193479,"url":"https://github.com/elastic/kibana/pull/193479","mergeCommit":{"message":"[Response Ops][Task Manager] Adding integration test to ensure no `WorkloadAggregator` errors when there are unrecognized task types. (#193479)\n\nFixes https://github.com/elastic/kibana-team/issues/1036\r\n\r\n## Summary\r\n\r\nAdding integration test as RCA action for incident where unrecognized\r\ntask types was causing issues generating the workload portion of the\r\ntask manager health report.\r\n\r\n## To verify\r\n\r\nAdd this line to your code to that will throw an error when there are\r\nunrecognized task types when generating the health report\r\n\r\n```\r\n--- a/x-pack/plugins/task_manager/server/task_type_dictionary.ts\r\n+++ b/x-pack/plugins/task_manager/server/task_type_dictionary.ts\r\n@@ -128,6 +128,7 @@ export class TaskTypeDictionary {\r\n }\r\n\r\n public get(type: string): TaskDefinition | undefined {\r\n+ this.ensureHas(type);\r\n return this.definitions.get(type);\r\n }\r\n```\r\n\r\nRun the integration test `node scripts/jest_integration.js\r\nx-pack/plugins/task_manager/server/integration_tests/removed_types.test.ts`\r\nand see that it fails because a `WorkloadAggregator` error is logged.\r\n\r\n---------\r\n\r\nCo-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>","sha":"01eae1556266c8377f6557f4ccacc53e0b4db7fc"}},"sourceBranch":"main","suggestedTargetBranches":["8.x"],"targetPullRequestStates":[{"branch":"main","label":"v9.0.0","branchLabelMappingKey":"^v9.0.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/193479","number":193479,"mergeCommit":{"message":"[Response Ops][Task Manager] Adding integration test to ensure no `WorkloadAggregator` errors when there are unrecognized task types. (#193479)\n\nFixes https://github.com/elastic/kibana-team/issues/1036\r\n\r\n## Summary\r\n\r\nAdding integration test as RCA action for incident where unrecognized\r\ntask types was causing issues generating the workload portion of the\r\ntask manager health report.\r\n\r\n## To verify\r\n\r\nAdd this line to your code to that will throw an error when there are\r\nunrecognized task types when generating the health report\r\n\r\n```\r\n--- a/x-pack/plugins/task_manager/server/task_type_dictionary.ts\r\n+++ b/x-pack/plugins/task_manager/server/task_type_dictionary.ts\r\n@@ -128,6 +128,7 @@ export class TaskTypeDictionary {\r\n }\r\n\r\n public get(type: string): TaskDefinition | undefined {\r\n+ this.ensureHas(type);\r\n return this.definitions.get(type);\r\n }\r\n```\r\n\r\nRun the integration test `node scripts/jest_integration.js\r\nx-pack/plugins/task_manager/server/integration_tests/removed_types.test.ts`\r\nand see that it fails because a `WorkloadAggregator` error is logged.\r\n\r\n---------\r\n\r\nCo-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>","sha":"01eae1556266c8377f6557f4ccacc53e0b4db7fc"}},{"branch":"8.x","label":"v8.16.0","branchLabelMappingKey":"^v8.16.0$","isSourceBranch":false,"state":"NOT_CREATED"}]}] BACKPORT--> Co-authored-by: Ying Mao <ying.mao@elastic.co> |
||
|
1eff7df2f0
|
[8.x] [Response Ops][Task Manager] Use ES client to update tasks at end of task run instead of SO client. (#192515) (#193924)
# Backport This will backport the following commits from `main` to `8.x`: - [[Response Ops][Task Manager] Use ES client to update tasks at end of task run instead of SO client. (#192515)](https://github.com/elastic/kibana/pull/192515) <!--- Backport version: 9.4.3 --> ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sqren/backport) <!--BACKPORT [{"author":{"name":"Ying Mao","email":"ying.mao@elastic.co"},"sourceCommit":{"committedDate":"2024-09-24T20:49:59Z","message":"[Response Ops][Task Manager] Use ES client to update tasks at end of task run instead of SO client. (#192515)\n\nResolves https://github.com/elastic/kibana/issues/192398\r\n\r\n## Summary\r\n\r\nUpdates task manager end of run updates to use the ES client update\r\nfunction for a true partial update instead of the saved objects client\r\nupdate function that performs a `GET` then an update.\r\n\r\n\r\n## To verify\r\n\r\nRun ES and Kibana and verify that the task documents are updated as\r\nexpected for both recurring tasks that succeed and fail, and one-time\r\ntasks that succeed and fail.\r\n\r\n---------\r\n\r\nCo-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>\r\nCo-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>","sha":"4c3865df19d3d0c7264f5a05bb675c760bd1526f","branchLabelMapping":{"^v9.0.0$":"main","^v8.16.0$":"8.x","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","Feature:Task Manager","Team:ResponseOps","v9.0.0","backport:prev-minor","v8.16.0"],"title":"[Response Ops][Task Manager] Use ES client to update tasks at end of task run instead of SO client.","number":192515,"url":"https://github.com/elastic/kibana/pull/192515","mergeCommit":{"message":"[Response Ops][Task Manager] Use ES client to update tasks at end of task run instead of SO client. (#192515)\n\nResolves https://github.com/elastic/kibana/issues/192398\r\n\r\n## Summary\r\n\r\nUpdates task manager end of run updates to use the ES client update\r\nfunction for a true partial update instead of the saved objects client\r\nupdate function that performs a `GET` then an update.\r\n\r\n\r\n## To verify\r\n\r\nRun ES and Kibana and verify that the task documents are updated as\r\nexpected for both recurring tasks that succeed and fail, and one-time\r\ntasks that succeed and fail.\r\n\r\n---------\r\n\r\nCo-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>\r\nCo-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>","sha":"4c3865df19d3d0c7264f5a05bb675c760bd1526f"}},"sourceBranch":"main","suggestedTargetBranches":["8.x"],"targetPullRequestStates":[{"branch":"main","label":"v9.0.0","branchLabelMappingKey":"^v9.0.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/192515","number":192515,"mergeCommit":{"message":"[Response Ops][Task Manager] Use ES client to update tasks at end of task run instead of SO client. (#192515)\n\nResolves https://github.com/elastic/kibana/issues/192398\r\n\r\n## Summary\r\n\r\nUpdates task manager end of run updates to use the ES client update\r\nfunction for a true partial update instead of the saved objects client\r\nupdate function that performs a `GET` then an update.\r\n\r\n\r\n## To verify\r\n\r\nRun ES and Kibana and verify that the task documents are updated as\r\nexpected for both recurring tasks that succeed and fail, and one-time\r\ntasks that succeed and fail.\r\n\r\n---------\r\n\r\nCo-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>\r\nCo-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>","sha":"4c3865df19d3d0c7264f5a05bb675c760bd1526f"}},{"branch":"8.x","label":"v8.16.0","branchLabelMappingKey":"^v8.16.0$","isSourceBranch":false,"state":"NOT_CREATED"}]}] BACKPORT--> Co-authored-by: Ying Mao <ying.mao@elastic.co> |
||
|
faff392eb7
|
[8.x] Fix memory leak in task manager task runner (#193612) (#193626)
# Backport This will backport the following commits from `main` to `8.x`: - [Fix memory leak in task manager task runner (#193612)](https://github.com/elastic/kibana/pull/193612) <!--- Backport version: 9.4.3 --> ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sqren/backport) <!--BACKPORT [{"author":{"name":"Mike Côté","email":"mikecote@users.noreply.github.com"},"sourceCommit":{"committedDate":"2024-09-20T17:52:26Z","message":"Fix memory leak in task manager task runner (#193612)\n\nIn this PR, I'm fixing a memory leak that was introduced in\r\nhttps://github.com/elastic/kibana/pull/190093 where every task runner\r\nclass object wouldn't free up in memory because it subscribed to the\r\n`pollIntervalConfiguration# Backport This will backport the following commits from `main` to `8.x`: - [Fix memory leak in task manager task runner (#193612)](https://github.com/elastic/kibana/pull/193612) <!--- Backport version: 9.4.3 --> ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sqren/backport) <!--BACKPORT observable. To fix this, I moved the\r\nobservable up a class into `TaskPollingLifecycle` which only gets\r\ncreated once on plugin start and then pass down the pollInterval value\r\nvia a function call the task runner class can call.","sha":"cf6e8b5ba971fffe2a57e1a7c573e60cc2fbe280","branchLabelMapping":{"^v9.0.0$":"main","^v8.16.0$":"8.x","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","Feature:Task Manager","Team:ResponseOps","v9.0.0","backport:prev-minor","v8.16.0"],"title":"Fix memory leak in task manager task runner","number":193612,"url":"https://github.com/elastic/kibana/pull/193612","mergeCommit":{"message":"Fix memory leak in task manager task runner (#193612)\n\nIn this PR, I'm fixing a memory leak that was introduced in\r\nhttps://github.com/elastic/kibana/pull/190093 where every task runner\r\nclass object wouldn't free up in memory because it subscribed to the\r\n`pollIntervalConfiguration# Backport This will backport the following commits from `main` to `8.x`: - [Fix memory leak in task manager task runner (#193612)](https://github.com/elastic/kibana/pull/193612) <!--- Backport version: 9.4.3 --> ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sqren/backport) <!--BACKPORT observable. To fix this, I moved the\r\nobservable up a class into `TaskPollingLifecycle` which only gets\r\ncreated once on plugin start and then pass down the pollInterval value\r\nvia a function call the task runner class can call.","sha":"cf6e8b5ba971fffe2a57e1a7c573e60cc2fbe280"}},"sourceBranch":"main","suggestedTargetBranches":["8.x"],"targetPullRequestStates":[{"branch":"main","label":"v9.0.0","branchLabelMappingKey":"^v9.0.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/193612","number":193612,"mergeCommit":{"message":"Fix memory leak in task manager task runner (#193612)\n\nIn this PR, I'm fixing a memory leak that was introduced in\r\nhttps://github.com/elastic/kibana/pull/190093 where every task runner\r\nclass object wouldn't free up in memory because it subscribed to the\r\n`pollIntervalConfiguration# Backport This will backport the following commits from `main` to `8.x`: - [Fix memory leak in task manager task runner (#193612)](https://github.com/elastic/kibana/pull/193612) <!--- Backport version: 9.4.3 --> ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sqren/backport) <!--BACKPORT observable. To fix this, I moved the\r\nobservable up a class into `TaskPollingLifecycle` which only gets\r\ncreated once on plugin start and then pass down the pollInterval value\r\nvia a function call the task runner class can call.","sha":"cf6e8b5ba971fffe2a57e1a7c573e60cc2fbe280"}},{"branch":"8.x","label":"v8.16.0","branchLabelMappingKey":"^v8.16.0$","isSourceBranch":false,"state":"NOT_CREATED"}]}] BACKPORT--> Co-authored-by: Mike Côté <mikecote@users.noreply.github.com> |
||
|
64e538429f
|
[8.x] Consistent scheduling when tasks run within the poll interval of their original time (#190093) (#193022)
# Backport This will backport the following commits from `main` to `8.x`: - [Consistent scheduling when tasks run within the poll interval of their original time (#190093)](https://github.com/elastic/kibana/pull/190093) <!--- Backport version: 9.4.3 --> ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sqren/backport) <!--BACKPORT [{"author":{"name":"Mike Côté","email":"mikecote@users.noreply.github.com"},"sourceCommit":{"committedDate":"2024-09-16T14:10:36Z","message":"Consistent scheduling when tasks run within the poll interval of their original time (#190093)\n\nResolves https://github.com/elastic/kibana/issues/189114\r\n\r\nIn this PR, I'm changing the logic to calculate the task's next run at.\r\nWhenever the gap between the task's runAt and when it was picked up is\r\nless than the poll interval, we'll use the `runAt` to schedule the next.\r\nThis way we don't continuously add time to the task's next run (ex:\r\nrunning every 1m turns into every 1m 3s).\r\n\r\nI've had to modify a few tests to have a more increased interval because\r\nthis made tasks run more frequently (on time), which introduced\r\nflakiness.\r\n\r\n## To verify\r\n1. Create an alerting rule that runs every 10s\r\n2. Apply the following diff to your code\r\n```\r\ndiff --git a/x-pack/plugins/task_manager/server/lib/get_next_run_at.ts b/x-pack/plugins/task_manager/server/lib/get_next_run_at.ts\r\nindex 55d5f85e5d3..4342dcdd845 100644\r\n--- a/x-pack/plugins/task_manager/server/lib/get_next_run_at.ts\r\n+++ b/x-pack/plugins/task_manager/server/lib/get_next_run_at.ts\r\n@@ -31,5 +31,7 @@ export function getNextRunAt(\r\n Date.now()\r\n );\r\n\r\n+ console.log(`*** Next run at: ${new Date(nextCalculatedRunAt).toISOString()}, interval=${newSchedule?.interval ?? schedule.interval}, originalRunAt=${originalRunAt.toISOString()}, startedAt=${startedAt.toISOString()}`);\r\n+\r\n return new Date(nextCalculatedRunAt);\r\n }\r\n```\r\n3. Observe the logs, the gap between runAt and startedAt should be less\r\nthan the poll interval, so the next run at is based on `runAt` instead\r\nof `startedAt`.\r\n4. Stop Kibana for 15 seconds then start it again\r\n5. Observe the first logs when the rule runs again and notice now that\r\nthe gap between runAt and startedAt is larger than the poll interval,\r\nthe next run at is based on `startedAt` instead of `runAt` to spread the\r\ntasks out evenly.\r\n\r\n---------\r\n\r\nCo-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>","sha":"1f673dc9f12e90a6aa41a903fee8b0adafcdcaf9","branchLabelMapping":{"^v9.0.0$":"main","^v8.16.0$":"8.x","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","Feature:Task Manager","Team:ResponseOps","v9.0.0","backport:prev-minor","v8.16.0"],"title":"Consistent scheduling when tasks run within the poll interval of their original time","number":190093,"url":"https://github.com/elastic/kibana/pull/190093","mergeCommit":{"message":"Consistent scheduling when tasks run within the poll interval of their original time (#190093)\n\nResolves https://github.com/elastic/kibana/issues/189114\r\n\r\nIn this PR, I'm changing the logic to calculate the task's next run at.\r\nWhenever the gap between the task's runAt and when it was picked up is\r\nless than the poll interval, we'll use the `runAt` to schedule the next.\r\nThis way we don't continuously add time to the task's next run (ex:\r\nrunning every 1m turns into every 1m 3s).\r\n\r\nI've had to modify a few tests to have a more increased interval because\r\nthis made tasks run more frequently (on time), which introduced\r\nflakiness.\r\n\r\n## To verify\r\n1. Create an alerting rule that runs every 10s\r\n2. Apply the following diff to your code\r\n```\r\ndiff --git a/x-pack/plugins/task_manager/server/lib/get_next_run_at.ts b/x-pack/plugins/task_manager/server/lib/get_next_run_at.ts\r\nindex 55d5f85e5d3..4342dcdd845 100644\r\n--- a/x-pack/plugins/task_manager/server/lib/get_next_run_at.ts\r\n+++ b/x-pack/plugins/task_manager/server/lib/get_next_run_at.ts\r\n@@ -31,5 +31,7 @@ export function getNextRunAt(\r\n Date.now()\r\n );\r\n\r\n+ console.log(`*** Next run at: ${new Date(nextCalculatedRunAt).toISOString()}, interval=${newSchedule?.interval ?? schedule.interval}, originalRunAt=${originalRunAt.toISOString()}, startedAt=${startedAt.toISOString()}`);\r\n+\r\n return new Date(nextCalculatedRunAt);\r\n }\r\n```\r\n3. Observe the logs, the gap between runAt and startedAt should be less\r\nthan the poll interval, so the next run at is based on `runAt` instead\r\nof `startedAt`.\r\n4. Stop Kibana for 15 seconds then start it again\r\n5. Observe the first logs when the rule runs again and notice now that\r\nthe gap between runAt and startedAt is larger than the poll interval,\r\nthe next run at is based on `startedAt` instead of `runAt` to spread the\r\ntasks out evenly.\r\n\r\n---------\r\n\r\nCo-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>","sha":"1f673dc9f12e90a6aa41a903fee8b0adafcdcaf9"}},"sourceBranch":"main","suggestedTargetBranches":["8.x"],"targetPullRequestStates":[{"branch":"main","label":"v9.0.0","branchLabelMappingKey":"^v9.0.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/190093","number":190093,"mergeCommit":{"message":"Consistent scheduling when tasks run within the poll interval of their original time (#190093)\n\nResolves https://github.com/elastic/kibana/issues/189114\r\n\r\nIn this PR, I'm changing the logic to calculate the task's next run at.\r\nWhenever the gap between the task's runAt and when it was picked up is\r\nless than the poll interval, we'll use the `runAt` to schedule the next.\r\nThis way we don't continuously add time to the task's next run (ex:\r\nrunning every 1m turns into every 1m 3s).\r\n\r\nI've had to modify a few tests to have a more increased interval because\r\nthis made tasks run more frequently (on time), which introduced\r\nflakiness.\r\n\r\n## To verify\r\n1. Create an alerting rule that runs every 10s\r\n2. Apply the following diff to your code\r\n```\r\ndiff --git a/x-pack/plugins/task_manager/server/lib/get_next_run_at.ts b/x-pack/plugins/task_manager/server/lib/get_next_run_at.ts\r\nindex 55d5f85e5d3..4342dcdd845 100644\r\n--- a/x-pack/plugins/task_manager/server/lib/get_next_run_at.ts\r\n+++ b/x-pack/plugins/task_manager/server/lib/get_next_run_at.ts\r\n@@ -31,5 +31,7 @@ export function getNextRunAt(\r\n Date.now()\r\n );\r\n\r\n+ console.log(`*** Next run at: ${new Date(nextCalculatedRunAt).toISOString()}, interval=${newSchedule?.interval ?? schedule.interval}, originalRunAt=${originalRunAt.toISOString()}, startedAt=${startedAt.toISOString()}`);\r\n+\r\n return new Date(nextCalculatedRunAt);\r\n }\r\n```\r\n3. Observe the logs, the gap between runAt and startedAt should be less\r\nthan the poll interval, so the next run at is based on `runAt` instead\r\nof `startedAt`.\r\n4. Stop Kibana for 15 seconds then start it again\r\n5. Observe the first logs when the rule runs again and notice now that\r\nthe gap between runAt and startedAt is larger than the poll interval,\r\nthe next run at is based on `startedAt` instead of `runAt` to spread the\r\ntasks out evenly.\r\n\r\n---------\r\n\r\nCo-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>","sha":"1f673dc9f12e90a6aa41a903fee8b0adafcdcaf9"}},{"branch":"8.x","label":"v8.16.0","branchLabelMappingKey":"^v8.16.0$","isSourceBranch":false,"state":"NOT_CREATED"}]}] BACKPORT--> Co-authored-by: Mike Côté <mikecote@users.noreply.github.com> |
||
|
0e8f537b0b
|
[8.x] Make task manager code use the same logger (#192574) (#193006)
# Backport This will backport the following commits from `main` to `8.x`: - [Make task manager code use the same logger (#192574)](https://github.com/elastic/kibana/pull/192574) <!--- Backport version: 9.4.3 --> ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sqren/backport) <!--BACKPORT [{"author":{"name":"Mike Côté","email":"mikecote@users.noreply.github.com"},"sourceCommit":{"committedDate":"2024-09-16T12:36:57Z","message":"Make task manager code use the same logger (#192574)\n\nIn this PR, I'm making the sub-loggers within task manager use the main\r\nlogger so we can observe the logs under\r\n`log.logger:\"plugin.taskManager\"`. To preserve separation, I moved the\r\nsub-logger name within a tag so we can still filter the logs via\r\n`tags:\"taskClaimer\"`.\r\n\r\nThe wrapped_logger.ts file is copied from\r\n`x-pack/plugins/alerting/server/task_runner/lib/task_runner_logger.ts`.\r\n\r\n---------\r\n\r\nCo-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>","sha":"a0973d600212096ac9e530c179a87c14b7409db2","branchLabelMapping":{"^v9.0.0$":"main","^v8.16.0$":"8.x","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","Feature:Task Manager","Team:ResponseOps","v9.0.0","backport:prev-minor","v8.16.0"],"title":"Make task manager code use the same logger","number":192574,"url":"https://github.com/elastic/kibana/pull/192574","mergeCommit":{"message":"Make task manager code use the same logger (#192574)\n\nIn this PR, I'm making the sub-loggers within task manager use the main\r\nlogger so we can observe the logs under\r\n`log.logger:\"plugin.taskManager\"`. To preserve separation, I moved the\r\nsub-logger name within a tag so we can still filter the logs via\r\n`tags:\"taskClaimer\"`.\r\n\r\nThe wrapped_logger.ts file is copied from\r\n`x-pack/plugins/alerting/server/task_runner/lib/task_runner_logger.ts`.\r\n\r\n---------\r\n\r\nCo-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>","sha":"a0973d600212096ac9e530c179a87c14b7409db2"}},"sourceBranch":"main","suggestedTargetBranches":["8.x"],"targetPullRequestStates":[{"branch":"main","label":"v9.0.0","branchLabelMappingKey":"^v9.0.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/192574","number":192574,"mergeCommit":{"message":"Make task manager code use the same logger (#192574)\n\nIn this PR, I'm making the sub-loggers within task manager use the main\r\nlogger so we can observe the logs under\r\n`log.logger:\"plugin.taskManager\"`. To preserve separation, I moved the\r\nsub-logger name within a tag so we can still filter the logs via\r\n`tags:\"taskClaimer\"`.\r\n\r\nThe wrapped_logger.ts file is copied from\r\n`x-pack/plugins/alerting/server/task_runner/lib/task_runner_logger.ts`.\r\n\r\n---------\r\n\r\nCo-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>","sha":"a0973d600212096ac9e530c179a87c14b7409db2"}},{"branch":"8.x","label":"v8.16.0","branchLabelMappingKey":"^v8.16.0$","isSourceBranch":false,"state":"NOT_CREATED"}]}] BACKPORT--> Co-authored-by: Mike Côté <mikecote@users.noreply.github.com> |
||
|
4fc88d8eac
|
[8.x] Fix bug in calculating when ad-hoc tasks are out of attempts (#192907) (#192918)
# Backport This will backport the following commits from `main` to `8.x`: - [Fix bug in calculating when ad-hoc tasks are out of attempts (#192907)](https://github.com/elastic/kibana/pull/192907) <!--- Backport version: 9.4.3 --> ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sqren/backport) <!--BACKPORT [{"author":{"name":"Mike Côté","email":"mikecote@users.noreply.github.com"},"sourceCommit":{"committedDate":"2024-09-13T18:40:29Z","message":"Fix bug in calculating when ad-hoc tasks are out of attempts (#192907)\n\nIn this PR, I'm fixing a bug where ad-hoc tasks would have one fewer\r\nattempts to retry in failure scenarios when using mget.\r\n\r\n## To verify\r\n\r\n1. Apply the following diff to your code\r\n```\r\ndiff --git a/x-pack/plugins/stack_connectors/server/connector_types/server_log/index.ts b/x-pack/plugins/stack_connectors/server/connector_types/server_log/index.ts\r\nindex 0275b2bdc2f..d481c3820a1 100644\r\n--- a/x-pack/plugins/stack_connectors/server/connector_types/server_log/index.ts\r\n+++ b/x-pack/plugins/stack_connectors/server/connector_types/server_log/index.ts\r\n@@ -77,6 +77,10 @@ export function getConnectorType(): ServerLogConnectorType {\r\n async function executor(\r\n execOptions: ServerLogConnectorTypeExecutorOptions\r\n ): Promise<ConnectorTypeExecutorResult<void>> {\r\n+\r\n+ console.log('*** Server log execution');\r\n+ throw new Error('Fail');\r\n+\r\n const { actionId, params, logger } = execOptions;\r\n\r\n const sanitizedMessage = withoutControlCharacters(params.message);\r\ndiff --git a/x-pack/plugins/task_manager/server/config.ts b/x-pack/plugins/task_manager/server/config.ts\r\nindex db07494ef4f..07e277f8d16 100644\r\n--- a/x-pack/plugins/task_manager/server/config.ts\r\n+++ b/x-pack/plugins/task_manager/server/config.ts\r\n@@ -202,7 +202,7 @@ export const configSchema = schema.object(\r\n max: 100,\r\n min: 1,\r\n }),\r\n- claim_strategy: schema.string({ defaultValue: CLAIM_STRATEGY_UPDATE_BY_QUERY }),\r\n+ claim_strategy: schema.string({ defaultValue: CLAIM_STRATEGY_MGET }),\r\n request_timeouts: requestTimeoutsConfig,\r\n },\r\n {\r\ndiff --git a/x-pack/plugins/task_manager/server/lib/get_retry_at.ts b/x-pack/plugins/task_manager/server/lib/get_retry_at.ts\r\nindex 278ba18642d..c8fb911d500 100644\r\n--- a/x-pack/plugins/task_manager/server/lib/get_retry_at.ts\r\n+++ b/x-pack/plugins/task_manager/server/lib/get_retry_at.ts\r\n@@ -54,6 +54,7 @@ export function getRetryDate({\r\n }\r\n\r\n export function calculateDelayBasedOnAttempts(attempts: number) {\r\n+ return 10 * 1000;\r\n // Return 30s for the first retry attempt\r\n if (attempts === 1) {\r\n return 30 * 1000;\r\n```\r\n2. Create an always firing rule that runs every hour, triggering a\r\nserver log on check intervals\r\n3. Let the rule run and observe the server log action running and\r\nfailing three times (as compared to two)","sha":"36eedc121bff8d83fc6a4590f468397f56d0bd14","branchLabelMapping":{"^v9.0.0$":"main","^v8.16.0$":"8.x","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","Feature:Task Manager","Team:ResponseOps","v9.0.0","backport:prev-minor","v8.16.0"],"title":"Fix bug in calculating when ad-hoc tasks are out of attempts","number":192907,"url":"https://github.com/elastic/kibana/pull/192907","mergeCommit":{"message":"Fix bug in calculating when ad-hoc tasks are out of attempts (#192907)\n\nIn this PR, I'm fixing a bug where ad-hoc tasks would have one fewer\r\nattempts to retry in failure scenarios when using mget.\r\n\r\n## To verify\r\n\r\n1. Apply the following diff to your code\r\n```\r\ndiff --git a/x-pack/plugins/stack_connectors/server/connector_types/server_log/index.ts b/x-pack/plugins/stack_connectors/server/connector_types/server_log/index.ts\r\nindex 0275b2bdc2f..d481c3820a1 100644\r\n--- a/x-pack/plugins/stack_connectors/server/connector_types/server_log/index.ts\r\n+++ b/x-pack/plugins/stack_connectors/server/connector_types/server_log/index.ts\r\n@@ -77,6 +77,10 @@ export function getConnectorType(): ServerLogConnectorType {\r\n async function executor(\r\n execOptions: ServerLogConnectorTypeExecutorOptions\r\n ): Promise<ConnectorTypeExecutorResult<void>> {\r\n+\r\n+ console.log('*** Server log execution');\r\n+ throw new Error('Fail');\r\n+\r\n const { actionId, params, logger } = execOptions;\r\n\r\n const sanitizedMessage = withoutControlCharacters(params.message);\r\ndiff --git a/x-pack/plugins/task_manager/server/config.ts b/x-pack/plugins/task_manager/server/config.ts\r\nindex db07494ef4f..07e277f8d16 100644\r\n--- a/x-pack/plugins/task_manager/server/config.ts\r\n+++ b/x-pack/plugins/task_manager/server/config.ts\r\n@@ -202,7 +202,7 @@ export const configSchema = schema.object(\r\n max: 100,\r\n min: 1,\r\n }),\r\n- claim_strategy: schema.string({ defaultValue: CLAIM_STRATEGY_UPDATE_BY_QUERY }),\r\n+ claim_strategy: schema.string({ defaultValue: CLAIM_STRATEGY_MGET }),\r\n request_timeouts: requestTimeoutsConfig,\r\n },\r\n {\r\ndiff --git a/x-pack/plugins/task_manager/server/lib/get_retry_at.ts b/x-pack/plugins/task_manager/server/lib/get_retry_at.ts\r\nindex 278ba18642d..c8fb911d500 100644\r\n--- a/x-pack/plugins/task_manager/server/lib/get_retry_at.ts\r\n+++ b/x-pack/plugins/task_manager/server/lib/get_retry_at.ts\r\n@@ -54,6 +54,7 @@ export function getRetryDate({\r\n }\r\n\r\n export function calculateDelayBasedOnAttempts(attempts: number) {\r\n+ return 10 * 1000;\r\n // Return 30s for the first retry attempt\r\n if (attempts === 1) {\r\n return 30 * 1000;\r\n```\r\n2. Create an always firing rule that runs every hour, triggering a\r\nserver log on check intervals\r\n3. Let the rule run and observe the server log action running and\r\nfailing three times (as compared to two)","sha":"36eedc121bff8d83fc6a4590f468397f56d0bd14"}},"sourceBranch":"main","suggestedTargetBranches":["8.x"],"targetPullRequestStates":[{"branch":"main","label":"v9.0.0","branchLabelMappingKey":"^v9.0.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/192907","number":192907,"mergeCommit":{"message":"Fix bug in calculating when ad-hoc tasks are out of attempts (#192907)\n\nIn this PR, I'm fixing a bug where ad-hoc tasks would have one fewer\r\nattempts to retry in failure scenarios when using mget.\r\n\r\n## To verify\r\n\r\n1. Apply the following diff to your code\r\n```\r\ndiff --git a/x-pack/plugins/stack_connectors/server/connector_types/server_log/index.ts b/x-pack/plugins/stack_connectors/server/connector_types/server_log/index.ts\r\nindex 0275b2bdc2f..d481c3820a1 100644\r\n--- a/x-pack/plugins/stack_connectors/server/connector_types/server_log/index.ts\r\n+++ b/x-pack/plugins/stack_connectors/server/connector_types/server_log/index.ts\r\n@@ -77,6 +77,10 @@ export function getConnectorType(): ServerLogConnectorType {\r\n async function executor(\r\n execOptions: ServerLogConnectorTypeExecutorOptions\r\n ): Promise<ConnectorTypeExecutorResult<void>> {\r\n+\r\n+ console.log('*** Server log execution');\r\n+ throw new Error('Fail');\r\n+\r\n const { actionId, params, logger } = execOptions;\r\n\r\n const sanitizedMessage = withoutControlCharacters(params.message);\r\ndiff --git a/x-pack/plugins/task_manager/server/config.ts b/x-pack/plugins/task_manager/server/config.ts\r\nindex db07494ef4f..07e277f8d16 100644\r\n--- a/x-pack/plugins/task_manager/server/config.ts\r\n+++ b/x-pack/plugins/task_manager/server/config.ts\r\n@@ -202,7 +202,7 @@ export const configSchema = schema.object(\r\n max: 100,\r\n min: 1,\r\n }),\r\n- claim_strategy: schema.string({ defaultValue: CLAIM_STRATEGY_UPDATE_BY_QUERY }),\r\n+ claim_strategy: schema.string({ defaultValue: CLAIM_STRATEGY_MGET }),\r\n request_timeouts: requestTimeoutsConfig,\r\n },\r\n {\r\ndiff --git a/x-pack/plugins/task_manager/server/lib/get_retry_at.ts b/x-pack/plugins/task_manager/server/lib/get_retry_at.ts\r\nindex 278ba18642d..c8fb911d500 100644\r\n--- a/x-pack/plugins/task_manager/server/lib/get_retry_at.ts\r\n+++ b/x-pack/plugins/task_manager/server/lib/get_retry_at.ts\r\n@@ -54,6 +54,7 @@ export function getRetryDate({\r\n }\r\n\r\n export function calculateDelayBasedOnAttempts(attempts: number) {\r\n+ return 10 * 1000;\r\n // Return 30s for the first retry attempt\r\n if (attempts === 1) {\r\n return 30 * 1000;\r\n```\r\n2. Create an always firing rule that runs every hour, triggering a\r\nserver log on check intervals\r\n3. Let the rule run and observe the server log action running and\r\nfailing three times (as compared to two)","sha":"36eedc121bff8d83fc6a4590f468397f56d0bd14"}},{"branch":"8.x","label":"v8.16.0","branchLabelMappingKey":"^v8.16.0$","isSourceBranch":false,"state":"NOT_CREATED"}]}] BACKPORT--> Co-authored-by: Mike Côté <mikecote@users.noreply.github.com> |
||
|
d858e7d2b3
|
[8.x] Fix bug in overdue task metric script (#192863) (#192899)
# Backport This will backport the following commits from `main` to `8.x`: - [Fix bug in overdue task metric script (#192863)](https://github.com/elastic/kibana/pull/192863) <!--- Backport version: 9.4.3 --> ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sqren/backport) <!--BACKPORT [{"author":{"name":"Mike Côté","email":"mikecote@users.noreply.github.com"},"sourceCommit":{"committedDate":"2024-09-13T15:51:47Z","message":"Fix bug in overdue task metric script (#192863)\n\nFixing a bug from https://github.com/elastic/kibana/pull/192603 where\r\ntasks in idle wouldn't show up in overdue metrics.\r\n\r\n## To verify\r\n1. Set `xpack.task_manager.unsafe.exclude_task_types: ['actions:*']` in\r\nyour kibana.yml\r\n2. Startup Elasticsearch and Kibana\r\n3. Create an always firing rule that logs a server log message\r\n4. Observe the metrics endpoint `/api/task_manager/metrics` and that the\r\noverdue metrics overall, for actions and for server log increase over\r\ntime because the task is skipped","sha":"1854acd557531501ed600204ac766c5670c29f38","branchLabelMapping":{"^v9.0.0$":"main","^v8.16.0$":"8.x","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","Feature:Task Manager","Team:ResponseOps","v9.0.0","backport:prev-minor","v8.16.0"],"title":"Fix bug in overdue task metric script","number":192863,"url":"https://github.com/elastic/kibana/pull/192863","mergeCommit":{"message":"Fix bug in overdue task metric script (#192863)\n\nFixing a bug from https://github.com/elastic/kibana/pull/192603 where\r\ntasks in idle wouldn't show up in overdue metrics.\r\n\r\n## To verify\r\n1. Set `xpack.task_manager.unsafe.exclude_task_types: ['actions:*']` in\r\nyour kibana.yml\r\n2. Startup Elasticsearch and Kibana\r\n3. Create an always firing rule that logs a server log message\r\n4. Observe the metrics endpoint `/api/task_manager/metrics` and that the\r\noverdue metrics overall, for actions and for server log increase over\r\ntime because the task is skipped","sha":"1854acd557531501ed600204ac766c5670c29f38"}},"sourceBranch":"main","suggestedTargetBranches":["8.x"],"targetPullRequestStates":[{"branch":"main","label":"v9.0.0","branchLabelMappingKey":"^v9.0.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/192863","number":192863,"mergeCommit":{"message":"Fix bug in overdue task metric script (#192863)\n\nFixing a bug from https://github.com/elastic/kibana/pull/192603 where\r\ntasks in idle wouldn't show up in overdue metrics.\r\n\r\n## To verify\r\n1. Set `xpack.task_manager.unsafe.exclude_task_types: ['actions:*']` in\r\nyour kibana.yml\r\n2. Startup Elasticsearch and Kibana\r\n3. Create an always firing rule that logs a server log message\r\n4. Observe the metrics endpoint `/api/task_manager/metrics` and that the\r\noverdue metrics overall, for actions and for server log increase over\r\ntime because the task is skipped","sha":"1854acd557531501ed600204ac766c5670c29f38"}},{"branch":"8.x","label":"v8.16.0","branchLabelMappingKey":"^v8.16.0$","isSourceBranch":false,"state":"NOT_CREATED"}]}] BACKPORT--> Co-authored-by: Mike Côté <mikecote@users.noreply.github.com> |
||
|
850cdf0275
|
[ResponseOps][TaskManager] followups from resource based scheduling PR (#192124)
Towards https://github.com/elastic/kibana/issues/190095, https://github.com/elastic/kibana/issues/192183, https://github.com/elastic/kibana/issues/192185 ## Summary This PR updates the following: - heap-to-capacity converter to take into account larger amounts of RAM, updated this to be 16GB - initial`maxAllowedCost` to be the default capacity of 10 - adds `xpack.alerting.maxScheduledPerMinute`, `xpack.discovery.active_nodes_lookback`, `xpack.discovery.interval` configs to docker - updates the TM docs for `xpack.task_manager.capacity` --------- Co-authored-by: Lisa Cawley <lcawley@elastic.co> |
||
|
b5dcc0db8d
|
Fix overdue tasks metric to consider retryAt when the task status isn't idle (#192603)
In this PR, I'm fixing the runtime field used to calculate the number of overdue tasks so it considers `retryAt` field when tasks are in running or claiming status. |
||
|
34fd9dcbf9
|
Fixes Failing test: Jest Integration Tests.x-pack/plugins/task_manager/server/integration_tests - capacity based claiming should claim tasks to full capacity (#192519)
Resolves https://github.com/elastic/kibana/issues/191117 This test became flaky again after merging this PR https://github.com/elastic/kibana/pull/192261 which changes Kibana to claim tasks against all partitions when the partitions have not yet been assigned. This was due to setting the `runAt` value to `now-1000` which means that all injected tasks were immediately overdue and claimable as soon as possible, which means they might get claimed in different claim cycles. This PR changes the `runAt` value to `now + 5000` so injected tasks won't be claimed until they're all injected so they should be claimed in one cycle. Copied this test 60 times and was able to run the integration tests twice (so 120 times total) with no failures. * https://buildkite.com/elastic/kibana-pull-request/builds/233196#0191dd0b-fe03-40cf-9fab-b211dd662993 * https://buildkite.com/elastic/kibana-pull-request/builds/233236#0191dd94-8010-4b3c-988d-6e7d5655f989 |
||
|
a141818c4f
|
[Response Ops][Task Manager] Add bulk update function that directly updates using the esClient (#191760)
Resolves https://github.com/elastic/kibana/issues/187704 ## Summary Creates a new `bulkPartialUpdate` function in the `task_store` class that uses the ES client to perform bulk partial updates instead of the Saved Objects client. Updates the update in the `mget` task claimer to use this new function. ## To verify Run this branch with the `xpack.task_manager.claim_strategy: 'mget'` and ensure that all tasks are running as expected. --------- Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com> |
||
|
e02e8fc1ff
|
Fix bug where mget task update conflicts are not properly handled (#192392)
In this PR, I'm fixing the mget task claimer to properly handle 409 task update conflicts. Previously they were treated as regular errors, following the regular error path and reflecting in the metrics. With this change, the metrics will no longer show errors when update conflicts exist. ## To verify 1. Set the following diff in your code, this will trigger claim conflicts 100% of the time ``` diff --git a/x-pack/plugins/task_manager/server/task_claimers/strategy_mget.ts b/x-pack/plugins/task_manager/server/task_claimers/strategy_mget.ts index 4eae5d9d707..54b386174f3 100644 --- a/x-pack/plugins/task_manager/server/task_claimers/strategy_mget.ts +++ b/x-pack/plugins/task_manager/server/task_claimers/strategy_mget.ts @@ -194,6 +194,7 @@ async function claimAvailableTasks(opts: TaskClaimerOpts): Promise<ClaimOwnershi // omits "enabled" field from task updates so we don't overwrite // any user initiated changes to "enabled" while the task was running ...omit(task, 'enabled'), + version: 'WzUwMDAsMTAwMF0=', scheduledAt: task.retryAt != null && new Date(task.retryAt).getTime() < Date.now() ? task.retryAt @@ -222,6 +223,7 @@ async function claimAvailableTasks(opts: TaskClaimerOpts): Promise<ClaimOwnershi } else { const { id, type, error } = updateResult.error; if (error.statusCode === 409) { + console.log('*** Conflict'); conflicts++; } else { logger.error(`Error updating task ${id}:${type} during claim: ${error.message}`, logMeta); ``` 2. Startup Elasticsearch and Kibana 3. Observe conflict messages logged and not `Error updating task...` 4. Pull the task manager metrics (`/api/task_manager/metrics`) 5. Observe all task claims are successful and `total_errors` is `0` |
||
|
2c5c8adf81
|
Reapply "[Response Ops][Task Manager] Setting task status directly to running in mget claim strategy (#192303)
Re-doing this PR: https://github.com/elastic/kibana/pull/191669
Reverted because it was causing a flaky test. After a lot of
investigation, it looks like the flakiness was caused by interference
from long-running tasks scheduled as part of other tests. The task
partitions test uses task IDs `1`, `2` and `3` and the tasks were being
short circuited when there were other tasks with UUIDs that started with
`1`, `2` or `3` due to the logic in the task runner that tries to
prevent duplicate recurring tasks from running. That logic just used
`startsWith` to test for duplicates where the identifier is
`${task.id}::${task.executionUUID}`. Updated that logic instead to check
for duplicate `task.id` instead of just using `startsWith` in this
commit:
|
||
|
2ace62cd45
|
Revert ba0485eab4 (#192288)
## Summary Reverting PR https://github.com/elastic/kibana/pull/191669 because it is causing a flaky test https://github.com/elastic/kibana/issues/192023 that requires additional investigation. Unskipped the flaky test and ran the flaky test runner against this PR |
||
|
ef6f3c7d90 | skip failing test suite (#191117) | ||
|
904aee8e6c
|
Task Partitioner improvements (#192261)
In this PR, I'm making a few small improvements to the task manager task partitioner. The improvements include: - Making a Kibana node claim against all the partitions when its node has no assigned partitions (and log a warning) - Making the task partitioner skip the cache whenever the previous fetch did not find any partitions - Log an error message whenever the kibana discover service fails the search for active nodes ## To verify **Claiming against all partitions:** 1. Set `xpack.task_manager.claim_strategy: mget` 2. Apply the following diff to your code ``` diff --git a/x-pack/plugins/task_manager/server/lib/assign_pod_partitions.ts b/x-pack/plugins/task_manager/server/lib/assign_pod_partitions.ts index 8639edd0483..d8d2277dbc1 100644 --- a/x-pack/plugins/task_manager/server/lib/assign_pod_partitions.ts +++ b/x-pack/plugins/task_manager/server/lib/assign_pod_partitions.ts @@ -42,10 +42,10 @@ export function assignPodPartitions({ }: AssignPodPartitionsOpts): number[] { const map = getPartitionMap({ kibanasPerPartition, podNames, partitions }); const podPartitions: number[] = []; - for (const partition of Object.keys(map)) { - if (map[Number(partition)].indexOf(podName) !== -1) { - podPartitions.push(Number(partition)); - } - } + // for (const partition of Object.keys(map)) { + // if (map[Number(partition)].indexOf(podName) !== -1) { + // podPartitions.push(Number(partition)); + // } + // } return podPartitions; } diff --git a/x-pack/plugins/task_manager/server/task_claimers/strategy_mget.ts b/x-pack/plugins/task_manager/server/task_claimers/strategy_mget.ts index c0193917f08..fac229d41fb 100644 --- a/x-pack/plugins/task_manager/server/task_claimers/strategy_mget.ts +++ b/x-pack/plugins/task_manager/server/task_claimers/strategy_mget.ts @@ -367,6 +367,7 @@ async function searchAvailableTasks({ filterDownBy(InactiveTasks), partitions.length ? tasksWithPartitions(partitions) : undefined ); + console.log('queryUnlimitedTasks', JSON.stringify(queryUnlimitedTasks, null, 2)); searches.push({ query: queryUnlimitedTasks, sort, // note: we could optimize this to not sort on priority, for this case @@ -394,6 +395,7 @@ async function searchAvailableTasks({ filterDownBy(InactiveTasks), partitions.length ? tasksWithPartitions(partitions) : undefined ); + console.log('queryLimitedTasks', JSON.stringify(query, null, 2)); searches.push({ query, sort, ``` 3. Startup Elasticsearch and Kibana 4. Notice something like the following logged as a warning ``` Background task node "5b2de169-2785-441b-ae8c-186a1936b17d" has no assigned partitions, claiming against all partitions ```` 5. Observe the queries logged and ensure no filter exists for partitions **Skipping the cache:** 1. Set `xpack.task_manager.claim_strategy: mget` 2. Apply the following diff to your code ``` diff --git a/x-pack/plugins/task_manager/server/lib/task_partitioner.ts b/x-pack/plugins/task_manager/server/lib/task_partitioner.ts index 9e90d459663..8f1d7b312aa 100644 --- a/x-pack/plugins/task_manager/server/lib/task_partitioner.ts +++ b/x-pack/plugins/task_manager/server/lib/task_partitioner.ts @@ -64,20 +64,18 @@ export class TaskPartitioner { // update the pod partitions cache after 10 seconds or when no partitions were previously found if (now - lastUpdated >= CACHE_INTERVAL || this.podPartitions.length === 0) { try { - const allPodNames = await this.getAllPodNames(); - this.podPartitions = assignPodPartitions({ - kibanasPerPartition: this.kibanasPerPartition, - podName: this.podName, - podNames: allPodNames, - partitions: this.allPartitions, - }); + console.log('Loading partitions from Elasticsearch'); + // const allPodNames = await this.getAllPodNames(); + this.podPartitions = []; this.podPartitionsLastUpdated = now; + return this.podPartitions; } catch (error) { this.logger.error(`Failed to load list of active kibana nodes: ${error.message}`); // return the cached value return this.podPartitions; } } + console.log('Loading partitions from cache'); return this.podPartitions; } ``` 3. Startup Elasticsearch and Kibana 4. Notice we only observe `Loading partitions from Elasticsearch` logs and no `Loading partitions from cache` regardless of the 10s cache threshold. **Error logging:** 1. Set `xpack.task_manager.claim_strategy: mget` 2. Apply the following diff to your code ``` diff --git a/x-pack/plugins/task_manager/server/kibana_discovery_service/kibana_discovery_service.ts b/x-pack/plugins/task_manager/server/kibana_discovery_service/kibana_discovery_service.ts index c532cb755f7..a9314aae88c 100644 --- a/x-pack/plugins/task_manager/server/kibana_discovery_service/kibana_discovery_service.ts +++ b/x-pack/plugins/task_manager/server/kibana_discovery_service/kibana_discovery_service.ts @@ -91,6 +91,7 @@ export class KibanaDiscoveryService { } public async getActiveKibanaNodes() { + throw new Error('no'); const { saved_objects: activeNodes } = await this.savedObjectsRepository.find<BackgroundTaskNode>({ type: BACKGROUND_TASK_NODE_SO_NAME, ``` 3. Observe the following error messages logged ``` Failed to load list of active kibana nodes: no ``` |
||
|
dc0ab43f5a
|
Fixes Failing test: Jest Integration Tests.x-pack/plugins/task_manager/server/integration_tests - capacity based claiming should claim tasks to full capacity (#191125)
Resolves https://github.com/elastic/kibana/issues/191117 ## Summary Made the following changes to reduce flakiness: * Set the `runAt` date slightly in the past * Ensured the partitions are properly set when scheduling the task After making these changes, was able to run the capacity claiming tests [50](https://buildkite.com/elastic/kibana-pull-request/builds/232049#0191be3e-9834-4a8f-b0bf-f84a15cab89a) & [65](https://buildkite.com/elastic/kibana-pull-request/builds/232106) times without failing. |
||
|
866a6c9a81
|
[ResponseOps] Errors during marking tasks as running are not shown in metrics (#191300)
Resolves https://github.com/elastic/kibana/issues/184171
## Summary
Errors are not shown in metrics when Elasticsearch returns an error
during `markAsRunning` (changes status from claiming to running)
operation in TaskManager. This PR updates the TaskManager to throw an
error instead of just logging it.
### Checklist
- [ ] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios
### To verify
1. Create an Always Firing rule.
2. Put the below code in the [try block of TaskStore.bulkUpdate
method](
|
||
|
ba0485eab4
|
[Response Ops][Task Manager] Setting task status directly to running in mget claim strategy (#191669)
Resolves https://github.com/elastic/kibana/issues/184739 ## Summary During the `mget` task claim strategy, we set the task directly to `running` with the appropriate `retryAt` value instead of setting it to `claiming` and letting the task runner set the task to `running`. This removes a task update from the claiming process. Updates the task runner to skip the `markTaskAsRunning` function if the claim strategy is `mget` and move the task directly to "ready to run". --------- Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com> |
||
|
ad816b0bf3
|
[Response Ops][Task Manager] Ensure mget claim errors are correctly reflected in task claim metrics (#191309)
Resolves https://github.com/elastic/kibana/issues/190082 ## Summary This PR ensures that any errors during the `mget` task claim process are accurately reflected in the task manager metrics. * Removed try/catch statements within the `mget` claim function so any errors updating/getting the task docs get bubbled up to the polling lifecycle code. This ensures that these errors get properly reporting using existing mechanisms * Reporting any errors inside the `mget` task claim process where individual documents may fail to update even if other bulk operations succeed. ## Verify 1. Verify that errors thrown within the `mget` claim process are reflected in the metrics. To do this, you can throw an error in each of the following functions used during the claim cycle: * `taskStore.msearch` * `taskStore.getDocVersions` * `taskStore.bulkUpdate` * `taskStore.bulkGet` 2. Verify that if `taskStore.bulkUpdate` or `taskStore.bulkGet` return successfully but contain errors within the response, they are reflected as task claim failures in the metrics. --------- Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com> |
||
|
dafce9016c
|
[Response Ops][Task Manager] Emitting error metric when task update fails (#191307)
Resolves https://github.com/elastic/kibana/issues/184173 ## Summary Catches errors updating the task from the `taskStore.bulkUpdate` function and emitting an error count so these errors are reflected in the metrics. ## To Verify 1. Add the following to force an error when running an example rule: ``` --- a/x-pack/plugins/task_manager/server/task_store.ts +++ b/x-pack/plugins/task_manager/server/task_store.ts @@ -24,6 +24,7 @@ import { ISavedObjectsRepository, SavedObjectsUpdateResponse, ElasticsearchClient, + SavedObjectsErrorHelpers, } from '@kbn/core/server'; import { RequestTimeoutsConfig } from './config'; @@ -309,6 +310,16 @@ export class TaskStore { this.logger.warn(`Skipping validation for bulk update because excludeLargeFields=true.`); } + const isProcessResult = docs.some( + (doc) => + doc.taskType === 'alerting:example.always-firing' && + doc.status === 'idle' && + doc.retryAt === null + ); + if (isProcessResult) { + throw SavedObjectsErrorHelpers.decorateEsUnavailableError(new Error('test')); + } + const attributesByDocId = docs.reduce((attrsById, doc) => { ``` 2. Create an `example.always-firing` rule and let it run. You should see an error in the logs: ``` [2024-08-26T14:44:07.065-04:00][ERROR][plugins.taskManager] Task alerting:example.always-firing "80b8481d-7bfc-4d38-a31b-7a559fbe846b" failed: Error: test ``` 3. Navigate to https://localhost:5601/api/task_manager/metrics?reset=false and you should see a framework error underneath the overall metrics and the alerting metrics. Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com> |
||
|
d3fdb7d8d6
|
[ResponseOps][TaskManager] fix limited concurrency starvation in mget task claimer (#187809)
resolves #184937 ## Summary Fixes problem with limited concurrency tasks potentially starving unlimited concurrency tasks, by using `_msearch` to search limited concurrency tasks separately from unlimited concurrency tasks. ### Checklist - [ ] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios - [ ] [Flaky Test Runner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was used on any tests changed --------- Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com> Co-authored-by: Mike Cote <michel.cote@elastic.co> |
||
|
1ee45da333
|
[Response Ops][Task Manager] Add new configurable values (#190934)
Resolves https://github.com/elastic/kibana/issues/190734 ## Summary Making the following configurable: - discovery interval for new discovery service - lookback time when querying for active nodes in discovery service - kibanas per partition for new partitioning functionality |
||
|
5dfa3ecf58
|
Make stop method of the taskManager plugin async (#191218)
towards: https://github.com/elastic/kibana/issues/189306 This PR fixes the `Deleting current node has failed.`errors mentioned in the above issue. |
||
|
68a924411b
|
Update dependency @elastic/elasticsearch to ^8.15.0 (main) (#190378)
Co-authored-by: elastic-renovate-prod[bot] <174716857+elastic-renovate-prod[bot]@users.noreply.github.com> Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com> Co-authored-by: Alejandro Fernández Haro <alejandro.haro@elastic.co> Co-authored-by: Walter Rafelsberger <walter.rafelsberger@elastic.co> |
||
|
53b5d7c9aa | skip failing test suite (#191117) | ||
|
9653d7e1fc
|
[Response Ops][Task Manager] Adding jest integration test to test capacity based claiming (#189431)
Resolves https://github.com/elastic/kibana/issues/189111 ## Summary Adds jest integration test to test cost capacity based claiming with the `mget` claim strategy. Using this integration test, we can exclude running other tasks other than our test types. We register a normal cost task and an XL cost task. We test both that we can claim tasks up to 100% capacity and that we will stop claiming tasks if the next task puts us over capacity, even if that means we're leaving capacity on the table. --------- Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com> |
||
|
fdae1348df
|
Rename task claimers (#190542)
In this PR, I'm renaming the task managers as we prepare to rollout the `mget` task claiming strategy as the default. Rename: - `unsafe_mget` -> `mget` - `default` -> `update_by_query` --------- Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com> |
||
|
06445ee2ce
|
Associate a tiny task cost with action tasks (#190508)
In this PR, I'm registering the action task types with a `cost` of `Tiny` given they are not CPU or memory intensive to run. --------- Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com> |