[8.11] [EDR Workflows][E2E] Increase the timeout of agent check in (#168438) (#168614)

# Backport

This will backport the following commits from `main` to `8.11`:
- [[EDR Workflows][E2E] Increase the timeout of agent check in
(#168438)](https://github.com/elastic/kibana/pull/168438)

<!--- Backport version: 8.9.7 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sqren/backport)

<!--BACKPORT [{"author":{"name":"Konrad
Szwarc","email":"konrad.szwarc@elastic.co"},"sourceCommit":{"committedDate":"2023-10-11T14:26:45Z","message":"[EDR
Workflows][E2E] Increase the timeout of agent check in (#168438)\n\nThis
pull request extends the agent fleet check timeout from 2 minutes\r\nto
4 minutes. We've identified a number of unreliable tests that
fail\r\nduring the `beforeAll` stage while executing the
`createEndpointHost`\r\ntask. The following logs appear before the
timeout:\r\n\r\n```\r\ninfo Enrolling Elastic Agent with Fleet\r\n |
Installing service....... DONE\r\n | Starting service... DONE\r\n |
Enrolling Elastic Agent with Fleet..........Successfully enrolled the
Elastic Agent.\r\n | Elastic Agent has been successfully installed.\r\n
| info Waiting for Agent to check-in with Fleet\r\n```\r\n\r\nThe error
message we encounter is `> Timed out waiting for
host\r\n[test-host-4981] to appear in Fleet.`\r\n\r\nIt appears that all
the preceding steps are successful, and only the\r\nfinal one fails due
to either the agent not checking in with the fleet\r\nfor 2 minutes or
the agent being unhealthy for two minutes. Since I\r\nhaven't been able
to replicate this behavior locally, and there isn't a\r\nway to inspect
what's happening on the agent, I believe the best course\r\nof action at
this point is to extend the timeout and monitor
the\r\nresults.\r\n\r\nReports of this error:\r\ncloses
https://github.com/elastic/kibana/issues/168427\r\ncloses
https://github.com/elastic/kibana/issues/168394\r\ncloses
https://github.com/elastic/kibana/issues/168393\r\ncloses
https://github.com/elastic/kibana/issues/168390\r\ncloses
https://github.com/elastic/kibana/issues/168363\r\ncloses
https://github.com/elastic/kibana/issues/168362\r\ncloses
https://github.com/elastic/kibana/issues/168361\r\ncloses
https://github.com/elastic/kibana/issues/168360\r\ncloses
https://github.com/elastic/kibana/issues/168359\r\n\r\nAffected CI
runs:\r\nhttps://buildkite.com/elastic/kibana-on-merge/builds/36483\r\nhttps://buildkite.com/elastic/kibana-on-merge/builds/36497\r\nhttps://buildkite.com/elastic/kibana-on-merge/builds/36501\r\nhttps://buildkite.com/elastic/kibana-on-merge/builds/36526\r\n\r\nAnother
time out happens from time to time when previously set 10\r\nminutes
timeout on `createEndpointHost` task is not enough to set up
the\r\nenvironment. Its portrayed below, timeout happens during agent
setup\r\n```\r\n  | default: Running: inline script\r\n  | default:
Reading package lists...\r\n  | default: Building dependency
tree...\r\n  | default: Reading state information...\r\n  | default:
Suggested packages:\r\n  | default: zip\r\n  | default: The following
NEW packages will be installed:\r\n  | default: unzip\r\n  | default: 0
upgraded, 1 newly installed, 0 to remove and 0 not upgraded.\r\n  |
default: Need to get 174 kB of archives.\r\n  | default: After this
operation, 385 kB of additional disk space will be used.\r\n  | default:
Get:1 http://archive.ubuntu.com/ubuntu jammy-updates/main amd64 unzip
amd64 6.0-26ubuntu3.1 [174 kB]\r\n  | default: dpkg-preconfigure: unable
to re-open stdin: No such file or directory\r\n  | default: Fetched 174
kB in 1s (210 kB/s)\r\n  | default: Selecting previously unselected
package unzip.\r\n  | (Reading database ... 63961 files and directories
currently installed.)\r\n  | default: Preparing to unpack
.../unzip_6.0-26ubuntu3.1_amd64.deb ...\r\n  | default: Unpacking unzip
(6.0-26ubuntu3.1) ...\r\n  | default: Setting up unzip (6.0-26ubuntu3.1)
...\r\n  | default: Processing triggers for man-db (2.10.2-1) ...\r\n  |
 \r\n  | CypressError: `cy.task('createEndpointHost')` timed out after
waiting
`600000ms`.\r\n```","sha":"91cdbe2d354100683b5d8670de88e0b2cf665ba9","branchLabelMapping":{"^v8.12.0$":"main","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","Team:Defend
Workflows","v8.11.0","v8.12.0"],"number":168438,"url":"https://github.com/elastic/kibana/pull/168438","mergeCommit":{"message":"[EDR
Workflows][E2E] Increase the timeout of agent check in (#168438)\n\nThis
pull request extends the agent fleet check timeout from 2 minutes\r\nto
4 minutes. We've identified a number of unreliable tests that
fail\r\nduring the `beforeAll` stage while executing the
`createEndpointHost`\r\ntask. The following logs appear before the
timeout:\r\n\r\n```\r\ninfo Enrolling Elastic Agent with Fleet\r\n |
Installing service....... DONE\r\n | Starting service... DONE\r\n |
Enrolling Elastic Agent with Fleet..........Successfully enrolled the
Elastic Agent.\r\n | Elastic Agent has been successfully installed.\r\n
| info Waiting for Agent to check-in with Fleet\r\n```\r\n\r\nThe error
message we encounter is `> Timed out waiting for
host\r\n[test-host-4981] to appear in Fleet.`\r\n\r\nIt appears that all
the preceding steps are successful, and only the\r\nfinal one fails due
to either the agent not checking in with the fleet\r\nfor 2 minutes or
the agent being unhealthy for two minutes. Since I\r\nhaven't been able
to replicate this behavior locally, and there isn't a\r\nway to inspect
what's happening on the agent, I believe the best course\r\nof action at
this point is to extend the timeout and monitor
the\r\nresults.\r\n\r\nReports of this error:\r\ncloses
https://github.com/elastic/kibana/issues/168427\r\ncloses
https://github.com/elastic/kibana/issues/168394\r\ncloses
https://github.com/elastic/kibana/issues/168393\r\ncloses
https://github.com/elastic/kibana/issues/168390\r\ncloses
https://github.com/elastic/kibana/issues/168363\r\ncloses
https://github.com/elastic/kibana/issues/168362\r\ncloses
https://github.com/elastic/kibana/issues/168361\r\ncloses
https://github.com/elastic/kibana/issues/168360\r\ncloses
https://github.com/elastic/kibana/issues/168359\r\n\r\nAffected CI
runs:\r\nhttps://buildkite.com/elastic/kibana-on-merge/builds/36483\r\nhttps://buildkite.com/elastic/kibana-on-merge/builds/36497\r\nhttps://buildkite.com/elastic/kibana-on-merge/builds/36501\r\nhttps://buildkite.com/elastic/kibana-on-merge/builds/36526\r\n\r\nAnother
time out happens from time to time when previously set 10\r\nminutes
timeout on `createEndpointHost` task is not enough to set up
the\r\nenvironment. Its portrayed below, timeout happens during agent
setup\r\n```\r\n  | default: Running: inline script\r\n  | default:
Reading package lists...\r\n  | default: Building dependency
tree...\r\n  | default: Reading state information...\r\n  | default:
Suggested packages:\r\n  | default: zip\r\n  | default: The following
NEW packages will be installed:\r\n  | default: unzip\r\n  | default: 0
upgraded, 1 newly installed, 0 to remove and 0 not upgraded.\r\n  |
default: Need to get 174 kB of archives.\r\n  | default: After this
operation, 385 kB of additional disk space will be used.\r\n  | default:
Get:1 http://archive.ubuntu.com/ubuntu jammy-updates/main amd64 unzip
amd64 6.0-26ubuntu3.1 [174 kB]\r\n  | default: dpkg-preconfigure: unable
to re-open stdin: No such file or directory\r\n  | default: Fetched 174
kB in 1s (210 kB/s)\r\n  | default: Selecting previously unselected
package unzip.\r\n  | (Reading database ... 63961 files and directories
currently installed.)\r\n  | default: Preparing to unpack
.../unzip_6.0-26ubuntu3.1_amd64.deb ...\r\n  | default: Unpacking unzip
(6.0-26ubuntu3.1) ...\r\n  | default: Setting up unzip (6.0-26ubuntu3.1)
...\r\n  | default: Processing triggers for man-db (2.10.2-1) ...\r\n  |
 \r\n  | CypressError: `cy.task('createEndpointHost')` timed out after
waiting
`600000ms`.\r\n```","sha":"91cdbe2d354100683b5d8670de88e0b2cf665ba9"}},"sourceBranch":"main","suggestedTargetBranches":["8.11"],"targetPullRequestStates":[{"branch":"8.11","label":"v8.11.0","labelRegex":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"},{"branch":"main","label":"v8.12.0","labelRegex":"^v8.12.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/168438","number":168438,"mergeCommit":{"message":"[EDR
Workflows][E2E] Increase the timeout of agent check in (#168438)\n\nThis
pull request extends the agent fleet check timeout from 2 minutes\r\nto
4 minutes. We've identified a number of unreliable tests that
fail\r\nduring the `beforeAll` stage while executing the
`createEndpointHost`\r\ntask. The following logs appear before the
timeout:\r\n\r\n```\r\ninfo Enrolling Elastic Agent with Fleet\r\n |
Installing service....... DONE\r\n | Starting service... DONE\r\n |
Enrolling Elastic Agent with Fleet..........Successfully enrolled the
Elastic Agent.\r\n | Elastic Agent has been successfully installed.\r\n
| info Waiting for Agent to check-in with Fleet\r\n```\r\n\r\nThe error
message we encounter is `> Timed out waiting for
host\r\n[test-host-4981] to appear in Fleet.`\r\n\r\nIt appears that all
the preceding steps are successful, and only the\r\nfinal one fails due
to either the agent not checking in with the fleet\r\nfor 2 minutes or
the agent being unhealthy for two minutes. Since I\r\nhaven't been able
to replicate this behavior locally, and there isn't a\r\nway to inspect
what's happening on the agent, I believe the best course\r\nof action at
this point is to extend the timeout and monitor
the\r\nresults.\r\n\r\nReports of this error:\r\ncloses
https://github.com/elastic/kibana/issues/168427\r\ncloses
https://github.com/elastic/kibana/issues/168394\r\ncloses
https://github.com/elastic/kibana/issues/168393\r\ncloses
https://github.com/elastic/kibana/issues/168390\r\ncloses
https://github.com/elastic/kibana/issues/168363\r\ncloses
https://github.com/elastic/kibana/issues/168362\r\ncloses
https://github.com/elastic/kibana/issues/168361\r\ncloses
https://github.com/elastic/kibana/issues/168360\r\ncloses
https://github.com/elastic/kibana/issues/168359\r\n\r\nAffected CI
runs:\r\nhttps://buildkite.com/elastic/kibana-on-merge/builds/36483\r\nhttps://buildkite.com/elastic/kibana-on-merge/builds/36497\r\nhttps://buildkite.com/elastic/kibana-on-merge/builds/36501\r\nhttps://buildkite.com/elastic/kibana-on-merge/builds/36526\r\n\r\nAnother
time out happens from time to time when previously set 10\r\nminutes
timeout on `createEndpointHost` task is not enough to set up
the\r\nenvironment. Its portrayed below, timeout happens during agent
setup\r\n```\r\n  | default: Running: inline script\r\n  | default:
Reading package lists...\r\n  | default: Building dependency
tree...\r\n  | default: Reading state information...\r\n  | default:
Suggested packages:\r\n  | default: zip\r\n  | default: The following
NEW packages will be installed:\r\n  | default: unzip\r\n  | default: 0
upgraded, 1 newly installed, 0 to remove and 0 not upgraded.\r\n  |
default: Need to get 174 kB of archives.\r\n  | default: After this
operation, 385 kB of additional disk space will be used.\r\n  | default:
Get:1 http://archive.ubuntu.com/ubuntu jammy-updates/main amd64 unzip
amd64 6.0-26ubuntu3.1 [174 kB]\r\n  | default: dpkg-preconfigure: unable
to re-open stdin: No such file or directory\r\n  | default: Fetched 174
kB in 1s (210 kB/s)\r\n  | default: Selecting previously unselected
package unzip.\r\n  | (Reading database ... 63961 files and directories
currently installed.)\r\n  | default: Preparing to unpack
.../unzip_6.0-26ubuntu3.1_amd64.deb ...\r\n  | default: Unpacking unzip
(6.0-26ubuntu3.1) ...\r\n  | default: Setting up unzip (6.0-26ubuntu3.1)
...\r\n  | default: Processing triggers for man-db (2.10.2-1) ...\r\n  |
 \r\n  | CypressError: `cy.task('createEndpointHost')` timed out after
waiting
`600000ms`.\r\n```","sha":"91cdbe2d354100683b5d8670de88e0b2cf665ba9"}}]}]
BACKPORT-->

Co-authored-by: Konrad Szwarc <konrad.szwarc@elastic.co>
This commit is contained in:
Kibana Machine 2023-10-11 11:58:07 -04:00 committed by GitHub
parent f12738f365
commit 356349b02b
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
2 changed files with 2 additions and 2 deletions

View file

@ -17,6 +17,6 @@ export const createEndpointHost = (
{
agentPolicyId,
},
{ timeout: timeout ?? 600000 }
{ timeout: timeout ?? 900000 } // 15 minutes, since setup can take 10 minutes and more. Task will time out if is not resolved within this time.
);
};

View file

@ -335,7 +335,7 @@ const enrollHostWithFleet = async ({
]);
}
log.info(`Waiting for Agent to check-in with Fleet`);
const agent = await waitForHostToEnroll(kbnClient, vmName, 120000);
const agent = await waitForHostToEnroll(kbnClient, vmName, 240000);
return {
agentId: agent.id,