## Summary
Currently, if you'd like to test something on Kibana's VM image, you'd
have to build a VM image to -qa, then rewrite all references to
`elastic-images-qa` for the PR jobs; once done with testing, we'd undo
the changes to `elastic-images-prod`.
This is a helpful tool for us to test with WIP VM images, we'd be able
to add a label to the PR, and it would automatically grab the QA images,
without any temporary commits.
Jobs in https://buildkite.com/elastic/kibana-pull-request/builds/289599
have ran with an elastic-qa image. ✅
## Summary
According to:
https://buildkite.com/elastic/kibana-on-merge/builds/65027#0195ca29-b10a-4e20-b00f-c4fbe43689fa
```
Annotate test failures error Request failed with status code 404 AxiosError: Request failed with status code 404
--
| at settle (/opt/buildkite-agent/builds/bk-agent-prod-gcp-1742853500882456889/elastic/kibana-on-merge/kibana/.buildkite/node_modules/axios/lib/core/settle.js:19:12)
...
| at async /opt/buildkite-agent/builds/bk-agent-prod-gcp-1742853500882456889/elastic/kibana-on-merge/kibana/.buildkite/scripts/lifecycle/annotate_test_failures.ts:14:5
| HTTP Error 404/Not Found (https://api.buildkite.com/v2/organizations/elastic/pipelines/kibana-on-merge/builds/65027/artifacts?page=2&per_page=100) { message: 'Not Found' }
```
This points to the client collecting all artifacts through traversing
the `next` links from Buildkite's API responses. It appears, Axios is
not happy about these absolute paths, even if the origin is the same.
This PR adjusts the next link parsing to relativize compared to a base
url.
## Summary
Extracts `collectEnvFromLabels` to a separate module, so it can be used
in the flaky test runner. With this, the label `ci:use-chrome-beta` will
be passed along to the flaky test runner, allowing for flaky testing on
chrome beta.
Other labels we treat as modifiers for PR behavior through setting env
variables should also be added to this set of mapping.
This reverts commit d8f6bd694b.
## Summary
Since this upgrade, we're getting 404 on failed test annotation.
Reverting this while we figure out what's causing it.
## Summary
closes https://github.com/elastic/kibana/issues/211592
This PR improves the way we run scout tests by discovering all the
plugins that have the scout tests and run tests in multiple workers:
<img width="1586" alt="image"
src="https://github.com/user-attachments/assets/4936ab50-fefb-470c-af3a-21263b58143f"
/>
How it works:
1. Run search script to find _all existing_ scout playwright config
files across kibana repo
2. Save results into `.scout/scout_playwright_configs.json` file, that
will be used as source to run configs in individual jobs per plugin.
Upload it as BK artifact.
3. Spin up job for each plugin mentioned in
`scout_playwright_configs.json`
4. In each individual job use `scout_playwright_configs.json` and get
configs for specific plugin, use worker with 8 vcpus where tests are run
in parallel (`usesParallelWorkers` prop)
While running configs 1 by 1 collect command exit code with the
following rules:
- configs run passed => exit code `0` , final status remains `0`
- config has no tests => exit code `2`, put config name into
`configsWithoutTests` group to push BK annotation later, change exit
status to `0` - we accept configs without tests at current stage
- config run fails => exit code `1`, final status changed to `1` and job
will fail in the end; put config name into `failedConfigs` group to push
BK annotation later
<img width="1564" alt="Screenshot 2025-02-21 at 14 34 16"
src="https://github.com/user-attachments/assets/06e9298d-466c-46bb-8e85-3d691a40850a"
/>
---------
Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>
## Summary
Similar to https://github.com/elastic/kibana/pull/195581
Adds a pipeline that builds Kibana and starts cloud deployment without
going through the CI test suites (as in normal pull-request pipeline
runs). It can be useful if a developer would like to save time/compute
on re-building/re-testing the whole project before deploying to the
cloud.
Added labels (`ci:cloud-deploy / ci:cloud-redeploy`) are required
similarly to the usual CI flow.
Related to: https://github.com/elastic/kibana-operations/issues/121
## Summary
In #195581 we've added the option to deploy through the clickable
triggers. But in it's current state, it's broken in several aspects.
(1) It's not starting on click. Triggers was resulting in a 422 on
Buildkite's side, and after digging more into it, this was the error:
<img width="1019" alt="Screenshot 2024-10-16 at 16 53 13"
src="https://github.com/user-attachments/assets/f602dde9-2cc4-474f-b432-a3d4f9d5ae91">
Apparently, building PRs needs to be enabled on jobs that want to be
triggered through the PR bot.
(2) It is set up to run regardless of the labels
(3) There's no feedback on runs
## Changes
This PR:
- enables buildability in the pipeline's config
- exits early if deploy labels are missing
- adds a comment on the PR if a deploy job is started or finished
- removes the kibana build step, it's not needed, as we have a step to
build the docker image
TODO:
- [x] Add feedback about a started job (either through a non-required
check, or a github comment)
- [x] Early exit if a label is missing
There are several other builds started right now, because the logic that
would trigger a build on changing a draft to ready. To be fixed in
https://github.com/elastic/buildkite-pr-bot/issues/78
Tested after manually by enabling the option on the UI, and triggering
through the checkbox:
https://buildkite.com/elastic/kibana-deploy-project-from-pr/builds/23
https://github.com/elastic/kibana/pull/194768 without the merge
conflicts.
Switches over to the org wide PR bot, with backwards compatibility for
both versions.
Updating the pipeline definition here is a global set for environment
variables on all branches, so I intend on merging the backports first to
support both versions and then proceeding with this.
## Summary
Updated `js-yaml` to `4.1.0`.
This PR also introduces a type override for the `js-yaml` load function
to maintain compatibility across the codebase. Specifically, updated
type definition of the load function looks as follows:
```typescript
function load<T = any>(str: string, opts?: jsyaml.LoadOptions): T;
```
The original type definition of the load function in `js-yaml` changed
from `any` to `unknown`. This change would require extensive type
updates throughout the entire repository to accommodate the `unknown`
type. To avoid widespread type changes and potential issues in the
codebase, the type is overriden back to `any` for now.
This is a temporary measure, we plan to address the necessary type
changes in subsequent PRs, where teams will gradually update the
codebase to work with the `unknown` type.
### Checklist
- [x] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios
## Release note
Updated `js-yaml` to `4.1.0`.
---------
Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>
Co-authored-by: Maxim Palenov <maxim.palenov@elastic.co>
Updates our base version to 9.0.0
For reviewers: there are test skips in this pull request. Please assess
whether these failures should block merging as part of your review. If
not, we will track them in
https://github.com/elastic/kibana/issues/192624.
---------
Co-authored-by: Sebastián Zaffarano <sebastian.zaffarano@elastic.co>
## Summary
Updates usage of `js-yaml` `load` and `dump` to `safeLoad` and
`safeDump`, in preparation for a major version update of dependency,
where the default behavior will be that of the safe function variants.
## Note to reviewers
`safeDump` will throw if it encounters invalid types (e.g. `undefined`),
whereas the `dump` function will still write the file including the
invalid types. This may have an affect within your use cases - if
throwing is not acceptable or is unhandled. To avoid this the
`skipInvalid` option can be used (see
https://github.com/nodeca/js-yaml#dump-object---options-) - this will
write the file, stripping out any invalid types from the input.
Please consider this when reviewing the changes to your code. If the
`skipInvalid` option is needed, please add it, or let us know to make
the change.
---------
Co-authored-by: Sid <siddharthmantri1@gmail.com>
Co-authored-by: “jeramysoucy” <jeramy.soucy@elastic.co>
Co-authored-by: Elena Shostak <elena.shostak@elastic.co>
Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>
Co-authored-by: Maxim Palenov <maxim.palenov@elastic.co>
## Summary
Part of #186515
Split FTR configs manifest into multiple files based on distro
(serverless/stateful) and area of testing (platform/solutions)
Update the CI scripts to support the change, but without logic
modification
More context:
With this change we will have a clear split of FTR test configs owned by
Platform and Solutions. It is a starting point to make configs
discoverable, our test pipelines be flexible and run tests based on
distro/solution.
## Goal
We'd like to introduce a way to run pipelines that have a dependency on
the currently active branch set (managed in
[versions.json](./versions.json)).
With this, we'd like to migrate over the `es-forward` pipelines
(currently:
[this](https://buildkite.com/elastic/kibana-7-dot-17-es-8-dot-15-forward-compatibility),
and
[this](https://buildkite.com/elastic/kibana-7-dot-17-es-8-dot-14-forward-compatibility))
to the new buildkite infra.
## Summary
This PR introduces a new pipeline:
https://buildkite.com/elastic/kibana-trigger-version-dependent-jobs
(through
[trigger-version-dependent-jobs.yml](.buildkite/pipeline-resource-definitions/trigger-version-dependent-jobs.yml)).
The purpose of this new pipeline is to take the name of a "pipelineSet"
that refers to a pipeline, and based on the `versions.json` file, work
out what are the branches on which the referred pipeline should be
triggered.
### Example: `Trigger ES forward compatibility tests`
- a scheduled run on
[kibana-trigger-version-dependent-jobs](https://buildkite.com/elastic/kibana-trigger-version-dependent-jobs)
with the env var `TRIGGER_PIPELINE_SET=es-forward` runs
- the pipeline implementation for
`kibana-trigger-version-dependent-jobs` works out (looking at
`versions.json`), that the `es-forward` set should trigger
https://buildkite.com/elastic/kibana-es-forward (doesn't exist prior to
the PR) for (7.17+8.14) and (7.17+8.15)
- the pipeline implementation uploads two trigger steps, running
https://buildkite.com/elastic/kibana-es-forward in two instances with
the relevant parameterization.
Since the trigger parameters are derived from the `versions.json` file,
if we move on and close `8.14`, and open up `8.16`, this will follow,
without having to update the pipeline resources or schedules.
## Changes
- 2 pipelines created:
[trigger-version-dependent-jobs.yml](.buildkite/pipeline-resource-definitions/trigger-version-dependent-jobs.yml),
[kibana-es-forward.yml](.buildkite/pipeline-resource-definitions/kibana-es-forward.yml)
- [x] add kibana-es-forward.yml
- implementation for `trigger-version-dependent-jobs` added
- branch configuration removed from pipelines (kibana-artifacts-staging,
kibana-artifacts-snapshot, kibana-artifacts-trigger)
- added a script for checking RREs validity (moved a few files)
## Verification
I've used the migration staging pipeline (*) to run this:
-
https://buildkite.com/elastic/kibana-migration-pipeline-staging/builds/130
- Env: `TRIGGER_PIPELINE_SET="artifacts-trigger"`
- Result:
[(success):](https://buildkite.com/elastic/kibana-artifacts-trigger/builds/10806)
it triggered for 8.14 only (as expected)
-
https://buildkite.com/elastic/kibana-migration-pipeline-staging/builds/131
- Env: `TRIGGER_PIPELINE_SET="es-forward"`
- Result: (success): it generated 2 trigger steps, but since the
es-forward pipeline doesn't exist, the upload step failed
-
https://buildkite.com/elastic/kibana-migration-pipeline-staging/builds/132
- Env: `TRIGGER_PIPELINE_SET="artifacts-snapshot"`
- Result: (success): it triggered jobs for all 3 open branches
(main/8.14/7.17)
-
https://buildkite.com/elastic/kibana-migration-pipeline-staging/builds/134
- Env: `TRIGGER_PIPELINE_SET="artifacts-staging"`
- Result: (success): it triggered 8.14 / 7.14, but not for main
(*note: this migration staging pipeline will come in handy even after
the migration, to stage newly created pipelines without creating the
resource up-front)
## Summary
These were used for testing the migration from the kibana-buildkite
infra to the elastic-wide buildkite infra. Now we're done with most of
the migration, we can clean these up.
## Summary
- Closes https://github.com/elastic/kibana-operations/issues/100
- Utilizes FIPS agent from elastic/ci-agent-images#686
- Adds dynamic agent selection during PR pipeline upload
- FIPS agents can be used with `FTR_ENABLE_FIPS_AGENT` env variable or
`ci:enable-fips-agent` label
- Removes agent image config from individual steps in favor of image
config for the whole pipeline.
- Steps can still override this config by adding `image`, `imageProject`
etc
- Adds a conditional assertion to `Check` CI step which validates that
FIPS is working properly
### Testing
- [Pipeline run using FIPS
agents](https://buildkite.com/elastic/kibana-pull-request/builds/215332)
- Failures are expected and this possibly ran with flaky tests
## Summary
rename `SLACK_NOTIFICATIONS_ENABLED` =>
`ELASTIC_SLACK_NOTIFICATIONS_ENABLED` to follow up on elastic-wide
buildkite changes.
This should re-enable test failure listing on the slack errors we post.
After migrating to gobld, the string identifying spot instances has
changed. This updates the check to determine if there are retries
available by filtering on the metadata for both gobld and
buildkite-agent-manager.
## Summary
Extends the flaky-test-runner with the capability to comment on the
flaky test runs on the PR that's being tested.
Closes: https://github.com/elastic/kibana/issues/173129
- chore(flaky-test-runner): Add a step to collect results and comment on
the tested PR
## Summary
While the daily coverage job is migrated
(https://buildkite.com/elastic/kibana-code-coverage-main/) to the new
infra, it will want to make use of the `pick_test_group_run_order` to
schedule jest tests. However, the generated pipline steps would need
some adjustment on the new infra.
This PR adds a branching function that generates agent targeting rules
that work according to the serving infra.
Fixes the issue:
https://buildkite.com/elastic/kibana-pull-request/builds/201218
## Summary
Removes resource def and pipeline for:
https://buildkite.com/elastic/kibana-serverless-release
Adds a step after deploy tag creation to trigger a GPCTL release flow
for the commit hash - this step needs to be emitted because some of its
parameters are not available at the time of uploading the original
pipeline file.
- [x] Test without firing off a release
<img width="1250" alt="Screenshot 2024-02-23 at 11 43 08"
src="f8b2fd04-8b44-4beb-949e-5f45a3573bc5">
Closes https://github.com/elastic/kibana-operations/issues/59 - see this
for the full context.
---------
Co-authored-by: Thomas Watson <w@tson.dk>
## Summary
Uses the mechanisms from above to decide what recent commit passed all
three checks successfully (on-merge job, artifact build, FTR tests
containing that commit).
If `AUTO_SELECT_COMMIT` is set to `1/true`, the release candidate will
automatically be selected and tagged. This proceeds to QA automatically
afterward.
Builds on #170655
Example run:
https://buildkite.com/elastic/kibana-serverless-release-1/builds/83
Closes: https://github.com/elastic/kibana-operations/issues/30
---------
Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>
Co-authored-by: Thomas Watson <w@tson.dk>
Co-authored-by: Thomas Watson <watson@elastic.co>
## Summary
Connected to: https://github.com/elastic/kibana-operations/issues/18
Pre-requisite for:
https://github.com/elastic/kibana-operations/issues/30
You can test the current assistant from the branch:
https://buildkite.com/elastic/kibana-serverless-release-1/builds?branch=buildkite-job-for-deployment
- use `DRY_RUN=1` in the runtime params to not trigger an actual release
:)
This PR creates the contents of a Buildkite job to assist the Kibana
Serverless Release initiation process at the very beginning and lay some
groundwork for further additions to the release management.
At the end of the day, we would like to create a tag deploy@<timestamp>
which will be picked up by another job that listens to these tags:
https://buildkite.com/elastic/kibana-serverless-release. However,
several parts of the preparation for release require manual research,
collecting information about target releases, running scripts, etc.
Any further addition to what would be useful for someone wanting to
start a release could be contained here.
Furthermore, we could also trigger downstream jobs from here. e.g.:
https://buildkite.com/elastic/kibana-serverless-release is currently set
up to listen for a git tag, but we may as well just trigger the job
after we've created a tag.
Check out an example run at:
https://buildkite.com/elastic/kibana-serverless-release-1/builds/72
(visible only if you're a
member of @ elastic/kibana-release-operators)
Missing features compared to the git action:
- [x] Slack notification about the started deploy
- [x] full "useful links" section
Missing features:
- [x] there's a bit of useful context that should be integrated to the
display of the FTR results (*)
- [x] skip listing and analysis if a commit sha is passed in env
(*) - Currently, we display the next FTR test suite that ran after the
merge of the PR. However, the next FTR that will contain the changes,
and show useful info related to the changeset is ONLY in the FTR that's
ran after the first successful onMerge after the merge commit. Meaning:
if main is failing when the change is merged, an FTR suite won't pick up
the change right after.
---------
Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>
Co-authored-by: Thomas Watson <w@tson.dk>
Co-authored-by: Thomas Watson <watson@elastic.co>
## Summary
Some post-build steps are failing because sometimes we're outgrowing the
buildkite metadata limits.
This PR will prevent upload of a text block too big, instead it will try
to gesture towards the build for more info.
Pipelines running triggered builds against older commits can create ci
stats with out of date metrics. When used as a reference point against
the branch HEAD, the number of selected groups may not accurately
reflect the current state.
## Summary
Allows for env-var controlled filtering of executed test suites in the
ES Serverless verification job, and adjusts existing flags and behaviour
to better-fit future usage.
We're changing the job to not publish the `latest-verified` tag by
default. This is to prevent external calls to the job from accidentally
promoting random tags to `latest-verified`, see changes below.
Flag changes:
- `PUBLISH_DOCKER_TAG`: if set to 1/true, passing runs will promote the
tested ES Serverless tag to `latest-verified`.
- (used to be default, now it requires this flag)
- `PUBLISH_MANIFEST`: if set to 1/true, passing runs will upload the
manifest attesting what (kibana + es) combination was used in the test
- (used to be called `UPLOAD_MANIFEST`)
- `SKIP_CYPRESS`: if set to 1/true, it will skip running the cypress
tests
- new flag
- `FTR_EXTRA_ARGS`: a string argument, if passed, it will be forwarded
verbatim to the FTR run script
- new flag, can be used to control the filtering required for #167611
(eg.: `FTR_EXTRA_ARGS="--include-tag=ml"`)
Example run:
https://buildkite.com/elastic/kibana-elasticsearch-serverless-verify-and-promote/builds/64#018b1a30-3360-4995-874a-864f18e104d5Closes: #168376
## Summary
As of last year we stopped supporting FTR configs
in the code coverage buildkite job.
While investigating a flaky test,
I noticed the file's presence in
the buildkite artifacts ui.
This pr drops that.
**Note to Reviewers:**
`.buildkite/scripts/steps/test/pick_test_group_run_order.sh` is
currently used in 4 places:
1. `.buildkite/pipelines/code_coverage/daily.yml` **this is where this
pr is concerned**
1. `.buildkite/pipelines/pull_request/base.yml`
1. `.buildkite/pipelines/on_merge.yml`
1. `.buildkite/pipelines/es_snapshots/verify.yml`
This change is small but this file is shared, so we've to keep this in
mind.
---------
Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>
FTR groups on CI target a 40 minute runtime. In situations where tests
are updated or moved, and there's no prior data, we're seeing occasional
timeouts with a 60 minute timeout. This increases the timeout to 90
minutes.
## Dearest Reviewers 👋
I've been working on this branch with @mistic and @tylersmalley and
we're really confident in these changes. Additionally, this changes code
in nearly every package in the repo so we don't plan to wait for reviews
to get in before merging this. If you'd like to have a concern
addressed, please feel free to leave a review, but assuming that nobody
raises a blocker in the next 24 hours we plan to merge this EOD pacific
tomorrow, 12/22.
We'll be paying close attention to any issues this causes after merging
and work on getting those fixed ASAP. 🚀
---
The operations team is not confident that we'll have the time to achieve
what we originally set out to accomplish by moving to Bazel with the
time and resources we have available. We have also bought ourselves some
headroom with improvements to babel-register, optimizer caching, and
typescript project structure.
In order to make sure we deliver packages as quickly as possible (many
teams really want them), with a usable and familiar developer
experience, this PR removes Bazel for building packages in favor of
using the same JIT transpilation we use for plugins.
Additionally, packages now use `kbn_references` (again, just copying the
dx from plugins to packages).
Because of the complex relationships between packages/plugins and in
order to prepare ourselves for automatic dependency detection tools we
plan to use in the future, this PR also introduces a "TS Project Linter"
which will validate that every tsconfig.json file meets a few
requirements:
1. the chain of base config files extended by each config includes
`tsconfig.base.json` and not `tsconfig.json`
1. the `include` config is used, and not `files`
2. the `exclude` config includes `target/**/*`
3. the `outDir` compiler option is specified as `target/types`
1. none of these compiler options are specified: `declaration`,
`declarationMap`, `emitDeclarationOnly`, `skipLibCheck`, `target`,
`paths`
4. all references to other packages/plugins use their pkg id, ie:
```js
// valid
{
"kbn_references": ["@kbn/core"]
}
// not valid
{
"kbn_references": [{ "path": "../../../src/core/tsconfig.json" }]
}
```
5. only packages/plugins which are imported somewhere in the ts code are
listed in `kbn_references`
This linter is not only validating all of the tsconfig.json files, but
it also will fix these config files to deal with just about any
violation that can be produced. Just run `node scripts/ts_project_linter
--fix` locally to apply these fixes, or let CI take care of
automatically fixing things and pushing the changes to your PR.
> **Example:** [`64e93e5`
(#146212)](64e93e5806)
When I merged main into my PR it included a change which removed the
`@kbn/core-injected-metadata-browser` package. After resolving the
conflicts I missed a few tsconfig files which included references to the
now removed package. The TS Project Linter identified that these
references were removed from the code and pushed a change to the PR to
remove them from the tsconfig.json files.
## No bazel? Does that mean no packages??
Nope! We're still doing packages but we're pretty sure now that we won't
be using Bazel to accomplish the 'distributed caching' and 'change-based
tasks' portions of the packages project.
This PR actually makes packages much easier to work with and will be
followed up with the bundling benefits described by the original
packages RFC. Then we'll work on documentation and advocacy for using
packages for any and all new code.
We're pretty confident that implementing distributed caching and
change-based tasks will be necessary in the future, but because of
recent improvements in the repo we think we can live without them for
**at least** a year.
## Wait, there are still BUILD.bazel files in the repo
Yes, there are still three webpack bundles which are built by Bazel: the
`@kbn/ui-shared-deps-npm` DLL, `@kbn/ui-shared-deps-src` externals, and
the `@kbn/monaco` workers. These three webpack bundles are still created
during bootstrap and remotely cached using bazel. The next phase of this
project is to figure out how to get the package bundling features
described in the RFC with the current optimizer, and we expect these
bundles to go away then. Until then any package that is used in those
three bundles still needs to have a BUILD.bazel file so that they can be
referenced by the remaining webpack builds.
Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>