Make redhat exhaustive test install command more robust (#17592)

We are seeing a frequent issue in CI where tests are failing to install via
package manager on rhel8. It looks like the issue may be due to processes that
are being terminated and leaving behind lock files on the rpm database. This
commit udpates the install method to try to do a database rebuild routine which
will force lock files to be removed. This fix assumes that at the time of
testing, our tests should take priority over any package manager operation and
we can prioritize our command over anything else that may be interacting with
the package manager.
This commit is contained in:
Cas Donoghue 2025-04-29 11:18:13 -07:00 committed by GitHub
parent 8a71eb7de9
commit 01962f6b5d
No known key found for this signature in database
GPG key ID: B5690EEEBB952194

View file

@ -42,12 +42,21 @@ module ServiceTester
end
end
def install(package)
cmd = sudo_exec!("yum install -y #{package}")
def install(package, retry_db_mismatch = true)
cmd = sudo_exec!("yum install -y #{package}")
if cmd.exit_status != 0
if retry_db_mismatch && cmd.stderr.to_s.include?("DB_VERSION_MISMATCH")
# There appears to be a race condition where lockfiles are left behind by
# processes that are not properly terminated. This can cause the RPM database to
# be in an inconsistent state. The solution is to remove and rebuild. See
# https://github.com/elastic/endgame-create-iso/pull/33 for example in our CI
puts "DB_VERSION_MISMATCH detected, fixing RPM database"
sudo_exec!("rm -f /var/lib/rpm/__db*")
sudo_exec!("rpm --rebuilddb")
return install(package, false)
end
raise InstallException.new(cmd.stderr.to_s)
end
end
def uninstall(package)