Make redhat exhaustive test install command more robust (#17592)

We are seeing a frequent issue in CI where tests are failing to install via package manager on rhel8. It looks like the issue may be due to processes that are being terminated and leaving behind lock files on the rpm database. This commit udpates the install method to try to do a database rebuild routine which will force lock files to be removed. This fix assumes that at the time of testing, our tests should take priority over any package manager operation and we can prioritize our command over anything else that may be interacting with the package manager.
2025-06-27 17:08:55 -04:00 · 2025-04-29 11:18:13 -07:00 · 2025-04-29 11:18:13 -07:00 · 01962f6b5d
commit 01962f6b5d
parent 8a71eb7de9
1 changed files with 12 additions and 3 deletions
--- a/qa/rspec/commands/redhat.rb
+++ b/qa/rspec/commands/redhat.rb
@ -42,12 +42,21 @@ module ServiceTester
      end
    end

-    def install(package)
-      cmd = sudo_exec!("yum install -y  #{package}")
+    def install(package, retry_db_mismatch = true)
+      cmd = sudo_exec!("yum install -y #{package}")
      if cmd.exit_status != 0
+        if retry_db_mismatch && cmd.stderr.to_s.include?("DB_VERSION_MISMATCH")
+          # There appears to be a race condition where lockfiles are left behind by
+          # processes that are not properly terminated. This can cause the RPM database to
+          # be in an inconsistent state. The solution is to remove and rebuild. See
+          # https://github.com/elastic/endgame-create-iso/pull/33 for example in our CI
+          puts "DB_VERSION_MISMATCH detected, fixing RPM database"
+          sudo_exec!("rm -f /var/lib/rpm/__db*")
+          sudo_exec!("rpm --rebuilddb")
+          return install(package, false)
+        end
        raise InstallException.new(cmd.stderr.to_s)
      end
-
    end

    def uninstall(package)