Parallel tests by Johan-Liebert1 · Pull Request #2190 · bootc-dev/bootc

Johan-Liebert1 · 2026-05-08T08:36:40Z

No description provided.

gemini-code-assist

Code Review

This pull request refactors the tmt test runner to support parallel execution of test plans using std::thread. It introduces a RunPlanResult struct and extracts the core test logic into a run_plan function. The review feedback identifies several opportunities for improvement, including addressing interleaved console output from concurrent threads, optimizing the thread-joining logic to avoid waiting for entire batches, handling potential panics when querying system parallelism, and removing redundant clones of owned objects.

cgwalters

Without any kind of deeper review, I think what we really want to do is upstream into https://github.com/teemtee/tmt support for bcvk.

I believe it already has support for concurrency (I mean I'd hope) and such - and that would make it a lot more sustainable for other projects to use tmt+bcvk (and we have many in the ecosystem that would make sense to do so)

Johan-Liebert1 · 2026-05-08T13:09:05Z

One or two test failures, but not as good of a speedup as I'd had locally. GA seems to have 4cpu

cgwalters · 2026-05-08T15:28:53Z

Yes, we can bump to larger runners though. I was experimenting with that previously in bootc-dev/ci-sandbox#1 but it's been a while

Johan-Liebert1 · 2026-05-11T12:41:45Z

Alright, so composefs integration tests now take ~30-40 mins instead of ~1hr - 1hr30min

Tests are still quite inconsistent tough. Especially the GC one randomly takes over 1000s to complete, dunno why. Needs some investigation

Johan-Liebert1 · 2026-05-11T12:47:38Z

The ostree one takes around ~1hr instead of ~2h15min

Some tests are randomly taking a bit too long though

/tmt/plans/integration/plan-32-multi-device-esp: PASSED (1140.54328774s)

I'll try to debug this

`bootc status` for UKIs takes upto 250MB of memory as we load the entire UKI into memory just to extract the cmdline. In bootc-dev/bootc#2190 tests for UKI get OOM killed Signed-off-by: Johan-Liebert1 <pragyanpoudyal41999@gmail.com>

Johan-Liebert1 · 2026-06-02T08:48:01Z

Interesting

content: error: 
    Building UKI: Computing composefs digest: 
        Reading container root: 
            Reading filtered filesystem: 
                Async reading filesystem from .: 
                    Scanning directory .: 
                        Scanning inode usr: Scanning directory usr: Scanning inode lib: Scanning directory lib: 
                            Getting file stats: Reading extended attributes: Operation not supported (os error 95)

I'll look into this one

Signed-off-by: Pragyan Poudyal <pragyanpoudyal41999@gmail.com>

Signed-off-by: Johan-Liebert1 <pragyanpoudyal41999@gmail.com>

So we don't spawn VMs for tests like "composefs-gc" just to do nothing and exit Signed-off-by: Pragyan Poudyal <pragyanpoudyal41999@gmail.com>

Sort tests in descending order of time taken for completion so longer tests get scheduled together. Also, update to use mpsc channels for communication between threads Signed-off-by: Pragyan Poudyal <pragyanpoudyal41999@gmail.com>

Signed-off-by: Pragyan Poudyal <pragyanpoudyal41999@gmail.com>

Use a Mutex guard for VM creation as CI sometimes is failing with ``` Error: 0: Failed to create libvirt domain 1: Failed to start libvirt domain: error: Failed to start domain 'bootc-..' error: failed to create directory '/run/user/1001/libvirt/qemu/run/swtpm': File exists ``` Signed-off-by: Pragyan Poudyal <pragyanpoudyal41999@gmail.com>

To reduce CI failures due to flakes, rerun failed tests Signed-off-by: Pragyan Poudyal <pragyanpoudyal41999@gmail.com>

Johan-Liebert1 · 2026-06-18T08:37:29Z

I think this one's ready for review now

Signed-off-by: Pragyan Poudyal <pragyanpoudyal41999@gmail.com>

cgwalters · 2026-06-22T20:19:36Z

    // Log disk usage after each test run to help diagnose "no space left on device" failures
    println!("Disk usage after plan {}:", plan);
    let _ = cmd!(sh, "df -h").run();
+    let _ = cmd!(sh, "du -h /var/* -d 1").run();


df is a cheap operation, this one is not; I don't think we should do it by default.

Only added this for testing. I'll remove it

cgwalters · 2026-06-22T20:21:33Z

-            } else {
-                println!("Run ID not available - cannot generate detailed report");
-            }
+    let rerun_suffix = format!("{}-rerun", random_suffix);


Shouldn't this be part of tmt?

I am really uncertain about doing this retry by default too at this level.

Ideally, we basically have retries around all network operations. Everything else - we really do want to drive flakes to zero.

Sometimes I can see tests in CI failing because the VM never starts after rebooting. This should hopefully catch that. I'm not sure about how tmt manages VMs, but in the said case, we would have to create a new VM

cgwalters · 2026-06-22T20:22:30Z

    // Launch VM with bcvk
    let firmware_args_slice = firmware_args.as_slice();
+
+    let guard = match libvirt_lock.lock() {


error: failed to create directory '/run/user/1001/libvirt/qemu/run/swtpm': File exists

That's a bug in libvirt that we're just working around, right?

If we have to do a workaround, I think it'd be better in bcvk.

I'm not sure if this would be considered a bug. It only doesn't work sometimes so I believed it was due to a race condition

cgwalters · 2026-06-22T20:22:57Z

 def main [] {
    tap begin "bootc-image-builder qcow2 build test"

+    print ">>>>>>>>>>>>>>>>>> the current working directory <<<<<<<<<<<<<<<<<" ($env.PWD)


This seems...verbose

cgwalters · 2026-06-22T20:23:47Z

 const DISTRO_CENTOS_9: &str = "centos-9";

+// Tests sorted by time taken (descending)
+const TESTS_SORTED_BY_TIME: [&str; 23] = [


Not a fan of duplicating all of our test names, this will get out of date quickly.

yeah... would you prefer adding a key extra-time-taken or something similar so we can do this sort using that key?

cgwalters · 2026-06-22T20:24:11Z


+/// For tests that should only run for composefs systems
+/// Ex. composefs-gc
+const FIELD_SKIP_IF_OSTREE: &str = "skip_if_ostree";


Yep, I like it

cgwalters · 2026-06-22T20:24:47Z

 struct RunPlanResult {
    plan_name: String,
    passed: bool,
+    time_taken: Option<Duration>,


I guess but isn't this part of tmt too?

cgwalters · 2026-06-22T20:25:37Z


+    let mut handles: Vec<JoinHandle<RunPlanResult>> = vec![];
+
+    let num_cpu = std::thread::available_parallelism()


Doesn't tmt support parallelism? I think eventually the right thing to do here is to move the bcvk stuff as a first-class op in tmt.

But I don't object at all to doing it here in the interim.

Johan-Liebert1 added the ci/merge Run full CI suite (all OSes) — equivalent to merge queue label May 8, 2026

bootc-bot Bot requested a review from cgwalters May 8, 2026 08:36

gemini-code-assist Bot reviewed May 8, 2026

View reviewed changes

Comment thread crates/xtask/src/tmt.rs

Comment thread crates/xtask/src/tmt.rs Outdated

Comment thread crates/xtask/src/tmt.rs Outdated

Comment thread crates/xtask/src/tmt.rs Outdated

Comment thread crates/xtask/src/tmt.rs Outdated

Johan-Liebert1 marked this pull request as draft May 8, 2026 08:48

Johan-Liebert1 force-pushed the parallel-tests branch from 122ad1a to 93f4abf Compare May 8, 2026 09:38

cgwalters reviewed May 8, 2026

View reviewed changes

Comment thread .github/workflows/ci.yml Outdated

Comment thread crates/xtask/src/tmt.rs Outdated

Johan-Liebert1 force-pushed the parallel-tests branch from 93f4abf to f2bbb18 Compare May 8, 2026 13:05

Johan-Liebert1 added ci/tier-1 Run CI for tier-1 OS (centos-10) only and removed ci/merge Run full CI suite (all OSes) — equivalent to merge queue labels May 9, 2026

Johan-Liebert1 force-pushed the parallel-tests branch 3 times, most recently from 5230a9f to a98f50a Compare May 11, 2026 10:41

Johan-Liebert1 added ci/merge Run full CI suite (all OSes) — equivalent to merge queue and removed ci/tier-1 Run CI for tier-1 OS (centos-10) only labels May 11, 2026

Johan-Liebert1 force-pushed the parallel-tests branch from a98f50a to cedbd85 Compare May 15, 2026 10:59

cgwalters reviewed May 18, 2026

View reviewed changes

Comment thread crates/xtask/src/tmt.rs Outdated

Johan-Liebert1 force-pushed the parallel-tests branch from 9e8d647 to a02028c Compare May 19, 2026 09:51

Johan-Liebert1 mentioned this pull request May 25, 2026

Implement buffered readers for UKI composefs/composefs-rs#298

Merged

Johan-Liebert1 force-pushed the parallel-tests branch from a02028c to 2110c6c Compare May 25, 2026 08:20

Johan-Liebert1 force-pushed the parallel-tests branch 2 times, most recently from ff859ce to e437f70 Compare May 30, 2026 09:40

Johan-Liebert1 added ci/tier-1 Run CI for tier-1 OS (centos-10) only and removed ci/merge Run full CI suite (all OSes) — equivalent to merge queue labels May 30, 2026

Johan-Liebert1 force-pushed the parallel-tests branch 4 times, most recently from 00e2ca9 to d514d82 Compare June 1, 2026 08:25

Johan-Liebert1 force-pushed the parallel-tests branch from 5f10779 to b0d220c Compare June 8, 2026 05:14

Johan-Liebert1 mentioned this pull request Jun 8, 2026

OverlayFS throws EOPNOTSUPP for listxattr until copy-up is performed composefs/composefs-rs#311

Closed

Johan-Liebert1 force-pushed the parallel-tests branch from b0d220c to 8ebd2af Compare June 8, 2026 09:57

Johan-Liebert1 added 6 commits June 17, 2026 15:50

tmt: Test in parallel

a98093c

Signed-off-by: Pragyan Poudyal <pragyanpoudyal41999@gmail.com>

xtask: Compute time taken by each test

9eb8675

Signed-off-by: Johan-Liebert1 <pragyanpoudyal41999@gmail.com>

tmt: Introduce skip_for_ostree

bfe14e6

So we don't spawn VMs for tests like "composefs-gc" just to do nothing and exit Signed-off-by: Pragyan Poudyal <pragyanpoudyal41999@gmail.com>

tmt/test: Sort tests by time taken, use mpsc channels

e5742e6

Sort tests in descending order of time taken for completion so longer tests get scheduled together. Also, update to use mpsc channels for communication between threads Signed-off-by: Pragyan Poudyal <pragyanpoudyal41999@gmail.com>

tmt/test/bib-build: Save disk image in /var/output

78bf20a

Signed-off-by: Pragyan Poudyal <pragyanpoudyal41999@gmail.com>

Johan-Liebert1 force-pushed the parallel-tests branch from 8ebd2af to 2452901 Compare June 17, 2026 10:32

tmt: Rerun failed tests

bdb74ec

To reduce CI failures due to flakes, rerun failed tests Signed-off-by: Pragyan Poudyal <pragyanpoudyal41999@gmail.com>

Johan-Liebert1 force-pushed the parallel-tests branch from 2452901 to bdb74ec Compare June 18, 2026 05:43

Johan-Liebert1 marked this pull request as ready for review June 18, 2026 08:37

Johan-Liebert1 requested a review from cgwalters June 18, 2026 08:37

bootc-bot Bot requested a review from henrywang June 18, 2026 08:37

tmt: Add du cmd to check disk usage

4d064c6

Signed-off-by: Pragyan Poudyal <pragyanpoudyal41999@gmail.com>

Johan-Liebert1 force-pushed the parallel-tests branch from 6a5c236 to 4d064c6 Compare June 22, 2026 10:09

cgwalters requested changes Jun 22, 2026

View reviewed changes


		let mut handles: Vec<JoinHandle<RunPlanResult>> = vec![];

		let num_cpu = std::thread::available_parallelism()

Conversation

Johan-Liebert1 commented May 8, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cgwalters left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Johan-Liebert1 commented May 8, 2026

Uh oh!

cgwalters commented May 8, 2026

Uh oh!

Johan-Liebert1 commented May 11, 2026

Uh oh!

Johan-Liebert1 commented May 11, 2026

Uh oh!

Uh oh!

Johan-Liebert1 commented Jun 2, 2026

Uh oh!

Johan-Liebert1 commented Jun 18, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants