Skip to content

fix(ci): Execute newly added or modified scheduled_only tests in pre-submits#4339

Open
mesakhcienet wants to merge 1 commit into
AI-Hypercomputer:mainfrom
CIeNET-International:fix/scheduled-tests-ci
Open

fix(ci): Execute newly added or modified scheduled_only tests in pre-submits#4339
mesakhcienet wants to merge 1 commit into
AI-Hypercomputer:mainfrom
CIeNET-International:fix/scheduled-tests-ci

Conversation

@mesakhcienet

@mesakhcienet mesakhcienet commented Jul 2, 2026

Copy link
Copy Markdown
Collaborator

Description

This change makes the pre-submit execute the scheduled_only tests that a PR introduces or modifies, so each is verified at least once before merge, while unrelated scheduled_only tests remain nightly-only (unchanged).

How it works

  • A pytest collection hook in tests/conftest.py computes the set of tests a PR adds or modifies — via git diff --unified=0 origin/<base>...HEAD — and dynamically applies a new newly_added marker to the matching collected tests.
  • The two test-runner workflows change the pre-submit marker from <marker> and not scheduled_only to <marker> and (not scheduled_only or newly_added). Scheduled (nightly) runs are untouched and continue to run everything.

Detection details / robustness

  • Diff parsing lives in a small, stdlib-only, unit-tested module tests/utils/newly_added_detection.py (parse_changed_tests is pure; get_changed_tests wraps git). Keeping it out of conftest.py lets it be tested without importing absl/pytest/jax.
  • Matching is file-qualified(file_path, test_name) — so identically named tests in different files (e.g. test_basic_call appears in 6 files) do not cross-contaminate and pull unrelated scheduled_only tests into pre-submit.
  • .gitattributes sets *.py diff=python so git diff hunk headers expose the enclosing def test_* for edits inside class methods (the suite is overwhelmingly class-based). This enables detection of modified tests, not just brand-new ones.
  • The package-test checkout uses fetch-depth: 0 so the three-dot origin/main...HEAD diff has a reachable merge-base (otherwise the shallow clone forces a less precise fallback).
  • The base branch falls back to GITHUB_BASE_REF (then main), so PRs targeting a non-main branch are handled.

Files changed

File Change
tests/conftest.py Collection hook applies newly_added (file-qualified match)
tests/utils/newly_added_detection.py (new) Pure diff parser + git wrapper
tests/unit/newly_added_detection_test.py (new) Unit tests for the parser
.github/workflows/run_tests_against_package.yml Marker expression + fetch-depth: 0
.github/workflows/run_pathways_tests.yml Marker expression
.gitattributes (new) *.py diff=python
pytest.ini Register newly_added marker

Shortcomings / future work

  • Detection is a no-op when maxtext_installed=true (the runner does cd /deps, which is not a git work tree). Not on the current PR pre-submit path, but worth noting.
  • run_pathways_tests.yml checks out via a hardcoded git clone https://github.com/google/maxtext.git, which does not resolve fork build SHAs (pre-existing, out of scope here).
  • A change that only deletes lines from a test (no added lines) is not flagged as modified.

Tests

New unit tests in tests/unit/newly_added_detection_test.py (marked cpu_only) cover the diff parser across: new top-level function, new class method, modified method detected via the hunk header, cross-file name-collision isolation, non-test / src/ changes ignored, and multi-file diffs.

All commands run locally against the repo .venv (Python 3.12).

1. Unit tests

$ python3 -m pytest tests/unit/newly_added_detection_test.py -q
6 passed, 1 warning in 0.14s

2. Lint gate (same hooks as code_quality.yml)

$ pre-commit run --files tests/conftest.py tests/utils/newly_added_detection.py \
    tests/unit/newly_added_detection_test.py .gitattributes \
    .github/workflows/run_tests_against_package.yml pytest.ini
codespell....................Passed
pylint.......................Passed
pyink........................Passed
yamllint.....................Passed

3. End-to-end through the real conftest.py — a temporary scheduled_only probe test committed on the branch, collected through the actual tests/conftest.py:

# The new scheduled_only test IS rescued by the newly_added marker:
$ python3 -m pytest -m "cpu_only and (not scheduled_only or newly_added)" tests/unit/_probe_test.py -q
1 passed

# Baseline: under the old marker it would have been skipped:
$ python3 -m pytest -m "cpu_only and not scheduled_only" tests/unit/_probe_test.py -q
1 deselected

This confirms the full chain: git diff → file-qualified parse → newly_added marker → survives -m selection → the scheduled_only test runs. The .gitattributes change was separately verified to make git diff --unified=0 emit @@ ... @@ def test_column(...) headers for edits inside class methods (instead of the enclosing class ... line).

4. CI: the two modified workflows exercise this change on the actual TPU / GPU / CPU and Pathways pre-submit runners in this PR.

Checklist

Before submitting this PR, please make sure (put X in square brackets):

  • I have performed a self-review of my code. For an optional AI review, add the gemini-review label.
  • I have necessary comments in my code, particularly in hard-to-understand areas.
  • I have run end-to-end tests tests and provided workload links above if applicable. — N/A: CI-infrastructure change with no ML workload; verified via unit tests, the lint gate, an end-to-end conftest probe (above), and the modified workflows running in this PR's pre-submit.
  • I have made or will make corresponding changes to the doc if needed, including adding new documentation pages to the relevant Table of Contents (toctree directive) as explained in our documentation. — No user-facing docs affected.

@codecov

codecov Bot commented Jul 2, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@@ -0,0 +1,192 @@
# Copyright 2025 Google LLC

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2026

@@ -0,0 +1,167 @@
# Copyright 2025 Google LLC

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2026

return found


def get_changed_tests(base_ref=None):

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to detect both newly added tests and the modified tests, or just new tests? @shralex

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

detecting modified tests sounds great

Comment thread tests/conftest.py
changed_tests = get_changed_tests() # set[(file_path, test_name)]
if changed_tests:
for item in items:
item_file = item.nodeid.split("::", 1)[0]

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

item_file = os.path.abspath(item.nodeid.split("::", 1)[0])

Comment thread tests/conftest.py
"external_training: goodput integrations",
"decoupled: marked on tests that are not skipped due to GCP deps, when DECOUPLE_GCLOUD=TRUE",
"skip_on_tpu7x: skip test if running on TPU7x platform",
"newly_added: newly introduced tests in PRs, executed even if scheduled_only",

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add "or modified"

stderr=subprocess.DEVNULL,
check=False,
)
try:

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gemini proposes the following (more fallback strategies):

# Diff strategies to try in order of preference.
# Remote specs are tried first (for CI), falling back to local specs (for local development).
diff_strategies = [
    ["origin/{base}...HEAD"],
    ["origin/{base}", "HEAD"],
    ["{base}...HEAD"],
    ["{base}", "HEAD"],
]

diff_text = None
for strategy in diff_strategies:
    try:
        diff_text = subprocess.check_output(
            ["git", "diff", "--unified=0"] + [arg.format(base=base) for arg in strategy],
            text=True,
            stderr=subprocess.DEVNULL,
        )
        break
    except Exception: # pylint: disable=broad-exception-caught
        continue

if diff_text is None:
    return set()

for path, touched_lines in parse_changed_line_map(diff_text).items():
if not _is_test_file(path):
continue
try:

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets convert to absolute path here and return that instead of path
# Resolve the absolute path to open the file and return in the set
full_path = os.path.abspath(os.path.join(repo_root, path))

except OSError:
continue
for name in touched_test_names(source, touched_lines):
changed.add((path, name))

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

full_path

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants