Skip to content

contour: batch dask chunk compute to bound client memory#3334

Merged
brendancol merged 2 commits into
xarray-contrib:mainfrom
Melissari1997:deep-sweep-performance-contour-01
Jun 15, 2026
Merged

contour: batch dask chunk compute to bound client memory#3334
brendancol merged 2 commits into
xarray-contrib:mainfrom
Melissari1997:deep-sweep-performance-contour-01

Conversation

@Melissari1997

Copy link
Copy Markdown
Collaborator

Summary

Closes #3333.

The dask and dask+cupy contours() backends previously submitted every delayed chunk task in a single dask.compute(*all_results) call. For large dask-backed rasters this materializes all chunk contour results in the client process at once and creates a large single synchronization point for the scheduler. Peak memory therefore scaled with total contour complexity rather than with chunk size.

This change computes chunk results in batches of _DASK_COMPUTE_BATCH_SIZE (default 64). Batching bounds the amount of intermediate chunk geometry held in memory at one time and reduces scheduler pressure, while preserving the same final deduplication and stitching behavior.

Changes

  • Added a private _DASK_COMPUTE_BATCH_SIZE constant.
  • Consolidated _contours_dask and _contours_dask_cupy into a shared _contours_dask_generic helper so the batching logic is not duplicated.
  • Added TestDaskBatchedCompute::test_dask_compute_called_in_batches to verify that compute is called in multiple batches when the chunk count exceeds the batch size, and that the batched result still matches the numpy backend.

Testing

  • python -m pytest xrspatial/tests/test_contour.py -q passes (61 passed, 30 skipped).
  • The new test specifically monkey-patches the batch size to 3 for a 16-chunk raster and confirms 6 dask.compute calls.

Notes

This is a targeted fix for the HIGH-severity materialization finding from the 2026-06-15 performance sweep against xrspatial/contour.py. The MEDIUM Python-loop-over-chunks finding is related but left as-is to keep the change surgical; future work could vectorize the offset/graph construction if needed.

…rib#3333)

The dask and dask+cupy backends previously submitted every delayed chunk
task in a single dask.compute(*all_results) call. For large rasters this
materializes all chunk contour results at once and creates a large single
synchronization point for the scheduler.

Compute chunk results in batches of _DASK_COMPUTE_BATCH_SIZE (default 64)
instead. This bounds the intermediate geometry held in client memory and
reduces scheduler pressure while preserving the same final deduplication
and stitching behavior.

Also consolidate the two nearly-identical dask backends into a shared
_contours_dask_generic helper so the batching logic is not duplicated.
@github-actions github-actions Bot added the performance PR touches performance-sensitive code label Jun 15, 2026
@Melissari1997

Copy link
Copy Markdown
Collaborator Author

PR Review: contour: batch dask chunk compute to bound client memory

Blockers (must fix before merge)

None found.

Suggestions (should fix, not blocking)

  • xrspatial/tests/test_contour.py:640 — The assertion len(compute_calls) == 6 mixes two concerns: the numpy contours() call (which should make zero calls) and the dask call (which should make 6). This is correct now because the test passes explicit levels, skipping the auto-level dask.compute. However, if a future change adds a dask.compute call to the numpy path (e.g., in validation), this test would break in a confusing way. Consider either (a) resetting compute_calls = [] between the two contours() calls, or (b) using separate counters for numpy vs dask paths.

  • benchmarks/benchmarks/ — No contour benchmark exists. This PR is a targeted performance fix, so a benchmark isn't strictly required. However, adding a simple contour benchmark (even a follow-up) would help prevent regressions in the batched-compute path.

Nits (optional improvements)

  • xrspatial/tests/test_contour.py:643 — The batch size and chunk count are chosen so the math works out exactly (16 chunks / batch 3 = 6 calls). If anyone edits either constant without updating the other, the test breaks. Using math.ceil(16 / 3) for the expected value would make the intent clearer.

  • xrspatial/contour.py:472 — No test exercises the path where _DASK_COMPUTE_BATCH_SIZE > number of chunks (single compute call). This is trivially correct but easy to add.

What looks good

  • The batching logic correctly bounds merged.extend() to one batch at a time rather than holding the full chunk_results list.
  • Consolidating _contours_dask and _contours_dask_cupy into _contours_dask_generic eliminates duplicated loop and overlap logic without changing the public API.
  • The new test verifies both mechanism (counting compute calls) and correctness (segment equality with numpy), giving confidence that the optimization doesn't change results.
  • The existing TestBackendEquivalence tests (numpy=dask, numpy=dask_cupy) now implicitly exercise the batched path as well.
  • No changes to the public API, dispatch registration, or the marching squares kernel — the change is cleanly scoped.

Checklist

  • Algorithm matches reference/paper (no algorithmic change)
  • All implemented backends produce consistent results
  • NaN handling is correct (unchanged)
  • Edge cases are covered by tests
  • Dask chunk boundaries handled correctly
  • No premature materialization or unnecessary copies
  • Benchmark exists or is not needed (existing gap, PR is surgical)
  • README feature matrix updated (not applicable — no new public function)
  • Docstrings present and accurate

…st for contour batching

- Reset compute_calls between numpy and dask contours() calls so the
  numpy path cannot pollute the batch count (even though the numpy path
  never calls dask.compute with explicit levels, this makes the test
  robust against future changes).
- Use math.ceil(num_chunks / batch_size) instead of hard-coding 6 so
  the expected count self-documents and won't silently break if
  constants change.
- Add test_single_batch_when_chunks_fit to verify a single dask.compute
  call when all chunks fit in one batch.
@Melissari1997

Copy link
Copy Markdown
Collaborator Author

PR Review: contour: batch dask chunk compute to bound client memory

Blockers (must fix before merge)

None found.

Suggestions (should fix, not blocking)

  • benchmarks/benchmarks/ — No contour benchmark exists. This PR is a targeted
    performance fix, so a benchmark isn't strictly required. Adding a simple
    contour benchmark (even a follow-up) would help prevent regressions in the
    batched-compute path.

Nits (optional improvements)

None.

What looks good

  • The batching logic correctly bounds merged.extend() to one batch at a time
    rather than holding the full chunk_results list.
  • Consolidating _contours_dask and _contours_dask_cupy into
    _contours_dask_generic eliminates duplicated loop and overlap logic without
    changing the public API.
  • test_dask_compute_called_in_batches now resets compute_calls between
    numpy and dask calls, uses math.ceil for the expected count, and includes
    a correctness assertion against the numpy backend.
  • New test_single_batch_when_chunks_fit covers the edge case where all chunks
    fit in one batch.
  • Existing TestBackendEquivalence tests implicitly exercise the batched path.
  • No changes to the public API, dispatch registration, or marching squares
    kernel — the change is cleanly scoped.

Checklist

  • Algorithm matches reference/paper (no algorithmic change)
  • All implemented backends produce consistent results
  • NaN handling is correct (unchanged)
  • Edge cases are covered by tests
  • Dask chunk boundaries handled correctly
  • No premature materialization or unnecessary copies
  • Benchmark exists or is not needed (existing gap, PR is surgical)
  • README feature matrix updated (not applicable — no new public function)
  • Docstrings present and accurate

@brendancol brendancol left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Melissari1997 Thanks for the fix and this should help with scaling

@brendancol brendancol merged commit 2212eca into xarray-contrib:main Jun 15, 2026
7 checks passed
@Melissari1997 Melissari1997 deleted the deep-sweep-performance-contour-01 branch June 15, 2026 13:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

contour: dask backends materialize all chunk results in a single compute

2 participants