contour: batch dask chunk compute to bound client memory#3334
Conversation
…rib#3333) The dask and dask+cupy backends previously submitted every delayed chunk task in a single dask.compute(*all_results) call. For large rasters this materializes all chunk contour results at once and creates a large single synchronization point for the scheduler. Compute chunk results in batches of _DASK_COMPUTE_BATCH_SIZE (default 64) instead. This bounds the intermediate geometry held in client memory and reduces scheduler pressure while preserving the same final deduplication and stitching behavior. Also consolidate the two nearly-identical dask backends into a shared _contours_dask_generic helper so the batching logic is not duplicated.
PR Review: contour: batch dask chunk compute to bound client memoryBlockers (must fix before merge)None found. Suggestions (should fix, not blocking)
Nits (optional improvements)
What looks good
Checklist
|
…st for contour batching - Reset compute_calls between numpy and dask contours() calls so the numpy path cannot pollute the batch count (even though the numpy path never calls dask.compute with explicit levels, this makes the test robust against future changes). - Use math.ceil(num_chunks / batch_size) instead of hard-coding 6 so the expected count self-documents and won't silently break if constants change. - Add test_single_batch_when_chunks_fit to verify a single dask.compute call when all chunks fit in one batch.
PR Review: contour: batch dask chunk compute to bound client memoryBlockers (must fix before merge)None found. Suggestions (should fix, not blocking)
Nits (optional improvements)None. What looks good
Checklist
|
brendancol
left a comment
There was a problem hiding this comment.
@Melissari1997 Thanks for the fix and this should help with scaling
Summary
Closes #3333.
The dask and dask+cupy
contours()backends previously submitted every delayed chunk task in a singledask.compute(*all_results)call. For large dask-backed rasters this materializes all chunk contour results in the client process at once and creates a large single synchronization point for the scheduler. Peak memory therefore scaled with total contour complexity rather than with chunk size.This change computes chunk results in batches of
_DASK_COMPUTE_BATCH_SIZE(default 64). Batching bounds the amount of intermediate chunk geometry held in memory at one time and reduces scheduler pressure, while preserving the same final deduplication and stitching behavior.Changes
_DASK_COMPUTE_BATCH_SIZEconstant._contours_daskand_contours_dask_cupyinto a shared_contours_dask_generichelper so the batching logic is not duplicated.TestDaskBatchedCompute::test_dask_compute_called_in_batchesto verify that compute is called in multiple batches when the chunk count exceeds the batch size, and that the batched result still matches the numpy backend.Testing
python -m pytest xrspatial/tests/test_contour.py -qpasses (61 passed, 30 skipped).dask.computecalls.Notes
This is a targeted fix for the HIGH-severity materialization finding from the 2026-06-15 performance sweep against
xrspatial/contour.py. The MEDIUM Python-loop-over-chunks finding is related but left as-is to keep the change surgical; future work could vectorize the offset/graph construction if needed.