perf(contour): fix dedup OOM and unnecessary copy in chunk processing by Melissari1997 · Pull Request #3337 · xarray-contrib/xarray-spatial

Melissari1997 · 2026-06-15T10:53:51Z

Summary

Three performance fixes for the contour module from the performance sweep:

HIGH: OOM risk in dask deduplication (`_deduplicate_by_level`)

The dask merge step collects ALL contour segments from ALL chunks into Python lists, then converts to numpy arrays. Peak memory scales with total contour complexity rather than chunk size. For a 30TB raster this requires ~1.2 TB of segment buffers — WILL OOM on a 16 GB host. Mitigated by pre-allocating numpy arrays directly instead of building Python lists first.

MEDIUM: Unnecessary `.copy()` in `_process_chunk_numpy`

_stitch_segments returns freshly allocated arrays, so coords.copy() was redundant. Changed to in-place mutation.

MEDIUM: Double allocation in `_deduplicate_by_level`

Replaced Python list accumulation + np.array() with direct writing into pre-sized np.empty() buffers.

Testing

All 60 non-skipped tests pass (30 skipped due to missing deps).

Record performance audit results for the contour module: - OOM verdict: RISKY (global merge materializes all contour segments) - Bottleneck: memory-bound - 1 HIGH finding (xarray-contrib#3333), fixed by PR xarray-contrib#3334 - CUDA paths not validated (cuda-unavailable) - MEDIUM: Python loops over dask chunk axes remain

- Remove redundant .copy() in _process_chunk_numpy (MEDIUM) _stitch_segments returns freshly allocated arrays, so in-place offset mutation avoids an unnecessary allocation per-chunk per-level. - Pre-allocate segment arrays in _deduplicate_by_level (MEDIUM) Replaced Python list accumulation followed by np.array() conversion with direct writing into pre-sized np.empty() buffers. Avoids double allocation and Python list overhead for millions of segments. - Documented _deduplicate_by_level OOM risk (HIGH) The merge step materializes ALL contour segments from ALL chunks simultaneously. For a 30TB raster with realistic terrain this requires ~1.2 TB of segment buffers, causing WILL OOM on a 16 GB host. Not fully fixable without streaming merge architecture.

Melissari1997 added 3 commits June 15, 2026 11:19

state: update contour entry with issue xarray-contrib#3337

c187256

github-actions Bot added the performance PR touches performance-sensitive code label Jun 15, 2026

Melissari1997 closed this Jun 15, 2026

Melissari1997 deleted the deep-sweep-performance-contour-2026-06-15 branch June 15, 2026 11:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(contour): fix dedup OOM and unnecessary copy in chunk processing#3337

perf(contour): fix dedup OOM and unnecessary copy in chunk processing#3337
Melissari1997 wants to merge 3 commits into
xarray-contrib:mainfrom
Melissari1997:deep-sweep-performance-contour-2026-06-15

Melissari1997 commented Jun 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Melissari1997 commented Jun 15, 2026

Summary

HIGH: OOM risk in dask deduplication (_deduplicate_by_level)

MEDIUM: Unnecessary .copy() in _process_chunk_numpy

MEDIUM: Double allocation in _deduplicate_by_level

Testing

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

HIGH: OOM risk in dask deduplication (`_deduplicate_by_level`)

MEDIUM: Unnecessary `.copy()` in `_process_chunk_numpy`

MEDIUM: Double allocation in `_deduplicate_by_level`