Skip to content

cypher_with: add ORDER BY to non-deterministic RETURN queries#2436

Merged
muhammadshoaib merged 1 commit into
apache:masterfrom
jrgemignani:stabilize_cypher_with_regression_tests
Jun 21, 2026
Merged

cypher_with: add ORDER BY to non-deterministic RETURN queries#2436
muhammadshoaib merged 1 commit into
apache:masterfrom
jrgemignani:stabilize_cypher_with_regression_tests

Conversation

@jrgemignani

Copy link
Copy Markdown
Contributor

Several cypher_with regression queries RETURN multiple rows without an ORDER BY, so their row order depends on heap/scan order and can vary between runs, build types, and platforms. Add ORDER BY ASC to those queries so the expected output is stable. Ordering keys use id() (a single int64 that bypasses the locale-sensitive string comparison path and is reproducible from the test's deterministic setup order), or the projected path/scalar where that is what the query returns. Where the underlying vertex/edge was dropped by a WITH projection, its id is threaded through as an alias rather than reordering the projection.

Full audit of cypher_with: all 23 multi-row result blocks were checked. After this change, every multi-row, non-EXPLAIN RETURN is deterministically ordered. The two remaining unordered multi-row blocks are left as-is:

  • "RETURN lbl" returns two identical "Person" rows, so order cannot drift;
  • the 13 EXPLAIN (VERBOSE, COSTS OFF) plan blocks emit a fixed serial plan (no parallel/gather nodes), so their row order is already deterministic.

This is a test-only change (regress/sql/cypher_with.sql and regress/expected/cypher_with.out); no extension C code or SQL is modified. Row counts are unchanged (pure reordering).

All 37 regression tests pass (installcheck) on PostgreSQL 18.3.

Co-authored-by: GitHub Copilot noreply@github.com

modified: regress/expected/cypher_with.out
modified: regress/sql/cypher_with.sql

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated no new comments.

@jrgemignani jrgemignani force-pushed the stabilize_cypher_with_regression_tests branch from 4e14eae to 9cb8621 Compare June 11, 2026 14:16
@jrgemignani jrgemignani force-pushed the stabilize_cypher_with_regression_tests branch from 9cb8621 to 05d5c71 Compare June 12, 2026 16:08
Several cypher_with regression queries RETURN multiple rows without an
ORDER BY, so their row order depends on heap/scan order and can vary
between runs, build types, and platforms. Add ORDER BY ASC to those
queries so the expected output is stable. Ordering keys use id() (a
single int64 that bypasses the locale-sensitive string comparison path
and is reproducible from the test's deterministic setup order), or the
projected path/scalar where that is what the query returns. Where the
underlying vertex/edge was dropped by a WITH projection, its id is
threaded through as an alias rather than reordering the projection.

Full audit of cypher_with: all 23 multi-row result blocks were checked.
After this change, every multi-row, non-EXPLAIN RETURN is deterministically
ordered. The two remaining unordered multi-row blocks are left as-is:
- "RETURN lbl" returns two identical "Person" rows, so order cannot drift;
- the 13 EXPLAIN (VERBOSE, COSTS OFF) plan blocks emit a fixed serial plan
  (no parallel/gather nodes), so their row order is already deterministic.

This is a test-only change (regress/sql/cypher_with.sql and
regress/expected/cypher_with.out); no extension C code or SQL is modified.
Row counts are unchanged (pure reordering).

All 37 regression tests pass (installcheck) on PostgreSQL 18.3.

Co-authored-by: GitHub Copilot <noreply@github.com>

modified:   regress/expected/cypher_with.out
modified:   regress/sql/cypher_with.sql
@jrgemignani jrgemignani force-pushed the stabilize_cypher_with_regression_tests branch from 05d5c71 to ad7b673 Compare June 21, 2026 04:17

@gregfelice gregfelice left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed the diff and reproduced locally.

Confirmed every modified block in cypher_with.out is a pure reordering — same row multiset and same output columns before and after. Threading id(e) AS eid through the WITH to get a stable ORDER BY key without altering the RETURN projection is a clean approach.

Reproduced the full suite green on a cassert build (PG18 --enable-cassert -O0, AGE compiled with -Werror), so the updated expected output regenerates exactly:

ok 15  - cypher_with   104 ms
...
# All 40 tests passed.

One non-blocking question on the LIMIT 5 block:

... WITH x LIMIT 5 RETURN x ORDER BY x

The LIMIT 5 is applied in the WITH, before the RETURN ... ORDER BY, so the new ORDER BY only sorts whichever 5 rows survive the (still unordered) LIMIT. If that source ever yields more than 5 candidate rows, the selection — and therefore the output — could still vary with heap order. The current dataset may keep it deterministic; an ORDER BY ... LIMIT inside the WITH would make it robust to future data changes. Happy to leave as-is if you've confirmed the source is ≤5 rows.

Otherwise a clean, well-audited determinism fix. +1.

@muhammadshoaib muhammadshoaib left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good

@muhammadshoaib muhammadshoaib merged commit 3cc74a4 into apache:master Jun 21, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants