Add reduce() list folding function#2435
Conversation
There was a problem hiding this comment.
Pull request overview
This PR adds openCypher reduce(acc = init, var IN list | body) support by desugaring the expression at transform time into a correlated scalar subquery that folds over unnest(list) WITH ORDINALITY using a new ordered aggregate (ag_catalog.age_reduce) and a serialized standalone fold-body expression evaluated inside the aggregate transition function.
Changes:
- Adds a new
cypher_reduceExtensibleNode plus grammar/keyword support for parsingreduce(...). - Implements
transform_cypher_reduce()to rewritereduce()into an ordered aggregate overunnest(... ) WITH ORDINALITY, serializing the fold body with accumulator/element rewritten toPARAM_EXECslots. - Adds
ag_catalog.age_reduce+age_reduce_transfn(C) and a comprehensive regression test suite (age_reduce).
Reviewed changes
Copilot reviewed 19 out of 19 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| Makefile | Registers age_reduce in the regression test list. |
| age--1.7.0--y.y.y.sql | Adds upgrade-time creation of age_reduce_transfn and age_reduce aggregate. |
| regress/expected/age_reduce.out | Expected output for the new reduce() regression coverage. |
| regress/sql/age_reduce.sql | New regression tests covering semantics, order, null/empty behavior, and unsupported constructs. |
| sql/age_aggregate.sql | Defines age_reduce_transfn and age_reduce aggregate in install SQL. |
| src/backend/nodes/ag_nodes.c | Registers cypher_reduce in node name/method tables. |
| src/backend/nodes/cypher_copyfuncs.c | Adds copy support for cypher_reduce. |
| src/backend/nodes/cypher_outfuncs.c | Adds serialization support for cypher_reduce. |
| src/backend/nodes/cypher_readfuncs.c | Adds deserialization support for cypher_reduce. |
| src/backend/parser/cypher_analyze.c | Extends raw-expression walker to traverse cypher_reduce fields. |
| src/backend/parser/cypher_clause.c | Implements transform_cypher_reduce() desugaring and fold-body param rewrite/validation. |
| src/backend/parser/cypher_gram.y | Adds reduce(...) grammar production and marks REDUCE as safe keyword for map keys. |
| src/backend/utils/adt/agtype.c | Implements age_reduce_transfn (aggregate transition fn) that deserializes/compiles and evaluates the fold body. |
| src/include/nodes/ag_nodes.h | Adds cypher_reduce_t tag. |
| src/include/nodes/cypher_copyfuncs.h | Declares copy_cypher_reduce. |
| src/include/nodes/cypher_nodes.h | Defines the cypher_reduce ExtensibleNode struct. |
| src/include/nodes/cypher_outfuncs.h | Declares out_cypher_reduce. |
| src/include/nodes/cypher_readfuncs.h | Declares read_cypher_reduce. |
| src/include/parser/cypher_kwlist.h | Adds reduce keyword token (REDUCE). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
36162be to
6c68874
Compare
6c68874 to
80841fd
Compare
Implement the openCypher reduce(acc = init, var IN list | body) expression,
which folds an arbitrary expression over a list, threading an accumulator
across the elements in list order. This closes a long-standing gap (reduce()
was previously unsupported) and works both at the SQL top level and inside a
cypher() RETURN/WHERE.
Implementation
--------------
reduce() is desugared, at transform time, into a correlated scalar subquery
over a new ordered aggregate rather than a new executor node, so no PostgreSQL
core changes are required:
CASE WHEN list IS NULL THEN NULL
ELSE COALESCE((SELECT ag_catalog.age_reduce(
<init>, '<serialized-body>'::text,
r.elem ORDER BY r.ord)
FROM unnest(<list>) WITH ORDINALITY AS r(elem, ord)),
<init>)
END
- A new cypher_reduce extensible node carries the accumulator/element names
and the init/list/body expressions (grammar production, keyword, and the
copy/out/read serialization plumbing).
- The fold body is transformed against a throwaway two-column agtype
namespace, its accumulator and element Var references are rewritten to
PARAM_EXEC params 0 and 1, and it is serialized with nodeToString() into a
text argument.
- age_reduce_transfn (a custom agtype aggregate transition function)
deserializes and compiles the body once per group with ExecInitExpr, then
evaluates it per element with ExecEvalExpr, rebinding the two params. The
body is normalized to agtype at transform time so a boolean or other
non-agtype result cannot be misread as a by-reference Datum.
Semantics
---------
- List order is preserved (unnest WITH ORDINALITY + aggregate ORDER BY).
- An empty list yields the initial value; a NULL list yields NULL.
- The list and initial value may reference outer-query variables (e.g.
reduce(total = 0, n IN nodes(p) | total + n.age)); the body may reference
only the accumulator and element.
- Arithmetic, string, list-building, boolean/comparison (AND/OR/=/>), CASE,
and element property-access bodies are all supported.
- Outer-variable, query-parameter, nested-reduce, and aggregate references
inside the body raise a clean ERRCODE_FEATURE_NOT_SUPPORTED error.
- reduce is registered as a safe keyword so it remains usable as a property
or map key, preserving backward compatibility.
Tests
-----
Adds the age_reduce regression test (registered in the install SQL and the
upgrade template so age_upgrade passes), covering: arithmetic/product/string
folds; order sensitivity; empty/NULL list; NULL element and NULL init;
list-building and CASE bodies; boolean and comparison bodies; element property
access; multiple and nested (in list/init) reduce(); reduce() in boolean
expressions, WHERE, and list comprehensions; folds over collected nodes and
node list properties; the not-supported rejections; and reduce as a map key.
All multi-row results are ordered. 38/38 installcheck pass.
Future work
-----------
The body restriction (accumulator and element only) is a property of the
standalone expression evaluation and can be relaxed without core changes:
- Allow loop-invariant outer-variable and cypher $parameter references in
the body by capturing them as additional eager aggregate arguments bound
to extra param slots.
- Support a nested reduce() inside the body via an SPI-based evaluation
fallback for subquery-bearing bodies.
Aggregates inside the body remain intentionally unsupported, matching the
openCypher specification.
Co-authored-by: Copilot <copilot@github.com>
modified: Makefile
modified: age--1.7.0--y.y.y.sql
new file: regress/expected/age_reduce.out
new file: regress/sql/age_reduce.sql
modified: sql/age_aggregate.sql
modified: src/backend/nodes/ag_nodes.c
modified: src/backend/nodes/cypher_copyfuncs.c
modified: src/backend/nodes/cypher_outfuncs.c
modified: src/backend/nodes/cypher_readfuncs.c
modified: src/backend/parser/cypher_analyze.c
modified: src/backend/parser/cypher_clause.c
modified: src/backend/parser/cypher_gram.y
modified: src/backend/utils/adt/agtype.c
modified: src/include/nodes/ag_nodes.h
modified: src/include/nodes/cypher_copyfuncs.h
modified: src/include/nodes/cypher_nodes.h
modified: src/include/nodes/cypher_outfuncs.h
modified: src/include/nodes/cypher_readfuncs.h
modified: src/include/parser/cypher_kwlist.h
Resolved Conflicts: age--1.7.0--y.y.y.sql
80841fd to
56fb282
Compare
|
Reviewed the full diff and the changed files in context, with two deep passes on the executor/planner mechanics. I independently verified the two subtle correctness traps this kind of desugaring is prone to — the COALESCE-NULL ambiguity and PARAM_EXEC isolation — and both are genuinely handled. Solid work; no blockers or majors. (Source-level review; I could not run Verified correct
Test coverage gaps (cheap to close, no bug implied)The mechanisms below are correct by inspection, but the suite doesn't lock in the exact semantics that make them correct:
(Large-list / interrupt-path stress and deeper nesting are nice-to-haves; the SubLink check makes nesting depth irrelevant.) Adding #1–#3 would be a few lines each and would pin down precisely the behavior this implementation gets right. Nice piece of work overall — the agtype- |
Design check before this lands — desugar-to-aggregate as the foundation for
|
|
@gregfelice I will review this Monday to see what can be adjusted. BTW, it is -> my opinion <- that reduce is closer to an aggregate than to anything else. Also, this is part 1 of 2, maybe 3 for reduce. Those features not supported are covered in the second PR. It was just going to be too big to create just one and review. This is also true for shortest path. |
Implement the openCypher reduce(acc = init, var IN list | body) expression, which folds an arbitrary expression over a list, threading an accumulator across the elements in list order. This closes a long-standing gap (reduce() was previously unsupported) and works both at the SQL top level and inside a cypher() RETURN/WHERE.
Implementation
reduce() is desugared, at transform time, into a correlated scalar subquery over a new ordered aggregate rather than a new executor node, so no PostgreSQL core changes are required:
Semantics
Tests
Adds the age_reduce regression test (registered in the install SQL and the upgrade template so age_upgrade passes), covering: arithmetic/product/string folds; order sensitivity; empty/NULL list; NULL element and NULL init; list-building and CASE bodies; boolean and comparison bodies; element property access; multiple and nested (in list/init) reduce(); reduce() in boolean expressions, WHERE, and list comprehensions; folds over collected nodes and node list properties; the not-supported rejections; and reduce as a map key. All multi-row results are ordered. 38/38 installcheck pass.
Future work
The body restriction (accumulator and element only) is a property of the standalone expression evaluation and can be relaxed without core changes:
Co-authored-by: Copilot copilot@github.com
modified: Makefile
modified: age--1.7.0--y.y.y.sql
new file: regress/expected/age_reduce.out
new file: regress/sql/age_reduce.sql
modified: sql/age_aggregate.sql
modified: src/backend/nodes/ag_nodes.c
modified: src/backend/nodes/cypher_copyfuncs.c
modified: src/backend/nodes/cypher_outfuncs.c
modified: src/backend/nodes/cypher_readfuncs.c
modified: src/backend/parser/cypher_analyze.c
modified: src/backend/parser/cypher_clause.c
modified: src/backend/parser/cypher_gram.y
modified: src/backend/utils/adt/agtype.c
modified: src/include/nodes/ag_nodes.h
modified: src/include/nodes/cypher_copyfuncs.h
modified: src/include/nodes/cypher_nodes.h
modified: src/include/nodes/cypher_outfuncs.h
modified: src/include/nodes/cypher_readfuncs.h
modified: src/include/parser/cypher_kwlist.h