Fix MERGE16_FIXME: cleanup AO vacuum, PartitionSelector, and qual pushdown#1820
Open
lss602726449 wants to merge 12 commits into
Open
Fix MERGE16_FIXME: cleanup AO vacuum, PartitionSelector, and qual pushdown#1820lss602726449 wants to merge 12 commits into
lss602726449 wants to merge 12 commits into
Conversation
f18cf09 to
968a42d
Compare
AO/AOCO tables have no per-tuple xmin/xmax -- visibility is managed via visibility map at segment level, not per-tuple transaction IDs. The freeze limits computed by vacuum_set_xid_limits are meaningless for AO tables. Worse, passing MultiXactCutoff to vac_update_relstats (vacuum) or swap_relation_files (CLUSTER) incorrectly sets relminmxid on AO tables (whose relminmxid should remain InvalidMultiXactId), causing them to unnecessarily participate in database-wide datminmxid calculation. Fix by: - vacuum_ao.c: remove vacuum_set_xid_limits call, pass Invalid values directly to vac_update_relstats - appendonlyam_handler.c / aocsam_handler.c: remove vacuum_set_xid_limits call in copy_for_cluster, return Invalid values to caller - cluster.c: relax MultiXactId assert to allow InvalidMultiXactId, and reset relminmxid to InvalidMultiXactId for AO tables (matching the existing relfrozenxid override) vacuum_set_xid_limits was a pre-PG16 API kept only for AO callers. With all callers removed, delete the function and its declaration.
The CreatePartitionPruneState() call in nodePartitionSelector.c is correct -- PartitionSelector only needs the pruning data structure, not the initial pruning and subplan map renumbering that ExecInitPartitionPruning() adds on top. Remove the incorrect FIXME. Also remove two dead declarations in execPartition.h: - ExecCreatePartitionPruneState: renamed to CreatePartitionPruneState in PG15 (commit 297daa9), declaration was never cleaned up - ExecFindInitialMatchingSubPlans: folded into ExecFindMatchingSubPlans in the same refactor, declaration was never cleaned up
The subplan check in check_output_expressions was incorrectly using UNSAFE_NOTIN_PARTITIONBY_CLAUSE, which only prevents normal pushdown but still allows the qual to be pushed as a window run condition. Subplans in output expressions should completely block pushdown in Cloudberry's distributed execution model. Add a dedicated UNSAFE_HAS_SUBPLAN flag and include it in the fully unsafe set in qual_is_pushdown_safe, so quals referencing output columns containing subplans are never pushed down.
…ryFileUsage Re-enable temporary file size reporting to pgstat and the associated Assert in FileClose. Fix a SIGSEGV in ReportTemporaryFileUsage that occurred during process exit when the resource owner was already released.
…c_mpp_query The commented-out code stripped write permissions from RTEs on non-root slices. This is unnecessary because InitPlan already skips permission checks on non-writer segments (execMain.c:1821). Additionally, PG16 moved requiredPerms from RTE to RTEPermissionInfo, making the original code incompatible.
The forceoverwrite check is unnecessary here because verify_dir_is_empty_or_create already handles non-empty directories before tar extraction begins.
…d output Enable pg_waldump validation in singlenode prevent_ao_wal test (was disabled by MERGE16_FIXME). Broaden matchignore to handle both "fatal" and "error" messages from pg_waldump. Add missing blank lines in expected output after pg_waldump command output.
Re-enable segwalrep/select_throttle in isolation2_schedule and workfile_mgr_test in singlenode. Fix gpstop -ari to -arf for singlenode mode where immediate shutdown causes crash recovery and blocks connections.
verify_cpu_usage() only used the first sample (all_info[0]) instead of averaging all collected samples, making it sensitive to single-sample fluctuations on busy CI machines. Use the mean of all samples as the function comment originally intended.
a5ed2e3 to
c1d9aa3
Compare
…ouble-release Cloudberry has a built-in scalar 'complex' type whose complex_in doesn't support PG16's soft-error API. Rename to 'complex_t' composite type in rowtypes tests to avoid conflict. Fix record_in double ReleaseTupleDesc bug: remove 6 redundant ReleaseTupleDesc calls before 'goto fail', since the fail label already releases tupdesc. The double-release caused SIGSEGV on malformed composite type input via pg_input_is_valid.
a270b2b to
9ac1d18
Compare
…mirror_down Remove inline CREATE OR REPLACE FUNCTION wait_for_mirror_down and its SELECT calls from vacuum_progress tests. The DDL dispatch to segments created distributed transactions that advanced DistributedLogShared-> oldestXmin past compact XIDs, causing the new vacuum worker in post-cleanup to see stale segments as recyclable (heap_blks_vacuumed=9, index_vacuum_count=2 instead of 0/0). The wait_for_mirror_down function is already defined in setup.sql, so the inline definition was redundant and harmful.
9ac1d18 to
044fcbe
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Remove unnecessary
vacuum_set_xid_limitscalls for AO/AOCO tables: AO tables don't have per-tuple transaction IDs, so freeze/cutoff computation is meaningless. Removed the calls fromvacuum_ao.c,appendonlyam_handler.c(CLUSTER), andaocsam_handler.c(CLUSTER). Fixedcluster.cto properly resetrelminmxidfor AO tables. Deleted the now-unusedvacuum_set_xid_limitsfunction entirely.Remove incorrect FIXME in PartitionSelector: The
CreatePartitionPruneStatecall is correct for Cloudberry's join-based partition pruning (distinct fromExecInitPartitionPruningwhich adds initial pruning + subplan map renumbering). Also removed two dead declarations (ExecCreatePartitionPruneState,ExecFindInitialMatchingSubPlans) left over from PG14.Add
UNSAFE_HAS_SUBPLANflag for qual pushdown safety: Replaced the incorrect reuse ofUNSAFE_NOTIN_PARTITIONBY_CLAUSE(a "soft" unsafe flag that still allows window run conditions) with a properUNSAFE_HAS_SUBPLANflag (hard block) for output columns containing subplans. This prevents unsafe pushdown of subplan-containing quals into subqueries in Cloudberry's distributed execution model.Test plan
--enable-cassertmake installcheckregression tests passisolation2tests pass