Skip to content

[core] Support append writes for MAP shared-shredding#8355

Open
lxy-9602 wants to merge 5 commits into
apache:masterfrom
lxy-9602:append-shredding-write
Open

[core] Support append writes for MAP shared-shredding#8355
lxy-9602 wants to merge 5 commits into
apache:masterfrom
lxy-9602:append-shredding-write

Conversation

@lxy-9602

@lxy-9602 lxy-9602 commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

Purpose

This is a subtask of MAP shared-shredding support.

This PR adds the append write path for MAP shared-shredding. When a MAP column is configured with map.storage-layout=shared-shredding, append writers now build a per-file physical write plan, convert logical rows to the shared-shredding physical layout, and write field-level shared-shredding metadata into file metadata before closing the format writer.

Main changes:

  • Add RowDataFileWritePlan, RowDataFileWritePlanFactory, RowDataTransform, and FileMetadataFinalizer to support per-file physical row conversion and metadata finalization.
  • Add MapSharedShreddingWritePlanFactory for append shared-shredding writes.
  • Add MapSharedShreddingCoreUtils to create and restore shared-shredding write context from recent data files.
  • Wire shared-shredding append writes into BaseAppendFileStoreWrite, AppendOnlyWriter, RowDataRollingFileWriter, RowDataFileWriter, and SingleFileWriter.
  • Keep file index and sequence tracking based on logical rows while writing physical rows to data files.
  • Restore adaptive shared-shredding stats from existing file metadata using each data file's actual file format.
  • Reject unsupported blob/vector append writes for shared-shredding for now.
  • Reject nested MAP file index when shared-shredding is enabled.
  • Extend format metadata field-id support for ORC via paimon.id.

This PR currently focuses on append tables. Blob writes, PK tables, and other more complex write paths will be supported in follow-up PRs. The row write plan is kept generic so these scenarios can reuse the same logical-to-physical write flow later.

Tests

  • mvn -pl paimon-core -am -Pfast-build -DfailIfNoTests=false -Dtest=AppendOnlyWriterTest,MapSharedShreddingCoreUtilsTest,SchemaValidationTest test
  • mvn -pl paimon-core -am -Pfast-build -DfailIfNoTests=false -Dtest=KeyValueFileReadWriteTest test
  • mvn -pl paimon-format -Pfast-build -DfailIfNoTests=false -Dtest=FormatMetadataUtilsTest,OrcFormatReadWriteTest test
  • mvn -pl paimon-arrow -am -Pfast-build -DfailIfNoTests=false -Dtest=ArrowSchemaMetadataCompatibilityTest test
  • mvn -pl paimon-core -DskipTests spotless:check
  • mvn -pl paimon-format -DskipTests spotless:check
  • mvn -pl paimon-arrow -DskipTests spotless:check
  • git diff --cached --check

writeCols);
writeCols,
sharedShredding
? new MapSharedShreddingWritePlanFactory(

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This enables writing the shared-shredding physical ROW layout, but I do not see a matching logical read/unshredding path. The normal table scan / format reader still requests the logical MAP schema; for Parquet, clipParquetType(MAP) expects a Parquet map group, so it will not be able to read this physical ROW. Please add an end-to-end test that writes through this path and reads the table back with the logical schema, and wire the reader conversion before enabling the write path.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the careful review. I agree this is an important gap. This PR currently focuses on introducing the write-side physical layout, but the normal scan path still reads with the logical MAP schema, so we should not expose this as a usable end-to-end feature before the reader-side unshredding path is in place.

My original thought was to keep this PR relatively small and reviewable by treating it as internal groundwork. To avoid users accidentally enabling an incomplete path, I can add schema validation to reject the shared-shredding option for now, and then enable the option together with the reader conversion, documentation, and end-to-end logical read tests in a follow-up PR.

While, I am also fine with making this PR larger and adding the reader-side conversion plus the end-to-end test here if you think the feature should only land when the full read/write path is included. Which approach would you prefer?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’ve added e2e coverage for the append-table read/write paths, including basic shared-shredding MAP read/write, multiple maps, projection, null values including overflow, metadata restore, and schema/data evolution cases. Unsupported paths such as rewrite are guarded to fail fast for now; would appreciate a review when you have time.

options.dataEvolutionEnabled(),
blobContext);
blobContext,
sharedShreddingContext);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Passing the shared-shredding context only to AppendOnlyWriter leaves the rewrite paths inconsistent. compactRewrite/clusterRewrite below still create a plain RowDataRollingFileWriter, so compaction can rewrite shared-shredded append files back into the default MAP layout without shared-shredding metadata. Please either route those rewrite writers through the same write plan or explicitly disable/guard compaction for shared-shredding append tables until it is supported.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching this! I added an explicit guard for now: compact rewrite and cluster rewrite will fail fast when MAP shared-shredding is configured.

columnName,
keyType);
checkArgument(
options.mapStorageLayout(columnName) != MapStorageLayout.SHARED_SHREDDING,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This validation catches nested map indexes, but shared-shredding can still be configured with formats whose writers do not implement SupportsWriterMetadata. In that case the table is accepted and the write fails only when the file is closing. Please reject unsupported file formats during schema validation, or otherwise fail before any data file write starts.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. I added schema validation for MAP shared-shredding to reject unsupported file formats early.

@lxy-9602 lxy-9602 force-pushed the append-shredding-write branch from 2db0402 to dc94d68 Compare June 27, 2026 02:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants