[core] Support reading shared-shredding map#8364
Conversation
| * original logical schema to upper layers by lazily converting only shared-shredding MAP fields | ||
| * when {@link InternalRow#getMap(int)} is called. | ||
| */ | ||
| public class MapSharedShreddingReader implements FileRecordReader<InternalRow> { |
There was a problem hiding this comment.
The wrapper is never used by the actual data-file read path. I only see references from this class and its unit test; RawFileSplitRead, DataEvolutionSplitRead, and FormatTableRead still pass the format reader directly into DataFileRecordReader, and nothing reads SupportsReaderFieldMetadata to build these metas before returning rows. As a result, a table containing a shared-shredding MAP would still expose the physical ROW from the format reader instead of this logical MAP wrapper, so the PR does not yet provide real read support outside the unit test. Please wire this reader into the real read paths after recovering the field metadata, and add an end-to-end read/write test that reads a shared-shredding map through the table API.
There was a problem hiding this comment.
Thanks for the review. This is a standalone PR that extracts a wrapper for converting physical columns to logical columns. Once this wrapper is merged, I will submit the previously mentioned append read/write end-to-end changes together in #8355. We mainly split this out to keep the PR size manageable and make the review easier. Also, the write-side changes have not been merged yet, so this will not produce data that cannot be read.
Purpose
Add read support for MAP shared-shredding data in
paimon-common.This change introduces a reader wrapper that rebuilds logical
MAP<STRING, T>values from shared-shredding physical ROW values.Changes
Add
MapSharedShreddingReaderFileRecordReader<InternalRow>.InternalRow#getMap(pos)is called.fieldMappingfieldMappingelementAdd shared-shredding utility methods
getPhysicalColumnIndicesisOverflowFieldbuildSpecificPhysicalStructTypeLimitation
This PR currently only supports reading the whole shared-shredding MAP field.
It does not yet support selecting / projecting specific MAP keys during read. Because of this, rebuilt map entries currently follow the physical metadata layout order instead of the user requested key order. A TODO is left in the reader for future key-level projection support.
Tests
Added unit coverage for:
fieldMappingand unknown field id handling.MapSharedShreddingUtilshelper methods.Verification
mvn -pl paimon-common -Pfast-build -Dtest=MapSharedShreddingReaderTest,MapSharedShreddingUtilsTest test git diff --check