Skip to content

fix: handle non-contiguous RowRanges when resolving global row IDs#383

Open
zhf999 wants to merge 11 commits into
alibaba:mainfrom
zhf999:fix-rowid
Open

fix: handle non-contiguous RowRanges when resolving global row IDs#383
zhf999 wants to merge 11 commits into
alibaba:mainfrom
zhf999:fix-rowid

Conversation

@zhf999

@zhf999 zhf999 commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

Purpose

  • Fix a correctness bug where upper layers assumed returned batch rows are continuous and derived global row IDs as previous_batch_start + offset.
  • Introduce unified usage of GetPreviousBatchGlobalRowId(batch_row_id) to resolve the global row ID for a row index inside the current batch.
  • In PrefetchFileBatchReaderImpl, cache the actual global row IDs for each returned batch and keep row-id mapping aligned when a batch is sliced by read_range.
  • For Parquet, add target row-range union and per-batch row mapping logic:
    • Merge filtered target ranges into a global target row set.
    • Build batch_row -> global_row_id mapping in NextBatch().
    • Keep GetPreviousBatchGlobalRowId() correct under non-contiguous rows caused by predicate + bitmap filtering.
  • Update upper-layer consumers to query per-row global IDs directly (deletion vectors, bitmap index filtering, _ROW_ID field conversion, KeyValue iteration positions).

Tests

  • Updated and adapted reader tests to the new interface and semantics:
    • src/paimon/format/avro/avro_file_batch_reader_test.cpp
    • src/paimon/format/blob/blob_file_batch_reader_test.cpp
    • src/paimon/format/lance/lance_format_reader_writer_test.cpp
    • src/paimon/format/orc/orc_file_batch_reader_test.cpp
    • src/paimon/format/parquet/parquet_file_batch_reader_test.cpp
    • src/paimon/common/reader/prefetch_file_batch_reader_impl_test.cpp
  • Added/extended coverage in Parquet with TestRowMapping to validate global row mapping across non-contiguous ranges.
  • Remaining changes are interface migration plus consistency updates of call sites and assertions.

API and Format

  • Public reader API change: FileBatchReader and implementations now use GetPreviousBatchGlobalRowId(uint64_t batch_row_id).
  • Semantic change: returns the global row ID for the batch_row_id inside the current batch (instead of deriving by batch start + offset under contiguous assumptions).
  • Storage format and on-disk protocol are unchanged.

Documentation

No.

Generative AI tooling

gpt-5.3-codex

Copilot AI review requested due to automatic review settings June 26, 2026 02:55

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot was unable to review this pull request because the user who requested the review has reached their quota limit.

virtual Result<uint64_t> GetPreviousBatchFirstRowNumber() const = 0;
/// Get the global row number of the row in the previously read batch.
virtual Result<uint64_t> GetPreviousBatchGlobalRowId(uint64_t batch_row_id) const = 0;

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GetPreviousBatchGlobalRowId -> GetPreviousBatchFileRowId

PAIMON_ASSIGN_OR_RAISE(uint64_t global_row_id, reader_->GetPreviousBatchGlobalRowId(i));
if (bitmap_.Contains(global_row_id)) {
is_valid.Add(i);
}

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks inefficient. Contains starts scanning from the beginning on every call, which can be quite costly for large bitmaps. In this case, since the iterator only moves forward, it should be enough to adjust the filter logic to take advantage of that.

Result<uint64_t> PrefetchFileBatchReaderImpl::GetPreviousBatchFirstRowNumber() const {
return previous_batch_first_row_num_;
Result<uint64_t> PrefetchFileBatchReaderImpl::GetPreviousBatchGlobalRowId(
uint64_t batch_row_id) const {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can’t we just return previous_batch_first_row_num_ + batch_row_id directly?

ASSERT_EQ(reader->GetPreviousBatchFirstRowNumber().value(), 0);
uint64_t global_row_id = 0;
ASSERT_OK_AND_ASSIGN(global_row_id, reader->GetPreviousBatchGlobalRowId(0));
ASSERT_EQ(global_row_id, 0);

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using .value() directly should also work here.

ReadResultCollector::CollectResult(
reader.get(), /*max simulated data processing time*/ 100));
ASSERT_EQ(reader->GetPreviousBatchFirstRowNumber().value(), 10);
auto expected_array = std::make_shared<arrow::ChunkedArray>(data_array);

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we drop the final validation, this scenario would no longer be exercised by the test. Could we mimic ReadResultCollector::CollectResult inside this test instead? We don’t need to materialize the full result — the goal is just to validate the last row of the final batch.

PAIMON_ASSIGN_OR_RAISE(uint64_t global_row_id,
reader_->GetPreviousBatchGlobalRowId(idx_in_array));
return first_row_id_.value() + global_row_id;
};

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name global row id doesn’t seem quite appropriate in this scenario. It usually implies a row ID at the table level, while here global_row_id actually refers to a row ID within the file.

Result<uint64_t> GetPreviousBatchFirstRowNumber() const override {
return ToReaderRowNumber(previous_batch_first_row_num_);
Result<uint64_t> GetPreviousBatchGlobalRowId(uint64_t batch_row_id) const override {
return previous_batch_first_row_num_ + batch_row_id;

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If previous_batch_first_row_num_ can be -1, then the current implementation of GetPreviousBatchGlobalRowId would be incorrect, which means the refactoring is not behaviorally equivalent.


static Result<std::shared_ptr<arrow::ChunkedArray>> CollectResultOneBatch(
BatchReader* batch_reader) {
return CollectResultOneBatch(batch_reader, /*max simulated data processing time*/ 0);

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/*max_data_processing_time_in_us=*/


static Result<std::shared_ptr<arrow::ChunkedArray>> CollectResultOneBatch(
BatchReader* batch_reader, int64_t max_data_processing_time_in_us) {
int64_t seed = DateTimeUtils::GetCurrentUTCTimeUs();

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can CollectResultOneBatch just return an Arrow array directly? It doesn’t look like we need a ChunkedArray here.


Result<RowRanges> ParquetFileBatchReader::GetAllTargetRowRanges(
const std::vector<TargetRowGroup>& target_row_groups) {
row_mapping_.clear();

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please rename this to something like UpdateAllTargetRowRanges. Methods prefixed with Get should ideally avoid modifying member variables. Also, update doesn’t need to return anything here — it can just update all_row_ranges_ directly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants