Skip to content

[flink] Preserve offsets for empty tiering buckets#3505

Open
hutiefang76 wants to merge 1 commit into
apache:mainfrom
hutiefang76:codex/fluss-3413-preserve-empty-bucket-offsets
Open

[flink] Preserve offsets for empty tiering buckets#3505
hutiefang76 wants to merge 1 commit into
apache:mainfrom
hutiefang76:codex/fluss-3413-preserve-empty-bucket-offsets

Conversation

@hutiefang76

Copy link
Copy Markdown

What is changed

Fixes #3413.

This change keeps lake writes limited to non-empty bucket write results, but collects committed Fluss log offsets and max tiered timestamps from all completed bucket write results in a mixed commit round. That preserves offsets for buckets that only advanced through an empty materialized batch while another bucket produced lake data in the same commit.

The existing empty-write-result test now asserts that the empty bucket offset is included in the committed lake snapshot metadata.

Why

Before this change, TieringCommitOperator built the logEndOffsets map from only non-empty write results. In a mixed commit, empty-batch buckets with writeResult() == null were filtered out, so their completed log offsets were not committed even though the commit round had lake data to publish.

Tests

  • Red/green verified TieringCommitOperatorTest#testCommitMeetsEmptyWriteResult; before the production change it failed because bucket 0 offset was missing from the committed snapshot.
  • mvn -s <empty-settings> -pl fluss-flink/fluss-flink-common -am -Dtest=TieringCommitOperatorTest#testCommitMeetsEmptyWriteResult -Dsurefire.failIfNoSpecifiedTests=false test
  • mvn -s <empty-settings> -pl fluss-flink/fluss-flink-common -am -Dtest=TieringCommitOperatorTest -Dsurefire.failIfNoSpecifiedTests=false test
  • git diff --check

I used an empty temporary Maven settings file locally to avoid a private mirror configured in my user Maven settings.

AI assistance

This PR was prepared with assistance from OpenAI Codex. I reviewed the change and verified the tests above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[flink] Preserve tiering offsets for empty-batch buckets

1 participant