Kafka Connect: Fix UUID conversion for Parquet writes by thswlsqls · Pull Request #17079 · apache/iceberg

thswlsqls · 2026-07-03T22:50:31Z

Summary

RecordConverter.convertUUID() returned byte[] for UUID columns when the target file format is Parquet, but the Parquet UUID writer (ParquetValueWriters.uuids()) expects a java.util.UUID and converts to bytes itself, so writes threw ClassCastException: class [B cannot be cast to class java.util.UUID.
Removes the byte[] branch so convertUUID() always returns UUID, matching ORC (GenericOrcWriters.uuids()) and Avro, which already accept UUID directly.
The byte[] conversion matched the writer contract before PR Parquet: Add readers and writers for the internal object model #11904 changed ParquetValueWriters' UUID writer to accept UUID directly; kafka-connect was not updated to follow — this restores the correct contract.
Note: open PR Kafka Connect: Precompute UUID-as-bytes flag in RecordConverter #16654 ("Kafka Connect: Precompute UUID-as-bytes flag in RecordConverter") touches the same method but explicitly preserves the current byte[] behavior, so it does not fix this bug; whichever of the two merges first, the other will need a rebase.

Testing done

Updated TestRecordConverter#testUUIDConversionWithParquet to assert the field equals the original UUID, replacing the UUIDUtil.convert(UUID_VAL) byte[] expectation.
./gradlew :iceberg-kafka-connect:iceberg-kafka-connect:check passes — TestRecordConverter 59/59, full module 122/122, 0 failures.

RecordConverter.convertUUID() converted UUID values to byte[] when the target file format is Parquet. The Parquet UUID writer (ParquetValueWriters.uuids()) expects a java.util.UUID and converts to bytes internally, so writing a UUID column with the default file format threw ClassCastException: class [B cannot be cast to class java.util.UUID. The byte[] branch matched the writer contract before apache#11904 changed ParquetValueWriters' UUID writer to accept UUID directly; kafka-connect was not updated to follow. This removes the byte[] conversion so convertUUID always returns a UUID, matching ORC/Avro and the current Parquet writer contract. Generated-by: Claude Code

github-actions Bot added the KAFKACONNECT label Jul 3, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Kafka Connect: Fix UUID conversion for Parquet writes#17079

Kafka Connect: Fix UUID conversion for Parquet writes#17079
thswlsqls wants to merge 1 commit into
apache:mainfrom
thswlsqls:fix/kafka-connect-uuid-parquet-conversion

thswlsqls commented Jul 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

thswlsqls commented Jul 3, 2026

Summary

Testing done

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant