Skip to content

feat: support null value messages (tombstones) for compacted topics#304

Open
grishaf wants to merge 3 commits into
apache:mainfrom
grishaf:feat/null-value-messages
Open

feat: support null value messages (tombstones) for compacted topics#304
grishaf wants to merge 3 commits into
apache:mainfrom
grishaf:feat/null-value-messages

Conversation

@grishaf

@grishaf grishaf commented May 18, 2026

Copy link
Copy Markdown

Motivation

Currently the Python client cannot send null value messages, which are needed as tombstones on compacted topics to delete entries for specific keys. Attempting to call producer.send(None, ...) raises a TypeError because BytesSchema.encode() rejects None, and _build_msg has a second _check_type(bytes, data, 'data') guard.

The C++ client added MessageBuilder::setNullValue() and Message::hasNullValue() in apache/pulsar-client-cpp#563 (merged April 3, 2026, milestone 4.2.0). The Java client has supported value(null) tombstones since v2.8+ via TypedMessageBuilderImpl.beforeSend() which sets msgMetadata.setNullValue(true) when the value is null.

This PR wraps the new C++ API for Python, using the same pattern as the Java client.

Modifications

pybind11 bindings (src/message.cc)

  • Added set_null_value binding on MessageBuilderMessageBuilder::setNullValue()
  • Added has_null_value binding on MessageMessage::hasNullValue()

Python wrapper (pulsar/__init__.py)

  • Message.has_null_value() — check if a received message is a tombstone
  • Producer._build_msg() — when content is None, skip schema encoding and call mb.set_null_value() instead of mb.content(data) (same pattern as Java's beforeSend())
  • Updated send() and send_async() docstrings to document None content

Dependencies (dependencies.yaml)

  • Bumped pulsar-cpp from 4.1.0 to 4.2.0

Tests (tests/pulsar_test.py)

  • test_null_value_message — send/receive null and non-null messages, verify has_null_value()
  • test_null_value_vs_empty_bytes — verify b"" and None are distinct
  • test_null_value_compaction — tombstoned keys disappear after topic compaction
  • test_null_value_table_view — tombstoned keys removed from TableView
  • test_null_value_with_properties — properties survive on null-value messages

Blocked on

This PR requires pulsar-client-cpp >= 4.2.0, which has not been released yet (v4.1.0 is the latest as of May 2026). The setNullValue()/hasNullValue() APIs are merged into C++ main but not yet in a release. CI will not pass until 4.2.0 ships.

References:

Made with Cursor

@grishaf grishaf force-pushed the feat/null-value-messages branch 2 times, most recently from 024c74c to dbe4502 Compare May 18, 2026 20:37
Add support for sending and detecting null value messages, which are
used as tombstones on compacted topics to delete entries for specific
keys. This wraps the C++ client's MessageBuilder::setNullValue() and
Message::hasNullValue() APIs added in pulsar-client-cpp#563.

Changes:
- Bump pulsar-cpp dependency to 4.2.0
- Add pybind11 bindings for set_null_value and has_null_value
- Allow Producer.send(None) to produce a null value message
- Add Message.has_null_value() to detect tombstone messages
- Skip schema encoding when content is None (mirrors Java client)
- Add integration tests for null values, compaction, and table view

Requires pulsar-client-cpp >= 4.2.0 (not yet released).

Co-authored-by: Cursor <cursoragent@cursor.com>
@grishaf grishaf force-pushed the feat/null-value-messages branch from dbe4502 to d293511 Compare May 19, 2026 08:21
@grishaf

grishaf commented May 19, 2026

Copy link
Copy Markdown
Author

Blocked on two upstream dependencies:

  1. Waiting for pulsar-client-cpp >= 4.2.0 release — the setNullValue() / hasNullValue() C++ APIs were merged into main (apache/pulsar-client-cpp#563) but not yet released. dependencies.yaml is set to 4.2.0 anticipating the release.

  2. Waiting for broker fix apache/pulsar#25817 — non-batched null-value messages are not removed during topic compaction due to a bug in extractKeyAndSize(). The Python compaction test (test_null_value_compaction) depends on this broker fix to pass.

@grishaf grishaf marked this pull request as ready for review June 16, 2026 07:45
@grishaf grishaf closed this Jun 16, 2026
@grishaf grishaf reopened this Jun 16, 2026
grishaf and others added 2 commits June 16, 2026 11:53
4.2.0 is not yet released on archive.apache.org, so CI fails downloading
the C++ client .deb. Revert to the released 4.1.0 to unblock CI until
pulsar-client-cpp 4.2.0 ships.

Co-authored-by: Cursor <cursoragent@cursor.com>
4.2.2 is the latest 4.x release and includes the null-value compaction
fix (apache/pulsar#25817, shipped in release/4.2.2). This lets
test_null_value_compaction pass; 4.0.0 predated the fix.

Co-authored-by: Cursor <cursoragent@cursor.com>
@grishaf

grishaf commented Jun 16, 2026

Copy link
Copy Markdown
Author

Update: both blockers from my earlier comment are now resolved on this branch.

  1. pulsar-client-cpp dependency — reverted dependencies.yaml back to the released 4.1.0 (commit 57e0a9f). The new set_null_value/has_null_value bindings compile and link fine against 4.1.0, and the CI run on that commit confirmed the build succeeds with 92/93 tests passing.
  2. Broker compaction fix — bumped the CI test broker to apachepulsar/pulsar:4.2.2 (commit 94bdd68), the latest 4.x release, which contains the null-value compaction fix from [fix][broker] Fix non-batched null-value messages not removed during topic compaction pulsar#25817 (shipped in release/4.2.2; also backported to 4.0.11 and 4.1.4). The previously failing test_null_value_compaction depended on this broker-side fix.

With these two commits the PR should be fully green. CI for the latest commit is currently in action_required (waiting on a maintainer to approve the workflow run on this fork PR) — could a committer kick it off? Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant