[MINOR][CORE] Add TaskContext JNI callback for reading Spark task attempt id from native#12435
Open
taiyang-li wants to merge 1 commit into
Open
[MINOR][CORE] Add TaskContext JNI callback for reading Spark task attempt id from native#12435taiyang-li wants to merge 1 commit into
taiyang-li wants to merge 1 commit into
Conversation
…tempt id from native Instead of extending every JNI entry point (createRuntime / MemoryManager create/hold/release) to plumb the Spark task attempt id from Java down to C++ as an extra parameter, expose a small callback surface that native code uses on demand: - Java side: org.apache.gluten.task.TaskContextJniWrapper#currentTaskAttemptId() reads TaskContext.get().taskAttemptId() on the current thread and returns -1 when there is no task context. - C++ side: gluten::getCurrentSparkTaskAttemptId() attaches the current thread to the JVM as a daemon on demand and calls back into the Java helper via JNI. The class ref and method id are cached in function-local statics on first use. Because Spark's TaskContext is a per-thread ThreadLocal, this returns a meaningful value whenever the native call is running on an executor task thread (or any thread inheriting that ThreadLocal), which is exactly when backends need it. No signature change to Runtime / MemoryManager / RuntimeJniWrapper / NativeMemoryManagerJniWrapper. No behavior change for existing backends (Velox, ClickHouse) that do not query the task attempt id from native. Co-Authored-By: Aime <aime@bytedance.com> Change-Id: I3185249796b0c396813dc39f54bd8e8b8589ca2a
1244485 to
696b750
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Background
One of a small series of common-code changes to introduce a new backend — Bolt (ByteDance unified lakehouse analytics acceleration engine) — into the Gluten community. The series minimizes the delta to Gluten common code so Bolt can plug in cleanly, while leaving Velox and ClickHouse backends unaffected.
This PR
The Bolt backend needs the Spark task attempt id at the native layer for task-level identification (memory pool naming, spill directories, logs, etc.). Instead of threading the value through every JNI entry point (which would require changing
RuntimeJniWrapper.createRuntimeand the wholeNativeMemoryManagerJniWrappercreate/hold/release chain plus every backendsRuntime/MemoryManagerfactory signature), this PR adds a small JNI callback that native code invokes on demand:org.apache.gluten.task.TaskContextJniWrapper#currentTaskAttemptId()returnsTaskContext.get().taskAttemptId(), or-1Lwhen there is no task context on the calling thread.gluten::getCurrentSparkTaskAttemptId()attaches the current thread to the JVM as a daemon on demand and calls the Java helper; class ref / method id cached in function-local statics on first use.Since Sparks
TaskContextis a per-threadThreadLocal, this returns a meaningful value whenever the native call runs on an executor task thread — exactly when backends need it. Existing backends (Velox, ClickHouse) that do not query the task attempt id from native are unaffected.