fix(dsl): compute cron wait at execution time, not at df.start() time#183
Open
Copilot wants to merge 3 commits into
Open
fix(dsl): compute cron wait at execution time, not at df.start() time#183Copilot wants to merge 3 commits into
Copilot wants to merge 3 commits into
Conversation
Copilot
AI
changed the title
[WIP] Fix df.wait_for_schedule to compute wait at execution time
fix(dsl): compute cron wait at execution time, not at df.start() time
May 27, 2026
df.wait_for_schedule() previously baked the wait duration at df.start() time via Utc::now(), which meant any delay between start and execution — and critically every iteration of a recurring `@>` loop — woke at the wrong moment (the stale, reused target busy-spun with wait=0 after the first tick). Now the DSL only validates the cron expression and stores it; the next tick is computed inside the orchestration using duroxide's deterministic clock (ctx.utc_now()) plus pure cron math, so it is replay-safe and correct for both single-shot and recurring waits. A NOTE references duroxide issue #34 (absolute-deadline timer) so this can later be simplified to ctx.schedule_timer_until(next).
63cb0ca to
24b0046
Compare
Adds 24_wait_for_schedule_exec_time.sql, which fails under the old df.start()-time cron computation and passes with the new execution-time computation. A df.sleep(30) before the wait introduces a start->execution delay; the new code recomputes the next ':00' tick at execution time (fires near the minute boundary, second ~= 0) while the old code reused a fixed offset and would fire ~30s into the minute. Asserts the fire lands before second 15 (the midpoint) to distinguish the two implementations.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #130.
df.wait_for_schedule()pre-computedwait_secondsat graph construction time. Any delay betweendf.start()and when the BGW actually runs theWAIT_SCHEDULEnode caused the timer to fire early — potentially before the intended cron tick. Worse, in a recurring@>loop the offset was baked once and reused on everycontinue_as_newiteration, so after the first tick the wait collapsed to ~0 and the loop busy-spun.Approach
The next cron tick is a function of "now", so it must be computed when the node executes, not at
df.start()time. Inside the orchestration we read the current time viactx.utc_now()— duroxide's deterministic clock, whose value is recorded in history and replayed verbatim — and then do pure cron math against it. This is fully replay-safe (the only non-determinism, the clock read, is the recorded syscall) and needs no extra activity.This supersedes the earlier draft of this PR, which stored a
target_timestampat DSL time and added acompute_cron_waitactivity. That approach was both unnecessary (ctx.utc_now()already gives a deterministic clock read inside the orchestration) and incorrect for recurring loops (the timestamp went stale across iterations).Changes
src/dsl.rs—df.wait_for_schedule()now only validates the cron expression eagerly (so a bad expression still fails fast atdf.start()) and stores just{"cron_expr": ...}; removed the DSL-timeUtc::now()/wait_secondscomputation.src/orchestrations/execute_function_graph.rs—execute_wait_schedule_nodereadsctx.utc_now(), computes the next cron tick from that instant, and schedulesschedule_timer(next - now). A NOTE points at duroxide Add pg_durable.worker_role GUC and use -U postgres consistently #34 (absolute-deadline timer) for a future simplification toschedule_timer_until(next).src/explain.rs— WAIT_SCHEDULE display is nowWAIT '<cron>'(the precomputed(Ns)is gone, since the wait is no longer known at plan time).src/lib.rs— unit test asserts the node config keepscron_exprand contains neitherwait_secondsnortarget_timestamp.Config shape change
Before:
{"cron_expr": "*/5 * * * *", "wait_seconds": 142}After:
{"cron_expr": "*/5 * * * *"}