-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Description
Bug summary
When a deployment in Prefect 2.18 is configured with a regular schedule (for example, an hourly IntervalSchedule), the scheduler sometimes emits two flow‐run creation events for the same scheduled timestamp. In practice this looks like:
A flow run is created at the exact top of the interval (e.g. scheduled start time 07:00:00 PM).
Within a few seconds—often 10–20 seconds later—a second flow run is created with that same scheduled start time.
Both runs carry the auto-scheduled tag and point to the same deployment.
Each “duplicate” run has essentially the same scheduled start time (06:03 and 07:03) but a different actual submission timestamp (…:00 vs. …:15, and …:59 vs. …:13), which leads to two runs per interval.
This is causing:
-
Wasted resources, since one of the two runs being run for already run task in successful flow.
-
Data skew, because downstream tasks re-process the same data input and causes issue.
-
Confusion in monitoring, since our dashboards assume exactly one run per interval.
This need to be checked and have some fixes / recommendation. I have attached some screenshots for reference.
Note
We cannot yet upgrade to Prefect 3 (see our blocker issue: #15751).
Please suggest some actions on why does the duplicate flow_run even gets scheduled.
We have written internal code to handle this , using read_flow_runs to cancel the flow if there any Running flow_runs. Even the solution turns out to be not working were the response is async.
Version info
Version: 2.18.0
API version: 0.8.4
Python version: 3.12.3
Git commit: 1006d2d8
Built: Thu, Apr 18, 2024 4:47 PM
OS/Arch: linux/x86_64
Profile: default
Server type: cloud
Additional context
No response