Skip to content

Conversation

ggcrunchy
Copy link
Contributor

This is a follow-up to #657, dealing with audio.onComplete.


@naveen-pcs pointed out on the forum that there were still audio crashes in 3701.

The stack trace has a telltale ~BaseResourceHandle(). After digging around, it seems the BaseResource in question was basically a way for the audio task to detect if the underlying Lua state has been invalidated and, if so, do nothing. Unfortunately, when the resource is set up, it updates a shared reference count, but on the audio thread and without any synchronization. Sure enough, that count must get out of whack and lead to something like a double free of the shared memory.

The audio thread is started and closed within the Lua state lifetime, and I don't see any way for the tasks to execute outside this range, since the scheduler is similarly contained. What I ended up doing, thus, was to create a single nPlatformNotifier belonging to the Runtime (so the reference count is only touched in the main thread) and sending the tasks through that. (All the accesses are read-only aside from the already-hardened Scheduler methods.)

I'm also now using Lua references rather than memory-allocated objects to track channel usage / onComplete callbacks, and sending them on through the Scheduler. This avoids some memory churn and simplifies cleanup.

For that matter, the channels-to-callback also now has some synchronization, via a per-channel atomic_flag. I forget the exact details at the moment, but the channel management looked shaky in its previous form (there was definite cross-thread touching). I haven't tested if this was relevant to #19.


I've currently only commented out the old code, not removed it. On request, I believe I can explain most, if not all, of those race bugs described in the comments, in light of my analysis.


The previous mutex implementation in the Scheduler is now also based on atomics.


Attached is an Android build for testing: Corona.aar.zip (You can unzip this and replace Corona.aar in your Native directory if you want to test on Android. Might want to save your old one, too.)

This is based on this PR and #661.

With this I was able to run the test from #296 on Windows, Mac, and Android without problems such as the loading issues mentioned in Discord nor the Back button bug from #663.

ggcrunchy and others added 23 commits February 7, 2019 17:09
…nd buffer and assignment logic

Next up, prepare details from graphics.defineEffect()
…from Program

Slight redesign of TimeTransform with caching (to synchronize multiple invocations with a frame) and slightly less heavyweight call site
Err, was calling Modulo for sine method, heh

That said, time transform seems to work in basic test
Fixed error detection for positive number time transform inputs

Relaxed amplitude positivity requirement
Maintenance Revert Test
My mistake adding back changes made
Notes for Delete(), in case we want a thread-safe version later
… (or nil ref, when not needed) and no ref means a channel is empty (for purposes of stopping)

Avoids lots of new and delete

More importantly, one notifier with lifetime outside the thread (no unguarded updating of reference counts)

No need for loop to delete them (with potential race) since will go away with Lua

Array of atomic flags to go with channel-to-callbacks, should avoid remaining race possibilities
@Shchvova Shchvova merged commit 0be9501 into coronalabs:master Jan 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants