-
Notifications
You must be signed in to change notification settings - Fork 38
Description
Hi!
While going through a rabit hole together with @vidocco, we found that the amqp
library uses the dangerous forkIO
.
Our issue is as follows:
We have a worker executable that should watch for certain messages and "deal" with thim.
This worker executable should keep running at all times.
However, consumeMsgs
is non-blocking, so we use something like takeMVar
to block until a certain variable is filled by our connectionClosedHandler
.
So far so good, but this has the following problem:
If the handler ever stops receiving messages without closing the connection, then our worker will hang indefinitely. (Any advice on how to do this more robustly would be appreciated.)
We recently had an incident where the worker stopped processing messages without crashing, so @vidocco and I went went to investigate.
Our hypothesis, after reading the amqp
code, is that the following might have happened:
The channelReceiver
seems to be the function that is responsible for handling incoming messages and is started using forkFinally
, which uses forkIO
.
This means that if this thread ever stops looping (because it throws an exception for example), the main thread will not notice this and the connection will not be closed either.
Here is an example to show that that's how forkIO
works:
#!/usr/bin/env stack
-- stack --resolver lts-15.15 script
{-# LANGUAGE NumericUnderscores #-}
import Control.Concurrent
main :: IO ()
main = do
putStrLn "Starting our 'server'."
forkIO $ do
putStrLn "Serving..."
threadDelay 1_000_000
putStrLn "Oh no, about to crash!"
threadDelay 1_000_000
putStrLn "Aaaargh"
undefined
threadDelay 5_000_000
putStrLn "Still running, eventhough we crashed"
threadDelay 5_000_000
putStrLn "Ok that's enough of that, stopping here."
Which outputs:
$ ./test.hs
Starting our 'server'.
Serving...
Oh no, about to crash!
Aaaargh
test.hs: Prelude.undefined
CallStack (from HasCallStack):
error, called at libraries/base/GHC/Err.hs:80:14 in base:GHC.Err
undefined, called at /home/syd/test/test.hs:17:5 in main:Main
Still running, eventhough we crashed
Ok that's enough of that, stopping here.
We haven't found which part of the channelReceiver
might have thrown the exception that may have caused the handler thread to stop, but we found some usages of error
that may have been the culprit:
Line 667 in e0cf5cc
basicReturnToPublishError x = error $ "basicReturnToPublishError fail: "++show x Line 664 in e0cf5cc
num -> error $ "unexpected return error code: " ++ (show num) Line 548 in e0cf5cc
msgFromContentHeaderProperties c _ = error ("Unknown content header properties: " ++ show c) Line 52 in e0cf5cc
intToDeliveryMode n = error ("Unknown delivery mode int: " ++ show n)
In short; forkIO
is dangerous and should be avoided. You can use async
together with link
instead.