Skip to content

Dangerous uses of forkIO #96

@NorfairKing

Description

@NorfairKing

Hi!

While going through a rabit hole together with @vidocco, we found that the amqp library uses the dangerous forkIO.

Our issue is as follows:
We have a worker executable that should watch for certain messages and "deal" with thim.
This worker executable should keep running at all times.
However, consumeMsgs is non-blocking, so we use something like takeMVar to block until a certain variable is filled by our connectionClosedHandler.
So far so good, but this has the following problem:
If the handler ever stops receiving messages without closing the connection, then our worker will hang indefinitely. (Any advice on how to do this more robustly would be appreciated.)

We recently had an incident where the worker stopped processing messages without crashing, so @vidocco and I went went to investigate.
Our hypothesis, after reading the amqp code, is that the following might have happened:

The channelReceiver seems to be the function that is responsible for handling incoming messages and is started using forkFinally, which uses forkIO.
This means that if this thread ever stops looping (because it throws an exception for example), the main thread will not notice this and the connection will not be closed either.

Here is an example to show that that's how forkIO works:

#!/usr/bin/env stack
-- stack --resolver lts-15.15 script
{-# LANGUAGE NumericUnderscores #-}
import Control.Concurrent
main :: IO ()
main = do
  putStrLn "Starting our 'server'."
  forkIO $ do            
    putStrLn "Serving..."
    threadDelay 1_000_000
    putStrLn "Oh no, about to crash!"
    threadDelay 1_000_000
    putStrLn "Aaaargh"
    undefined
  threadDelay 5_000_000
  putStrLn "Still running, eventhough we crashed"
  threadDelay 5_000_000                 
  putStrLn "Ok that's enough of that, stopping here."

Which outputs:

$ ./test.hs
Starting our 'server'.
Serving...
Oh no, about to crash!
Aaaargh
test.hs: Prelude.undefined
CallStack (from HasCallStack):
  error, called at libraries/base/GHC/Err.hs:80:14 in base:GHC.Err
  undefined, called at /home/syd/test/test.hs:17:5 in main:Main
Still running, eventhough we crashed
Ok that's enough of that, stopping here.

We haven't found which part of the channelReceiver might have thrown the exception that may have caused the handler thread to stop, but we found some usages of error that may have been the culprit:

In short; forkIO is dangerous and should be avoided. You can use async together with link instead.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions