Jakub Arnold's Blog


Mutable State in Haskell

Haskell is a purely functional language, which means there are no side-effects and all variables are immutable. But as you probably know this isn’t completely true. All variables are indeed immutable, but there are ways to construct mutable references where we can change what the reference points to.

Without side effects we wouldn’t be able to do much, which is why Haskell gives us the IO monad. In a similar manner we have many ways to achieve mutable state in Haskell, let’s take a look at them:

IORef

We all know that the IO monad allows us to do arbitrary effects in the real world, so it probably comes as no surprise that it also allows us to create a mutable reference to an type, called IORef (from Data.IORef.) There is not much complicated about IORef, as it only takes a single type parameter, which is the type it’s going to contain.

Before we move into specifics it is important to note here that modifying the IORef is no a pure operation, which means ever single operation on the IORef will be inside the IO monad.

Let’s take a look at some of the functions available for manipulating IORefs.

data IORef a

newIORef    :: a -> IO (IORef a)
readIORef   :: IORef a -> IO a
writeIORef  :: IORef a -> a -> IO ()
modifyIORef :: IORef a -> (a -> a) -> IO ()

First thing you’ll probably notice is that in order to create an IORef we need to give it a value. An IORef must always contain a value of a given type, it is impossible to create it empty. Here’s a simple example.

import Data.IORef

main :: IO ()
main = do
    ref <- newIORef (0 :: Int)

    modifyIORef ref (+1)

    readIORef ref >>= print

I’ve used 0 :: Int instead of just 0 to make it explicit that we’re using Ints. If you don’t do that it won’t affect the program but you might get a warning from the compiler.

There’s not much really happening in this example, we just create a new IORef, increase it’s value by 1 and then print the result. While this is nice it doesn’t really show much, so let’s make this more complicated.

A common pattern in Haskell is to take an immutable data structure and put it inside a mutable reference, which basically gives you a mutable version of that data structure (let’s ignore the fact that there might be a more efficient way to do this for now.) This will work because we can take any Haskell type and put it into an IORef. Let’s begin by using Maybe Int to represent a mutable box for an Int which can be empty.

magic :: IORef (Maybe Int) -> IO ()
magic ref = do
    value <- readIORef ref

    case value of
        Just _ -> return ()
        Nothing -> writeIORef ref (Just 42)

main :: IO ()
main = do
    ref <- newIORef Nothing
    magic ref

    readIORef ref >>= print

First we define a function which takes a IORef (Maybe Int), that is a mutable reference that maybe contains an Int and produces some side effects. The implementation simply reads the IORef and do nothing if it already has a value, but if it contains Nothing it will replace that value with Just 42. Our main function then simply prints the contents of the IORef, which is Just 42.

In-place bubble sort with IORef

If you’ve read this far there’s a fair chance that you know how bubble sort works. The important thing about it is that it works in-place and modifies the array it is sorting. Here’s a simple implementation in Ruby.

def bubble_sort(list)
  list.each_index do |i|
    (list.length - i - 1).times do |j|
      if list[j] > list[j + 1]
        list[j], list[j + 1] = list[j + 1], list[j]
      end
    end
  end
end

The key part being here is that we’re swapping the elements of the list as we iterate through it. This is something we can’t do in pure Haskell, but we can attempt to do this using IORefs.

We will use a simple Haskell list where each element is IORef Int, so that we can move them around. The exact type will be [IORef Int].

Disclaimer: I am aware that using a list, which is a linked list, is a horribly inefficient implementation. The point of this article is however to show how IORef can be used, not how to properly sort an array.

Our sorting function will accept a plain list of Ints, wrap them all in IORefs, do the sorting in place, and unwrap the IORefs to return a list of Ints again.

bubbleSort :: [Int] -> IO [Int]
bubbleSort input = do
    let ln = length input

    xs <- mapM newIORef input

    forM_ [0..ln - 1] $ \_ -> do
        forM_ [0..ln - 2] $ \j -> do
            let ix = xs !! j
            let iy = xs !! (j + 1)

            x <- readIORef ix
            y <- readIORef iy

            when (x > y) $ do
                writeIORef ix y
                writeIORef iy x

    mapM readIORef xs

Let’s go through the code one step at a time. First we need to calculate the length of the list being sorted and bind that to a variable.

let ln = length input

Next we wrap all of the items in the list inside an IORef. This will allow us to do the sort in-place by swapping around the values of the references.

xs <- mapM newIORef input

Let’s examine the mapM here a little bit. The newIORef function has a type of a -> IO (IORef a), if we try to partially apply it with map, we’ll get back the following.

λ> :t map newIORef
:: [a] -> [IO (IORef a)]

This is not very useful for us, since we need a [IORef a]. Fortunately Haskell provides a sequence :: [IO a] -> IO [a] function which simply pulls out the monadic effects from a list.

λ> :t sequence . map newIORef
:: [a] -> IO [IORef a]

mapM is simply defined a shorthand for as sequence . map. There also exists forM which is exactly like mapM, but the arguments are swapped around.

λ> :t mapM
mapM :: Monad m => (a -> m b) -> [a] -> m [b]
λ> :t forM
forM :: Monad m => [a] -> (a -> m b) -> m [b]

One last variant is mapM_ and forM_, which the same as mapM and forM, only their return value is discarded.

λ> :t mapM_
mapM_ :: Monad m => (a -> m b) -> [a] -> m ()
λ> :t forM_
forM_ :: Monad m => [a] -> (a -> m b) -> m ()

We chose forM because the function we pass in as an argument is quite long and it just ends up being syntactically more pleasing, and because we only care about the effects produced by the function we apply. [0..ln - 2] simply allows us to call the function length - 2 number of times.

forM_ [0..ln - 2] $ \_ -> do
    forM_ [0..ln - 2] $ \j -> do

Next we extract two items from the list, note that these have the type IORef Int.

let ix = xs !! j
let iy = xs !! (j + 1)

We need to read the values from the IORefs in order to be able to compare them

x <- readIORef ix
y <- readIORef iy

and then simply swap the contents if x > y

when (x > y) $ do
    writeIORef ix y
    writeIORef iy x

The last step is to unwrap the IORefs.

mapM readIORef xs

Now that we went through each of the steps, let’s test our bubble sort implementation.

λ> bubbleSort [1,2,3,4]
[1,2,3,4]
λ> bubbleSort [4,3,2,1]
[1,2,3,4]
λ> bubbleSort [4,99,23,93,17]
[4,17,23,93,99]

It works! Keep in mind that this implementation is horribly slow. If you’re interested in fast arrays in Haskell check out the vector library.

ST monad

You’ve probably noticed that the only reason why we need to perform our sorting algorithm in the IO monad is to have mutable references, which is not ideal since we’re not really doing any IO.

Luckily for us there is a solution called the state thread monad. I won’t be going on into great detail since the API for IORef and STRef is almost exactly the same.

data STRef s a

newSTRef    :: a -> ST s (STRef s a)
readSTRef   :: STRef s a -> ST s a
writeSTRef  :: STRef s a -> a -> ST s ()
modifySTRef :: STRef s a -> (a -> a) -> ST s ()

The key difference is that while we can’t ever escape from the IO monad, we do have the ability to escape from the ST monad with the runST :: ST s a -> a function, making the computation pure.

import Control.Monad.ST
import Data.STRef

magic :: Int -> Int
magic x = runST $ do
    ref <- newSTRef x

    modifySTRef ref (+1)

    readSTRef ref

The only thing worth mentioning here compared to the IORef example is that the type of the function magic is just Int -> Int, because we’re able to escape the ST monad using a call to runST.

If you’re not sure why this is useful, think of the sorting algorithm we developed earlier. There are many algorithms which require mutation, but which are also pure in their nature. If the way to achieve mutation was using the IO monad, we wouldn’t be able to implement such algorithm in pure code.

MVar

The next type we’re going to take a look at is a little bit more complicated than IORef, it’s called an MVar. As usual most of the API is similar, but there is one huge difference. While an IORef must always have a value, MVar can be empty.

We have two ways of constructing an MVar.

newMVar :: a -> IO (MVar a)
newEmptyMVar :: IO (MVar a)

We also have an additional operation takeMVar :: MVar a -> IO a which takes a value out of an MVar and leaves it empty. Now comes the important part, if we try to do takeMVar from an empty MVar, it will block the thread until someone else puts a value into the MVar. The same thing happens when you try to putMVar into an MVar that already has a value, it will block until someone takes that value out.

Try compiling and running the following program.

import Control.Concurrent

main :: IO ()
main = do
    a <- newEmptyMVar
    takeMVar a

After a second or so you’ll get an exception and the program will crash.

*** Exception: thread blocked indefinitely in an MVar operation

The reason for this is that there are no other threads that could possibly modify the MVar, so the runtime kills the thread. If we modify the program to first put a value into the MVar it will work correctly.

main :: IO ()
main = do
    a <- newEmptyMVar
    putMVar a "hello"
    takeMVar a >>= print

Now you might be thinking, how does the runtime know that there are no other threads that could put a value into that MVar? Using garbage collection!

Every MVar knows which threads are currently blocked on it. If a thread that is currently blocked on an MVar is not accessible from any other running thread, it will get killed since there is no way it to become unblocked.

If you’re interested in more details about this I recommend reading the amazing Parallel and Concurrent Programming in Haskell book, specifically the chapter on how blocked MVars are handled.

Synchronizing threads using MVar

One of the great benefits of MVars is that they can be be used to serve as synchronization primitives for communication between threads.

We can use them as a simple 1 item channel, where we fork a thread that forever loops trying to read from the MVar and print the result, and in the main thread we read input from the user and put it into the same MVar.

import Control.Monad
import Control.Concurrent

main :: IO ()
main = do
    a <- newEmptyMVar

    forkIO $ forever $ takeMVar a >>= putStrLn

    forever $ do
        text <- getLine
        putMVar a text

Everything will work as expected since takeMVar will block until we put something into the MVar.

One important thing to note here is that when main returns the runtime automatically kills all of the other running threads. It doesn’t wait for them to finish. Let’s see a simple example.

import Control.Monad
import Control.Concurrent

main :: IO ()
main = do
    forkIO $ do
        threadDelay 2000000
        putStrLn "Hello World"

    putStrLn "Game over!"

If you run this using runhaskell or by compiling and running the binary you’ll only see the output of Game over!. The second thread will never print Hello World, because by the time it starts waiting the main function will return and the runtime will kill the other thread.

We can fix this by using an MVar to make the main function wait for the other thread to finish.

import Control.Monad
import Control.Concurrent

main :: IO ()
main = do
    a <- newEmptyMVar

    forkIO $ do
        threadDelay 2000000
        putStrLn "Hello World"
        putMVar a ()

    takeMVar a
    putStrLn "Game over!"

The main thread first tried to take a value out of the MVar, which will block because there’s nothing in there yet, and then the second thread will sleep for 2 seconds, print Hello World and put a () into the MVar. This causes main to continue, print Game over! and exit the program. We could also do this the other way around by using putMVar on a full MVar in order to block, but the end result is the same.

main :: IO ()
main = do
    a <- newMVar ()

    forkIO $ do
        threadDelay 2000000
        putStrLn "Hello World"
        takeMVar a

    putStrLn "Game over!"
    putMVar a ()

There are many more things to cover with respect to MVar, but I’m not going to go more in depth here, since there already are other great resources on the topic.

Software Transactional Memory - STM

Last on our list is Software Transactional Memory. Much like we had IORef and MVar, STM gives us TVar, which stands for transaction variable. The way that STM works is that it builds up a log of actions that are to be performed atomically. We won’t be covering STM itself as a method for managing concurrency, since it’s a rather lengthy topic. Instead we’ll just examine the options for achieving mutable state using STM using a TVar.

Every STM operation happens inside the STM monad, which already tells us that we can chain multiple STM operations into one (since the monad instance provides us with >>=.) In order to run the actual STM transaction we must use the function atomically :: STM a -> IO a, which takes any STM operation and performs it in a single atomic step.

The API for creating TVars is almost the same as for IORefs.

data TVar a

newTVar    :: a -> STM (TVar a)
readTVar   :: TVar a -> STM a
writeTVar  :: TVar a -> a -> STM ()
modifyTVar :: TVar a -> (a -> a) -> STM ()

There are also alternatives that work in the IO monad.

newTVarIO   :: a -> IO (TVar a)
readTVarIO  :: TVar a -> IO a

Note that these are just convenience functions that we could have implemented ourselves using atomically function.

newTVarIO :: a -> IO (TVar a)
newTVarIO = atomically . newTVar

readTVarIO :: TVar a -> IO a
readTVarIO = atomically . readTVar

Now let’s move onto mutations. We’ll use the same example as we did with IORef, but implement it using a TVar. We have many ways to approach it, either by building one big transaction with all the steps, or by doing this in many small ones.

First let’s do one big atomically with all the steps.

bigTransaction :: IO ()
bigTransaction = do
    value <- atomically $ do
        var <- newTVar (0 :: Int)
        modifyTVar var (+1)
        readTVar var

    print value

There’s not much interesting going on in here, so let’s split it into smaller chunks. Even though modifyTVar is the perfect function for our use case, we can use a combination or readTVar and writeTVar to achieve the same, because atomically will make sure those two happen in a single step.

atomicReadWrite :: IO ()
atomicReadWrite = do
    var <- newTVarIO (0 :: Int)

    atomically $ do
        value <- readTVar var
        writeTVar var (value + 1)

    readTVarIO var >>= print

Since STM is a monad, we can also make this more interesting by combining two STM operations together and running those atomically.

f :: TVar Int -> STM ()
f var = modifyTVar var (+1)

twoCombined :: IO ()
twoCombined = do
    var <- newTVarIO (0 :: Int)

    atomically $ do
        f var
        f var

    readTVarIO var >>= print

There’s a lot more to STM than just TVars which is why I’d encourage you, dear reader, to take a look at the following resources. You might find that it will change the way you think about concurrent programming completely.

Related
Haskell