Yes, you read the title correctly. And yes, ruby still has a GIL. However, ruby has long been able to support concurrency for IO-bound calls. Ruby will not block the current thread if it is waiting on IO. In practice, this means that many of our ruby programs (like web applications) can use ruby’s threads to concurrently serve multiple requests.
Working directly with ruby’s threading primitives can be complicated. This is a problem that the concurent-ruby library aims to solve. This library is mature and comprehensive but it offers a staggering number of APIs for modeling concurrency in your application.
I’d like to suggest a different path. The
dry-monads library exposes the Task
monad which is built on top of concurrent-ruby. In this post I’ll explore the
Task monad as well as the newly-released do syntax.
Tasks Introduction
First, let’s assume you have the following in your Gemfile:
source "https://rubygems.org"
gem "dry-monads", "~> 1.0", require: "dry/monads/all"
We can now write a simple program that uses a Task to perform an asynchronous
computation. We’ll use sleep to represent any kind of long-running IO
operation (think database queries, HTTP requests, etc).
class Async
include Dry::Monads::Task::Mixin
def go
slow_task(1).value!
end
def slow_task(i)
Task[:io] do
sleep 1
i
end
end
end
p Async.new.go
# => 1
First, we include Dry::Monads::Task::Mixin in our class. This gives us access
to the Task constant which we can use to build our Tasks. Next, we use
Task[:io] and provide it a block to start a Task. This creates a Task on
the :io executor. There are three default executors: :io for IO-bound tasks,
:fast for CPU-bound tasks and :immediate which runs on the current thread
(usually used for testing).
As soon as our Task is initialized, it begins running in its own thread. It
will not block the main thread of execution.
Finally, we call the value! method to block until our Task returns a
value.
If we save this in a file named monads.rb and run it, we’ll see it takes a bit
over a second to execute.
$ time ruby monads.rb
1
ruby monads.rb 0.19s user 0.08s system 21% cpu 1.255 total
Note that if we remove the value! call, our code will execute without waiting
on the value of the Task. Let’s try it.
class Async
include Dry::Monads::Task::Mixin
def go
slow_task(1)
end
def slow_task(i)
Task[:io] do
sleep 1
i
end
end
end
p Async.new.go
# => 1
Now let’s run it.
$ time ruby monads.rb
Task(?)
ruby monads.rb 0.18s user 0.08s system 102% cpu 0.254 total
Our program returned faster than a second. It didn’t wait on the Task to
complete but rather returned the value of the Task itself. See the ? in
there? It means that our program doesn’t know the value of the Task yet
because it hasn’t finished running.
Let’s try updating our code to sleep a bit in order to give the Task time to
finish executing.
class Async
include Dry::Monads::Task::Mixin
def go
slow_task(1)
end
def slow_task(i)
Task[:io] do
sleep 1
i
end
end
end
result = Async.new.go
sleep 2
p result
# => 1
$ time ruby monads.rb
Task(value=1)
ruby monads.rb 0.17s user 0.09s system 11% cpu 2.257 total
Now that we’ve slept longer than the Task takes to execute, you’ll notice that
our Task does indeed have a value.
Idiomatic Task Usage
Using value! isn’t great. First off, it blocks the main thread of execution
while waiting on a value. Second, it doesn’t handle errors. Let’s introduce an
exception in our task and see what happens.
class Async
include Dry::Monads::Task::Mixin
def go
slow_task(1).value!
end
def slow_task(i)
Task[:io] do
sleep 1
raise "boom!"
end
end
end
p Async.new.go
$ time ruby monads.rb
Traceback (most recent call last):
<...>
monads.rb:15:in `block in slow_task': boom! (RuntimeError)
ruby monads.rb 0.17s user 0.08s system 20% cpu 1.250 total
That’s not good. What should we use instead of value!? There are two primary
methods: bind and fmap. Let’s talk about bind first.
The bind method takes one argument – a block. That block will be called with
the value of the Task when it successfully completes. It is expected that the
block provided to bind return a new Task. Let’s use bind to chain together
multiple Tasks.
class Async
include Dry::Monads::Task::Mixin
def go
slow_task(1)
.bind { |i| slow_task(i + 1) }
.bind { |i| slow_task(i + 2) }
end
def slow_task(i)
Task[:io] do
sleep 1
i
end
end
end
p Async.new.go
Ok, let’s run it.
$ time ruby monads.rb
Task(?)
ruby monads.rb 0.17s user 0.10s system 103% cpu 0.254 total
Whoops, what happened? It turns out bind is non-blocking. Our program exits
immediately without waiting for the result of our chain of calls. Let’s add back
our handy sleep to see if we can find out what’s going on here.
class Async
include Dry::Monads::Task::Mixin
def go
slow_task(1)
.bind { |i| slow_task(i + 1) }
.bind { |i| slow_task(i + 2) }
end
def slow_task(i)
Task[:io] do
sleep 1
i
end
end
end
result = Async.new.go
sleep 5
p result
$ time ruby monads.rb
Task(value=4)
ruby monads.rb 0.18s user 0.10s system 5% cpu 5.263 total
There we go. Our first Task produced the value 1 after sleeping for a
second. Our first call to bind waited for the value to be available and then
produced a new Task, adding 1 to it and sleeping for another second.
Finally, our last call to bind waited for this value (2) and returns another
Task adding 2 more to this value. This Task eventually returns 4.
The fmap function behaves exactly like bind except that the provided block
returns a raw value rather than another Task. Let’s add one to our chain.
class Async
include Dry::Monads::Task::Mixin
def go
slow_task(1)
.bind { |i| slow_task(i + 1) }
.bind { |i| slow_task(i + 2) }
.fmap { |i| i + 3 }
end
def slow_task(i)
Task[:io] do
sleep 1
i
end
end
end
result = Async.new.go
sleep 5
p result
$ time ruby monads.rb
Task(value=7)
ruby monads.rb 0.17s user 0.09s system 5% cpu 5.249 total
Our last call to fmap waits for the aforementioned value of 4 to be produced
and then adds 3 to it, returning the raw value of 7. The fmap method knows
to re-wrap that value in a Task that returns immediately.
Handling Exceptions
An exception raised inside of a Task puts it into an error state and stores
the exception. Both bind and fmap are “error aware” in that they will not
execute their blocks on a Task in the error state. Instead, they will ignore
the provided block and simply return the errored Task.
class Async
include Dry::Monads::Task::Mixin
def go
slow_task(1)
.bind { |i| slow_task(i + 1) }
.bind { |i| error_task }
.fmap { |i| i + 3 }
end
def slow_task(i)
Task[:io] do
sleep 1
i
end
end
def error_task
Task[:fast] do
raise "boom"
end
end
end
result = Async.new.go
sleep 5
p result
$ time ruby monads.rb
Task(error=#<RuntimeError: boom>)
ruby monads.rb 0.19s user 0.09s system 5% cpu 5.261 total
In the above example, the block provided to fmap never executes because our
error_task method returns a Task in the error state.
Parallel Execution
So far we’ve run all our Tasks in serial. Each waits for a value from the
previous Task and returns a new Task. If we have 4 Tasks that sleep for a
second each, the computation will take 4 seconds.
Often we will have Tasks that we prefer to execute in parallel. To achieve
this, we’ll use the List monad and the traverse method.
class Async
include Dry::Monads::List::Mixin
include Dry::Monads::Task::Mixin
def go
List::Task[slow_task(1), slow_task(2)]
.traverse
.bind { |a, b| slow_task(a + b) }
end
def slow_task(i)
Task[:io] do
sleep 1
i
end
end
end
p Async.new.go.value!
We’re creating three tasks above, but we’re running the first two in parallel.
We achieve this by creating a List monad using the List::Task[...]
invocation, providing our list of Tasks inside of the square brackets. Next,
we call traverse. The traverse method “flips” a List monad. That is, given
a List of Tasks, it will return to us a Task of a List. Said
differently, traverse will wait until each Task in the List successfully
completes and then it will build a new Task with a List of the provided
values. Our next call to bind destructures the list into two block arguments,
which we add together to make a new Task.
Importantly, the call to traverse allows the Tasks to run in parallel. Even
though we’re starting two Tasks that each take a second to complete, that
stage of processing should only take roughly a second. Our next call to bind
creates a serial Task that also takes a second meaning our total time should
be roughly two seconds. Notice that we reintroduced the value! call to block
until the Task result is available. Let’s run it.
$ time ruby monads.rb
3
ruby monads.rb 0.17s user 0.11s system 12% cpu 2.271 total
As we can see, our code ran in just over two seconds. Yeah parallelism!
The Correct Way to Block
At some point our concurrent code is going to want to return a value back to the
rest of our program. We don’t want to be stuck in async land forever. In order
to return a value, we’ll have to somehow block until our Task has completed.
So far we’ve only seen two ways to do this: call value! or sleep. Both of
these are bad.
Fortunately, because Tasks are monads, we can convert them to other well-known
monadic types. Specifically, we can convert a Task into a
Result by calling to_result.
This will also block.
class Async
include Dry::Monads::Task::Mixin
def go
slow_task(1)
.bind { |i| slow_task(i + 1) }
.bind { |i| slow_task(i + 2) }
.fmap { |i| i + 3 }
.to_result
end
def slow_task(i)
Task[:io] do
sleep 1
i
end
end
end
p Async.new.go
$ time ruby monads.rb
Success(7)
ruby monads.rb 0.20s user 0.16s system 10% cpu 3.342 total
Discussing Result is beyond the scope of this post, but please follow the
above link to read more if you’re interested.
Using the Do Syntax
Last but not least, we will discuss the so-called “do syntax”. Many languages
that rely heavily on monads recognize that it is not always intuitive or
convenient to create chains of weird looking function calls to access the values
inside of the monad. Calling bind and fmap constantly can be confusing,
especially if nested calls are required. The do syntax helps solve this problem.
In dry-monads, the do syntax is accomplished using the yield keyword. Let’s
take a look at some example code.
class Async
include Dry::Monads::Task::Mixin
include Dry::Monads::Do.for(:go)
def go
a = yield slow_task(1)
b = yield slow_task(a + 1)
c = yield slow_task(b + 2)
Task[:immediate] do
c + 3
end
end
def slow_task(i)
Task[:io] do
sleep 1
i
end
end
end
p Async.new.go
First, you’ll notice the new line include Dry::Monads::Do.for(:go). This tells
dry-monads to add the do syntax behavior to our go method. Next, within the
body of go, any time we used to call bind we’re now calling our
Task-returning function as normal and then passing it as an argument to
yield. The do syntax has added a block argument to our go method, which we
invoke by calling yield. This block checks the value of the Task to
determine if it is successful. If so, it returns the value unwrapped. If it is
not successful, it raises an error to stop the execution of our method. This
error is caught by another wrapping helper and returned as the result of the
method. Importantly, yield also blocks. Let’s see what happens when we run
this code.
$ time ruby monads.rb
Task(value=7)
ruby monads.rb 0.18s user 0.08s system 8% cpu 3.259 total
Note that any method using the do syntax should always return a monadic type
because any time a call to yield returns an unsuccessful result, that value
will be directly returned. We want to make sure our method always returns the
same type in both success and error cases.
Let’s see how this works in an error case.
class Async
include Dry::Monads::Task::Mixin
include Dry::Monads::Do.for(:go)
def go
a = yield slow_task(1)
b = yield error_task()
c = yield slow_task(b + 2)
Task[:immediate] do
c + 3
end
end
def slow_task(i)
Task[:io] do
sleep 1
i
end
end
def error_task
Task[:fast] do
raise "boom"
end
end
end
p Async.new.go
$ time ruby monads.rb
Task(error=#<RuntimeError: boom>)
ruby monads.rb 0.18s user 0.13s system 23% cpu 1.292 total
See how code ran in just over a second? As soon as our invocation of
error_task returned an unsuccessful Task, the execution of the method
stopped and returned the error Task immediately.
The do syntax works with parallel execution too.
class Async
include Dry::Monads::List::Mixin
include Dry::Monads::Task::Mixin
include Dry::Monads::Do.for(:go)
def go
a, b = yield List::Task[slow_task(1), slow_task(2)].traverse
Task[:immediate] do
a + b
end
end
def slow_task(i)
Task[:io] do
sleep 1
i
end
end
end
p Async.new.go
$ time ruby monads.rb
Task(value=3)
ruby monads.rb 0.17s user 0.11s system 22% cpu 1.269 total
Wrap Up
I hope this post gives you a sense of what it is like to develop concurrent code
using the Task monad. It is a powerful abstraction that returns values, allows
conversion to other monadic types and can unwrap async code using the do syntax.
And all of this behavior is built on top of the stable concurrent-ruby
library. I hope you consider trying out dry-monads in a future ruby project.