Yes, you read the title correctly. And yes, ruby still has a GIL. However, ruby has long been able to support concurrency for IO-bound calls. Ruby will not block the current thread if it is waiting on IO. In practice, this means that many of our ruby programs (like web applications) can use ruby’s threads to concurrently serve multiple requests.
Working directly with ruby’s threading primitives can be complicated. This is a problem that the concurent-ruby library aims to solve. This library is mature and comprehensive but it offers a staggering number of APIs for modeling concurrency in your application.
I’d like to suggest a different path. The
dry-monads library exposes the Task
monad which is built on top of concurrent-ruby
. In this post I’ll explore the
Task
monad as well as the newly-released do
syntax.
Tasks Introduction
First, let’s assume you have the following in your Gemfile
:
source "https://rubygems.org"
gem "dry-monads", "~> 1.0", require: "dry/monads/all"
We can now write a simple program that uses a Task
to perform an asynchronous
computation. We’ll use sleep
to represent any kind of long-running IO
operation (think database queries, HTTP requests, etc).
class Async
include Dry::Monads::Task::Mixin
def go
slow_task(1).value!
end
def slow_task(i)
Task[:io] do
sleep 1
i
end
end
end
p Async.new.go
# => 1
First, we include Dry::Monads::Task::Mixin
in our class. This gives us access
to the Task
constant which we can use to build our Task
s. Next, we use
Task[:io]
and provide it a block to start a Task
. This creates a Task
on
the :io
executor. There are three default executors: :io
for IO-bound tasks,
:fast
for CPU-bound tasks and :immediate
which runs on the current thread
(usually used for testing).
As soon as our Task
is initialized, it begins running in its own thread. It
will not block the main thread of execution.
Finally, we call the value!
method to block until our Task
returns a
value.
If we save this in a file named monads.rb
and run it, we’ll see it takes a bit
over a second to execute.
$ time ruby monads.rb
1
ruby monads.rb 0.19s user 0.08s system 21% cpu 1.255 total
Note that if we remove the value!
call, our code will execute without waiting
on the value of the Task
. Let’s try it.
class Async
include Dry::Monads::Task::Mixin
def go
slow_task(1)
end
def slow_task(i)
Task[:io] do
sleep 1
i
end
end
end
p Async.new.go
# => 1
Now let’s run it.
$ time ruby monads.rb
Task(?)
ruby monads.rb 0.18s user 0.08s system 102% cpu 0.254 total
Our program returned faster than a second. It didn’t wait on the Task
to
complete but rather returned the value of the Task
itself. See the ?
in
there? It means that our program doesn’t know the value of the Task
yet
because it hasn’t finished running.
Let’s try updating our code to sleep a bit in order to give the Task
time to
finish executing.
class Async
include Dry::Monads::Task::Mixin
def go
slow_task(1)
end
def slow_task(i)
Task[:io] do
sleep 1
i
end
end
end
result = Async.new.go
sleep 2
p result
# => 1
$ time ruby monads.rb
Task(value=1)
ruby monads.rb 0.17s user 0.09s system 11% cpu 2.257 total
Now that we’ve slept longer than the Task
takes to execute, you’ll notice that
our Task
does indeed have a value.
Idiomatic Task Usage
Using value!
isn’t great. First off, it blocks the main thread of execution
while waiting on a value. Second, it doesn’t handle errors. Let’s introduce an
exception in our task and see what happens.
class Async
include Dry::Monads::Task::Mixin
def go
slow_task(1).value!
end
def slow_task(i)
Task[:io] do
sleep 1
raise "boom!"
end
end
end
p Async.new.go
$ time ruby monads.rb
Traceback (most recent call last):
<...>
monads.rb:15:in `block in slow_task': boom! (RuntimeError)
ruby monads.rb 0.17s user 0.08s system 20% cpu 1.250 total
That’s not good. What should we use instead of value!
? There are two primary
methods: bind
and fmap
. Let’s talk about bind
first.
The bind
method takes one argument – a block. That block will be called with
the value of the Task
when it successfully completes. It is expected that the
block provided to bind
return a new Task
. Let’s use bind
to chain together
multiple Task
s.
class Async
include Dry::Monads::Task::Mixin
def go
slow_task(1)
.bind { |i| slow_task(i + 1) }
.bind { |i| slow_task(i + 2) }
end
def slow_task(i)
Task[:io] do
sleep 1
i
end
end
end
p Async.new.go
Ok, let’s run it.
$ time ruby monads.rb
Task(?)
ruby monads.rb 0.17s user 0.10s system 103% cpu 0.254 total
Whoops, what happened? It turns out bind
is non-blocking. Our program exits
immediately without waiting for the result of our chain of calls. Let’s add back
our handy sleep to see if we can find out what’s going on here.
class Async
include Dry::Monads::Task::Mixin
def go
slow_task(1)
.bind { |i| slow_task(i + 1) }
.bind { |i| slow_task(i + 2) }
end
def slow_task(i)
Task[:io] do
sleep 1
i
end
end
end
result = Async.new.go
sleep 5
p result
$ time ruby monads.rb
Task(value=4)
ruby monads.rb 0.18s user 0.10s system 5% cpu 5.263 total
There we go. Our first Task
produced the value 1
after sleeping for a
second. Our first call to bind
waited for the value to be available and then
produced a new Task
, adding 1
to it and sleeping for another second.
Finally, our last call to bind
waited for this value (2
) and returns another
Task
adding 2
more to this value. This Task
eventually returns 4
.
The fmap
function behaves exactly like bind
except that the provided block
returns a raw value rather than another Task
. Let’s add one to our chain.
class Async
include Dry::Monads::Task::Mixin
def go
slow_task(1)
.bind { |i| slow_task(i + 1) }
.bind { |i| slow_task(i + 2) }
.fmap { |i| i + 3 }
end
def slow_task(i)
Task[:io] do
sleep 1
i
end
end
end
result = Async.new.go
sleep 5
p result
$ time ruby monads.rb
Task(value=7)
ruby monads.rb 0.17s user 0.09s system 5% cpu 5.249 total
Our last call to fmap
waits for the aforementioned value of 4
to be produced
and then adds 3
to it, returning the raw value of 7
. The fmap
method knows
to re-wrap that value in a Task
that returns immediately.
Handling Exceptions
An exception raised inside of a Task
puts it into an error state and stores
the exception. Both bind
and fmap
are “error aware” in that they will not
execute their blocks on a Task
in the error state. Instead, they will ignore
the provided block and simply return the errored Task
.
class Async
include Dry::Monads::Task::Mixin
def go
slow_task(1)
.bind { |i| slow_task(i + 1) }
.bind { |i| error_task }
.fmap { |i| i + 3 }
end
def slow_task(i)
Task[:io] do
sleep 1
i
end
end
def error_task
Task[:fast] do
raise "boom"
end
end
end
result = Async.new.go
sleep 5
p result
$ time ruby monads.rb
Task(error=#<RuntimeError: boom>)
ruby monads.rb 0.19s user 0.09s system 5% cpu 5.261 total
In the above example, the block provided to fmap
never executes because our
error_task
method returns a Task
in the error state.
Parallel Execution
So far we’ve run all our Task
s in serial. Each waits for a value from the
previous Task
and returns a new Task
. If we have 4 Task
s that sleep for a
second each, the computation will take 4 seconds.
Often we will have Task
s that we prefer to execute in parallel. To achieve
this, we’ll use the List
monad and the traverse
method.
class Async
include Dry::Monads::List::Mixin
include Dry::Monads::Task::Mixin
def go
List::Task[slow_task(1), slow_task(2)]
.traverse
.bind { |a, b| slow_task(a + b) }
end
def slow_task(i)
Task[:io] do
sleep 1
i
end
end
end
p Async.new.go.value!
We’re creating three tasks above, but we’re running the first two in parallel.
We achieve this by creating a List
monad using the List::Task[...]
invocation, providing our list of Task
s inside of the square brackets. Next,
we call traverse
. The traverse
method “flips” a List
monad. That is, given
a List
of Task
s, it will return to us a Task
of a List
. Said
differently, traverse
will wait until each Task
in the List
successfully
completes and then it will build a new Task
with a List
of the provided
values. Our next call to bind
destructures the list into two block arguments,
which we add together to make a new Task
.
Importantly, the call to traverse
allows the Task
s to run in parallel. Even
though we’re starting two Task
s that each take a second to complete, that
stage of processing should only take roughly a second. Our next call to bind
creates a serial Task
that also takes a second meaning our total time should
be roughly two seconds. Notice that we reintroduced the value!
call to block
until the Task
result is available. Let’s run it.
$ time ruby monads.rb
3
ruby monads.rb 0.17s user 0.11s system 12% cpu 2.271 total
As we can see, our code ran in just over two seconds. Yeah parallelism!
The Correct Way to Block
At some point our concurrent code is going to want to return a value back to the
rest of our program. We don’t want to be stuck in async land forever. In order
to return a value, we’ll have to somehow block until our Task
has completed.
So far we’ve only seen two ways to do this: call value!
or sleep
. Both of
these are bad.
Fortunately, because Task
s are monads, we can convert them to other well-known
monadic types. Specifically, we can convert a Task
into a
Result by calling to_result
.
This will also block.
class Async
include Dry::Monads::Task::Mixin
def go
slow_task(1)
.bind { |i| slow_task(i + 1) }
.bind { |i| slow_task(i + 2) }
.fmap { |i| i + 3 }
.to_result
end
def slow_task(i)
Task[:io] do
sleep 1
i
end
end
end
p Async.new.go
$ time ruby monads.rb
Success(7)
ruby monads.rb 0.20s user 0.16s system 10% cpu 3.342 total
Discussing Result
is beyond the scope of this post, but please follow the
above link to read more if you’re interested.
Using the Do Syntax
Last but not least, we will discuss the so-called “do syntax”. Many languages
that rely heavily on monads recognize that it is not always intuitive or
convenient to create chains of weird looking function calls to access the values
inside of the monad. Calling bind
and fmap
constantly can be confusing,
especially if nested calls are required. The do syntax helps solve this problem.
In dry-monads
, the do syntax is accomplished using the yield
keyword. Let’s
take a look at some example code.
class Async
include Dry::Monads::Task::Mixin
include Dry::Monads::Do.for(:go)
def go
a = yield slow_task(1)
b = yield slow_task(a + 1)
c = yield slow_task(b + 2)
Task[:immediate] do
c + 3
end
end
def slow_task(i)
Task[:io] do
sleep 1
i
end
end
end
p Async.new.go
First, you’ll notice the new line include Dry::Monads::Do.for(:go)
. This tells
dry-monads
to add the do syntax behavior to our go
method. Next, within the
body of go
, any time we used to call bind
we’re now calling our
Task
-returning function as normal and then passing it as an argument to
yield
. The do syntax has added a block argument to our go
method, which we
invoke by calling yield
. This block checks the value of the Task
to
determine if it is successful. If so, it returns the value unwrapped. If it is
not successful, it raises an error to stop the execution of our method. This
error is caught by another wrapping helper and returned as the result of the
method. Importantly, yield
also blocks. Let’s see what happens when we run
this code.
$ time ruby monads.rb
Task(value=7)
ruby monads.rb 0.18s user 0.08s system 8% cpu 3.259 total
Note that any method using the do syntax should always return a monadic type
because any time a call to yield
returns an unsuccessful result, that value
will be directly returned. We want to make sure our method always returns the
same type in both success and error cases.
Let’s see how this works in an error case.
class Async
include Dry::Monads::Task::Mixin
include Dry::Monads::Do.for(:go)
def go
a = yield slow_task(1)
b = yield error_task()
c = yield slow_task(b + 2)
Task[:immediate] do
c + 3
end
end
def slow_task(i)
Task[:io] do
sleep 1
i
end
end
def error_task
Task[:fast] do
raise "boom"
end
end
end
p Async.new.go
$ time ruby monads.rb
Task(error=#<RuntimeError: boom>)
ruby monads.rb 0.18s user 0.13s system 23% cpu 1.292 total
See how code ran in just over a second? As soon as our invocation of
error_task
returned an unsuccessful Task
, the execution of the method
stopped and returned the error Task
immediately.
The do syntax works with parallel execution too.
class Async
include Dry::Monads::List::Mixin
include Dry::Monads::Task::Mixin
include Dry::Monads::Do.for(:go)
def go
a, b = yield List::Task[slow_task(1), slow_task(2)].traverse
Task[:immediate] do
a + b
end
end
def slow_task(i)
Task[:io] do
sleep 1
i
end
end
end
p Async.new.go
$ time ruby monads.rb
Task(value=3)
ruby monads.rb 0.17s user 0.11s system 22% cpu 1.269 total
Wrap Up
I hope this post gives you a sense of what it is like to develop concurrent code
using the Task
monad. It is a powerful abstraction that returns values, allows
conversion to other monadic types and can unwrap async code using the do syntax.
And all of this behavior is built on top of the stable concurrent-ruby
library. I hope you consider trying out dry-monads
in a future ruby project.