Concurrency in Ruby has always been somewhat of a black box to me. Many of its traits I’ve just taken for granted and never really dived deeper to understand it better. Its most prominent feature, GVL (Global VM Lock)*, which forbids true parallelism and makes our threads run only one at a time (I will talk about it briefly in the next section) is particularly interesting. It begs several fundamental questions, like:
- When does Ruby put the currently running thread on the shelf, and start another one?
- Is there some kind of timer, that tracks how long a thread’s been running and stops it when it exceeds its timeslice?
- Does a thread voluntarily hands the control off to the next one or is it forced to do so by some other entity?
To answer these questions and to fill in some of my knowledge gaps, I’ve decided to put on my detective hat and go on a journey through Ruby internals. I’d be more than happy if you joined me on that trip!
* This article is all about CRuby/MRI 2.x. Other Ruby implementations may have very different threading scheme.
Concurrency Model in Ruby
Before we hit the road, let’s take a step back to quickly go over the concurrency model in Ruby. There are a lot of great articles, books, and other sources1,2, which thoroughly explain this topic, so I won’t go too deep into details here. Instead, let me just talk you through the most important parts. It may come in handy during our journey later on.
What happens when you create a new thread in Ruby? Under the hood, Ruby schedules a brand new OS native thread. To illustrate that, I’ve prepared a simple script that starts two threads running an empty, infinite loop:
If you ran this program, and then looked at the list of active tasks in your system, you’d get something similar to this:
Clearly, it shows that both of them were started as independent OS threads.
Global VM Lock
Does the 1-to-1 mapping between Ruby and OS threads mean, that our code can run in parallel on multicore architectures? Unfortunately not. Despite using native threads, there’s a machinery that prevents multiple tasks from executing at the same time. This mechanism is called Global VM Lock or GVL, for short. Only one thread, currently possessing the lock, can be active. One question may come to your mind right now - why would anyone want to limit the number of active threads to just one? Well, this is a very good question.
One of the reasons is that the MRI implementation is not thread-safe. Internal structures of Ruby would easily get corrupted if two tasks were trying to access the same object at the same time. Of course, there are some tools, like locks or mutexes, which could be used to prevent it. That being said, this is a much broader topic that deserves separate article(s) on its own3. For this article, we just need to be aware of the GVL’s existence and its implications.
Closer Look at the Internals
After this rather short introduction, we are finally ready to let our adventure begin. However, every journey has to start somewhere, right? Where should we kick off ours then? Ruby source code sounds like a good choice, wouldn’t you agree?
Implementation of threads in Ruby is stored in the thread.c file. Here you can see, with your own eyes, the guts that make up every thread in your program. If you look for a single source of truth, this is the right place.
There’s a short description of the thread design at the beginning of that file:
A thread has mutex (GVL: Global VM Lock or Giant VM Lock) can run. When thread scheduling, running thread release GVL. If running thread try blocking operation, this thread must release GVL and another thread can continue this flow. After blocking operation, thread must check interrupt (RUBY_VM_CHECK_INTS).
This note tells us a few interesting things:
- Thread has to acquire the lock to run
- Thread releases the lock by itself
- The lock is being released during blocking operations
RUBY_VM_CHECK_INTShas something to do with interrupt handling
First of all, it confirms the existence of GVL and multitasking limit which we’ve discussed previously. Then, we learn that threads are non-preemptive at the implementation level - they are responsible for releasing the lock to let another thread continue its flow. This answers one of our questions, great!
Another interesting insight is that the lock is released when our code starts a blocking operation, like reading or writing to a file. Is it the only thing that triggers context switching? If that’s the case, it’d mean that a thread not issuing any I/O calls would prevent other threads from running. It would starve them! It doesn’t sound good, does it? Let’s verify if that’s what happens!
What kind of non-blocking operations can we do to track the execution order of our threads? We could use an in-memory data structure for storing their trails. Ideally, something that remembers the order of interactions with it. There is one fundamental data structure that has this property. The good old array.
How can we employ it? How about making every thread push its ID many times to a shared array? As a result, we’d get an array filled with thread IDs. Since each ID is distinct and belongs to only one thread, we can determine in which order threads were running (by looking at positions of their IDs in the array). That sounds like a good plan. Below is a simple script that does exactly that:
If you run this code you will get an array of size 2,000,000, filled with two values:
After that, we can just group consecutive values and count them. It’ll tell us how many push operations one thread performs before yielding control to the second one. Below is a simple visualization of such groupings (the length of a group corresponds to the number of push operations):
As we can see, threads were definitely taking turns. For some time
T1 was adding elements to an array and then it passed control to
T2, and so on. It means that, even without having any blocking calls, threads were being switched from time to time. Is there something else then, that ensures context switching even in the absence of I/O operations? Apparently, there is! Let’s go back to that mysterious
RUBY_VM_CHECK_INTS expression to see what’s behind this magic formula.
If you look for its definition, you’ll find it in the vm_core.h file:
As you can see, it’s just a macro that invokes another function called rb_threadptr_execute_interrupts, defined in the thread.c file. Its implementation is much more complex, but a few interesting things are going on in there. First of all, as the name suggests, this routine handles a bunch of different interrupts. Secondly, there is one particularly interesting interrupt being dealt with, namely timer_interrupt:
That looks promising! Judging by the name of the routine inside the if block, it has something to do with thread scheduling. Let’s take a look at the actual implementation of that function, to confirm our suspicions:
Great! It looks like we’ve finally found the place we were looking for. Without a doubt, this is a control tower of our threads - it stops the current one to make room for others. The lock release takes place when two conditions are met:
- There is more than one living thread
- Time given to the current thread has elapsed
That makes sense. We are not really interested in yielding the control if there’s only one thread within a Ruby process. If there are other threads though, we need to routinely check if the time given to the current one elapsed, and eventually switch it to another task.
We can finally say that the
RUBY_VM_CHECK_INTS macro is responsible for thread scheduling. What does it tell us? Well, we can for example look where it’s used to understand which operations, other than I/O calls, trigger context switching. It turns out that, except for blocking calls, this macro is being used during several other activities, like fibers switching or waiting for a child process to finish. However, there is one particularly interesting place in which this macro is used - the insns.def file. The place where YARV instructions are defined.
What is YARV? It’s an acronym for Yet another Ruby VM - a virtual machine of Ruby which executes our code. Before our code can be interpreted in YARV, it has to be first compiled to a series of low-level instructions. These instructions are defined in the insns.def file.
What kind of YARV commands trigger the
RUBY_VM_CHECK_INTS macro? Here is the full list:
If you disassemble your Ruby code, you’ll notice that these instructions form the basis of fundamental control structures, like loops, conditional statements, function calls, or error raising. Below is a very simple example, showing a disassembled form of a while loop:
As we can see, the
jump command (the last YARV instruction), lies at the heart of our infinite while loop. After calling the
puts method, it makes the VM jump back to the instruction at the
0006 address, which is the starting point of our loop. When Ruby processes the jump command, it goes through the scheduling functions we’ve seen before and makes our thread release the lock.
Now it all makes sense! In Experiment 1, we didn’t have any blocking operations but since we had a loop, it allowed our threads to take turns. Could we then write code that prevents context switching entirely?
If we cannot use any blocking calls, loops, or call any methods, that may sound like a cumbersome task. There are certain operations though, that can be done without using these structures.
<< operator needs no introduction. It works like the
Array#push method - appends a given element to the end of an array. However, there is one important difference between them. The
<< operator is not a function call. Instead, it’s represented by a single YARV instruction:
opt_ltlt. What’s more, it’s not one of those commands that invoke the
RUBY_VM_CHECK_INTS macro internally. How could we leverage that? For example, we can write a very long routine that appends different values to an array, many times. Something like that:
Our test would be very similar to the first experiment. The only difference is that, instead of using a loop to populate an array, we will call the
<< operator for each element explicitly, as a separate command. However, such a program would be very long, so instead of doing that, I’ll use the eval method that takes a string and evaluates it as Ruby code:
We can use the same strategy, as we did during Experiment 1, to determine the order in which threads were running.
Here is the result:
Wow! That’s cool. We were able to turn off the thread scheduling completely! As we can see, Thread
T1 released the lock only after it finished populating the array, with one million copies of the
T2 hadn’t been able to kick in until then.
That’s a really interesting observation. It proves that thread switching may happen only during some specific operations (mentioned earlier). If our thread doesn’t execute any of them, it never gets interrupted.
After our little investigation, we can summarize what we’ve learned along the way.
The first discovery was that Ruby threads are non-preemptive (at the C level) - they voluntarily yield GVL to enable concurrency. Then, we were told that the scheduling takes place when the currently running thread performs an I/O call. On top of that, it also periodically checks if its time slice has elapsed and releases the CPU accordingly. These time checks are run only during particular operations, like:
- Fibers switching
- Waiting for the termination of a child process
- Processing special YARV instructions
Hopefully, some of the findings shown in this article were interesting to you and shed more light on how threads work under the hood. That knowledge led us to an interesting experiment, which showed that if we don’t use any control structures and function calls, we could starve other threads, which are waiting for their time. By all means, this is not a very realistic scenario since it’s hard to imagine a program without loops or function calls.
Nevertheless, it feels good when you understand how tools that you use on a daily basis work internally. I had a really good time playing around with the scheduling scheme in Ruby, and I hope that you enjoyed it as well.