In the previous post in this series, we discussed some of the basic concepts of asynchronous programming and how they can be applied in Scala. In this post, we’re going to dive a bit further into the engine that drives asynchronous programming in Scala: the Execution Context. The Execution Context is the mechanism which coordinates async tasks, and you find it used all over the place, often as an implicit parameter to many methods.

Recap of Futures

In the previous blog post, a Future was described as a “photograph” of a completed Promise. Once this photograph is available, we can perform additional operations on it. Take the following, for example:

val someNumber: Future[Int] =
  Future(4)

val someOtherNumber: Future[Int] =
  someNumber.map(_ + 1)

val illegalNumber: Future[Int] =
  someNumber.filter(_ == 30)

val flatMappedNumber: Future[Int] =
  someNumber.flatMap(n => Future(n * 3))

Ignoring the fact that a basic call like `Future(4)` doesn’t accomplish very much, these lines seek to illustrate that Futures can be composed, altered, and transformed. However, in order to actually perform these transformations, they need access to an ExecutionContext. Calls to `Future.map`, `Future.filter`, `Future.flatMap`, and many other method calls require an in-scope implicit ExecutionContext. Futures create little tasks which are submitted to an execution context to be handled later.

The Default Executor

If you’re not building full-fledged microservices or backends and only really need to accomplish a simple task for a local program, you can usually do just fine with the default thread executor. The default global executor can be used very easily with just one line:

import scala.concurrent.ExecutionContext.Implicits.global

Providing this import will bring the default thread executor into scope as an implicit value. Calls to `Future.map`, `Future.flatMap`, etc. will behave correctly. The default executor is a fork-join pool under the hood. We’ll explore what that means in a bit, but in short, it’s an efficient, general-purpose execution context for non-blocking operations. In other words, it’ll get the job done for most async tasks. Without overriding the configuration, the pool’s size is equal to the number of available processors. For example, if your CPU has four cores, the executor will use four threads.

The Akka Dispatcher

If you are in fact building a microservice or backend, or are using Akka or the Play Framework anywhere in your application, you should avoid using the default Scala executor and instead use the Actor System’s dispatcher:

import akka.actor.ActorSystem
import scala.concurrent.ExecutionContext

val system: ActorSystem =
  ActorSystem()

implicit val executionContext: ExecutionContext =
  system.dispatcher

I won’t be diving into Akka in this particular post, but Akka/Play users should be aware of the dispatcher’s existence and use it when applicable.

Similar to the default executor mentioned above, the default Akka dispatcher is a fork-join pool; however it’s important to note that unlike the Executor from the Scala standard library, the Executor used by the Akka Dispatcher does not extend java.util.concurrent.Executor. As such, attempts to use `scala.concurrent.blocking` will not work as expected. Similarly, using the dispatcher as the taskSupport for Scala parallel collections will not work as expected either.

Types of Executors

An Executor or ExecutionContext is the mechanism which handles the processing and scheduling of asynchronous computations. It can be implemented in a variety of ways to support different needs. For example, the aforementioned fork-join implementation is excellent for purely async, non-blocking computations. In instances where you absolutely need to run blocking computations, you should instead opt to run them on a fixed thread pool. Here are a few common types of executors:

Fork-Join Pool

import scala.concurrent.ExecutionContext
import java.util.concurrent.Executors
val forkJoinEc: ExecutionContext =
  ExecutionContext.fromExecutor(
    Executors.newWorkStealingPool(4)
  )

For tasks which execute quickly and are exclusively CPU-bound, a fork-join pool will provide the greatest performance.

A “thread” is essentially a series of steps which must be executed sequentially. When you construct a Future, you define the steps to get to a result. Although Futures themselves are async building blocks, the actual body of a Future is executed in-order. Multiple threads can be active at once on a multi-core processor, with each core executing a particular thread. Threads are managed by the operating system. When the JVM needs a new thread, it requests one from the OS.

Threads aren’t catastrophically expensive to make, but they do consume some memory. It’s important to consider the memory costs, but the larger factor is the expense involved with “context switching”. A CPU may only have four cores, but the operating system could be juggling hundreds of threads at once. Each thread gets a little bit of time on a CPU core before it halts and another thread takes over. Context switching happens all the time on a computer; it’ll happen hundreds of times before you finish this sentence. They happen in the blink of an eye, and you blink your eyes all day. But imagine how much harder life would be if you spent so much of it blinking! The less you blink, the more productive you are. The same holds true for computers with context switching.

Instead of running hundreds of different threads on only a handful of CPU cores, it’d be better to run one thread per CPU core to permit each thread to be active for as long as possible. With fewer threads come fewer context switches. In other words, fewer threads means less blinking.  Less blinking means less time wasted on managing the…blinking.

A fork-join pool maintains a pool of Threads. These Threads pull from a shared queue of work.  Once a piece of work is completed, the thread gets another piece of work from the queue. But at no point does the thread need to give up its time to another thread. It will continue to work without taking any breaks. This makes for an incredibly efficient workflow because it minimizes context switching.

That holds true for most types of thread pools though. Fork-Join Pools are unique in that they leverage “work stealing” to speed up the processing of tasks. This processing method is very conducive to tasks which split themselves off into smaller sub-tasks. You’ll find this to be the case in a lot of modern async Scala development. If you’re using `Future.map` or `Future.flatMap` a lot, then taking advantage of fork-join pools will be an excellent default.

Fixed Thread Pool

import scala.concurrent.ExecutionContext
import java.util.concurrent.Executors
val forkJoinEc: ExecutionContext =
  ExecutionContext.fromExecutor(
    Executors.newFixedThreadPool(20)
  )

While a fork-join pool is great for non-blocking calls, it suffers miserably when dealing with blocking calls. Imagine working on a 4-core CPU, and all four cores are occupied with separate file IO requests. And they’re all sitting there waiting. Your fork-join pool only has four threads, and those threads are going to wait on those file IO tasks until they complete. Oh and by the way, we’ll have four more file IO requests waiting for you when those finish.

Fork-join pools are designed for maximum efficiency when there are an equal number of threads to cores. However that proves to be a major drawback when dealing with blocking operations (such as network requests). Instead of using a small number of threads, this situation warrants a larger thread pool.

A Fixed Thread Pool will maintain a specific number of threads in memory. In many cases, this pool may consist of dozens or even hundreds of threads. Similar to other types of thread pools, tasks are maintained in a queue. When work becomes available, it is assigned to an unoccupied thread. The difference between this sort of thread pool and a fork-join pool is that this pool does not use “work stealing”. Threads won’t try to take over the work of other threads when they’re not busy. This renders them inefficient for short, self-forking, CPU-bound tasks which is often the case for non-blocking async tasks. However, as mentioned above, this type of pool is ideal for long-running, blocking tasks such as file read/write, certain database calls, and other network requests.

Cached Thread Pool

import scala.concurrent.ExecutionContext
import java.util.concurrent.Executors
import java.util.concurrent.TimeUnit
import java.util.concurrent.LinkedBlockingQueue
import java.util.concurrent.ThreadPoolExecutor
val cachedThreadPool: ExecutionContext =
  ExecutionContext.fromExecutor(
    new ThreadPoolExecutor(
      20, // Minimum # of threads
      40, // Maximum # of threads
      10000, // Length of time before destroying an idle thread
      TimeUnit.MILLISECONDS, // Unit for the previous argument
      new LinkedBlockingQueue() // Task queue type
    )
  )

A “cached thread pool” is similar to a Fixed Thread Pool in all ways except its number of threads. A fixed pool will maintain an exact number of threads at all times, while a cached pool will maintain a minimum number of threads but will create new threads when enough demand arises. As demand decreases, extra threads are terminated.

The advantage of a cached pool is that it will consume less memory and maintain fewer system threads when there is low demand. At non-peak hours, your system may want to run at lower power, or it may need some resources for other system tasks. A fixed thread pool will always hold onto threads, regardless of if they are needed. This can theoretically be wasteful, but in reality, the difference is relatively negligible.

The disadvantage of a core pool is that threads take time to create. It involves making a system call to instruct the kernel to create a new thread. This temporarily transfers control over to the kernel which is not an instantaneous operation. In most cases, this will again be negligible, but if you do this dozens or hundreds of times a second, you may notice a performance decrease.

Conclusion

Execution contexts are the backbone of async programming in Scala. They are the machinery which orchestrates background tasks. Work is defined in smaller units, and those tasks are submitted to the execution context. The execution context is responsible for handling when and where those tasks get executed.

Scala Futures provide mechanisms for chaining together async computations, often in the form of monadic transformations such as map and flatMap. These transformation methods produce async tasks to be run on the provided execution context. Again, all the magic happens in the execution context; the standard library just provides simpler mechanisms for interacting with it.

Different executors are suited for different types of tasks. A fork-join pool (also known as a work-stealing pool) is best suited for CPU-bound tasks that are broken up (or will self-divide) into smaller sub-tasks. They maintain a much smaller thread pool to keep the CPU as busy as possible with minimal context switching. Conversely, a fixed/cached thread pool is best suited for long-running (single-threaded), blocking operations. These thread pools are much larger than their fork-join counterparts, which sacrifices some overhead for resistance to thread starvation.

In general, a fork-join pool serves as the best type of executor for purely async operations that perform no blocking of the CPU. More and more libraries and external services are integrating async interfaces into their technology. It used to be the case that all network requests and database calls were performed using blocking operations, but nowadays, there exist non-blocking HTTP frameworks and non-blocking database libraries. In these cases, a fork-join pool is an excellent choice because even though there may be network requests, they are done in a non-blocking manner.

There are few remaining operations for which blocking behavior is still required. While async strides in local file IO are being made, in most cases, this is still done in a blocking manner. If you perform a lot of disk access in your application, consider using a fixed/cache thread pool for such operations.

Stay tuned for more posts in this series. There is still plenty to cover in the world of async Scala, particularly involving Akka actors and streams.

Other posts in the Scala Series:
Gentle Introduction to Async Scala