Exception handling in Scala

Regardless of language used, when we write software, it's always with the understanding that at some point, something will go wrong. When things go wrong, we have a few choices of how to deal with them. If we fail to deal with them, the default behaviour tends to be drastic – services going down or ceasing normal function, lack of transparency of what went wrong and why, perhaps some confusing log messages or no logging at all.

It's important, then, to consider your error cases when designing the software. If a database operation fails, do we need to retry? What if the failure is persistent? Is it okay to drop a message if we failed to process it, or do we need to be resilient about eventually passing it on? How do we need to communicate a failure? In the worst case scenarios, does it make most sense to shut down the entire application rather than continue to operate in an unpredictable way?

At the lowest level, though, we need to consider how to handle the exceptions themselves in a way that makes acting on those decisions easy. Today we'll discuss best practices for propagating and catching exceptions, and designing clear exceptions. This is the first step in answering the much broader questions highlighted above.

Exceptions vs monads
Clear definitions
The correct stack depth
Nesting exceptions
tl;dr

Exceptions vs monads

Firstly, I'd like to touch upon the use of exceptions in Scala, compared to using monads to report error behaviour in a more functional way. Exceptions happen as a side effect in the JVM, and therefore are not functional: throwing an exception prevents us obtaining the type we were promised, and instead we have to deal with the possible exception separately.

When designing our own APIs, it's often a judgement call whether to report errors using a monad, such as an Either[SomeError, Result] or to throw exceptions. While using a monad is more functional, building up layers can become difficult to deal with as the stack depth increases. We may need to deal with multiple layers of possible error and flattening those down as we go, and we may end up reaching for a functional framework such as typelevel cats to deal with these in a purely functional way. These problems can result in confusion and additional complexity, along with a new dependency on opinionated functional frameworks.

Where you draw the line is highly subjective, and a major point of contention in Scala — so I'll just state here that I try to be as pragmatic as possible: allow exceptions to propagate for exceptional cases, such as database failures and network failures. In cases where we've processed some data and the result is some kind of permanent failure, I might be more inclined to define a strong type or use Either to reflect this result. In all cases I avoid letting it result in a degree of complexity which makes it difficult to reason about or require an additional framework.

What's inescapable is that at some point you will have to deal with exceptions in some form:

Java APIs and many Scala APIs will throw exceptions
Future and Try both encapsulate an exception, albeit wrapping it in a monad
At the top level you'll have to acknowledge that an exception may bubble up from somewhere

With that in mind, we'll focus on dealing with and defining clear exceptions. However, a lot of what's covered in this article will be relevant even if you eliminate exceptions early and prefer encapsulating your own errors in monads instead.

Clear definitions

An important aspect of reporting errors, whether using exceptions or not, is to make them very clear and specific — to both machines and humans. Often, this means defining your own exception type and throwing it:

class EntryNotFoundException(id: String) extends RuntimeException(s"No such entry: $id")

This provides us with a clear type, allowing our eventual error-handling code to identify it and deal with it logically. It also provides some extra information at that stage: we know the problematic ID as well. By providing a readable error message, we also make it clear to humans, so that if this exception appears in logs, we'll know exactly what happened, especially when combined with the stack trace.

An approach I see too often looks like this instead:

throw new Exception(s"No such entrry: $id")

We've at least provided an error message, so we can read it ok in the logs — though we may have to broaden our search of where the error originated since we have a very generic type. However, when this propagates several layers up the stack, we have no way of distinguishing a failure to look up our ID from, say, a network error or a divide by zero. We certainly can't report on which ID failed anymore, either.

Let's also consider how we test our error cases:

// With a well-defined exception, we know we've failed for the right reason: our bad ID ws not found
val err = the[EntryNotFoundException] thrownBy myCode()
err.id shouldBe expectedId

// Lacking that, the best we can do is match on a human-readable message
val err = the[Exception] thrownBy myCode()
err.message shouldBe s"No such entrry: $id"

This is a far weaker test, and worse, overly sensitive: it will fail if someone later decides to fix the typo in the word "entry".

The correct stack depth

One decision to make when dealing with exceptions is to decide what is the correct depth at which to eliminate them. A common mistake is to eliminate, or even avoid, exceptions too soon:

def supplementaryData(id: String): Option[SupplementaryData] = {
  cache.get(id) orElse {
    logger.error(s"Failed to fetch $id from the cache")
    None
  }
}

Here we've logged a message about the cache miss, so we do see some info in the logs. We return an Option, so the caller can at least decide to report that lack of data as it sees fit.

However:

From a human point of view, we've reported a message and an ID into the logs but we have no stack trace, so we only know that we were querying the cache. We have no way of knowing why we were querying the cache, in a scenario where several different processes may need to access this data, and can only guess whether this resulted in a dropped message, an incomplete database entry, an HTTP 500 back to a user.
From an error handling point of view, the upstream can see we didn't get the data, but has no idea why. If we had propagated an exception, or provided an Either[Err, SupplementaryData] with a more specific reason, we have greater power to act on what went wrong.

To preserve some of this information to the caller, we could change our function a bit:

def supplementaryData(id: String): SupplementaryData = {
  cache.get(id) getOrElse {
    throw new CacheMissException(id)
  }
}

// in this case we've decided to deal with the a cache miss by simply
// leaving the extra data off a result:
def get(id: String): Record = {
  val supplement: Option[SupplementaryData] = try {
    Some(supplementaryData(id))
  } catch {
    case e: CacheException =>
      logger.error(s"Error while fetching supplementary data for $id, omitting data", e)
      None
  }

  Record(id, basicData(id), supplement)
}

This can be quite subjective, as catching it at a low level may be fine while only one process is performing this cache lookup, and you may also consider dealing with any errors through retries and cache refills part of the API of your cache class. In that scenario, reporting simply that we had a cache miss, and adding info about what we were doing at the time, can solve the problem. The cache would log some more internal information as needed, though we still lose the ability to relate the two events. There's no clear-cut answer here, however; the correct depth at which to deal with errors may vary case to case.

This makes it important to consider, early on, how you intend to deal with these exceptions, how much information you'll need to preserve, and what you'll be seeing in the logs (or reported to your dashboard as metrics). This will inform where error handling should happen.

Nesting exceptions

A neat feature of exceptions is that you can nest them by providing one exception as a cause to another, which will provide you with a stack trace detailing each exception:

class LookupException(id: String, cause: Throwable)
  extends RuntimeException(s"Failed lookup for id $id", cause)

try {
  cache.supplementaryData(id)
} catch {
  case e: CacheException =>
    throw new LookupException(id, e)
}

If we take this approach, we can see that we got a cache miss which resulted in a lookup failure being thrown, and whatever ws performing that lookup can add further information. Our log might look like:

Failed to serve request to lookup for id 1234, responding with 500
com.example.LookupException: Failed lookup for id 1234
      at ...
Caused by: com.example.CacheMissException: Failed to fetch id 1234 from the cache
      at ...

tl;dr

To summarise:

Keep in mind the big picture of how different errors will be handled by your application as you write your code. Use that to inform how and where you'll handle them
Consider both what the logs will look like, and how much information is needed to correctly handle problems
Make sure you provide clear, specific error types and clear messaging
Keep in mind how testing will work when defining your exceptions