Regardless of language used, when we write software, it's always with the understanding that at some point, something will go wrong. When things go wrong, we have a few choices of how to deal with them. If we fail to deal with them, the default behaviour tends to be drastic – services going down or ceasing normal function, lack of transparency of what went wrong and why, perhaps some confusing log messages or no logging at all.
It's important, then, to consider your error cases when designing the software. If a database operation fails, do we need to retry? What if the failure is persistent? Is it okay to drop a message if we failed to process it, or do we need to be resilient about eventually passing it on? How do we need to communicate a failure? In the worst case scenarios, does it make most sense to shut down the entire application rather than continue to operate in an unpredictable way?
At the lowest level, though, we need to consider how to handle the exceptions themselves in a way that makes acting on those decisions easy. Today we'll discuss best practices for propagating and catching exceptions, and designing clear exceptions. This is the first step in answering the much broader questions highlighted above.
Firstly, I'd like to touch upon the use of exceptions in Scala, compared to using monads to report error behaviour in a more functional way. Exceptions happen as a side effect in the JVM, and therefore are not functional: throwing an exception prevents us obtaining the type we were promised, and instead we have to deal with the possible exception separately.
When designing our own APIs, it's often a judgement call whether to report errors using a monad, such as an
Either[SomeError, Result]
or to throw exceptions. While using a monad is more functional, building up
layers can become difficult to deal with as the stack depth increases. We may need to deal with multiple
layers of possible error and flattening those down as we go, and we may end up reaching for a functional
framework such as typelevel cats to deal with these in a purely functional way. These problems can result in
confusion and additional complexity, along with a new dependency on opinionated functional frameworks.
Where you draw the line is highly subjective, and a major point of contention in Scala — so I'll just state
here that I try to be as pragmatic as possible: allow exceptions to propagate for exceptional cases, such as
database failures and network failures. In cases where we've processed some data and the result is some kind
of permanent failure, I might be more inclined to define a strong type or use Either
to reflect this
result. In all cases I avoid letting it result in a degree of complexity which makes it difficult to reason
about or require an additional framework.
What's inescapable is that at some point you will have to deal with exceptions in some form:
Future
and Try
both encapsulate an exception, albeit wrapping it in a monadWith that in mind, we'll focus on dealing with and defining clear exceptions. However, a lot of what's covered in this article will be relevant even if you eliminate exceptions early and prefer encapsulating your own errors in monads instead.
An important aspect of reporting errors, whether using exceptions or not, is to make them very clear and specific — to both machines and humans. Often, this means defining your own exception type and throwing it:
class EntryNotFoundException(id: String) extends RuntimeException(s"No such entry: $id")
This provides us with a clear type, allowing our eventual error-handling code to identify it and deal with it logically. It also provides some extra information at that stage: we know the problematic ID as well. By providing a readable error message, we also make it clear to humans, so that if this exception appears in logs, we'll know exactly what happened, especially when combined with the stack trace.
An approach I see too often looks like this instead:
throw new Exception(s"No such entrry: $id")
We've at least provided an error message, so we can read it ok in the logs — though we may have to broaden our search of where the error originated since we have a very generic type. However, when this propagates several layers up the stack, we have no way of distinguishing a failure to look up our ID from, say, a network error or a divide by zero. We certainly can't report on which ID failed anymore, either.
Let's also consider how we test our error cases:
// With a well-defined exception, we know we've failed for the right reason: our bad ID ws not found
val err = the[EntryNotFoundException] thrownBy myCode()
err.id shouldBe expectedId
// Lacking that, the best we can do is match on a human-readable message
val err = the[Exception] thrownBy myCode()
err.message shouldBe s"No such entrry: $id"
This is a far weaker test, and worse, overly sensitive: it will fail if someone later decides to fix the typo in the word "entry".
One decision to make when dealing with exceptions is to decide what is the correct depth at which to eliminate them. A common mistake is to eliminate, or even avoid, exceptions too soon:
def supplementaryData(id: String): Option[SupplementaryData] = {
cache.get(id) orElse {
logger.error(s"Failed to fetch $id from the cache")
None
}
}
Here we've logged a message about the cache miss, so we do see some info in the logs. We return an Option
,
so the caller can at least decide to report that lack of data as it sees fit.
However:
Either[Err, SupplementaryData]
with a more specific
reason, we have greater power to act on what went wrong.To preserve some of this information to the caller, we could change our function a bit:
def supplementaryData(id: String): SupplementaryData = {
cache.get(id) getOrElse {
throw new CacheMissException(id)
}
}
// in this case we've decided to deal with the a cache miss by simply
// leaving the extra data off a result:
def get(id: String): Record = {
val supplement: Option[SupplementaryData] = try {
Some(supplementaryData(id))
} catch {
case e: CacheException =>
logger.error(s"Error while fetching supplementary data for $id, omitting data", e)
None
}
Record(id, basicData(id), supplement)
}
This can be quite subjective, as catching it at a low level may be fine while only one process is performing this cache lookup, and you may also consider dealing with any errors through retries and cache refills part of the API of your cache class. In that scenario, reporting simply that we had a cache miss, and adding info about what we were doing at the time, can solve the problem. The cache would log some more internal information as needed, though we still lose the ability to relate the two events. There's no clear-cut answer here, however; the correct depth at which to deal with errors may vary case to case.
This makes it important to consider, early on, how you intend to deal with these exceptions, how much information you'll need to preserve, and what you'll be seeing in the logs (or reported to your dashboard as metrics). This will inform where error handling should happen.
A neat feature of exceptions is that you can nest them by providing one exception as a cause to another, which will provide you with a stack trace detailing each exception:
class LookupException(id: String, cause: Throwable)
extends RuntimeException(s"Failed lookup for id $id", cause)
try {
cache.supplementaryData(id)
} catch {
case e: CacheException =>
throw new LookupException(id, e)
}
If we take this approach, we can see that we got a cache miss which resulted in a lookup failure being thrown, and whatever ws performing that lookup can add further information. Our log might look like:
Failed to serve request to lookup for id 1234, responding with 500
com.example.LookupException: Failed lookup for id 1234
at ...
Caused by: com.example.CacheMissException: Failed to fetch id 1234 from the cache
at ...
To summarise: