Typesafe config is a popular, even ubiquitous means of configuring applications in Scala, using a custom format developed by Lightbend, HOCON. HOCON is a JSON superset, meaning any JSON is also valid HOCON – but with many additional features to make it easier for humans to read and write, less verbose, and more flexible.
It's a very popular choice due to how quickly and easily you can configure your Scala applications, and so most Scala developers are familiar with it to some extent. However, it's all too frequent to see developers not appreciate some of the most powerful features of the format.
Today I'll cover some quality of life features which are often overlooked and how I've found them useful, and the mistakes I've often seen made in configuring applications with typesafe config.
As a JSON superset language, HOCON firstly gives you a structured, readable means of writing your config. This includes trimming down unnecessary quotes where not needed, eliminating commas, and adding comments. It also lets you write nested objects as entire nested sections, like JSON, or simply refer to paths using a dotty syntax, or any combination of the two.
db {
postgres {
host = localhost
port = 5432
}
redis {
host = localhost
port = 6379
}
}
queues.rabbitmq {
host = localhost
port = 5672
}
assets.images.stock-photos = file:///usr/share/stock-photos/
These simple features already make HOCON nice to write, but there are several powerful features which are easy to miss. The below is by no means comprehensive, but covers some features I've found useful and some situations in which I've used them. For a more comprehensive guide check out the documentation.
You can declare properties and sections multiple times, and the latest version to appear will be taken:
phrases {
greeting = hello
greeting = bonjour
confirmation = yes
}
phrases {
confirmation = oui
}
phrases.rejection = non
...resulting in a final selection of bonjour
, oui
, and non
in that section. This means you can easily
override values specified in default config by whole section or by individual property.
This also means you could concatenate base.conf
with prod.conf
to update values you've specified in
your prod.conf
without affecting anything else – this can be useful when your deployment is varied. For
example, one copy of your application is for Europe and one is for North America, so you might concatenate
europe.conf
and prod.conf
to create your full config file for Europe.
You can reuse or embed other properties into different config parameters, allowing you to refactor common values into a section, or concatenate values to build a longer string. For example, building full URLs from components, or reusing a secret to talk to multiple applications:
# You can use something like a meta namespace to indicate these properties aren't directly
# read by your application but allow them to be used in properties which are; this is optional
# but may aid readability
meta {
app-secret = MY_SECRET
scraper {
protocol = https
host = www.example.com
port = 8080
path = /stuff
root = ${meta.scraper.protocol}"://"${meta.scraper.host}":"${meta.scraper.port}${meta.scraper.path}
}
}
scraper {
cars-endpoint = ${meta.scraper.root}/cars
sofas-endpoint = ${meta.scraper.root}/sofas
secret = ${meta.app-secret}
}
data-service {
secret = ${meta.app-secret}
}
Objects can be merged by putting them in series with each other, which means it's easy to merge extra values into a set of defaults.
I've found this very useful in the past when writing a data streaming framework. The framework defined a standard data pipeline to work with multiple different types of data, and you could run multiple pipelines in parallel to handle multiple data feeds. Each pipeline needed several properties to be configured, some of which were more likely to need changing than others. That meant my config looked something like this:
# Provided in the library (reference.conf)
defaults {
pipeline {
feed-name = null
buffer-size = 1000
alarm-threshold = 10s
max-packet-size = 10K
}
}
# In the service (application.conf)
pipelines {
football ${defaults.pipeline} {
feed-name = football
}
basketball ${defaults.pipeline} {
feed-name = basketball
alarm-threshold = 30s
max-packet-size = 1M
}
}
So how do we use HOCON to configure our application? Where does this config live, and how do we configure per environment?
Basics first. The "default" config file of your application will usually live in
src/main/resources/application.conf
. This might look something like this:
http {
bind = localhost
port = 8080
}
db {
postgres {
host = localhost
port = 5432
}
}
This is the version that is built into your application's .jar
file, so this is the version which
contains your defaults. In my example, we're configuring a simple application which talks to a postgres
database and serves an HTTP interface. Since this is baked into the .jar
, our defaults contain sensible
values you might use in a non-production environment, making it easy to run during development with no
further configuration, where we might be running dependent services locally.
Since libraries which use typesafe config (e.g. Akka) will provide a variety of their own properties to
configure, and their own defaults in reference.conf
included in the library, you can also configure
properties of the library as needed, and use the same approach for your own libraries.
Having all our config baked into the build artifact isn't much use to us, though. How do we configure it in a particular environment, or on a particular host or container? We have a few options.
At runtime, we can override any property we like using runtime arguments as -D
parameters with dotty
config paths. This might look like:
java -Dhttp.bind=0.0.0.0 postgres.host=postgres.example.com -jar myapp.jar
We can take the value of environment variables by specifying them in the config file. We can also take them if present and fall back to something else otherwise. For example, we could change our example to look like:
http {
bind = ${?MYAPP_BIND}
bind = localhost
port = 8080
}
db {
postgres {
host = ${?POSTGRES_HOST}
host = localhost
port = 8080
}
}
Now we can configure MYAPP_BIND
and POSTGRES_HOST
variables in our environment to override these
properties, and they'll assume the same values previously if we haven't done so. This can be very useful in
a local development environment, as you can simply set some preferred properties in a .bashrc
. It can also
be useful when running your application in docker, as you can easily inject in these env vars using -e
when
running your container.
We can provide a separate config file, and override only what we need to:
include "application"
host.bind = 0.0.0.0
db.postgres.host = postgres.example.com
Here we include the default application config from the jar file, and then we override the properties we need to change. Note that, in general, you will always want to fall back on this file, as by design you are required to configure every valid property of the application. What you don't want to do is copy out every property again in order to do so, which makes you repeat yourself and exposes you to errors when new config options are added.
We've looked at a few simple, handy ways of configuring your application per environment or use case. I've deliberately left external config files fresh in your mind here because they are written in HOCON, making them by far the most expressive.
Leaving that aside for the moment, let's look at two common mistakes in creating your configuration, which generally boil down to not using HOCON:
While useful for overriding a parameter or two in development, say -Ddebug=true
for example, let's imagine
an example where you need to configure several properties:
# Default application.conf
http {
bind = localhost
port = 8080
}
db.postgres {
host = localhost
port = 5432
username = postgres
password = postgres
}
auth-service {
host = localhost
port = 8081
secret = DEVSECRET
}
user-blacklist = [hacker1, hacker2, hacker3]
And let's write the line to run application in production:
java \
-Dhttp.bind=0.0.0.0 \
-Dhttp.port=80 \
-Ddb.postgres.host=postgres.example.com \
-Ddb.postgres.username=myappuser \
-Ddb.postgres.password='pr0dP@ssword!' \
-Dauth-service.host=auth.example.com \
-Dauth-service.port=8080 \
-Dauth-service.secret='pr0d_sh@red_s3cr3t' \
-Duser-blacklist.4=hacker4
...bit of a mouthful, and a fair bit trickier to read than the HOCON we're overriding. We can't use any of
the power of HOCON to help us here. This is especially noticeable with the user-blacklist
property,
where altering lists is variably clunky or impossible.
Another concern you have is that you're writing your config directly into the command which will be run,
which means any user on the running host can use tools like ps
to see the full text of the command –
including a database password and a secret API key in this case.
You could similarly use environment variables to configure every line of your config, but again you have the same problems of reducing your nice structured config to flat key-value pairs, and the additional problem that you have to add a lot of base config to record the environment variables you want to use.
Continuing the theme of not using HOCON and losing its benefits, the second mistake I've seen frequently is ignoring HOCON and configuring the application in your infrastructure code. I will demonstrate what I mean using ansible here, but I've seen this same mistake made with ansible, puppet, kubernetes and terraform.
Consider that you're trying to configure the same application as above, and now you have an ansible variables file:
# group_vars/production.yml
http_bind: 0.0.0.0
http_port: 80
postgres_host: postgres.example.com
postgres_username: {{ secrets.postgres_user }}
postgres_password: {{ secrets.postgres_password }}
auth_host: auth.example.com
auth_port: 80
auth_secret: {{ secrets.auth_secret }}
user_blacklist: [user1, user2, user3]
When writing your config, you can then use any of the above methods to configure your application. Let's say you decide to use a template to write a HOCON config file.
http {
bind = {{ http_bind }}
port = {{ http_port }}
}
db.postgres {
host = {{ postgres_host }}
username = {{ postgres_username }}
password = {{ postgres_password }}
}
auth-service {
host = {{ auth_host }}
port = {{ auth_port }}
secret = {{ auth_secret }}
}
user-blacklist = {{ user_blacklist }}
We're writing HOCON this time, but we still have key-value parameters because our actual configuration is a series of ansible variables, and we have to additionally make sure they all get templated into the config file.
This is certainly better than the version with -D
params, as we're using YAML and that gives us some of
our features back; we could structure our YAML rather than writing it flat, and we have a concept of lists
again, so our blacklist isn't so awkward.
The downside is twofold: we have to update two places with new variables, and we're telling ansible about properties it doesn't need to know about.
The solution, then, is to write HOCON as much as possible, so that you get its full power and, as ever, Don't Repeat Yourself. In my example above, this isn't possible for everything. It's very likely that hostnames will be provided by your infrastructure management tooling, and your secrets will be read from your secret management tooling, meaning you'll still need to template your config file.
Likely, though, several pieces of config simply belong in your application config file. The tidiest approach is to keep your infrastructure config simple:
postgres_user: {{ secrets.postgres_user }}
postgres_password: {{ secrets.postgres_password }}
auth_host: {{ get_auth_host() }} # from service discovery, for example
auth_secret: {{ secrets.auth_secret }}
and your application config structured with HOCON:
http.bind = 0.0.0.0
db.postgres {
host = postgres.example.com
user = {{ postgres_user }}
password = {{ postgres_password }}
}
user-blacklist = [user1, user2, user3, user4, user5]
Tell ansible to write this template to /etc/myapp/app.conf
, make sure -Dconfig.file=/etc/myapp/app.conf
is included in your run command, and you're good to go.
To summarise, then: