Are you using the full potential of typesafe config?

Typesafe config is a popular, even ubiquitous means of configuring applications in Scala, using a custom format developed by Lightbend, HOCON. HOCON is a JSON superset, meaning any JSON is also valid HOCON – but with many additional features to make it easier for humans to read and write, less verbose, and more flexible.

It's a very popular choice due to how quickly and easily you can configure your Scala applications, and so most Scala developers are familiar with it to some extent. However, it's all too frequent to see developers not appreciate some of the most powerful features of the format.

Today I'll cover some quality of life features which are often overlooked and how I've found them useful, and the mistakes I've often seen made in configuring applications with typesafe config.

HOCON Basics
Managing config
Common mistakes
- Using -D params
- Leaking app config into infrastructure code
The solution
tl;dr

HOCON Basics

As a JSON superset language, HOCON firstly gives you a structured, readable means of writing your config. This includes trimming down unnecessary quotes where not needed, eliminating commas, and adding comments. It also lets you write nested objects as entire nested sections, like JSON, or simply refer to paths using a dotty syntax, or any combination of the two.

db {
  postgres {
    host = localhost
    port = 5432
  }

  redis {
    host = localhost
    port = 6379
  }
}

queues.rabbitmq {
  host = localhost
  port = 5672
}

assets.images.stock-photos = file:///usr/share/stock-photos/

These simple features already make HOCON nice to write, but there are several powerful features which are easy to miss. The below is by no means comprehensive, but covers some features I've found useful and some situations in which I've used them. For a more comprehensive guide check out the documentation.

Declaring properties multiple times

You can declare properties and sections multiple times, and the latest version to appear will be taken:

phrases {
  greeting = hello
  greeting = bonjour
  confirmation = yes
}

phrases {
  confirmation = oui
}

phrases.rejection = non

...resulting in a final selection of bonjour, oui, and non in that section. This means you can easily override values specified in default config by whole section or by individual property.

This also means you could concatenate base.conf with prod.conf to update values you've specified in your prod.conf without affecting anything else – this can be useful when your deployment is varied. For example, one copy of your application is for Europe and one is for North America, so you might concatenate europe.conf and prod.conf to create your full config file for Europe.

Substitutions

You can reuse or embed other properties into different config parameters, allowing you to refactor common values into a section, or concatenate values to build a longer string. For example, building full URLs from components, or reusing a secret to talk to multiple applications:

# You can use something like a meta namespace to indicate these properties aren't directly
# read by your application but allow them to be used in properties which are; this is optional
# but may aid readability
meta {
  app-secret = MY_SECRET

  scraper {
    protocol = https
    host = www.example.com
    port = 8080
    path = /stuff
    root = ${meta.scraper.protocol}"://"${meta.scraper.host}":"${meta.scraper.port}${meta.scraper.path}
  }
}

scraper {
  cars-endpoint = ${meta.scraper.root}/cars
  sofas-endpoint = ${meta.scraper.root}/sofas
  secret = ${meta.app-secret}
}

data-service {
  secret = ${meta.app-secret}
}

Merging objects

Objects can be merged by putting them in series with each other, which means it's easy to merge extra values into a set of defaults.

I've found this very useful in the past when writing a data streaming framework. The framework defined a standard data pipeline to work with multiple different types of data, and you could run multiple pipelines in parallel to handle multiple data feeds. Each pipeline needed several properties to be configured, some of which were more likely to need changing than others. That meant my config looked something like this:

# Provided in the library (reference.conf)
defaults {
  pipeline {
    feed-name = null
    buffer-size = 1000
    alarm-threshold = 10s
    max-packet-size = 10K
  }
}

# In the service (application.conf)
pipelines {
  football ${defaults.pipeline} {
    feed-name = football
  }
  basketball ${defaults.pipeline} {
    feed-name = basketball
    alarm-threshold = 30s
    max-packet-size = 1M
  }
}

Managing config

So how do we use HOCON to configure our application? Where does this config live, and how do we configure per environment?

The default config file

Basics first. The "default" config file of your application will usually live in src/main/resources/application.conf. This might look something like this:

http {
  bind = localhost
  port = 8080
}

db {
  postgres {
    host = localhost
    port = 5432
  }
}

This is the version that is built into your application's .jar file, so this is the version which contains your defaults. In my example, we're configuring a simple application which talks to a postgres database and serves an HTTP interface. Since this is baked into the .jar, our defaults contain sensible values you might use in a non-production environment, making it easy to run during development with no further configuration, where we might be running dependent services locally.

Since libraries which use typesafe config (e.g. Akka) will provide a variety of their own properties to configure, and their own defaults in reference.conf included in the library, you can also configure properties of the library as needed, and use the same approach for your own libraries.

Having all our config baked into the build artifact isn't much use to us, though. How do we configure it in a particular environment, or on a particular host or container? We have a few options.

Runtime arguments

At runtime, we can override any property we like using runtime arguments as -D parameters with dotty config paths. This might look like:

java -Dhttp.bind=0.0.0.0 postgres.host=postgres.example.com -jar myapp.jar

Environment variables

We can take the value of environment variables by specifying them in the config file. We can also take them if present and fall back to something else otherwise. For example, we could change our example to look like:

http {
  bind = ${?MYAPP_BIND}
  bind = localhost
  port = 8080
}

db {
  postgres {
    host = ${?POSTGRES_HOST}
    host = localhost
    port = 8080
  }
}

Now we can configure MYAPP_BIND and POSTGRES_HOST variables in our environment to override these properties, and they'll assume the same values previously if we haven't done so. This can be very useful in a local development environment, as you can simply set some preferred properties in a .bashrc. It can also be useful when running your application in docker, as you can easily inject in these env vars using -e when running your container.

An external config file

We can provide a separate config file, and override only what we need to:

include "application"

host.bind = 0.0.0.0
db.postgres.host = postgres.example.com

Here we include the default application config from the jar file, and then we override the properties we need to change. Note that, in general, you will always want to fall back on this file, as by design you are required to configure every valid property of the application. What you don't want to do is copy out every property again in order to do so, which makes you repeat yourself and exposes you to errors when new config options are added.

Common mistakes

We've looked at a few simple, handy ways of configuring your application per environment or use case. I've deliberately left external config files fresh in your mind here because they are written in HOCON, making them by far the most expressive.

Leaving that aside for the moment, let's look at two common mistakes in creating your configuration, which generally boil down to not using HOCON:

Using -D params

While useful for overriding a parameter or two in development, say -Ddebug=true for example, let's imagine an example where you need to configure several properties:

# Default application.conf
http {
  bind = localhost
  port = 8080
}

db.postgres {
  host = localhost
  port = 5432
  username = postgres
  password = postgres
}

auth-service {
  host = localhost
  port = 8081
  secret = DEVSECRET
}

user-blacklist = [hacker1, hacker2, hacker3]

And let's write the line to run application in production:

java \
  -Dhttp.bind=0.0.0.0 \
  -Dhttp.port=80 \
  -Ddb.postgres.host=postgres.example.com \
  -Ddb.postgres.username=myappuser \
  -Ddb.postgres.password='pr0dP@ssword!' \
  -Dauth-service.host=auth.example.com \
  -Dauth-service.port=8080 \
  -Dauth-service.secret='pr0d_sh@red_s3cr3t' \
  -Duser-blacklist.4=hacker4

...bit of a mouthful, and a fair bit trickier to read than the HOCON we're overriding. We can't use any of the power of HOCON to help us here. This is especially noticeable with the user-blacklist property, where altering lists is variably clunky or impossible.

Another concern you have is that you're writing your config directly into the command which will be run, which means any user on the running host can use tools like ps to see the full text of the command – including a database password and a secret API key in this case.

You could similarly use environment variables to configure every line of your config, but again you have the same problems of reducing your nice structured config to flat key-value pairs, and the additional problem that you have to add a lot of base config to record the environment variables you want to use.

Leaking app config into infrastructure code

Continuing the theme of not using HOCON and losing its benefits, the second mistake I've seen frequently is ignoring HOCON and configuring the application in your infrastructure code. I will demonstrate what I mean using ansible here, but I've seen this same mistake made with ansible, puppet, kubernetes and terraform.

Consider that you're trying to configure the same application as above, and now you have an ansible variables file:

# group_vars/production.yml
http_bind: 0.0.0.0
http_port: 80
postgres_host: postgres.example.com
postgres_username: {{ secrets.postgres_user }}
postgres_password: {{ secrets.postgres_password }}
auth_host: auth.example.com
auth_port: 80
auth_secret: {{ secrets.auth_secret }}
user_blacklist: [user1, user2, user3]

When writing your config, you can then use any of the above methods to configure your application. Let's say you decide to use a template to write a HOCON config file.

http {
  bind = {{ http_bind }}
  port = {{ http_port }}
}

db.postgres {
  host = {{ postgres_host }}
  username = {{ postgres_username }}
  password = {{ postgres_password }}
}

auth-service {
  host = {{ auth_host }}
  port = {{ auth_port }}
  secret = {{ auth_secret }}
}

user-blacklist = {{ user_blacklist }}

We're writing HOCON this time, but we still have key-value parameters because our actual configuration is a series of ansible variables, and we have to additionally make sure they all get templated into the config file.

This is certainly better than the version with -D params, as we're using YAML and that gives us some of our features back; we could structure our YAML rather than writing it flat, and we have a concept of lists again, so our blacklist isn't so awkward.

The downside is twofold: we have to update two places with new variables, and we're telling ansible about properties it doesn't need to know about.

The solution

The solution, then, is to write HOCON as much as possible, so that you get its full power and, as ever, Don't Repeat Yourself. In my example above, this isn't possible for everything. It's very likely that hostnames will be provided by your infrastructure management tooling, and your secrets will be read from your secret management tooling, meaning you'll still need to template your config file.

Likely, though, several pieces of config simply belong in your application config file. The tidiest approach is to keep your infrastructure config simple:

postgres_user: {{ secrets.postgres_user }}
postgres_password: {{ secrets.postgres_password }}
auth_host: {{ get_auth_host() }}  # from service discovery, for example
auth_secret: {{ secrets.auth_secret }}

and your application config structured with HOCON:

http.bind = 0.0.0.0

db.postgres {
  host = postgres.example.com
  user = {{ postgres_user }}
  password = {{ postgres_password }}
}

user-blacklist = [user1, user2, user3, user4, user5]

Tell ansible to write this template to /etc/myapp/app.conf, make sure -Dconfig.file=/etc/myapp/app.conf is included in your run command, and you're good to go.

tl;dr

To summarise, then:

take the time to grasp some of the features of HOCON and read the docs and you'll make configuring your application much easier
remember to keep your config where it belongs, and write it in HOCON