Akka actors case study: a multiplayer games backend

Today I'll be sharing the high-level architecture and design features of a project I wrote back in 2018 using Akka actors. This is a reasonably complex project, and does a decent job of illustrating several points of how to build generic, reusable actors to perform useful functions.

This article is the first of a series where we'll look in more depth at individual components of this design. In this article, we'll look at the high-level project design, how we break down a complex problem into simple, reusable components, and how to structure and build the solution in Akka at a high level.


A multiplayer games backend

Firstly, the project goal. A few years ago, a major passion project involved creating multiplayer games which could be played on my website. I had a few of these of varying complexity, from a slightly wacky "enhanced" version of snakes and ladders, to a Yahtzee-inspired "rainbow dice" game, to a complicated card game, Tichu. The latest iteration of these games, around 2016-2018, involved some javascript frontend work hosted on my website, and a backend written in Scala with Akka actors. We'll be looking at the backend.

The games have a fair bit of variety, but they were all online multiplayer, turn-based board or card games, so they all worked with a fairly similar model:

  1. Create a lobby to let players join
  2. Start the game with players currently participating in the lobby
  3. Decide turn order
  4. Let players make moves, in turn order, and update the game state according to the results of each move
  5. Publish player moves to all players (sometimes with private details hidden)
  6. Determine when the game was over and who won
  7. End the game, and reward the winner(s) somehow

The details of the latter steps in particular vary greatly from game to game, but clearly there's a lot of common ground here even between very different games.

The basic design, then, involved firstly designing some actors which could generically perform common roles, like defining a lobby and the logic around when it's ready for a game to start, who has permissions to start the game and/or invite people, and other "lobby" functionality. Similarly, enforcing turn order happens with the help of the TurnController which we insert in front of our game logic to "guard" it from out of order actions, duplicate actions, and races. We also need a feed of actions which are happening, both to inform users and to allow us to adopt an event-driven approach.

Along with those pieces common to all games, we define specific actor relationships and state models for the specific game, and define the logic of our game: which actions can be taken, when does the turn end, what does our scoring and our game board look like, when is the game over?

Ultimately this needs to reach the user in the frontend, so we interface our internal actors with an HTTP interface, providing some endpoints for checking game state and taking actions, and providing a websocket feed to subscribe to real time events.

High-level structure

One excellent feature of Akka is the ability to easily draw what we're building, since we're creating actors which will talk to each other in much the same way as services might talk to each other. At a high level, we can look at how our top-level actors communicate with each other to handle setting up and tearing down games, and at a lower level we can zoom into child actors and see how they work as well.

Let's first take a look at a simplified structure of a game:

Game-lobby architecture diagram

In this diagram, each component is an actor or group of actors, and is fully encapsulated. They perform specific, reusable functions and communicate with other actors. Let's take a look at some of the components we need to make this work.

The game daemon

This is the top-level manager actor for one type of game, and deals with creating individual games and holding their references, and providing an interface to manage them. This is what the user will be talking to when creating an initial game lobby or connecting to a game by ID. It supervises all games as well, attempting to recover or replace any lost actors.

When a game is created, it'll spin up a child GameFramework actor, which then handles everything to do with the game.

Ultimately we serve an interface for multiple games, where each has a slug and a daemon actor, so that we can direct our queries to the snakes-and-ladders daemon or the four-in-a-row daemon as appropriate.

The game framework

This is the parent actor encapsulating everything needed for one instance of a game, throughout the lifecycle. It spins up children for all of the components you see beneath it in the diagram. This is a generic component, as every type of game needs a feed, a lobby, and an actor to deal with the game logic; we'll swap out that last piece per game. This actor will perform all the high-level housekeeping functions for any type of game.

The framework understands the lobby and game lifecycle, stores game participants, controls how the feed is hooked up to other components during the lifecycle, and acts as the parent to every other component. This means it'll supervise everything beneath it and handle tasks like ensuring games are torn down if left idle.

The framework is created when the user requests to start a new game by talking to the game daemon, ultimately by making an HTTP POST request like /game-type/create and receiving back an ID to interact with the instance. It sets up the initial components and hooks up the lobby to the feed. Once the lobby completes and the game start is requested, it ensures the game itself is initialised and allows the game to take over control of user interaction instead.

The lobby

The lobby is the initial "room" before the game starts, where we start collecting participants and get ready to launch the game. It provides several few pieces of functionality:

The lobby lifecycle begins when the parent GameFramework begins, and ends when the game is set to begin; the lobby yields a set of users who will be playing, and the game will be initialised with those users.

The feed

The feed primarily exists as a broadcast mechanism, allowing subscribers to register when a user connects via websocket. It also assists in adopting an event-driven approach to our game logic, so that in event of failure, we can recover the game state by replaying the feed into the game. There are a few considerations here:

This means we have a few pieces of logic to make our feed really useful, as well as needing a solid Event model which games can extend, or draw from a pool of common definitions.

The game logic

The game component will itself be an actor, and encodes all the logic of the game. For a simple game like Snakes and Ladders, this may be a single actor which holds all the game state and responds to a couple of potential actions to advance the game. For a complex card game like Tichu, there may be multiple phases to a round, player hands and scores to track, and therefore it will spin up and shut down various groups of child actors to deal with the internals. In most cases it will place itself behind a TurnController to avoid having to worry about whose turn it is, so this behaviour is provided as standard.

The turn controller

This component encapsulates the common logic of guarding the game from out-of-order actions and deciding whose turn it is, and is used by any turn-based game. It has the following behaviours:

This effectively makes the TurnController a proxy which both decides and enforces turn order, and can be configured by the game to customise its behaviour when needed. The actor is inserted between the interface and the game and this prevents the game having to worry about turn order or races.

Sharding

Since these multiplayer games have finite player limits, the service can be scaled up with only very limited exposure to the drawbacks of distributed computing. We can pick a sharding strategy which distributes our individual GameFramework actor groups across our nodes, meaning that everything internal to the game will always take place on a single node, and we don't risk dropping regular messages during the course of the game.

This means that only the top level GameDaemon will need to send messages between nodes of the application, so message delivery failures will happen early: only a game creation or an attempt to connect to a feed can fail, and both can be retried safely. Akka cluster provides the functionality we need to send messages to actors on different nodes, but it does unavoidably have weaker message delivery guarantees than those passed between two different actors in the same JVM.

HTTP interface

The heavily-structured nature of the actor hierarchy corresponds nicely to a RESTful API with structured URLs. Without going into too much detail, a progressive structure like game/snakes-and-ladders/12345/action/roll/ descends the logical actor hierarchy to deliver actions to a specific game. This makes the route structure fairly transparently reflect the actor system structure and aids in understanding and debugging. Since players also need realtime updates on what's going on in the game, users subscribe to a websocket feed like game/snakes-and-ladders/12345/feed to receive continuous updates on what's going on.

This interface is served via akka http, and after performing authentication and parsing and validating any JSON actions delivered, communicates with the underlying actor system to deliver the message.

Take aways

A lot of what I've discussed here is a highly-specific use case for akka actors, but hopefully some of the designs here can be used as inspiration for other challenging projects. Here are some takeaways for designing your own actor systems:

In future articles, I'll dive into several individual components in more depth to see exactly how we build and test actors with the behaviours described here.