Unlocking observability: Structured logging in Spring Boot

In this section we describe how to add structured logs using SLF4J (or Simple Logging Facade for Java). There are three main components that we need to configure:

Log format: By default an application will produce logs in a text format — we have to add structure to these logs by defining a format. In this blog, we will use a structured JSON format.
Log generation: We need to configure our application to include contextual information in addition to the log message, like a user-id who made a request.
Log ingestion: Ultimately we have to ingest these logs into a data store that allows us to interact with the logs. We will use logstash to push application logs into our storage engine Elasticsearch and we will use Kibana as the logging UI.

Step 1: Generating logs

In our demo application, we choose to use SLF4J with Logback, which is a commonly used logging framework. This is defined in the project dependencies.

SLF4J is a logging facade that works with many logging backends, in this article, we demonstrate structured logging with SLF4J and Logback as a backend

Adding logs is a two-step process:

public class PartyController {
private final Logger logger = LoggerFactory.getLogger(getClass().getName());

public Map<String, String> index() {

logger.info("Reading index path");
return Map.of("message", "Welcome to Jamboree!"); }

}

This will generate logs in a text-only format. To add a structure to the logs, we have to use MDC or Mapped Diagnostic Context. MDC will add contextual information that will be present along with the log message. MDC is a map maintained by the logging framework (Logback in the demo project) where the application code provides key-value pairs which can then be inserted by the logging framework in log messages. This key-value pair way of adding logs is what adds structure to the log messages.

The MDC class contains static methods which can be used to add keys to any log message that is subsequently generated by the application:

MDC.put("REQUEST_URI", request.getRequestURI());

if (params.location() == null) {

logger.error("Missing parameter: location");
return Response.error(HttpStatus.BAD_REQUEST, "Location is mandatory.");}

MDC.remove("REQUEST_URI");

With the contextual information being present at the time of log generation, the format of the log can use this to generate rich information that is useful when reading the logs.

Step 2: Defining the log format

In the previous snippet, we added REQUEST_ID with MDC. Let us now configure the log format to introduce a more structured format that will generate log messages as a JSON payload containing keys and values. JSON formatting is required because many tools, including our storage engine, supports it out of the box. To do this, we have to configure logback settings.

Logback configuration

Largely, logback defines three settings for defining the log format:

Logger: This is the application class that will produce the log. This is defined in the source code like this.
Appender: The appender receives the logs produced by all the loggers in an application and stores them in the defined format — this could be a file, for instance. In our example project, we have two appenders — one to print the logs to the console and another “rolling appender” to store logs in the JSON format to a directory.
Layout: The layout defines how a single log should be formatted. For our console appender, the output format contains the log message and the log level, while the rolling appender will save it a JSON payload.

Following is the logback configuration taken from the demo repository and defines an appender called ROLLING, creates one file for all logs per minute using the JSON format. We also have the original console appender which makes logs more human readable.

<appender name="ROLLING" class="ch.qos.logback.core.rolling.RollingFileAppender">
<file>logs/jamboree.log</file>

<fileNamePattern>logs/jamboree-%d{yyyy-MM-dd-HH-mm}.log</fileNamePattern>
<maxHistory>60</maxHistory>
<totalSizeCap>20GB</totalSizeCap>
</rollingPolicy>

<jsonFormatter class="ch.qos.logback.contrib.jackson.JacksonJsonFormatter">
<prettyPrint>false</prettyPrint>
</jsonFormatter>
<timestampFormat>yyyy-MM-dd HH:mm:ss.SSS</timestampFormat>
<appendLineSeparator>true</appendLineSeparator>
</layout>
</appender>

With this configuration, the logs are going to be available in the logs directory as indicated by the file pattern:

$ head logs/jamboree-2023-12-04-08-48.log
{"timestamp":"2023-12-04 08:48:00.106","level":"INFO","thread":"http-nio-7123-exec-2","mdc":{"REQUEST_URI":"/party/","TRACE_ID":"e10d91d9-b822-40cb-8794-45a103b6c248","REQUEST_ID":"66481f10-53ee-4c88-b9b2-8547b1416b85","PARTY_ID":"12442","REQUEST_METHOD":"POST"},"logger":"me.mourjo.jamboree.rest.PartyController","message":"Creating a party with PartyRequest[name=Adi Dhakeswari, location=Kolkata, time=null]","context":"default"}
{"timestamp":"2023-12-04 08:48:00.113","level":"ERROR","thread":"http-nio-7123-exec-2","mdc":{"REQUEST_URI":"/party/","TRACE_ID":"e10d91d9-b822-40cb-8794-45a103b6c248","REQUEST_ID":"66481f10-53ee-4c88-b9b2-8547b1416b85","PARTY_ID":"12442","REQUEST_METHOD":"POST"},"logger":"me.mourjo.jamboree.rest.PartyController","message":"Missing parameter: time","context":"default"}

Step 3: Ingestion into logging infrastructure

To make the most out of structured logs, we need to index the JSON logs into a search engine that facilitates the complex search queries and analysis tools we showed above.

To do this, our logging infrastructure will contain the following components as shown below — in Jamboree, all of this is configured to run through [docker compose](https://github.com/mourjo/jamboree/blob/main/docker-compose.yml).

The logging infrastructure for our demo project uses the ELK stack which runs inside docker containers

In the above image, there are three main components involved in the logging process:

Log storage: This is the search index and storage location for our structured logs. For this demo, we are using Elasticsearch
Log ingestor: Application logs are locally stored on files (or streamed to STDERR), these logs need to be aggregated and collected for storage in Elasticsearch. In a production setup, there will be multiple nodes from where logs need to be aggregated. In this demo, we only have one application instance producing these nodes, which are periodically fetched by Logstash.
Logging UI: As logs get aggregated and stored, we need an user-interface to view and analyze the logs. We will use Kibana for this.

This is a very common setup (commonly called ELK or Elastic-Logstash-Kibana).