Gatling: A surprisingly impressive performance testing tool

In this blog post, I share my experience with Gatling on a proof-of-concept project. We’ll explore how to deploy and use Gatling in a cloud environment and look into some of its many features.

Context and project objective

As part of the development team for a proof-of-concept project for the Dutch Transport Inspectorate, I am tasked with creating a backend solution to receive and process real-time taxi data. One of the primary objectives of the project is to verify that the new solution can handle all requests sent by taxis in the Netherlands at any time of the day. Which makes performance testing an essential component of the project.

For further information on this project, visit the Dutch Transport Inspectorate website.

Why Gatling?

Gatling is a popular choice for performance testing for over a decade now, but our team is relatively new to it. Before, we had used other tools such as JMeter and LoadNinja. I was reintroduced to Gatling at the Devoxx 2022 conference in Belgium, where I attended a session hosted by Gatling founder Stephane Landelle.

His session highlighted Gatling’s capabilities in various load testing scenarios and impressed me with its well-documented API. The fact that we could write all our tests in Java made it the perfect fit for our Java development team.

Test scenario’s

The proof-of-concept is designed to capture all relevant activities of a taxi driver throughout their work day. The backend receives information such as the driver’s location, type of operation, and additional details. The plan is to conduct two tests:

A stress test that simulates a busy time of the day, with a message rate of approximately 600 messages per second.
A soak test that represents an average day, with approximately 3 million messages received in a 24-hour period.

To mimick a typical workday, we’ll inject 10,000 virtual users into the scenarios. The exact number is dependent on the capacity of your cloud environment. Our team determined that each Gatling instance should manage 1000 users.

This means we will execute our load tests with 10 Gatling instances. During the extended load test, we’ll use Gatling’s throttle() method to prevent the sending of unnecessary data.

A closer look at a Gatling scenario

Here is a simplified and redacted version of the code for one of our scenarios:

private ScenarioBuilder buildWorkdayScenario() {
    //1. Create data feeders
    Iterator<Map<String, Object>> startWorkFeeder = createStartWorkFeeder();
    Iterator<Map<String, Object>> operationFeeder = createOperationFeeder();
    Iterator<Map<String, Object>> stopWorkFeeder = createStopWorkFeeder();
    //2. Duration from environment variable
    int loadTestDuration = 
        Integer.parseInt(System.getenv().getOrDefault("LOAD_TEST_DURATION_IN_SECONDS", "1"));

    //3. Create scenario
    return scenario("Taxi Driver Work Report Scenario") 
        //4. Add feeder to scenario
        .feed(startWorkfeeder) 
        //5. Create API request (start workday)
        .exec(createWorkRequest("Post Start Work", "start", "#{work.starttime}"))
        //6. Create a loop for specified duration 
        .during(Duration.ofSeconds(loadTestDuration)) 
        //7. Add feeder to requests in loop. Virtual user gets data from it when it gets here.
        .on(feed(operationFeeder) 
            //8. Create API requests in loop (Start en stop trip)
            .exec(createOperationRequest(
                "Post Start Trip", "start", "#{operation.startTime}", "#{operation.startMessageId}"))
            .exec(createOperationRequest(
                "Post Stop Trip", "stop", "#{operation.stopTime}", "#{operation.endMessageId}")))
        // 9. Add feeder to scenario
        .feed(stopWorkFeeder) 
    // 10. Create API request (stop workday)
    .exec(createWorkRequest("Post Stop Work", "stop", "#{work.stoptime}")); 
}

In this scenario, a typical day on the job for a taxi driver is being modeled. The simulation involves the driver starting their shift, sending out trip information throughout the duration of the load test, and finally concluding their workday. The focus of this scenario is on the data feeders and the loop structure with the during() method, which we will discuss further.

Data Feeders

Virtual users need data to start a scenario. Gatling provides the ability to inject data into virtual users through the use of feeders. Each time a virtual user reaches a feed step, a record from the feeder is injected into the user’s session, resulting in a new instance of the session.

There are several different ways to implement a feeder. In this case, we use a simple in-memory feeder. However, you can also use feeders that serve data from files or databases.

The following code is an example of a feeder. For more information, please check Gatling’s documentation.

private Iterator<Map<String, Object>> createStartWorkFeeder() {
    return Stream.generate(() -> {
        String workId = UUID.randomUUID().toString();
        String authorizationToken = "90000000-1111-2222-3333-00000" + 
            getAuthorizationTokenBaseAndIncrement();
        String startTime = OffsetDateTime.now(UTC).toString();
        return Map.of("authorizationToken", authorizationToken, "work.id", workId, 
            "work.startTime", startTime);
    }).iterator();
}

During Loop Statement

A loop was necessary in this scenario and Gatling provides several ways to do this, the during() method being one of the possibilities. It’s convenient that you can also apply a feeder to just this loop, as shown in the example.

These are a few examples, but they should give you a sense of how well-designed and extensive Gatling’s API is. Once you get familiar with it, you won’t have to write your own logic to create the desired scenario.

Running the tests in a cloud environment

With our scenario in place, let’s explore how to deploy it. We took great inspiration from this excellent article by Richard Hendricksen on deploying concurrent load tests and containerizing Gatling. Basically, we make use of Gatling’s ability to run in both test mode and report-only mode.

Our Dockerfile’s entrypoint refers to a bash file with the following content:

#!/bin/bash

if [ "$REPORT_ONLY" == "false" ]
then
    ## Clean reports
    rm -rf target/gatling/*
    rm -rf /mnt/gatling/logs/*

    # Running performance test
    mvn -Dmaven.repo.local=/.m2/repository -e gatling:test -Dgatling.simulationClass=api.loadtest.WorkdaySimulation

    # Copying log files to pvc
    for _dir in target/gatling/*/
    do
        cp ${_dir}simulation.log /mnt/gatling/logs/$HOSTNAME-simulation.log
    done
else
    # Generate logs
    mvn -Dmaven.repo.local=/.m2/repository -e gatling:test -Dgatling.reportsOnly=/mnt/gatling/logs
fi

Initially, Gatling operates in its normal mode, generating a log file for each individual instance. Once the load test is completed, Gatling runs in report-only mode, combining all log files into one comprehensive report.

Example of a Gatling report

All Gatling instances execute as Kubernetes Jobs. The structure of our Job template is organized as follows:

apiVersion: template.openshift.io/v1
kind: Template
metadata:
annotations:
name: deployment-template
namespace: "${PRJNAME}"
objects:
- apiVersion: batch/v1
    kind: Job
    metadata:
    name: "${APPNAME}"
    namespace: "${PRJNAME}"
    labels:
        app: "${APPNAME}"
    spec:
    backoffLimit: 1
    parallelism: "${{JOBINSTANCES}}"
    completions: "${{JOBINSTANCES}}"
    completionMode: Indexed
    template:
        metadata:
        labels:
            name: "${APPNAME}"
        spec:
        containers:
            - name: "${APPNAME}"
            image: "${IMGURI}:${APPTAG}"
            ports:
                - containerPort: 8080
                protocol: TCP
            resources:
                limits:
                cpu: 250m
                memory: 500Mi
            volumeMounts:
                - mountPath: /mnt/gatling/logs
                name: "${APPNAME}-gatling-logs-volume"
            imagePullPolicy: Always
        volumes:
            - name: "${APPNAME}-gatling-logs-volume"
            persistentVolumeClaim:
                claimName: "${APPNAME}-gatling-logs-pvc"
        restartPolicy: Never

Results

The stress test revealed a database bottleneck in our application. Telling us we need to optimize the indexes and increase the capacity of the database cluster. This way we achieved substantial improvements in our throughput, allowing us to reach our target throughput. The soak test proceeded without any issues.

Conclusion

This first encounter with Gatling exceeded our expectations. Although there is a learning curve in understanding its terminology, once you become familiar with it, creating scenarios is very straightforward.

The fact that you can write Gatling tests in Java makes it much more accessible for programmers, which is a nice bonus. The reporting functionality in Gatling is top-notch and it is a pleasant surprise to discover the report-only mode, which allows for combining log files from multiple instances into a single comprehensive report.

I definitely look forward to utilizing Gatling again in upcoming projects and exploring its capabilities even further.

Gatling: A surprisingly impressive performance testing tool