Don't run your acceptance tests' server in-process

At Black Pepper, acceptance testing is part of our DNA. Having a passing suite of automated acceptance tests would typically be part of our definition-of-done for a user story. At times, we might even practice Acceptance Test-Driven Development, where developers will sit down with a QA engineer and work together to write the acceptance tests before a line of code is written, in close collaboration with the customer. We believe that this is one of the best ways to demonstrate that requirements have been met, while helping to protect against unwanted regressions into the future forever.

Typically such tests drive a robot web browser to interact with a running instance of the web application we're looking to test. But how do our tests ensure there is an appropriate instance of the application within reach? How should we configure this app's start-up, for maximum convenience and test performance, both on developers' machines and in our Continuous Integration environment?

The possibilities are infinite, but I'm going to group them here into two categories: stand-alone, and in-process; and then I'm going to describe why you should almost always choose the former. This article is about running JUnit tests against a Spring Boot web application, but many of the points made here have broad applicability.

Stand-alone deployment

In a stand-alone deployment, we build the application artifact exactly as if it would be deployed to production before running the tests.

We might deploy it onto a production-like machine first, and run the tests on it there. This might be the best approach for tests running on a Continuous Integration server, but would likely be too time-consuming for developers who just wanted to quickly run a test against the latest code on their local machine, and be annoying to debug. So we also want to enable developers to run the application quickly via their IDE.

For a simple CI, or for full, local builds, we just want the app to start and stop around our integration testing build phase. We're probably using Maven for this, because its plugin ecosystem for these sorts of tasks is much more mature than that of Gradle at present, so some top contenders to help us with this are process-exec-maven-plugin or, if we Docker-ise our build artifact first, docker-maven-plugin. Our Maven configuration could require as little as this:



In-process deployment

In an in-process deployment, we just let our test start up its application server as part of its own code. At its simplest, this could just look like this:

private static ConfigurableApplicationContext context;

public static void setUp() {
    context = new SpringApplicationBuilder(MyApplication.class).run();

public void aTest() {
    // exercise our server...

public static void tearDown() {
    if (context != null) {

What is attractive about this approach is its simplicity. It takes only a few lines of code and requires no Maven plugins to achieve this. Start the test from your IDE, and it will spin up a server to run our app, run the test, and shutdown the server.

For what is unattractive about this approach, please read on...


Browser-based tests are slow. It is inherently much slower to fire up a browser, sending real HTTP requests, render their responses in a real browser DOM and simulate interaction with these elements than to run your average unit test, which might use a comparatively tiny amount of system resources.

But it's important to developers that they run fast. To achieve this, they must be designed to run in parallel. The time it takes to run an acceptance test suite on a single thread might only be seconds straight after a project's inception, but could grow to hours once that project has been running for several years.

Not only that, but single-threaded test runs are not a real test of an application's robustness. Web applications are intended to be used by many people concurrently. That mysterious session synchronisation bug, or that database deadlock, which will never occur due to tests on a single thread, might occur almost immediately once multiple concurrent requests are exercising an app!

The problem is: it's hard to design tests for parallelism. You will have to write your tests so they don't insert data that conflicts with that which another thread has just inserted with the same unique key value. Each test may have to tear down its data – but only its data. And you may have to write tests in such a way that they can tolerate other tests' data creeping into their view of the world. All of this is fiddly at the start of a project, but pure nightmare fuel if you leave this change until you have 1,000 tests! So design for parallelism from the start.

If you design for parallelism, you will need a stand-alone server to run your tests against. Using the Maven Failsafe plugin, it's pretty easy to parallelise your tests:


This will spin up as many JVM processes as you have cores, and allocate the tests between them. But you don't want to spin up as many web application servers as you have cores! Just the one would be fine. So launch this separately, stand-alone.

Remove the browser dependency from your tests

A lingering annoyance with browser-based acceptance tests is the need to keep various bits of software all up to date on the container or VM running your tests. For instance, in the case of Selenide driving a Chrome browser via ChromeDriver, you will need to make sure all the versions play nice together – the automatic update of Chrome on a developer's machine may mean she's out of action for the morning while she tries to find the corresponding versions of everything else. It's not easy to rollback automatic Chrome updates!

To free developers from this stress, it's a good idea for them not to be reliant on synchronising versions of software that can update irrevocably without warning. For full builds, using something like the selenium/standalone-chrome Docker image can help with this, but, in order for the browser in this container to be able to see your in-process app, you'd need to run Docker with --network=host, which then means your tests will only be able to run on Linux. Docker on Windows and Mac runs in a VM which makes host networking impossible. If your app were stand-alone, however, you could just run it in a container in the same network, which is supported on all platforms.

Anyway, on CI, wouldn't it be nicer to bring up a full Selenium Grid, potentially testing your app with all the browsers you support? We've had success using Zalenium to manage an auto-scaling Selenium Grid, but perhaps you might prefer to run your tests on BrowserStack? This is a really appealing option if you want to run your tests against multiple devices, and it's not at all practical if your tests are running your app in-process.

Improve your developer experience

If browser-based tests are slow to start, then web applications are worse. You may be lucky enough to be testing in isolation an extremely slim microservice that is fully operational in milliseconds, but most end-to-end testing doesn't look like this. Once your JPA entity manager factory or your Elasticsearch index have been initialised, and your Spring bean soup has been lazily initialised on that first request, you're seriously slowing down every test invocation you make while developing if you're running your app in-process.

You might argue that if you've changed the application code, you'll have to restart the app anyway, so this really makes no difference. But for me the benefits of iteratively altering a test and executing it against a running server are obvious, and this reasoning is best adopted by only those developers likely to always get their test code changes right first time. :)

Besides, pointing a test at a running server can open all kinds of avenues in improving an application's debuggability. Want to manually drive the app to get it into a specific state, before executing a test to verify it behaves correctly? Maybe it would take a long time to write the necessary test set-up steps. Easy to do with your app running stand-alone; difficult if your app is being brought up whenever you run a test in-process.

And good luck making absolutely sure you don't have the server running every time you're about to start a test. In-process workflow: ugh, error, port in use; stop test, maybe even stop browser, stop server, lose state you were enjoying on server; rerun test. Stand-alone workflow: no issue.

It's not actually your app!

Last but not least, another reason not to run your tests' server in-process is that, when you do, you're no longer really testing your app. Your application is no longer running with the same set of dependency classes as it will do in production; it's now working with some unknown combination of its own and its tests' dependencies. I would find this enormously worrying.

Acceptance test projects typically become pretty complicated. They'll probably use a DI container, with its bytecode manipulation dependencies, and all sorts of parsers to read test DSLs and their utilities; REST clients, JSON serialisers... if you use something like Wiremock, your tests will even have a dependency on an embedded web server!

Imagine all your tests being green, but you get a report that your app is failing in one specific feature on production because of a missing or mismatched dependency. I've seen it happen: it's not pretty, it destroys your confidence in your test suite and, if your app is instead deployed stand-alone before your test suite runs, it's completely avoidable.

I've heard it argued that, unless you're testing against production, all testing is only a simulation, and so you may as well take this kind of shortcut. First, this feels like arguing you may as well build crash test dummies out of chicken wire and papier mache: admittedly it's not life and death, but testing your app actually works is the whole reason you're doing this.

And second: well, maybe you should be testing on production then.

I hope you enjoyed this article and we would be interested to hear your experience and thoughts!