100000 Tests under 20 Minutes with Meissa Distributed Runner
January 25, 2019
It is relatively easy nowadays to write automated tests. However, the more you have more difficult it gets to find out how to execute them. What I mean here is that test automation is usually used to find regression bugs not to test your new features. Which means that you want all of your tests to be executed as often as possible. But when you have over 1000 heavy tests, it gets quite tricky. By heavy tests, I mean API, UI, DB, integration tests or in simpler words- every test that needs more than 1 second to execute. Here I will present to you an open-source tool that solves the problem. I will not go in many details how to use it or how it works internally since you can find this information on the official website. Instead, I will tell you about a “little” experiment we did to evaluate the capabilities of the app.
What is a Parallel Test Execution?
1. Parallel- Single Machine – Multiple ProcessesSome unit test frameworks have native parallel execution support like NUnit. You have, for example, 100 tests. If you run them on a single machine with 5 CPU Cores, on each core, 20 tests will be executed simultaneously. However, not all test frameworks support this option, and there are some major problems related to it, depending on the type of tests you want to run.
2. Distributed Testing- Multiple Machines – Single ProcessYour second option is to run your tests at the same time on multiple machines and merge the results at the end. Usually, you need an additional complex tooling, for example Microsoft Test Controller/Agents setup.
3. Parallel Distributed Testing- Multiple Machines – Multiple Processes.You can mix both approaches. In this case, you will use the complex tooling and, at the same time, run the tests in parallel on each machine.
Advantages Distributed Parallel Test Execution
The most obvious reason is speed. Instead of executing the tests in 16 hours, they complete in under 4 hours. When all required tests can be run a couple of times in a business day, you will be able to release your application every day (if we talk about web projects). Even if it is not web, you can improve the quality of your app and shorten the testing cycle by executing all your tests as part of the CI process. This means you will have higher coverage in shorter throughput time. As you know, each time your tests execute, their ROI (return of investment) increases. Last, the more often you run all your tests, the probability of locating unstable/not-well-written tests rises.
Why Do You Need to Distribute Your Tests?
In our observation, the optimal number of test run processes is 1.5-2.0 x total number of cores on your machine, which means you are limited to the hardware you use. In the teams where I worked, most of the VMs had up to 2 CPUs. One “big” machine or many smaller? To answer to this question, I need to explain the difference between horizontal and vertical scale. Horizontal scaling means you scale by adding more machines to your pool of resources, whereas vertical scaling means you scale by adding more power (CPU, RAM) to an existing one. However, typically, many smaller machines are cheaper than one big one. The VM clouds nowadays are the mainstream. Because of that, the price differences are insignificant.
Meissa is an open source distributed tests runner. It is built using the latest technologies such as .NET Core, ASPNET.Core and more. So it is completely cross-platform. It is designed to be programming language agnostic which means that it can run tests written in different languages.
The Biggest Experiment
I wanted to test the tool to the maximum. So, I created a test project with 100000 tests. I created a simple application for their generation.
Each of them executes for one second, which means that, if you run them sequentially on a single machine, they will run for 1666 minutes = ~27.8 hours. To use Meissa’s maximal capabilities, I decided to create 10 virtual machines in Azure; each of them had 8 CPU cores and 14 GB of RAM. Then I started Meissa in Agent mode on each of them.
Of course, I created a separate VSTS build for the ultimate test.The plan was to execute the tests on the 10 machines and on each of them to start 8 separate test processes, which means the tests had to be executed 80 times faster. Below, you can see the results.
The tests were executed for less than 20 minutes. I reconfigured the build to use 14 processes. I am happy with the results that show the tool can cope with the most extreme situations.