I often work on projects where we run load and performance testing for clients. Load testing produces huge amounts of data that needs to be analysed and the results of that analysis published. Much of this analysis is mechanical: calculating means, 90th percentiles, and variances; producing charts of response time over the course of a run etc.
We like to publish data in an accessible fashion that allows all the stakeholders to see progress and in particular allows everyone to participate. We've found using a Wiki for this to be excellent. However, getting data out of load testing tools and publishing it on a Wiki usually takes a lot of manual effort, so I decided to investigate how to do this as automatically as possible - so that I can actually spend my time adding value.
The tools I use are:
- The Grinder - for the actual performance and load testing.
- JMX - for collecting application server statistics.
- Apache Commons Math Library - for statistical analysis.
- JFreeChart - for charting results.
- MediaWiki - for publishing results.
The Grinder is a simple to use open source load testing tool. The useful thing about The Grinder is that it output the raw data for a test run as 'comma separated value' (CSV) files. These files include an entry for every test performed giving a timestamp, the response time, whether it succeeded or not together with some other useful statistics.
This raw data means it's ideal for performing various statistical analyses after a run has completed. In a typical load testing environment I run multiple load injectors across a number of machines, so I have to collate all the output files from all the injector machines and process them together.
Slightly more difficult is capturing the server statistics. It's useful to know how much memory is being used, how many threads there are, how often garbage collection is occuring etc. The Java Virtual Machine exposes these using JMX and so it's fairly simple to write a client that captures this information periodically throughout a test run. One has to be careful to ensure that querying for these statistics doesn't adversely affect the application under test, so I typically make sure that the frequency with which I poll isn't too high - once every second appears to be reasonable - though it's useful to run comparison tests with and without this data collection to ensure that the scalability and performance of the system aren't unduly affected.
In addition to JVM statistics, depending on the application server, there may be other statistics exposed such as database connection pool usage or session count, and of course the application itself may expose useful statistics.
A couple of years ago I wrote an application that collected data periodically via JMX and wrote it to a CSV file - originally this was to enable JVM monitoring so that it could be easily published via Orca alongside the operating system statistics that Orca captures out of the box. This application allows me to capture data about the server during performance tests so that I can correlate it with the test results.
An important point to note is that timestamp information is captured with the data to allow for time based correlation; and for this to work all machines that are capturing statistics must have their clocks reasonably in line. Hence, I can capture the data for the run itself together with application server and operating system statistics, and with the timestamp information I can correlate peaks and troughs from the load test data with any server issues such as low memory conditions.
Apache Commons Math Library provides tools for statistical analysis of raw data. For performance and load testing I'm typically interested in the mean response time and its standard deviation, and sometimes the 90th percentile response times.
Usually, I'm interested in the statistics only for the period of the run where the load is constant, i.e. after ramp-up and before ramp-down. The data from The Grinder is useful in determining this, since it captures the test, thread and run number of a particular test and so I can work out the time at which all threads had started and the time at which the earliest thread finished and just use data between these times to produce the statistics.
A picture is worth a thousand words, and it's much easier to understand a load and performance profile by referencing a graph of response times plotted against the elapsed time of the test than by looking at the raw numbers. Therefore I generate charts of response times and throughput, and server memory etc to go alongside the summary statistics.
JFreeChart is a excellent charting library for Java and can be used to produce a lot of different charts. I use Time Series charts for plotting data captured during the test against time. This allows me to plot, for example, the response time of each test result against the time that test occurred. This is much easier than attempting to do the same chart in either Excel or Open Office Spreadsheet.
Publishing The Data
With the data captured and analysed the final step is to publish it. We maintain a project wiki, so it's the appropriate place to publish data for the load and performance tests.
MediaWiki provides a HTTP API that supports uploading images and for editing content. So I can use that to upload the graphs generated by JFreeChart (though one has to remember to use a unique name for each chart, I use the run start time and type of chart as the name, for example responsetime-20090202-1015.png).
Next I generate the text of a new wiki page for the test run showing a summary of the run and referencing the charts, and use the API to create the page in the wiki.
Finally, to tie it all together I use the API to edit an existing 'results' page and add a link to the new page.
I've found that by automated the process gives a number of advantages:
- Primarily it frees me up to actually analyse the implications of the results.
- The presentation and analysis of the results for runs is consistent.
- It is reasonably easy to go back provide new analysis on old runs (providing that you have the raw data) since it's quick and easy to re-run analysis and publishing - for example, I added throughput analysis and charting mid way through a project (it previously hadn't been of interest to the client and later they decided that it would be useful to see it).