Performance tuning with Flood IO and New Relic part 1

In this first post we're going to demonstrate some basic principles of performance tuning using Flood IO and New Relic.

Flood IO is a distributed load testing platform that lets you scale out your JMeter or Gatling load test scenarios across the globe within minutes.

New Relic provides an Application Performance Monitoring platform that is fast to deploy and gives immediate visibility into your application.

Setup

Our Application Under Test is an intentionally poor performing Ruby on Rails app from the New Relic Code Kata hosted on Heroku with free New Relic monitoring. We are using the Ruby-JMeter gem to write a test plan and execute it for free on Flood IO.

New Relic is free for Flood IO users and free for monitoring pre-production apps on Heroku. We're about to demonstrate some of New Relic Professional's features including detailed transaction traces and code level visibility.

Baseline Test

We baseline performance using Flood IO which serves as a reference point for subsequent performance tuning.

Ouch! With just 3 concurrent users, average response times are up to 56s and we also have transactions with errors.

Performance Tuning

Performance tuning is simply a scientific method to correct and integrate previous knowledge via an iterative process. We start each iteration by defining a question and then provide an explanatory hypothesis which we can test or experiment in a reproducible manner. Flood IO helps us test and New Relic enable us to analyse the data and draw conclusions.

Response Time

This is a primary metric of performance analysis and our baseline shows many_assets has a mean response time of 56s with a standard deviation of 14s. Why is this slow?

New Relic Transactions shows us the slowest average response times.

We can determine from the routes that our point of interest (POI) is the ManyAssetsController which is linked to our many_assets transaction. It looks like our POI is not the slowest transaction in New Relic, so we need to consider another important metric of performance analysis, throughput.

Throughput

Unlike response time, throughput is a quantity described as a rate.

New Relic Transactions shows us the highest throughput.

The many_assets transaction in Flood IO has a mean completion rate of 3 requests per minute (rpm) but the ManyAssetsController in New Relic has 1.72rpm for the linked index method and another 688rpm for the display method. Why is this so high?

The page view itself gives us some hints. Too many images!

Lets have a look at the same transaction using YSlow.

Ouch! 400 image requests are made for this page on an empty cache. Worse, those same 400 images are requested on a primed cache. This would be particularly bad for high latency links.

Using observations from Flood IO and New Relic we can form an explanatory hypothesis for why response times are slow for this transaction. We can experiment using CSS image sprites present this collection of images as a single request. This should reduce the number of round trips to the server which in turn decreases throughput on the controller and hopefully improves response time for that transaction.

Problem Solved

We add CSS image sprites and repeat the baseline.

Our prediction was sound. It looks like many_assets is no longer in the top 5 slowest transactions in Flood IO. New Relic also helps confirm that we've solved the throughput issue. No more evidence of +600rpm transactions.

Next Steps

Now that we've introduced the basic concepts of performance tuning and how you can simulate load using Flood IO and analyse performance using New Relic, read our second post in this series to discover how we tune the remaining performance problems present in our application under test.