Guide to JMeter regular expressions

This is a guide to using the JMeter regular expression extractor, to help correlate response values with future request parameters. It introduces you to regular expressions and how they're implemented in JMeter. We'll take a look at common scenarios we see on Flood IO and answer common support questions.

You may have heard the term correlation used by performance testers, but what does it mean?

Correlate. have a mutual relationship or connection, in which one thing affects or depends on another.

Instead of the science of correlation and causation, in load testing language, this often refers to the act of correlating a dynamic value from a response with a value in a future request. If you are converting from LoadRunner to JMeter then you will know what I mean, if not, then read on!

Examples of dynamic data in response bodies

  • CSRF Tokens: the server sends a unique authenticity token in the body of a response, which is then used by the browser when it submits a form in the next request.

    first response <<<<<<<<<<<<<<<<<<<<<<<< <meta content="authenticity_token" name="csrf-param" /> <meta content="VLDkNA3oFiFP0ap9zOPkWwAwLxmKwFpZ57JlUVZA//E=" name="csrf-token" />

    ``` next request

    > Content-Disposition: form-data; name="authenticity_token" VLDkNA3oFiFP0ap9zOPkWwAwLxmKwFpZ57JlUVZA//E= ```

  • __ VIEWSTATE: Microsoft® ASP.NET web applications persist changes to the state a form in the body of a response, which is used on each susbequent postback to the server.

    first response <<<<<<<<<<<<<<<<<<<<<<<< <input type="hidden" name="__VIEWSTATE" id="__VIEWSTATE" value="qUJAF651pJ8WTL0r7dcB+HwCIu5roI89rxbRJhCalaDd5WwuJBR4XnqrSL+1ntHDz4JXmZX3J+uH1Z0yMMqHoN9lwc4qsduHqyB5IkMPTQtH7R7RLf3y+0JrfvE48s10Jo2WZ5X6kc3QM2jbBzG2VR1Fbnn9ZN9IV5nbN7Jc4+UQ3O8PuqpY+vG6hLdWsZOzo6FXAVa5ibL57KW7pPcUDzO+Zzi196o0WTz79HVUf2eQVK9uZEX4kWHOJcNmUcd8kyTU+Xobex3z0jnc29axbnsFZbbRnLLUeZw0Nnycn50qN7pafSBsEe2xG8FRGPdzVi6KNNfCLm7V/FGkJiDbFeopvkBNXXHx/gwJs1UONXqQm/YhTNraTb2B0fKzfbHcJ/2lTco+jQ/fPbNr6rPWrg+DGC5DohH1MdXb9Rtw9LDzJ0SUPC5B7kf+6uswY3jkrQPKYnp9hrhdqwvygjMe55Df0t6Um9voXv/vRR8HK60EZQQ8tYcG0Qnot/fYZw+Kt9lkY47mEZjL9YXoJDgQlNHK2uFBzCKB+L3CU1v7TanITLqWrOf7n6nujpUiQ5J0hbY9iPQpyvsyntA/cHGYr3vjF2OxmvurWAoZFA4f0r1Y2Ig0X16bz6OFJnY5IuJrzJuKTnGHHlTMyRkbbtSqnDN2yxYttHOQfrgXe5Y8pRK05vEHLtk8wBsZHl9xxzRJcBt6Gz+XcIFNa/5BiI8+cbiSP6ssdCYHsqumKevMotKFHdLRwY2OVijtKBrJFSDhKbtBPP3RM8zzc2KtY11+PKQGN58=" />

    ``` next request

    > Content-Disposition: form-data; name="__VIEWSTATE" qUJAF651pJ8WTL0r7dcB+HwCIu5roI89rxbRJhCalaDd5WwuJBR4XnqrSL+1ntHDz4JXmZX3J+uH1Z0yMMqHoN9lwc4qsduHqyB5IkMPTQtH7R7RLf3y+0JrfvE48s10Jo2WZ5X6kc3QM2jbBzG2VR1Fbnn9ZN9IV5nbN7Jc4+UQ3O8PuqpY+vG6hLdWsZOzo6FXAVa5ibL57KW7pPcUDzO+Zzi196o0WTz79HVUf2eQVK9uZEX4kWHOJcNmUcd8kyTU+Xobex3z0jnc29axbnsFZbbRnLLUeZw0Nnycn50qN7pafSBsEe2xG8FRGPdzVi6KNNfCLm7V/FGkJiDbFeopvkBNXXHx/gwJs1UONXqQm/YhTNraTb2B0fKzfbHcJ/2lTco+jQ/fPbNr6rPWrg+DGC5DohH1MdXb9Rtw9LDzJ0SUPC5B7kf+6uswY3jkrQPKYnp9hrhdqwvygjMe55Df0t6Um9voXv/vRR8HK60EZQQ8tYcG0Qnot/fYZw+Kt9lkY47mEZjL9YXoJDgQlNHK2uFBzCKB+L3CU1v7TanITLqWrOf7n6nujpUiQ5J0hbY9iPQpyvsyntA/cHGYr3vjF2OxmvurWAoZFA4f0r1Y2Ig0X16bz6OFJnY5IuJrzJuKTnGHHlTMyRkbbtSqnDN2yxYttHOQfrgXe5Y8pRK05vEHLtk8wBsZHl9xxzRJcBt6Gz+XcIFNa/5BiI8+cbiSP6ssdCYHsqumKevMotKFHdLRwY2OVijtKBrJFSDhKbtBPP3RM8zzc2KtY11+PKQGN58= ```

How do I know if I need to correlate response values with future requests?

An experienced tester will have an eye for this, and by looking at network activity in their favourite proxy recorder or network debug tool, well known parameters such as VIEWSTATE or JSESSIONID, or timestamp and tokens should stand out like the proverbial. Other parameters may be more subtle to detect, especially single characters or obfuscated parameters that result from some form of javascript parsing / execution.

The only way to know for sure, is to adapt the old proverb

Measure twice and DIFF!

By that we mean take a snapshot or recording of your transaction twice, with the same user and use some form of file comparison tool to detect differences. If you can't do that, then often the Mark I Human Eyeball will suffice.

Correlation using JMeter

Using the above CSRF token example, let's take a look at how it's done in JMeter.

We'd make our first request with a HTTP Request Sampler

Thread Group  
  HTTP Request
    Server Name or IP: flood.io
    Path: /

To that request, we'd add a Regular Expression Extractor

Thread Group  
  HTTP Request
    -> Regular Expression Extractor
      Reference Name: authenticity_token
      Regular Expression: content="(.+?)" name="csrf-token"
      Template: $1$
      Match No. (0 for Random): 1
      Default Value:

The Reference Name will store the results of the expression in the JMeter variable ${authenticity_token}

Your First Regular Expression

If you don't know any regex, take heart, you can get started with some basics. Consider the following expression:

content="(.+?)" name="csrf-token"

All regular expressions are pattern matching. This expression says:

  1. match the characters content=" literally (case sensitive)
  2. then capture the 1st group in brackets (.+?)
  3. inside that group, match any character .+? between one and unlimited times, as few times as possible, expanding as needed (otherwise known as lazy)
  4. then match the characters " name="csrf-token" literally (case sensitive)

Visually the expression looks like this:

Now that we have a regular expression, we specify a template for using it with $1$. This means the variable ${authenticity_token} will be populated with the first matched group only.

Our Match Number is set to 1, we want the first instance of this matched string, as there could be multiple matches on a page.

Some Variations on Regex

The JMeter manual has some useful regular expressions which you can familiarise yourself with. Following is an an example which fleshes out some of these concepts.

Often you will need to extract multiple attributes from a HTML tag, for example the ID and Name attributes associated with a particular class.

<input class="cats" id="meow" name="Buster">  
<input class="cats" id="purr" name="Mac">  
<input class="cats" id="roar" name="Sooky">  

Consider the following regular expression extractor:

Thread Group  
  HTTP Request
    -> Regular Expression Extractor
      Reference Name: cat
      Regular Expression: class="cats" id="(.+?)" name="(.+?)"
      Template: $2$ says $1$
      Match No. (0 for Random): 1
      Default Value:

This would yield the following results:

cat=Buster says meow  
cat_g=2  
cat_g0=class="cats" id="meow" name="Buster"  
cat_g1=meow  
cat_g2=Buster  

What does that all mean? We stored the results of the expression class="cats" id="(.+?)" name="(.+?)" in a JMeter variable called ${cat}. The template we used was the 2nd group followed by the string says followed by the first 1st group. So in effect ${cat} now equals Buster says meow

JMeter also breaks the expression up into other variables which is handy when we only want parts of the matched expression. For example ${cat_g1} says the 1st group equals meow and likewise ${cat_g2} equals the 2nd group Buster. Indeed we can also get the entire matched string from the regular expression via $cat_g0.

What happens if we wanted all the matches on a page? This is where Match No. comes into play. In previous examples we used the 1st match found on the page.

Match No. is not a zero based index! If you specify 0 then you will get a random match from the page.

So this expression:

Thread Group  
  HTTP Request
    -> Regular Expression Extractor
      Reference Name: cat
      Regular Expression: class="cats" id=".+?" name="(.+?)"
      Template: $1$
      Match No. (0 for Random): 0
      Default Value:

Would yield the following results:

cat=Sooky  
cat_g=1  
cat_g0=class="cats" id="roar" name="Sooky"  
cat_g1=Sooky  

In this case we only matched one (group) and used a random match on the page, so this iteration ${cat} equals Sooky.

In further iterations, we would see any random value from Buster, Sooky and Mac.

The other trick we might like to do is return all matches on the page.

This expression:

Thread Group  
  HTTP Request
    -> Regular Expression Extractor
      Reference Name: cat
      Regular Expression: class="cats" id="(.+?)" name="(.+?)"
      Template: $1$
      Match No. (0 for Random): -1
      Default Value:

Yields the following results:

cat=Mac  
cat_1=meow  
cat_1_g=2  
cat_1_g0=class="cats" id="meow" name="Buster"  
cat_1_g1=meow  
cat_1_g2=Buster  
cat_2=purr  
cat_2_g=2  
cat_2_g0=class="cats" id="purr" name="Mac"  
cat_2_g1=purr  
cat_2_g2=Mac  
cat_3=roar  
cat_3_g=2  
cat_3_g0=class="cats" id="roar" name="Sooky"  
cat_3_g1=roar  
cat_3_g2=Sooky  
cat_matchNr=3  

There's a new variable present called ${cat_MatchNr} which equals 3. As the name suggests, this is the total amount of matches on the page. This can be quite handy for a number of reasons. For example if we wanted to loop through all the matches in the response and include them in the next request, we could do something like this using a BeanShell PreProcessor

The following BeanShell script executes a basic for loop, from 1 up to cat_matchNr which equals 3, and for each iteration of the loop, adds a HTTP request parameter as follows:

Thread Group  
  HTTP Request
    -> BeanShell PreProcessor
      Script:
        int count = Integer.parseInt(vars.get("cat_matchNr"));

        for (int i=1; i<=count; i++)
        {
          says = vars.get("cat_" + i + "_g1");
          cat = vars.get("cat_" + i + "_g2");
          sampler.addArgument(cat + "_says", says);
        }

This yields the following request:

GET http://wheres.my.kitten.com/?Buster_says=meow&Mac_says=purr&Sooky_says=roar  

Extracting from the Header

Sometimes the value you are after does not exist in the response body, it might exist in the response header.

It's easy to do this, just change the response field to check to Headers

So this expression:

Thread Group  
  HTTP Request
    -> Regular Expression Extractor
      Response Field to Check: Headers
      Reference Name: auth_token
      Regular Expression: token":"([^"]+)"
      Template: $1$
      Match No. (0 for Random): 1

Would yield the following results:

auth_token=abcd1234  
auth_token_g=1  
auth_token_g0=token":"abcd1234"  
auth_token_g1=abcd1234  

Extracting values from JSON

Flood IO supports the use of JMeter plugins which make it very simple to extract JSON values from a response body.

Consider the following response body from a typical HTTP response containing JSON:

{
  "ok" : true,
  "status" : 200,
  "name" : "Armory",
  "version" : {
    "number" : "0.19.8",
    "snapshot_build" : false
  },
  "tagline" : "You Know, for Search"
}

This expression:

Thread Group  
  HTTP Request
    -> jp@gc - JSON Path Extractor
      Name: name
      JSON Path: $.name

    -> jp@gc - JSON Path Extractor
      Name: version_number
      JSON Path: $.version.number

Would yield the following results:

name=Armory  
version_number=0.19.8  

It doesn't get more simple than that!

The examples we've shown on this page are probably enough to get you started. Feel free to contact support with more specific questions.