A/B Testing with NGINX in 40 Lines of Code

A/B Testing, the silver spoon that has enabled designers and product managers to get a deep insight into user behavioral patterns. On the one hand it’s allowed product managers more flexibility while conceptualizing user journeys, on the other hand it’s become a developers nightmare for being told to make 2 versions of the same component.

“The general concept behind A/B testing is to create an experiment with a control group and one or more experimental groups (called “cells” within Netflix) which receive alternative treatments. Each member belongs exclusively to one cell within a given experiment, with one of the cells always designated the “default cell”. This cell represents the control group, which receives the same experience as all Netflix members not in the test.”— Netflix blog

What does the present ecosystem offer.

Presently a lot of companies like Mixpanel, VWO, optimisely provide client SDKs (Javascript code) which have to be added in the head tag of the page html. The tests can then be created via a dashboard. Although the above methods give you a lot of options when it comes to button colors, component height (css attributes), it doesn’t really allow you to create two separate flows altogether. Also some of the above libraries can really hamper your page load times and can create a jittery / laggy experience for users.

Presenting NGINX

Nginx is a light web server that offers a bunch of functionality like load balancing, reverse proxying, html compression. It’s easy to setup and really offers a lot of control to developers.

Nginx is a terrific tool for distributing traffic for split tests. It’s stable, it’s blazingly fast, and configurations for typical use cases are prevalent online. More complex configuration can be accomplished after just a couple hours exploring the documentation. Small companies may not have resources to spend on paid software for A/B testing, but nginx provides an option to carry out some form of A/B testing.

You want to see which of the forms below will have better conversion.
Version A and Version B respectively

Your hypothesis being a lesser number of form fields implies less data being inputted by the user thus leading to more conversions.

So we can define 2 buckets Version A and Version B. The former is the control group which is what is shown to 80% of the traffic where as the latter is the test group which forms the remaining 20% of the traffic. Port 7770 will host one bucket of the code where as port 7777 will host the second bucket of the code.


Over to some Code Now

Your nginx.conf file can be modified as shown below.

http {
# ...
# application version 1a
upstream version_a {
server
server 127.0.0.1:7770; ## Can be an external ip too
}
    # application version 1b
upstream version_b {
server
server 127.0.0.1:7777; ## Can be an external ip too
}
    split_clients "app${remote_addr}${http_user_agent}${date_gmt}"   $appversion {
80% version_1a;
* version_1b;
}
server {
# ...
listen 80;
location / {
proxy_set_header Host $host;
proxy_pass http://$appversion;
}
}
}

Create 2 upstreams, one for each bucket.

The split_client directive helps us divert traffic along with the specified weightage to a particular upstream. The app${remote_addr}${http_user_agent}${date_gmt}appversion creates a hash based on the above parameters and is used by nginx to log a request made to either bucket. Preferably these parameters are those which are pertaining solely to a user, like user_agent, remote addr.

Ok so this will work but it doesn’t give the user a persistent experience. If i refresh my page, there is a chance i switch between buckets, and this can be a horrid user experience. Consider the above case, imagine trying to fill a 6 field form and then suddenly on refreshing seeing a 2 field form.

New Approach

  1. Proxy pass request to either bucket
  2. Set a cookie with an expiration time equal to duration of test.
  3. Check for cookie existence and proxy pass to correct bucket to ensure a uniform user experience.

We will use nginxs’ map directive and map the $cookie_name variable to different buckets that we have created.

http {
# ...
# application version a
upstream version_a {
server server 127.0.0.1:7770; ## Can be an external ip too
}
   # application version b
upstream version_b {
server server 127.0.0.1:7777; ## Can be an external ip too
}
split_clients "app${remote_addr}${http_user_agent}${date_gmt}" $appversion {
80% version_a;
* version_b;
}
    map $cookie_split_test_version $upstream_group {
default $appversion;
"version_a" "version_a";
"version_b" "version_b";
}
server {
# ...
listen 80;
location / {
add_header Set-Cookie "split_test_version=$upstream_group;Path=/;Max-Age=518400;";
            proxy_set_header Host $host;
            if ($upstream_group = "version_a") {
proxy_pass http://127.0.0.1:7777;
break;
}
          if ($upstream_group = "version_b") {
proxy_pass http://127.0.0.1:7770;
break;
}
          proxy_pass http://$appversion;
}
}
}

As its a little hard to format the above code…

Conclusion

  1. Nginx provides a very simple api to create an A/B test environment.
  2. Allows for multiple buckets to be created, the example above shows 2 buckets but we can split traffic and create more buckets.
  3. As the same code is hosted on 2 ports, deployment can become tricky(presently i have 2 branches a master and a test branch), whether done off a different branch or from the same one.
  4. Carrying more than one A/B test can become tricky. Yes you can use the location directive and set different cookies based on the required tests, but having 2 tests (Test 1, Control: 80, Test 20 & Test2 Control: 50, Test 50) is impossible. That being said at a time you should not have more than 1 A/B test per page. Else you will end up having 2^n versions of your page where n is the number of tests, and tracking conversions will be hell.
  5. Tracking can now be done at a very granular level as the code bases are effectively separate.

Do let me know if i’ve made any mistake in the above. Happy to correct and learn. Hope you liked the article.

PS: Did anyone notice it is less than 40 lines of code


One clap, two clap, three clap, forty?

By clapping more or less, you can signal to us which stories really stand out.