Performance Testing 101 - 5 min intro & example
When developing and deploying web services, apps or sites the following questions come up: "How will it perform?", "How many concurrent users will it support?", "If I tweak this setting, will it be faster?", "Do these new features effect performance?". The list could go on and on and on. Performance questions are common, solid answers are not.
Performance testing can take many different shapes, from dead-simple one-liners to complex setups, tests, tear-downs and analysis. While this article focuses on quick, easy and straightforward testing, future articles will address more advanced topics.
There are some great easy tools to get first ball-park answers to performance questions getting at the number of concurrent users as well as how the response time changes as load increases. Here I'll give a short intro to ApacheBench
.
Let us begin with setting up some basic terminology, first let's refer to our machine under test as the host
, this can be any kind of http-accessible server you have. Second we will want an agent
machine to drive our tests from.
When performance testing, it is key to limit the number of possible variables which could distort our results. Ideally your agent
is a separate dedicated machine and as close as possible (network distance wise) to you host
system in order to minimize the amount of networking you test. This is especially relevant when you are testing applications hosted in a shared environment (ie cloud). The performance impact of noisy neighbors can be surprising, but that is a topic we will explore in detail in the future.
ApacheBench
ApacheBench is a command line tool (ab
) which allows for simple load driving against HTTP hosts. It's great at producing large numbers of REST
requests, capable of producing thousands of requests per second. Generally I find ApacheBench most useful for getting a rough idea of how many requests an application can handle. It's extremely simple to use and therefore a great tool while debugging configurations.
To install on Debian/Ubuntu:
sudo apt-get install apache2-utils
To Install on RHEL/Centos/Fedora
sudo yum install httpd-tools
Usage:
Usage: ab [options] [http[s]://]hostname[:port]/path
Options are:
-n requests Number of requests to perform
-c concurrency Number of multiple requests to make at a time
-t timelimit Seconds to max. to spend on benchmarking
This implies -n 50000
-s timeout Seconds to max. wait for each response
Default is 30 seconds
-b windowsize Size of TCP send/receive buffer, in bytes
-B address Address to bind to when making outgoing connections
-p postfile File containing data to POST. Remember also to set -T
-u putfile File containing data to PUT. Remember also to set -T
-T content-type Content-type header to use for POST/PUT data, eg.
'application/x-www-form-urlencoded'
Default is 'text/plain'
-v verbosity How much troubleshooting info to print
-w Print out results in HTML tables
-i Use HEAD instead of GET
-x attributes String to insert as table attributes
-y attributes String to insert as tr attributes
-z attributes String to insert as td or th attributes
-C attribute Add cookie, eg. 'Apache=1234'. (repeatable)
-H attribute Add Arbitrary header line, eg. 'Accept-Encoding: gzip'
Inserted after all normal header lines. (repeatable)
-A attribute Add Basic WWW Authentication, the attributes
are a colon separated username and password.
-P attribute Add Basic Proxy Authentication, the attributes
are a colon separated username and password.
-X proxy:port Proxyserver and port number to use
-V Print version number and exit
-k Use HTTP KeepAlive feature
-d Do not show percentiles served table.
-S Do not show confidence estimators and warnings.
-q Do not show progress when doing more than 150 requests
-l Accept variable document length (use this for dynamic pages)
-g filename Output collected data to gnuplot format file.
-e filename Output CSV file with percentages served
-r Don't exit on socket receive errors.
-m method Method name
-h Display usage information (this message)
-Z ciphersuite Specify SSL/TLS cipher suite (See openssl ciphers)
-f protocol Specify SSL/TLS protocol
(TLS1, TLS1.1, TLS1.2 or ALL)
Example usage:
ab -c 1 -n 1000 http://example.com/
Putting ApacheBench to use:
The above section should be plenty to get you started, but lets look at a quick example of testing caching. Below I've setup a simple flask
server example, which on request calculated a random Fibonacci number between 1
and 30
.
#!/usr/bin/env python
#
# To start this server, you must have python and flask installed
# Start server: python testserver-fib.py
#
# To install flask use the pip line below:
# pip install Flask
# or visit: http://flask.pocoo.org/docs/0.12/installation/
from flask import Flask
import random
app = Flask(__name__)
# snagged from: http://stackoverflow.com/a/499245
def F(n):
if n == 0: return 0
elif n == 1: return 1
else: return F(n-1)+F(n-2)
@app.route('/')
def hello_world():
r = random.randint(1,30)
fib = F(r)
# ApacheBench expects constant output
return 'fib({0:02}):{0:06}'.format(r,fib)
if __name__ == "__main__":
app.run(debug=True)
Now let's see how we do performance wise. We set the concurrency to 1 using -c 1
and specify the number of requests to 500 by setting -n 500
. Note that we are using the simple flask dev-server, which is single threaded.
m@test:~$ ab -c 1 -n 500 http://127.0.0.1:5000/
-- snip --
Concurrency Level: 1
Time taken for tests: 17.821 seconds
Complete requests: 500
Failed requests: 0
Total transferred: 85000 bytes
HTML transferred: 7000 bytes
Requests per second: 28.06 [#/sec] (mean)
Time per request: 35.642 [ms] (mean)
Time per request: 35.642 [ms] (mean, across all concurrent requests)
Transfer rate: 4.66 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.0 0 0
Processing: 1 35 88.2 1 471
Waiting: 0 35 88.2 1 471
Total: 1 36 88.2 1 471
Percentage of the requests served within a certain time (ms)
50% 1
66% 5
75% 16
80% 27
90% 107
95% 277
98% 447
99% 461
100% 471 (longest request)
In the above example, we see the average request time was 34ms
, the median was 1ms
and we had 28rps
(requests per second). What happens if instead of a single connection we have 10 concurrent connections (setting -c 10
)?
m@test:~$ ab -c 1 -n 500 http://127.0.0.1:5000/
-- snip --
Concurrency Level: 10
Time taken for tests: 18.579 seconds
Complete requests: 500
Failed requests: 0
Total transferred: 85000 bytes
HTML transferred: 7000 bytes
Requests per second: 26.91 [#/sec] (mean)
Time per request: 371.583 [ms] (mean)
Time per request: 37.158 [ms] (mean, across all concurrent requests)
Transfer rate: 4.47 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.0 0 0
Processing: 3 360 294.4 294 1470
Waiting: 2 360 294.4 294 1470
Total: 3 360 294.4 294 1470
Percentage of the requests served within a certain time (ms)
50% 294
66% 443
75% 529
80% 579
90% 746
95% 1018
98% 1136
99% 1160
100% 1470 (longest request)
While the RPS remains very similar to before at ~27rps
, our response times have gone through the roof (mean of 371ms
and median of 294ms
)! Here we have a situation, where multiple parallel connections get serialized and processed one at a time, while the overall rate remains unchanged, the quality of service delivered to each client degrades by a factor roughly similar to the number of concurrent connections.
Let's see if we can do better. Since we repeatedly calculate the same 30 Fibonacci numbers, let's add some caching into the mix. Generally, if you have long-running requests that will always return an unchanging value, it is a good idea to cache these. With the caching in place, the first few requests will still have the same slow response time, but all of the following requests will benefit from the cache and therefore be as fast as our cache lookup. See the modified code below:
#!/usr/bin/env python
#
# * To start this server, you must have python and flask installed
# * Copy this into a file named testserver-fib-cached.py
# * Start server: python testserver-fib-cached.py
#
# * To install flask use the pip line below:
# `pip install Flask`
# or visit: http://flask.pocoo.org/docs/0.12/installation/
from flask import Flask
import random
app = Flask(__name__)
cache = {}
# snagged from: http://stackoverflow.com/a/499245
def F(n):
if n == 0: return 0
elif n == 1: return 1
else: return F(n-1)+F(n-2)
@app.route('/')
def hello_world():
r = random.randint(1,30)
if r in cache:
print('hit')
# ApacheBench expects constant output
return 'Cache Hit! fib({0:02}):{0:06}'.format(r,cache[r])
else:
fib = F(r)
cache[r] = fib
print('miss')
# ApacheBench expects constant output
return 'Cache Miss! fib({0:02}):{0:06}'.format(r,fib)
if __name__ == "__main__":
app.run(debug=True)
Now let's run our single connection, 500 request benchmark again:
-- snip --
Concurrency Level: 1
Time taken for tests: 1.680 seconds
Complete requests: 500
Failed requests: 0
Total transferred: 91000 bytes
HTML transferred: 13000 bytes
Requests per second: 297.55 [#/sec] (mean)
Time per request: 3.361 [ms] (mean)
Time per request: 3.361 [ms] (mean, across all concurrent requests)
Transfer rate: 52.89 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.0 0 0
Processing: 1 3 27.7 1 497
Waiting: 0 3 27.7 1 497
Total: 1 3 27.7 1 497
Percentage of the requests served within a certain time (ms)
50% 1
66% 1
75% 1
80% 1
90% 1
95% 1
98% 7
99% 85
100% 497 (longest request)
The results are quite impressive: the mean is down to 3.3ms
, the median down to 1ms
and the request rate is at 297rps
! That is 10x faster. Once the cache is initialized and our benchmark no longer includes the cache seeding time, we get even higher performance, which at this point is likely limited only by the cache-lookups. My local testing gets me up to 1100rps
with median and mean both less than1ms
. While this is a simple example for demonstration, it is important to note that part of what we are seeing is a misleading flaw in how most load driving tools generate requests and record latencies, this is known as the coordinated-omission problem, but that is a topic for another day.
This concludes our short introduction into performance testing, but soon to follow will be more articles addressing more complex setups, benchmarking methods and types, metrics to be evaluating and considerations for repeatability.