Collecting Web Server Metrics

Metrics are incredibly important to collect and understand. Not only will they be there to support you when “shit hits the fan” but also when you want to improve the resiliency of your service. They also play an important role in capacity planning, trending, continuous intergration/continuous deployment and performance tuning. Etsy talks more about the importance of collecting metrics here, measure anything, measure everything.

In this post we are going to examine how to collect metrics for two popular web servers, Nginx and Apache.

Let’s get started…

Nginx

Installation

The ‘HttpStubStatusModule’ is not compiled into Nginx by default. To check if it is, run

[root@ip- ~]# /usr/sbin/nginx -V
nginx version: nginx/1.2.5
configure arguments:  **--with-http_stub_status_module**

Here we can see the version of Nginx I’m using in AWS from the ‘amzn-updates’ repository comes with it compiled in. If you didn’t see the output above and are compiling from source, ensure the following argument is set --with-http_stub_status_module

More information about the module can be found here.

Configuration

The configuration is simple for the Nginx status module. Add the code below to a server block, replacing “YOUR_IP” with the IP address of your computer. This will only allow connections via localhost and your computer to the status page. Then reload nginx /etc/init.d/nginx reload

server {
    server_name www.drewl.org;
    ...

    location /nginx_status {
        stub_status on;
        access_log off;
        allow 127.0.0.1;
        allow YOUR_IP;
        deny all;
    }
}

Results

Either with curl or a web browser go to ‘http://YOUR_SITE.com/nginx_status'

curl http://YOUR_SITE.com/nginx_status

Active connections: 59 
server accepts handled requests
 11939 11939 17784 
Reading: 0 Writing: 1 Waiting: 58

The output can be interpreted as…

active (59)
- currently established connections
accepts (11939)
- running count of accepted requests
handled (11939)
- running count of handled requests
- should be the same as accepts unless resource limits were reached
requests (17784)
- running count of established requests
- will be a different value than “accepts” if keepalive is enabled
  - requests per connection = requests / handled
    - 17784/11939 = 1.5 requests per connection
reading (0)
- reading the request headers
- expected to be near 0 especially if request headers can fit into one packet
- might not be 0 when large cookies are being sent
writing (1)
- reading the request body, processing requests, or writing a client response
waiting (58)
- number of keep alive connections opened
- active - (reading + writing)

Apache

Installation

The mod_status module is usually shipped by default when you install Apache via the package provider of your OS’s distribution but lets check…

[root@graphite ~]# httpd -V
Server version: Apache/2.2.15 (Unix)
Server built:   Aug 13 2013 17:29:28

[root@graphite ~]# httpd -M | grep status
[Wed Dec 18 00:01:27 2013] [warn] module wsgi_module is already loaded, skipping
 status_module (shared)

If mod_status is not available, when you compile it from source ensure the --enable-status argument is present.

Assuming the module is available let’s get started. Before copying and pasting this, review your httpd.conf, as it might already have these options enabled. If not, let’s add the Location directive below.

Replace “YOUR_IP” with the IP address of your computer. Once completed restart apache but before restarting it’s always a good idea to test the config (/etc/init.d/httpd configtest). This Location directive can be moved to a Virtualhost instead of the global httpd.conf if desired for security reasons.

vi /etc/httpd/conf/httpd.conf

<IfModule !status_module>
    LoadModule status_module modules/mod_status.so
</IfModule>

<Location /server-status>
    SetHandler server-status
    Order deny,allow
    Deny from all
    Allow from 127.0.0.1
    Allow from YOUR_IP
</Location>

Results

With a web browser go to “http://YOUR_SITE.com/server-status?refresh=5". This will refresh the status page every 5 seconds.

apache status

This shows…

Number of workers serving requests
Number of idle workers
Server uptime
The scoreboard! more on this below…

With curl add ‘?auto’ to the end of the URI and this will produce a machine friendly view of the information we saw before.

curl http://YOUR_SITE/server-status?auto

BusyWorkers: 1
IdleWorkers: 74
Scoreboard: W________________________..RRR....

ExtendedStatus

The ExtendedStatus directive shows even more information in regards to the requests being processed. But be cautious, according to the docs the web server can slow down, how much I don’t know yet.

In additional to what we saw above we also get…

Total Accesses: Total number of requests
Total kBytes: Total number of KB’s served
CPULoad: CPU Usage of Apache
ReqPerSec: Number of requests per second
BytesPerSec: Number of bytes per second
BytesPerReq: Average number of bytes per request

Automatically with a curl

curl http://YOUR_SITE/server-status?auto

Total Accesses: 28
Total kBytes: 23
CPULoad: .166667
Uptime: 18
ReqPerSec: 1.55556
BytesPerSec: 1308.44
BytesPerReq: 841.143
BusyWorkers: 2
IdleWorkers: 73
Scoreboard: W________________________..RRR....

Manually via a web browser

apache extended status

This can be helpful if you need to trace down user requests that are causing Apache to use a lot of CPU resources.

Srv    Child Server number - generation
PID    OS process ID
Acc    Number of accesses this connection / this child / this slot
M      Mode of operation
CPU    CPU usage, number of seconds
SS     Seconds since beginning of most recent request
Req    Milliseconds required to process most recent request
Conn   Kilobytes transferred this connection
Child  Megabytes transferred this child
Slot   Total megabytes transferred this slot

Scoreboard

The scoreboard shown above is what’s used by programs such as Monit to gather metrics to determine the health of Apache.

The scoreboard can be decoded as…

‘_’ Waiting for a connection
‘S’ Starting up
‘R’ Reading request
‘W’ Sending reply
‘K’ Keepalive (read)
‘D’ DNS lookup,
‘C’ Closing connection
‘L’ Logging
‘G’ Gracefully finishing
‘I’ Idle cleanup of worker
‘.’ Open slot with no current process

Things to look for

A high numer of ‘K’ keys could mean you have your KeepAliveTimeout set to high
A high number ‘L’ keys could be a sign of IO bottlenecks
A low number of ‘.’ keys could be a sign you need to increase ServerLimit or MaxClients
A low number of ‘idle workers’ could be a sign you need to increase MinSpareThreads

Collecting

Now that we have both web servers configured to query for metrics its time to start collecting.

There are a couple of tools to use at our disposable.

I’m going to be using a tool called Riemann for collecting the metrics. I’ll leave the setup, configuration and Graphite integration for another post as this one is already lengthy.

Assuming you have Riemann setup along with Riemann Tools we are ready to start collecting the metrics.

Nginx

Let’s start collecting the Nginx metrics

riemann-nginx-status --host 162.243.251.204 -u http://127.0.0.1/nginx_status --event-host drewl.org

Apache

and the Apache metrics

riemann-apache-status --host 162.243.251.204 -u http://127.0.0.1/server-status --event-host graphite.org

apache extended status

and that’s all folk’s, hopefully this will prove useful to help you start collecting and measuring all the things!

More Areas to Explore

Collect the total number of responses with status codes 1XX, 2XX, 3XX, 4XXX and 5XXX
SSL Metrics

Nginx#

Installation#

Configuration#

Results#

Apache#

Installation#

Results#

ExtendedStatus#

Automatically with a curl#

Manually via a web browser#

Scoreboard#

Things to look for#

Collecting#

Nginx#

Apache#

More Areas to Explore#

Nginx

Installation

Configuration

Results

Apache

Installation

Results

ExtendedStatus

Automatically with a curl

Manually via a web browser

Scoreboard

Things to look for

Collecting

Nginx

Apache

More Areas to Explore