Simple Squid access log reporting.
Squid is one of the biggest and most used proxies on the interwebs. And generating reports from the access logs is already a done deal, there are many commercial and OSS apps that support the squid log format. But I found my self in a situation where I wanted stats but didn’t want to install a web server on my proxy or use syslog to push my logs to a centralised server which was running such software, and also wasn’t in a position to go buy one of those off the shelf amazing wiz bang Squid reporting and graphing tools.
As a Linux geek I surfed the web to see what others have done. I came across a list provided by the Squid website. Following a couple of links, I came across a awk script called ‘proxy_stats.gawk’ written by Richard Huveneers.
I downloaded it and tried it out… unfortunately it didn’t work, looking at the code.. which he nicely commented showed that he had it set up for access logs from version 1.* of squid. Now the squid access log format from squid 2.6+ hasn’t changed too much from version 1.1. all they have really done is add a “content type” entry at the end of each line.
So as a good Linux geek does, he upgrades the script, my changes include:
- Support for squid 2.6+
- Removed the use a deprecated switches that now isn’t supported in the sort command.
- Now that there is a an actual content type “column” lets use it to improve the ‘Object type report”.
- Add a users section, as this was an important report I required which was missing.
- And in a further hacked version, an auto generated size of the first “name” column.
Now with the explanation out of the way, let me show you it!
For those who are new to awk, this is how I’ve been running it:
zcat <access log file> | awk -f proxy_stats.gawk > <report-filename>
NOTE: I’ve been using it for some historical analysis, so I’m running it on old rotated files, which are compressed thus the zcat.
You can pass more then one file at a time and it order doesn’t matter, as each line of an access log contains the date in epoch time:
zcat `find /var/log/squid/ -name "access.log*"` |awk -f proxy_stats.gawk
The script produces an ascii report (See end of blog entry for example), which could be generated and emailed via cron. If you want it to look nice in any email client using html the I suggest wrapping it in <pre> tags.:
<html>
<head><title>Report Title</title></head>
Report title<body>
<pre>
... Report goes here ...
</pre>
</body>
</html>
For those experienced Linux sys admins out there using cron + ‘find -mtime’ would be a very simple way of having an automated daily, weekly or even monthly report.
But like I said earlier I was working on historic data, hundreds of files in a single report, hundreds because for business reasons we have been rotating the squid logs every hour… so I did what I do best, write a quick bash script to find all the files I needed to cat into the report:
#!/bin/bash ACCESS_LOG_DIR="/var/log/squid/access.log*" MONTH="$1" function getFirstLine() { if [ -n "`echo $1 |grep "gz$"`" ] then zcat $1 |head -n 1 else head -n 1 $1 fi } function getLastLine() { if [ -n "`echo $1 |grep "gz$"`" ] then zcat $1 |tail -n 1 else tail -n 1 $1 fi } for log in `ls $ACCESS_LOG_DIR` do firstLine="`getFirstLine $log`" epochStr="`echo $firstLine |awk '{print $1}'`" month=`date -d @$epochStr +%m` if [ "$month" -eq "$MONTH" ] then echo $log continue fi #Check the last line lastLine="`getLastLine $log`" epochStr="`echo $lastLine |awk '{print $1}'`" month=`date -d @$epochStr +%m` if [ "$month" -eq "$MONTH" ] then echo $log fi done
So there you go, thanks to the work of Richard Huveneers there is a script that I think generates a pretty good acsii report, which can be automated or integrated easily into any Linux/Unix work flow.
If you interested in getting hold of the most up to date version of the script you can get it from my sysadmin github repo here.
As promised earlier here is an example report:
Parsed lines : 32960 Bad lines    : 0 First request : Mon 30 Jan 2012 12:06:43 EST Last request : Thu 09 Feb 2012 09:05:01 EST Number of days: 9.9 Top 10 sites by xfers          reqs  %all %xfers  %hit        MB  %all  %hit    kB/xf     kB/s ------------------------- ------------------------------- ------------------------ ------------------- 213.174.155.216                  20  0.1% 100.0%  0.0%       0.0  0.0%  0.0%      1.7      2.5 30.media.tumblr.com               1  0.0% 100.0%  0.0%       0.0  0.0%  0.0%     48.3     77.4 28.media.tumblr.com               1  0.0% 100.0%  0.0%       0.1  0.0%  0.0%     87.1      1.4 26.media.tumblr.com               1  0.0%  0.0%     -       0.0  0.0%     -        -        - 25.media.tumblr.com               2  0.0% 100.0%  0.0%       0.1  0.0%  0.0%     49.2     47.0 24.media.tumblr.com               1  0.0% 100.0%  0.0%       0.1  0.0%  0.0%    106.4    181.0 10.1.10.217                     198  0.6% 100.0%  0.0%      16.9  0.9%  0.0%     87.2   3332.8 3.s3.envato.com                  11  0.0% 100.0%  0.0%       0.1  0.0%  0.0%      7.6     18.3 2.s3.envato.com                  15  0.0% 100.0%  0.0%       0.1  0.0%  0.0%      7.5     27.1 2.media.dorkly.cvcdn.com          8  0.0% 100.0% 25.0%       3.2  0.2%  0.3%    414.1    120.5 Top 10 sites by MB             reqs  %all %xfers  %hit        MB  %all  %hit    kB/xf     kB/s ------------------------- ------------------------------- ------------------------ ------------------- zulu.tweetmeme.com                2  0.0% 100.0% 100.0%       0.0  0.0% 100.0%      3.1    289.6 ubuntu.unix.com                   8  0.0% 100.0% 100.0%       0.1  0.0% 100.0%      7.5    320.0 static02.linkedin.com             1  0.0% 100.0% 100.0%       0.0  0.0% 100.0%     36.0    901.0 solaris.unix.com                  2  0.0% 100.0% 100.0%       0.0  0.0% 100.0%      3.8    223.6 platform.tumblr.com               2  0.0% 100.0% 100.0%       0.0  0.0% 100.0%      1.1    441.4 i.techrepublic.com.com            5  0.0% 60.0% 100.0%       0.0  0.0% 100.0%      6.8   2539.3 i4.zdnetstatic.com                2  0.0% 100.0% 100.0%       0.0  0.0% 100.0%     15.3    886.4 i4.spstatic.com                   1  0.0% 100.0% 100.0%       0.0  0.0% 100.0%      4.7    520.2 i2.zdnetstatic.com                2  0.0% 100.0% 100.0%       0.0  0.0% 100.0%      7.8   2920.9 i2.trstatic.com                   9  0.0% 100.0% 100.0%       0.0  0.0% 100.0%      1.5    794.5 Top 10 neighbor report         reqs  %all %xfers  %hit        MB  %all  %hit    kB/xf     kB/s ------------------------- ------------------------------- ------------------------ ------------------- www.viddler.com                   4  0.0% 100.0%  0.0%       0.0  0.0%     -      0.0      0.0 www.turktrust.com.tr             16  0.0% 100.0%  0.0%       0.0  0.0%     -      0.0      0.0 www.trendmicro.com                5  0.0% 100.0%  0.0%       0.0  0.0%     -      0.0      0.0 www.reddit.com                    2  0.0% 100.0%  0.0%       0.0  0.0%     -      0.0      0.0 www.linkedin.com                  2  0.0% 100.0%  0.0%       0.0  0.0%     -      0.0      0.0 www.google-analytics.com          2  0.0% 100.0%  0.0%       0.0  0.0%     -      0.0      0.0 www.facebook.com                  2  0.0% 100.0%  0.0%       0.0  0.0%     -      0.0      0.0 www.dynamicdrive.com              1  0.0% 100.0%  0.0%       0.0  0.0%     -      0.0      0.0 www.benq.com.au                   1  0.0% 100.0%  0.0%       0.0  0.0%     -      0.0      0.0 wd-edge.sharethis.com             1  0.0% 100.0%  0.0%       0.0  0.0%     -      0.0      0.0 Local code                     reqs  %all %xfers  %hit        MB  %all  %hit    kB/xf     kB/s ------------------------- ------------------------------- ------------------------ ------------------- TCP_CLIENT_REFRESH_MISS        2160  6.6% 100.0%  0.0%       7.2  0.4%  0.0%      3.4     12.9 TCP_HIT                         256  0.8% 100.0% 83.2%      14.0  0.8% 100.0%     56.0   1289.3 TCP_IMS_HIT                     467  1.4% 100.0% 100.0%      16.9  0.9% 100.0%     37.2   1747.4 TCP_MEM_HIT                     426  1.3% 100.0% 100.0%      96.5  5.3% 100.0%    232.0   3680.9 TCP_MISS                      27745 84.2% 97.4%  0.0%    1561.7 85.7%  0.3%     59.2     18.2 TCP_REFRESH_FAIL                 16  0.0% 100.0%  0.0%       0.2  0.0%  0.0%     10.7      0.1 TCP_REFRESH_MODIFIED            477  1.4% 99.8%  0.0%      35.0  1.9%  0.0%     75.3   1399.4 TCP_REFRESH_UNMODIFIED         1413  4.3% 100.0%  0.0%      91.0  5.0%  0.0%     66.0    183.5 Status code                    reqs  %all %xfers  %hit        MB  %all  %hit    kB/xf     kB/s ------------------------- ------------------------------- ------------------------ ------------------- 000                             620  1.9% 100.0%  0.0%       0.0  0.0%     -      0.0      0.0 200                           29409 89.2% 100.0%  2.9%    1709.7 93.8%  7.7%     59.5    137.1 204                             407  1.2% 100.0%  0.0%       0.2  0.0%  0.0%      0.4      1.4 206                             489  1.5% 100.0%  0.0%     112.1  6.1%  0.0%    234.7    193.0 301                              82  0.2% 100.0%  0.0%       0.1  0.0%  0.0%      0.7      1.5 302                             356  1.1% 100.0%  0.0%       0.3  0.0%  0.0%      0.8      2.7 303                               5  0.0% 100.0%  0.0%       0.0  0.0%  0.0%      0.7      1.5 304                             862  2.6% 100.0% 31.2%       0.4  0.0% 30.9%      0.4     34.2 400                               1  0.0%  0.0%     -       0.0  0.0%     -        -        - 401                               1  0.0%  0.0%     -       0.0  0.0%     -        -        - 403                              47  0.1%  0.0%     -       0.0  0.0%     -        -        - 404                             273  0.8%  0.0%     -       0.0  0.0%     -        -        - 500                               2  0.0%  0.0%     -       0.0  0.0%     -        -        - 502                              12  0.0%  0.0%     -       0.0  0.0%     -        -        - 503                              50  0.2%  0.0%     -       0.0  0.0%     -        -        - 504                             344  1.0%  0.0%     -       0.0  0.0%     -        -        - Hierarchie code                reqs  %all %xfers  %hit        MB  %all  %hit    kB/xf     kB/s ------------------------- ------------------------------- ------------------------ ------------------- DIRECT                        31843 96.6% 97.7%  0.0%    1691.0 92.8%  0.0%     55.7     44.3 NONE                           1117  3.4% 100.0% 100.0%     131.6  7.2% 100.0%    120.7   2488.2 Method report                  reqs  %all %xfers  %hit        MB  %all  %hit    kB/xf     kB/s ------------------------- ------------------------------- ------------------------ ------------------- CONNECT                        5485 16.6% 99.2%  0.0%     132.8  7.3%  0.0%     25.0      0.3 GET                           23190 70.4% 97.7%  4.9%    1686.3 92.5%  7.8%     76.2    183.2 HEAD                           2130  6.5% 93.7%  0.0%       0.7  0.0%  0.0%      0.3      1.1 POST                           2155  6.5% 99.4%  0.0%       2.9  0.2%  0.0%      1.4      2.0 Object type report             reqs  %all %xfers  %hit        MB  %all  %hit    kB/xf     kB/s ------------------------- ------------------------------- ------------------------ ------------------- */*                               1  0.0% 100.0%  0.0%       0.0  0.0%  0.0%      1.6      3.2 application/cache-digest        396  1.2% 100.0% 50.0%      33.7  1.8% 50.0%     87.1   3655.1 application/gzip                  1  0.0% 100.0%  0.0%       0.1  0.0%  0.0%     61.0     30.8 application/javascript          227  0.7% 100.0% 12.3%       2.2  0.1%  7.7%      9.9     91.9 application/json                409  1.2% 100.0%  0.0%       1.6  0.1%  0.0%      4.1      6.0 application/ocsp-response       105  0.3% 100.0%  0.0%       0.2  0.0%  0.0%      1.9      2.0 application/octet-stream        353  1.1% 100.0%  6.8%      81.4  4.5%  9.3%    236.1    406.9 application/pdf                   5  0.0% 100.0%  0.0%      13.5  0.7%  0.0%   2763.3     75.9 application/pkix-crl             96  0.3% 100.0% 13.5%       1.0  0.1%  1.7%     10.6      7.0 application/vnd.google.sa      1146  3.5% 100.0%  0.0%       1.3  0.1%  0.0%      1.1      2.4 application/vnd.google.sa      4733 14.4% 100.0%  0.0%      18.8  1.0%  0.0%      4.1     13.4 application/x-bzip2              19  0.1% 100.0%  0.0%      78.5  4.3%  0.0%   4232.9    225.5 application/x-gzip              316  1.0% 100.0% 59.8%     133.4  7.3% 59.3%    432.4   3398.1 application/x-javascript       1036  3.1% 100.0%  5.8%       9.8  0.5%  3.4%      9.7     52.1 application/xml                  46  0.1% 100.0% 34.8%       0.2  0.0% 35.1%      3.5    219.7 application/x-msdos-progr       187  0.6% 100.0%  0.0%      24.4  1.3%  0.0%    133.7    149.6 application/x-pkcs7-crl          83  0.3% 100.0%  7.2%       1.6  0.1%  0.4%     19.8     10.8 application/x-redhat-pack        13  0.0% 100.0%  0.0%      57.6  3.2%  0.0%   4540.7    156.7 application/x-rpm               507  1.5% 100.0%  6.3%     545.7 29.9%  1.5%   1102.2    842.8 application/x-sdlc                1  0.0% 100.0%  0.0%       0.9  0.0%  0.0%    888.3    135.9 application/x-shockwave-f       109  0.3% 100.0% 11.9%       5.4  0.3% 44.5%     50.6    524.1 application/x-tar                 9  0.0% 100.0%  0.0%       1.5  0.1%  0.0%    165.3     36.4 application/x-www-form-ur        11  0.0% 100.0%  0.0%       0.1  0.0%  0.0%      9.9     15.4 application/x-xpinstall           2  0.0% 100.0%  0.0%       2.5  0.1%  0.0%   1300.6    174.7 application/zip                1802  5.5% 100.0%  0.0%     104.0  5.7%  0.0%     59.1      2.5 Archive                          89  0.3% 100.0%  0.0%       0.0  0.0%     -      0.0      0.0 audio/mpeg                        2  0.0% 100.0%  0.0%       5.8  0.3%  0.0%   2958.2     49.3 binary/octet-stream               2  0.0% 100.0%  0.0%       0.0  0.0%  0.0%      5.5     14.7 font/ttf                          2  0.0% 100.0%  0.0%       0.0  0.0%  0.0%     15.5     12.5 font/woff                         1  0.0% 100.0% 100.0%       0.0  0.0% 100.0%     42.5   3539.6 Graphics                        126  0.4% 100.0%  0.0%       0.1  0.0%  0.0%      0.6      2.5 HTML                             14  0.0% 100.0%  0.0%       0.0  0.0%  0.0%      0.1      0.1 image/bmp                         1  0.0% 100.0%  0.0%       0.0  0.0%  0.0%      1.3      3.9 image/gif                      5095 15.5% 100.0%  2.4%      35.9  2.0%  0.7%      7.2      9.5 image/jpeg                     1984  6.0% 100.0%  4.3%      52.4  2.9%  0.6%     27.0     62.9 image/png                      1684  5.1% 100.0% 10.3%      28.6  1.6%  1.9%     17.4    122.2 image/vnd.microsoft.icon         10  0.0% 100.0% 30.0%       0.0  0.0% 12.8%      1.0      3.3 image/x-icon                     72  0.2% 100.0% 16.7%       0.2  0.0%  6.0%      3.2     15.0 multipart/bag                     6  0.0% 100.0%  0.0%       0.1  0.0%  0.0%     25.2     32.9 multipart/byteranges             93  0.3% 100.0%  0.0%      16.5  0.9%  0.0%    182.0    178.4 text/cache-manifest               1  0.0% 100.0%  0.0%       0.0  0.0%  0.0%      0.7      3.1 text/css                        470  1.4% 100.0%  7.9%       3.4  0.2%  5.8%      7.4     59.7 text/html                      2308  7.0% 70.7%  0.4%       9.6  0.5%  0.6%      6.0     14.7 text/javascript                1243  3.8% 100.0%  2.7%      11.1  0.6%  5.2%      9.1     43.3 text/json                         1  0.0% 100.0%  0.0%       0.0  0.0%  0.0%      0.5      0.7 text/plain                     1445  4.4% 99.4%  1.5%      68.8  3.8%  5.5%     49.0     41.9 text/x-cross-domain-polic        24  0.1% 100.0%  0.0%       0.0  0.0%  0.0%      0.7      1.7 text/x-js                         2  0.0% 100.0%  0.0%       0.0  0.0%  0.0%     10.1      6.4 text/x-json                       9  0.0% 100.0%  0.0%       0.0  0.0%  0.0%      3.0      8.5 text/xml                        309  0.9% 100.0% 12.9%      12.9  0.7% 87.5%     42.8    672.3 unknown/unknown                6230 18.9% 99.3%  0.0%     132.9  7.3%  0.0%     22.0      0.4 video/mp4                         5  0.0% 100.0%  0.0%       3.2  0.2%  0.0%    660.8     62.7 video/x-flv                     117  0.4% 100.0%  0.0%     321.6 17.6%  0.0%   2814.9    308.3 video/x-ms-asf                    2  0.0% 100.0%  0.0%       0.0  0.0%  0.0%      1.1      4.7 Ident (User) Report            reqs  %all %xfers  %hit        MB  %all  %hit    kB/xf     kB/s ------------------------- ------------------------------- ------------------------ ------------------- -                             32960 100.0% 97.8%  3.5%    1822.6 100.0%  7.2%     57.9    129.0 Weekly report                  reqs  %all %xfers  %hit        MB  %all  %hit    kB/xf     kB/s ------------------------- ------------------------------- ------------------------ ------------------- 2012/01/26                    14963 45.4% 97.6%  3.6%     959.8 52.7%  1.8%     67.3    104.5 2012/02/02                    17997 54.6% 98.0%  3.4%     862.8 47.3% 13.2%     50.1    149.4 Total report                   reqs  %all %xfers  %hit        MB  %all  %hit    kB/xf     kB/s ------------------------- ------------------------------- ------------------------ ------------------- All requests                  32960 100.0% 97.8%  3.5%    1822.6 100.0%  7.2%     57.9    129.0 Produced by : Mollie's hacked access-flow 0.5 Running time: 2 seconds
Happy squid reporting!
I try to download your proxy_stats.gawk from github but give me a error:
[root@webserver logs]# zcat access.log | awk -f proxy_stats.gawk > reportawk
awk: proxy_stats.gawk:1:
awk: proxy_stats.gawk:1: ^ syntax error
gzip: access.log: not in gzip format
where i wrong?
The access.log your are using is not in gzip format, so use cat instead of zcat.
Once log rotate rotates your logs they get compressed, so if the file ends in .gz then use zcat, if they just end in .log then use cat.
So try:
cat access.log | awk -f proxy_stats.gawk > reportawk
When in doubt use the file command to see if the file is compressed:
file access.log
I hope this helps,
Matt
Nice script, very nice. It works very well on squid 2.7. But I’d upgraded my squid to 3.1.20 version, and this script doesn’t work anymore.
Could you, please, make it compatible with this newer squid version? I´m very pleasent with it, your effort will be VERY appreciated.
Thank you in advance.
just replace ” == 9″ with ” == 10″ for newer squid
Very very good