Archive for May, 2009

Combined Web Site Logging splitter for AWStats

Friday, May 29th, 2009

AWStats has an interesting problem when working with combined logging. When you have 500 domains and combined logfiles at roughly 2 gigabytes a day, awstats spends a lot of time shuffling through all of the log files to return the results. The simple solution appeared to be a small python script that read the awstats config directory and split the logfile into pieces so that awstats could run on individual logfiles. It requires one loop through the combined logfile to create all of the logfiles, rather than looping through the 2 gigabyte logfile for each domain when awstats was set up with combined logging.

#!/usr/bin/python

import os,re
from string import split

dirs = os.listdir('/etc/awstats')

domainlist = {}

for dir in dirs:
  if (re.search('\.conf$',dir)):
    dom = re.sub('^awstats\.', '', dir)
    dom = re.sub('\.conf$', '', dom)
    domainlist[dom] = 1
    
loglist = open('/var/log/apache2/combined-access.log.1','r')
for line in loglist:
  (domain,logline) = line.split(None, 1)
  if (domain in domainlist):
    if (domainlist[domain] == 1):
      domainlist[domain] = open('/var/log/apache2/' + domain + '-access.log.1', 'w')
    domainlist[domain].write(logline)

While the code isn’t particularly earthshattering, it cut down log processing to roughly 20 minutes per day rather than the previous 16-30 hours per day.

Varnish proves itself against a DDOS

Saturday, May 2nd, 2009

I’ve worked a lot with Varnish over the last few weeks and we’ve had a rather persistent hacker that has been sending a small but annoying DDOS to a client on one of our machines. Usually we isolate the client and move their affected sites to a machine that won’t affect other clients. Then we can modify firewall rules, find the issue, wait for the attack to end and move them back. Usually this results in a bit of turmoil because not every client is easy to shuffle around. Some have multiple databases and perhaps the application they are running takes a bit more horsepower to run due to the attack.

In this case, the application wasn’t too badly written and it was just a matter of firewalling certain types of packets and modifying the TCP settings to allow things to time out a bit more quickly while the attack persisted. In order to do this seamlessly we had to move the physical IP that client was using to another machine running varnish.

What we ended up with was running Varnish on a machine where we had the ability to freely firewall packets, could turn on more verbose packet logging and, pulled the requests from the original machine. Short of moving the IP address and making config changes on the existing machine, it was straightforward:

Original Machine
* changed apache config to listen to a different IP address on port 81
* modified the firewall to allow port 81
* adjusted the apache config to listen to port 81 on that IP address
* shut down the virtual ethernet interface
* restarted apache

Varnish Machine
* set up the backend to request files from port 81 on the new IP assigned from the old machine
* copied the firewall rules from the Original Machine to the Varnish Machine
* brought up the IP from the original machine
* restarted varnish

Cleared the Arp-cache in the switches that both machines were connected to.

Within seconds, the load on the Original machine dropped to half of what it was before. Varnish had been running on that machine, but, the DDOS was still hitting the firewall rules and causing apache to open connections. Moving both of those pieces of the equation off the machine resulted in an immediate improvement on the Original Machine. Since the same cpu horsepower is being used with the script – Varnish passes those requests through, and we’ve only removed some of the static files from being served from the machine, I believe we can safely conclude that it wasn’t the application that had the problems. Apache has roughly the same number of processes as it had when we were running varnish on that machine, so, the load reduction appears to be mostly related to the firewall rules or the traffic that was still coming through.

Since moving the traffic over to the other machine, we see the same issues being exhibited there. Since that machine isn’t doing anything but caching the apache responses, we can reasonably assume that the firewall is adding quite a bit of overhead to things. The inbound traffic on the Original Machine was cut almost in half with a corresponding jump on the Varnish machine. Since Varnish is dealing with inbound traffic from the original machine and from the DDOS attack, it is difficult to say with certainty that the inbound traffic on that machine is reflecting it, however, based on the 90% cache hit rate and the size of the cached pages, I don’t believe the inbound traffic on that machine should be what it is, so, it is evident that the DDOS traffic moved.

After moving one set of sites, and analyzing the Original Machine, it does appear that a second set of his sites is also impacted.

Entries (RSS) and Comments (RSS).
Cluster host: li