Mon 22 Jun 2009
If you’re like me, and goodness who isn’t?!?, you probably run a few high bandwidth sites. You probably worry a lot about reliability and scalability and fault tolerance. You’ve maybe looked into hardware load balancers (like F5′s BigIP), only to be discouraged by the hefty price tag: 15k and up. Don’t misunderstand– if you’ve got the cash, you can’t beat the BigIP in terms of reliability and features and ease of use. But if your blog isn’t exactly paying for itself yet, you’ll need to consider other options. A lot of people like the idea of DNS load balancing. That’s where you add a bunch of A records for your domain to different IP addresses. The DNS server will then serve out the IPs in a round-robin sort of manner. It’s a good way to spread the web traffic to a group of web servers. But if a server goes down, you’ll have to remove it from your DNS and wait for the TTL to expire before things are back up (of course, you could power down the machine and alias the IP address to another server in the same network…)
But it’s clunky because the only balancing is round robin and there are no health checks involved. Fortunately, you have alternatives. I’ll present one of those options here: load balancing with mod_proxy. mod_proxy is an apache module. In the diagram to the above-right you can see my setup. I have two load balancers, two web servers and two databases. On the two load balancers, I’m running apache with the mod_proxy and mod_proxy_balancer. There is a UDP heartbeat between the two. Apache is started on only one of those servers and the virtual IP (VIP) is aliased to that one. If that machine goes off line, the second load balancer will grab the VIP and start up apache.
And then we have the web servers. The active load balancer will round robin serve up pages to the web servers. If one of the web servers goes off line, the load balancer detects that and will take that web server out of the rotation.
And, yes, I set it all up and it works splendidly. Yes, you can put the load balancers and web servers on two machines instead of four. Yes, you’ll need to keep sessions in MySQL to prevent loss of session info during a failover. You can check which load balancer is active by watching ifconfig or ps -aux|grep httpd. If things aren’t working, check /var/log/ha-log, and if there are errors, it will tell you the command it’s trying to run. run that command to see the error message.
I haven’t talked about how to scale the MySQL tier. There are a number of solutions: master/slave, mysql proxy, active/active or mysql cluster. More about that later.
There are a number of conf files you’ll need to set up to get this working:
authkeys
auth 2 2 sha1 TOPSECRETWORD
ha.cf
logfile /var/log/ha-log bcast eth0 keepalive 2 warntime 5 deadtime 10 initdead 20 udpport 694 auto_failback yes node lb1 node lb2 uuidfrom nodename respawn hacluster /usr/lib/heartbeat/ipfail
haresources (Must be EXACTLY the SAME ON BOTH lb1 and lb2!!!)
lb1 10.1.1.19 httpd
http.conf
Listen 10.1.1.19:80 ... ProxyRequests Off <Location /balancer-manager> SetHandler balancer-manager Order deny,allow Allow from all </Location> <Proxy balancer://mycluster> # cluster member 1 BalancerMember http://10.3.1.31:80 route=lb1 # cluster member 2 BalancerMember http://10.3.1.32:80 route=lb2 </Proxy> ProxyPass /balancer-manager ! ProxyPass / balancer://mycluster/ lbmethod=byrequests ProxyPassReverse / http://10.3.1.31 ProxyPassReverse / http://10.3.1.32 #ProxyTimeout 10 <VirtualHost *:80> ServerName intranet.example.com UseCanonicalName On DocumentRoot /var/www/html <Directory "/var/www/html"> Options FollowSymLinks AllowOverride None Order allow,deny Allow from all </Directory> RewriteEngine On RewriteCond %{REMOTE_ADDR} ^(127.0.0.1) RewriteRule ^(/server-status) $1 [H=server-status,L] # Proxy the rest to the load balancer RewriteRule ^/(.*)$ balancer://mycluster%{REQUEST_URI} [P,QSA,L] ErrorLog /var/log/httpd/http_log_error CustomLog /var/log/httpd/http_log combined # Deflate AddOutputFilterByType DEFLATE text/html text/plain text/xml application/xml application/xhtml+xml text/javascript text/css BrowserMatch ^Mozilla/4 gzip-only-text/html BrowserMatch ^Mozilla/4.0[678] no-gzip BrowserMatch bMSIE !no-gzip !gzip-only-text/html SetEnvIf User-Agent ".*MSIE.*" nokeepalive ssl-unclean-shutdown downgrade-1.0 force-response-1.0 </VirtualHost>
iptables
... -A RH-Firewall-1-INPUT -m state --state NEW -m udp -p udp --dport 694 -j ACCEPT ...
ldirectord.cf
checktimeout=10 checkinterval=2 autoreload=no logfile="/var/log/ldirectord.log" logfile="local0" quiescent=yes virtual=10.1.1.19:80 fallback=127.0.0.1:80 real=10.1.1.21:80 masq real=10.1.1.22:80 masq service=http request="index.html" receive="Test Page" scheduler=rr protocol=tcp checktype=negotiate
These reference pages are very useful:
- http://www.johnandcailin.com/blog/john/scaling-drupal-step-three-using-heartbeat-implement-redundant-load-balancer
- http://www.howtoforge.com/high_availability_heartbeat_centos
- http://www.johnandcailin.com/blog/john/scaling-drupal-step-four-database-segmentation-using-mysql-proxy
- http://httpd.apache.org/docs/2.2/mod/mod_proxy.html
- http://httpd.apache.org/docs/2.2/mod/mod_proxy_balancer.html
Isn’t more easy setup haproxy ? Guess it’s more efficient then apache with mod_proxy. Of course with heartbeat as well.