Using HAProxy with Zope via Buildout
After my post on reducing GIL contention by using fewer Zope threads, Lee Joramo asked for more information on setting up HAProxy, so let me share my configuration. Much of the credit for this goes to Hanno Schlichting and Alex Clark, who offered me much good advice and a sample configuration, respectively.
First, a few words about what HAProxy offers. For the past couple years I’ve been using Pound to load balance between multiple backend Zope instances. But recently I’ve been hearing recommendations from people I trust (such as Jarn and Elizabeth Leddy) to try HAProxy instead.
HAProxy offers some nice features:
– Backend health checks
– Various load-balance algorithms for how requests get distributed to backends
– Can do sticky sessions so that an authenticated user always hits the same backend
– Warmup time (don’t send as many requests to a Zope instance while it’s starting up)
– Provides a status page giving info on backend status and uptime, # of queued requests, # of active sessions, # of errors, etc.
Some of these are possible with pound too, but the status screen was really the “killer app” for me. This is fun to watch but also very useful for doing rolling restarts when new code needs to be deployed without an interruption in service.
Configuration
In my buildout.cfg I added:
[buildout] ... parts = ... haproxy-build haproxy-conf [haproxy-build] recipe = plone.recipe.haproxy url = http://dist.plone.org/thirdparty/haproxy-1.3.22.zip [haproxy-conf] recipe = collective.recipe.template input = ${buildout:directory}/haproxy.conf.in output = ${buildout:directory}/etc/haproxy.conf maxconn = 24000 ulimit-n = 65536 user = zope group = staff bind = 127.0.0.1:8080
Here, we add a part called “haproxy-build” which uses the plone.recipe.haproxy recipe to build haproxy from source and add a bin/haproxy script for running it, and a part called “haproxy-conf” which builds the HAProxy configuration file by filling in variables in a template file called haproxy.conf.in.
Be sure to set the user and group variables to the user and group you want HAProxy to run as, and update the bind variable to set the port to which HAProxy should bind.
I run most of my Plone stack using supervisord, so I also updated my supervisord configuration in buildout to run HAProxy:
[supervisor] recipe = collective.recipe.supervisor ... programs = ... 10 haproxy ${buildout:directory}/bin/haproxy [ -f ${buildout:directory}/etc/haproxy.conf -db ]
In a real life deployment, you’ll probably also want a caching reverse proxy like squid or varnish sitting in front of HAProxy.
What about the contents of haproxy.conf.in? Here’s mine:
global log 127.0.0.1 local6 maxconn ${haproxy-conf:maxconn} user ${haproxy-conf:user} group ${haproxy-conf:group} daemon nbproc 1 defaults mode http option httpclose # Remove requests from the queue if people press stop button option abortonclose # Try to connect this many times on failure retries 3 # If a client is bound to a particular backend but it goes down, # send them to a different one option redispatch monitor-uri /haproxy-ping timeout connect 7s timeout queue 300s timeout client 300s timeout server 300s # Enable status page at this URL, on the port HAProxy is bound to stats enable stats uri /haproxy-status stats refresh 5s stats realm Haproxy\ statistics frontend zopecluster bind ${haproxy-conf:bind} default_backend zope # Load balancing over the zope instances backend zope # Use Zope's __ac cookie as a basis for session stickiness if present. appsession __ac len 32 timeout 1d # Otherwise add a cookie called "serverid" for maintaining session stickiness. # This cookie lasts until the client's browser closes, and is invisible to Zope. cookie serverid insert nocache indirect # If no session found, use the roundrobin load-balancing algorithm to pick a backend. balance roundrobin # Use / (the default) for periodic backend health checks option httpchk # Server options: # "cookie" sets the value of the serverid cookie to be used for the server # "maxconn" is how many connections can be sent to the server at once # "check" enables health checks # "rise 1" means consider Zope up after 1 successful health check server plone0101 127.0.0.1:${zeoclient1:http-address} cookie p0101 check maxconn 2 rise 1 server plone0102 127.0.0.1:${zeoclient2:http-address} cookie p0102 check maxconn 2 rise 1
This assumes that I have Zope instances built by parts called “zeoclient1” and “zeoclient2” in my buildout; you’ll probably need to update those names.
You may want to adjust the “option httpchk” line to use a different URL for checking whether the Zope instances are up — you want to point at something that can be rendered as quickly as possible (in my case it’s the Zope root information screen, so I’m not too worried).
The maxconn setting for each backend should be at least the number of threads that that Zope instance is running. Laurence Rowe pointed out to me that it should probably not be set to 1, since Zope also serves some things (blobs and ) via file stream iterators, which happens apart from the main ZPublisher threads. (So setting maxconn to 1 would mean serving a large blob could block other requests to that backend, for instance.)
See the HAProxy configuration documentation for more details on the settings that can be used in this file.