Reading PHP session from Varnish Cache
In my previous post I showed how to integrate Varnish Cache with a PHP application. The example can solve various simple problems but it might not be enough for a complex software. A good example is a multilingual application. One URL can have multiple caches. You might also need to know more about a user (is he logged in? has he received a notification? etc) to make some additional caching decisions.
All of that can be handled with a special cookie(s) which will flag different scenarios but in my experience this is a clumsy solution. You will need to think of all possible user journeys and make sure that appropriate cookies are created. Caching is very difficult on it’s own so there is no need to make it even more complicated. Much better approach in my opinion is pulling data directly from a PHP session.
PHP by default stores session in a file. This might be OK with a single server architecture but if you have more than one web server than you need a centralised storage. Independently of your setup much better place for a PHP session is memcached. It will improve access time, scalability and of course – you will be able to access session from Varnish.
Storing session data inside the memcached is very simple to do with PHP.
1 2 |
$ sudo apt-get install memcached php5-memcached $ sudo /etc/init.d/memcached start |
Edit the php.ini file.
1 |
$ sudo vim /etc/php5/apache2/php.ini |
Look for session settings
1 2 3 4 |
[Session] ; Handler used to store/retrieve data. ; http://php.net/session.save-handler session.save_handler = files |
and change it to
1 2 3 4 5 |
[Session] ; Handler used to store/retrieve data. ; http://php.net/session.save-handler session.save_handler = memcached session.save_path = "localhost:11211" |
Now restart apache and it’s done.
1 |
$ sudo /etc/init.d/apache2 restart |
If you like you can test it with the below code.
1 2 3 4 5 6 7 8 9 10 11 12 13 |
<?php session_start(); $_SESSION['test'] = 'Hello World'; $m = new Memcached(); $m->addServer('localhost', 11211); foreach( $m->getAllKeys() as $key ) { printf( '<h3>%s</h3>', $key ); var_dump ( $m->get( $key ) ); } |
It should return something like this:
1 2 3 4 5 6 7 8 9 10 |
memc.sess.key.lock.78uso0onvumb665c1gm739er36 string '1' (length=1) memc.sess.key.78uso0onvumb665c1gm739er36 string 'test|s:11:"Hello World";' (length=24) |
If it’s all working lets create a simple page which will simulate multilingual support.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 |
<?php session_start(); if( isset( $_POST['lang'] ) ) { print_r($_POST); $_SESSION['lang'] = $_POST['lang']; } $lang = isset( $_SESSION['lang'] ) ? $_SESSION['lang'] : 'English'; printf( "<h1>My language is: %s (%s)</h1><hr />", $lang, time() ); ?> <form action="/" method="post"> <ul> <li><input type="radio" name="lang" value="English"> English</li> </ul> <ul> <li><input type="radio" name="lang" value="Spanish"> Spanish</li> </ul> <ul> <li><input type="radio" name="lang" value="German"> German</li> </ul> <input type="Submit" name="change" value="Change" /> </form> |
The idea is simple. If langues is set PHP will store it in session as “lang” and appropriate content will be displayed.
The challenge for Varnish is to create and return an appropriate cache based on selected language. Language is saved as a serialised string inside the memcached. It’s stored under “memc.sess.key.UNIQUE_KEY” where the UNIQUE_KEY is a value from the PHPSESSID cookie.
To access memcached from Varnish Cache script you have to install VMOD-Memcached. To compile this module you need Varnish source code.
1 2 |
$ wget http://repo.varnish-cache.org/source/varnish-3.0.3.tar.gz $ tar zxfv varnish-3.0.3.tar.gz |
Get the VMOD and all dependencies.
1 2 3 4 5 6 7 |
$ git clone https://github.com/sodabrew/libvmod-memcached $ sudo apt-get install libmemcached-dev python-docutils $ cd libvmod-memcached $ ./autogen.sh $ ./configure VARNISHSRC=../varnish-3.0.3/ $ make $ sudo make install |
The extension should be copied into your Varnish vmod directory.
1 2 3 4 |
$ ls /usr/local/lib/varnish/vmods/ | grep memcached libvmod_memcached.a libvmod_memcached.la libvmod_memcached.so |
The last missing thing is the default.vcl file.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 |
import std; import memcached; backend default { .host = "127.0.0.1"; .port = "80"; } sub vcl_init { memcached.servers({"--SERVER=localhost:11211 --NAMESPACE="memc.sess.key.""}); return (ok); } sub vcl_recv { if (req.restarts == 0) { if (req.http.x-forwarded-for) { set req.http.X-Forwarded-For = req.http.X-Forwarded-For + ", " + client.ip; } else { set req.http.X-Forwarded-For = client.ip; } } if (req.request != "GET" && req.request != "HEAD" && req.request != "PUT" && req.request != "POST" && req.request != "TRACE" && req.request != "OPTIONS" && req.request != "DELETE") { /* Non-RFC2616 or CONNECT which is weird. */ return (pipe); } if (req.request != "GET" && req.request != "HEAD") { /* We only deal with GET and HEAD by default */ return (pass); } set req.http._sess = regsub( regsub( req.http.Cookie, ".*PHPSESSID=", "" ), ";.*", "" ); std.log( "Cookie: " + req.http._sess ); set req.http._sess = memcached.get( req.http._sess ); std.log( "Session: " + req.http._sess ); return (lookup); } sub vcl_pipe { return (pipe); } sub vcl_pass { return (pass); } sub vcl_hash { hash_data(req.url); if (req.http.host) { hash_data(req.http.host); } else { hash_data(server.ip); } if( req.http._sess && req.http._sess ~ "lang" ) { set req.http._lang = regsub( regsub( req.http._sess, ".*lang.*?\x22", "" ), "\x22.*", "" ); std.log( "Lang: " + req.http._lang ); hash_data( req.http._lang ); } return (hash); } sub vcl_hit { return (deliver); } sub vcl_miss { return (fetch); } sub vcl_fetch { if( req.url ~ "^/$" ) { set beresp.ttl = 30m; remove beresp.http.set-cookie; return(deliver); } if (beresp.ttl <= 0s || beresp.http.Set-Cookie || beresp.http.Vary == "*") { /* * Mark as "Hit-For-Pass" for the next 2 minutes */ set beresp.ttl = 520 s; return (hit_for_pass); } return (deliver); } sub vcl_deliver { return (deliver); } sub vcl_error { set obj.http.Content-Type = "text/html; charset=utf-8"; set obj.http.Retry-After = "5"; synthetic {" ERROR "}; return (deliver); } sub vcl_fini { return (ok); } |
There are few interesting things going on here.
1 2 3 4 |
sub vcl_init { memcached.servers({"--SERVER=localhost:11211 --NAMESPACE="memc.sess.key.""}); return (ok); } |
As you probably can guess Varnish will connect to the memcached server on init.
Now look at the bottom of the vcl_recv function.
1 2 3 4 |
set req.http._sess = regsub( regsub( req.http.Cookie, ".*PHPSESSID=", "" ), ";.*", "" ); std.log( "Cookie: " + req.http._sess ); set req.http._sess = memcached.get( req.http._sess ); std.log( "Session: " + req.http._sess ); |
VCL language doesn’t allow to define new variables although you can reuse the predefined one (like in this example “req.http“). By the end of this block you should have the whole PHP session stored inside req.http._sess.
You can use
1 |
$ varnishlog | grep Log |
to see output of the std.log function.
The most important code happens inside the vcl_hash subroutine.
1 2 3 4 5 |
if( req.http._sess && req.http._sess ~ "lang" ) { set req.http._lang = regsub( regsub( req.http._sess, ".*lang.*?\x22", "" ), "\x22.*", "" ); std.log( "Lang: " + req.http._lang ); hash_data( req.http._lang ); } |
You can read more about VCL subroutines here but in a nutshell vcl_hash is responsible for building a hash string under which a cache is going to be saved.
By default Varnish is caching per URL and host but we have to extend it by a language name. This is exactly what happens here. A full hash string will look more less like this:
1 |
"/" + "localhost:8080" + "English" |
The last thing worth explaining is what happens inside the vcl_fetch.
1 2 3 4 5 6 7 |
sub vcl_fetch { if( req.url ~ "^/$" ) { set beresp.ttl = 30m; remove beresp.http.set-cookie; return(deliver); } |
If there is a cookie attached to a request Varnish will never return cached content. It comes from an assumption that if there is a cookie the page must be dynamic.
The point of this exercise if to handle dynamic content so we walk around this limitation for http://localhost:8080/ requests by unsetting cookies (it happens only in the Varnish scope).
Now you can start Varnish server (don’t forget to type start).
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
$ sudo varnishd -f /usr/local/etc/varnish/default.vcl -s malloc,128M -T 127.0.0.1:2000 -a 0.0.0.0:8080 -d Platform: Linux,3.5.0-30-generic,x86_64,-smalloc,-smalloc,-hcritbit 200 244 ----------------------------- Varnish Cache CLI 1.0 ----------------------------- Linux,3.5.0-30-generic,x86_64,-smalloc,-smalloc,-hcritbit Type 'help' for command list. Type 'quit' to close CLI session. Type 'start' to launch worker process. start child (4913) Started 200 0 Child (4913) said Child starts |
Open two different web browsers, go to http://localhost:8080/ and start changing languages. POST requests are always forwarded to the web server so session value should be updated. For every GET Varnish should return an appropriate content (according to the current language selection) from cache.
It’s little bit tricky to set it up for the first time but the reward is worth it. Making the Varnish Cache aware of user’s status gives much more flexibility and allows to handle more requests directly from cache. That dramatically drops your hosting costs and increases capacity of your server. Give it a go.