Reading PHP session from Varnish Cache

In my previous post I showed how to integrate Varnish Cache with a PHP application. The example can solve various simple problems but it might not be enough for a complex software. A good example is a multilingual application. One URL can have multiple caches. You might also need to know more about a user (is he logged in? has he received a notification? etc) to make some additional caching decisions.

All of that can be handled with a special cookie(s) which will flag different scenarios but in my experience this is a clumsy solution. You will need to think of all possible user journeys and make sure that appropriate cookies are created. Caching is very difficult on it’s own so there is no need to make it even more complicated. Much better approach in my opinion is pulling data directly from a PHP session.

PHP by default stores session in a file. This might be OK with a single server architecture but if you have more than one web server than you need a centralised storage. Independently of your setup much better place for a PHP session is memcached. It will improve access time, scalability and of course – you will be able to access session from Varnish.

Storing session data inside the memcached is very simple to do with PHP.

$ sudo apt-get install memcached php5-memcached
$ sudo /etc/init.d/memcached start

Edit the php.ini file.

$ sudo vim /etc/php5/apache2/php.ini

Look for session settings

[Session]
; Handler used to store/retrieve data.
; http://php.net/session.save-handler
session.save_handler = files

and change it to

[Session]
; Handler used to store/retrieve data.
; http://php.net/session.save-handler
session.save_handler = memcached
session.save_path = "localhost:11211"

Now restart apache and it’s done.

$ sudo /etc/init.d/apache2 restart

If you like you can test it with the below code.

addServer('localhost', 11211);

foreach( $m->getAllKeys() as $key ) {
  printf( '

%s

', $key ); var_dump ( $m->get( $key ) ); }

It should return something like this:

memc.sess.key.lock.78uso0onvumb665c1gm739er36

string

 '1' (length=1)
memc.sess.key.78uso0onvumb665c1gm739er36

string

 'test|s:11:"Hello World";' (length=24)

If it’s all working lets create a simple page which will simulate multilingual support.

<?php

session_start();

if( isset( $_POST['lang'] ) ) {
  print_r($_POST);
  $_SESSION['lang'] = $_POST['lang'];
}

$lang = isset( $_SESSION['lang'] ) ? $_SESSION['lang'] : 'English';

printf( "

My language is: %s (%s)


", $lang, time() ); ?>
  • English
  • Spanish
  • German

The idea is simple. If langues is set PHP will store it in session as “lang” and appropriate content will be displayed.

The challenge for Varnish is to create and return an appropriate cache based on selected language. Language is saved as a serialised string inside the memcached. It’s stored under “memc.sess.key.UNIQUE_KEY” where the UNIQUE_KEY is a value from the PHPSESSID cookie.

To access memcached from Varnish Cache script you have to install VMOD-Memcached. To compile this module you need Varnish source code.

$ wget http://repo.varnish-cache.org/source/varnish-3.0.3.tar.gz
$ tar zxfv varnish-3.0.3.tar.gz

Get the VMOD and all dependencies.

$ git clone https://github.com/sodabrew/libvmod-memcached
$ sudo apt-get install libmemcached-dev python-docutils
$ cd libvmod-memcached
$ ./autogen.sh 
$ ./configure VARNISHSRC=../varnish-3.0.3/
$ make
$ sudo make install

The extension should be copied into your Varnish vmod directory.

$ ls /usr/local/lib/varnish/vmods/ | grep memcached
libvmod_memcached.a
libvmod_memcached.la
libvmod_memcached.so

The last missing thing is the default.vcl file.

import std;
import memcached;

backend default {
  .host = "127.0.0.1";
  .port = "80";
}

sub vcl_init {
  memcached.servers({"--SERVER=localhost:11211 --NAMESPACE="memc.sess.key.""});
  return (ok);
}

sub vcl_recv {

  if (req.restarts == 0) {
    if (req.http.x-forwarded-for) {
      set req.http.X-Forwarded-For =
      req.http.X-Forwarded-For + ", " + client.ip;
    } else {
      set req.http.X-Forwarded-For = client.ip;
    }
  }

  if (req.request != "GET" &&
      req.request != "HEAD" &&
      req.request != "PUT" &&
      req.request != "POST" &&
      req.request != "TRACE" &&
      req.request != "OPTIONS" &&
      req.request != "DELETE") {
      /* Non-RFC2616 or CONNECT which is weird. */
      return (pipe);
  }

  if (req.request != "GET" && req.request != "HEAD") {
    /* We only deal with GET and HEAD by default */
    return (pass);
  }

  set req.http._sess = regsub( regsub( req.http.Cookie, ".*PHPSESSID=", "" ), ";.*", "" );
  std.log( "Cookie: " + req.http._sess );
  set req.http._sess = memcached.get( req.http._sess );
  std.log( "Session: " + req.http._sess );

  return (lookup);
}


sub vcl_pipe {
  return (pipe);
}

sub vcl_pass {
  return (pass);
}

sub vcl_hash {
  hash_data(req.url);
  if (req.http.host) {
    hash_data(req.http.host);
  } else {
    hash_data(server.ip);
  }

  if( req.http._sess && req.http._sess ~ "lang" ) {
    set req.http._lang = regsub( regsub( req.http._sess, ".*lang.*?x22", "" ), "x22.*", "" );
    std.log( "Lang: " + req.http._lang );
    hash_data( req.http._lang );
  }

  return (hash);
}

sub vcl_hit {
  return (deliver);
}

sub vcl_miss {
  return (fetch);
}

sub vcl_fetch {

  if( req.url ~ "^/$" ) {
    set beresp.ttl = 30m;
    remove beresp.http.set-cookie;
    return(deliver);
  }

  if (beresp.ttl <= 0s ||
    beresp.http.Set-Cookie ||
    beresp.http.Vary == "*") {
    /*
    * Mark as "Hit-For-Pass" for the next 2 minutes
    */
      set beresp.ttl = 520 s;
      return (hit_for_pass);
  }

  return (deliver);
}


sub vcl_deliver {
  return (deliver);
}

sub vcl_error {
  set obj.http.Content-Type = "text/html; charset=utf-8";
  set obj.http.Retry-After = "5";
  synthetic {"
  ERROR
  "};
  return (deliver);
}


sub vcl_fini {
  return (ok);
}

There are few interesting things going on here.

sub vcl_init {
  memcached.servers({"--SERVER=localhost:11211 --NAMESPACE="memc.sess.key.""});
  return (ok);
}

As you probably can guess Varnish will connect to the memcached server on init.

Now look at the bottom of the vcl_recv function.

set req.http._sess = regsub( regsub( req.http.Cookie, ".*PHPSESSID=", "" ), ";.*", "" );
std.log( "Cookie: " + req.http._sess );
set req.http._sess = memcached.get( req.http._sess );
std.log( "Session: " + req.http._sess );

VCL language doesn’t allow to define new variables although you can reuse the predefined one (like in this example “req.http“). By the end of this block you should have the whole PHP session stored inside req.http._sess.

You can use

$ varnishlog | grep Log

to see output of the std.log function.

The most important code happens inside the vcl_hash subroutine.

if( req.http._sess && req.http._sess ~ "lang" ) {
  set req.http._lang = regsub( regsub( req.http._sess, ".*lang.*?x22", "" ), "x22.*", "" );
  std.log( "Lang: " + req.http._lang );
  hash_data( req.http._lang );
}

You can read more about VCL subroutines here but in a nutshell vcl_hash is responsible for building a hash string under which a cache is going to be saved.

By default Varnish is caching per URL and host but we have to extend it by a language name. This is exactly what happens here. A full hash string will look more less like this:

"/" + "localhost:8080" + "English"

The last thing worth explaining is what happens inside the vcl_fetch.

sub vcl_fetch {

  if( req.url ~ "^/$" ) {
    set beresp.ttl = 30m;
    remove beresp.http.set-cookie;
    return(deliver);
  }

If there is a cookie attached to a request Varnish will never return cached content. It comes from an assumption that if there is a cookie the page must be dynamic.

The point of this exercise if to handle dynamic content so we walk around this limitation for http://localhost:8080/ requests by unsetting cookies (it happens only in the Varnish scope).

Now you can start Varnish server (don’t forget to type start).

$ sudo varnishd -f /usr/local/etc/varnish/default.vcl -s malloc,128M -T 127.0.0.1:2000 -a 0.0.0.0:8080 -d
Platform: Linux,3.5.0-30-generic,x86_64,-smalloc,-smalloc,-hcritbit
200 244 
-----------------------------
Varnish Cache CLI 1.0
-----------------------------
Linux,3.5.0-30-generic,x86_64,-smalloc,-smalloc,-hcritbit

Type 'help' for command list.
Type 'quit' to close CLI session.
Type 'start' to launch worker process.

start
child (4913) Started
200 0

Child (4913) said Child starts

Open two different web browsers, go to http://localhost:8080/ and start changing languages. POST requests are always forwarded to the web server so session value should be updated. For every GET Varnish should return an appropriate content (according to the current language selection) from cache.

It’s little bit tricky to set it up for the first time but the reward is worth it. Making the Varnish Cache aware of user’s status gives much more flexibility and allows to handle more requests directly from cache. That dramatically drops your hosting costs and increases capacity of your server. Give it a go.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s