HTTP Caching
with Varnish


PHP Romania
June 22, 2018


© David Buchmann

What is a reverse proxy again?

What could possibly go wrong?

httpstatusdogs.com

Overview




HTTP Refresher

HTTP is simple

Request

GET /path
Accept-Encoding: text/html
            

Response

HTTP/1.1 200 OK
Content-Type: text/html

<html>...</html>
            

HTTP verbs

HTTP response codes

twitter.com/stevelosh/status/372740571749572610




HTTP Cache Control

Cache control headers

HTTP 1.1, RFC 2616, Sections 13.2 and 13.3

Cache Expiration

Cache-Control: s-maxage=3600, max-age=900
Expires: Thu, 15 May 2014 08:00:00 GMT
            
  1. s-maxage
  2. max-age
  3. Expires (HTTP 1.0 - avoid!)
  4. Default to default_ttl if nothing specified

Cache validation

ETag: 82901821233

If-None-Match: 82901821233

304 Not Modified

Do not cache

Cache-Control: s-maxage=0, private, no-cache
            

By default, Varnish 3 only looks at s-maxage=0.

https://github.com/varnishcache/varnish-cache/blob/4.0/bin/varnishd/builtin.vcl (Varnish 4) https://github.com/varnishcache/varnish-cache/blob/3.0/bin/varnishd/default.vcl (Varnish 3)

Default Varnish behaviour

Keep variants apart

Response content depends on request headers

Requests

GET /resource
Accept: application/json
            
GET /resource
Accept: text/xml
            

Response

Vary: Accept
            

Varnish does what you tell it



Think carefully and test thoroughly

Varnish Configuration Language

VCL: Debug time to live

sub vcl_backend_response {
    set beresp.http.TTL = beresp.ttl;
}
            

VCL: Two applications

backend default {
    .host = "127.0.0.1"; .port = "8080";}
backend legacy {
    .host = "127.0.0.1"; .port = "8000";}

sub vcl_recv {
    if (req.url ~ "^/archive/") {
        set req.backend_hint = legacy;
    } else {
        set req.backend_hint = default;
    }
}
            

VCL can do a lot of things


But first make your application behave correctly!


Advanced topics

Cache Invalidation

There are two hard things in computer science:

  1. Naming things
  2. Cache invalidation
  3. Off by one errors

Cache busting

<link rel="stylesheet" href="/css/style.css?v1" type="text/css"/>
...
<script src="/js/scripts.js?v1"></script>
            

Explicit cache invalidation

Invalidation flavors

Communicating invalidation

Custom configuration for purge

acl invalidators {
    "localhost";
}

if (req.method == "PURGE") {
    if (!client.ip ~ invalidators) {
        return (synth(405, "Not allowed"));
    }
    return (purge);
}

...
            

Custom configuration for refresh

acl invalidators {
    "localhost";
}

if (req.http.Cache-Control ~ "no-cache"
    && client.ip ~ invalidators
) {
    set req.hash_always_miss = true;
}

...
            

Banning

vcl_backend_response {
    set beresp.http.X-Url = bereq.url;
    set beresp.http.X-Host = bereq.http.host;
}

vcl_recv {
  if (req.method == "BAN") {
    if (!client.ip ~ invalidators) {
      return (synth(405, "Not allowed"));
    }
    ban("obj.http.X-Host ~ " + req.http.X-Host
      + " && obj.http.X-Url ~ " + req.http.X-Url
    );
  }
}
            

Cache Tagging

$response->withHeader('xkey', 'news id42 id44');
            
xkey.purge(req.http.xkey-purge);
            

You can also use BAN, but its much less efficient




Edge Side Includes

Use Edge Side Includes

Like server side include, but on Varnish:




Wrap-Up

Take-Aways

Outlook: Use libraries

Outlook: There is more than caching

Thank you!


@dbu




Caching and Sessions

Strategies when Caching with Sessions