Love your cache: Optimize for the second load

Dec 22, 2020 17:00 · 6079 words · 29 minute read diminishing returns single round trip

(dramatic music) - Hey there, I’m Sam. I’m here today at Chrome Dev Summit to talk about your cache, how it works, and how to use it to best benefit your users, especially focusing on the second load of your sites. I’m just really excited to talk to you today because I know a lot of guidance out there only focuses on the first load. This is an intermediate talk, although I’ll try to avoid getting overly technical. I will be issuing some basic knowledge about how the web and how http works.

00:28 - I’ve also written an article to go along with this video, so please check it out, if you’d like to follow along. To start off, I’m going to set out some of the goals of this talk, give you a well-lit path, and then talk through some of the classic caching pitfalls. When we talk about the second load and making that work well, there’s really two big goals. First, you want to make sure that when you release a new version of your site or update your content, users should get the new version almost immediately. But secondly, we want to do as little work as possible to get here, and this means we don’t want to re-download everything, you’re just making a small change, and that fundamentally means using the browser’s cache. So this is what I’m speaking today.

01:04 - It’s about the second load, and using the cache really well to make sure that your users have a great experience with your content. But first I’d like to dispel this myth that the web is at one end of these two extremes. As I mentioned, you could go to the network for every resource and always as far as the site is up-to-date, albeit a bit slowly. You can also put everything in a cache and be fast but maybe out of date, and you use the sites already there ready to go. But this isn’t really the case. You can be in the middle. I want to start with a case study. I work on Google’s developer documentation site, web.

dev, 01:37 - and in July this year, we run a virtual conference, web.dev/live. The conference site was built as a microsite inside web.dev and included a bunch of interactive elements, including a video player and a chat component. But had a problem where users might not get the latest version of the CSS and JavaScript needed to run the microsite, and those components in the end would just fail to work. For the rest of the site for Colab and articles and so on, we had gone to one end of the extreme.

02:02 - We’d cache the entire site in the pursuit of loading fast by our service worker that used Workbox. Now, to be fair, we always went to the network to load article content but then we rendered it inside a shell, which we cached already. This contained our site’s HTML, CSS, JavaScript and so on. I do want to be really clear. It’s actually worked really well for articles and we were kind of spoiled by that. We only fetched the actual bytes contained in the raw text, most loads were only a few kilobytes to the network, and this made our experience really quite fast.

02:32 - And these elements lives in harmony, then everything changed, and in building the microsite, we realized that our really aggressive caching could cause the user’s cache to get stuck without loading the new required dependencies. Now, to be fair, we eventually found a workaround, and the site works a bit differently now, but in the end, we’d backed ourselves into a corner. This big learning here is that the sites don’t really just exist in a git repo. They also exist on a server somewhere, but finally, really in the caches of all your clients all around the world, and you have to be holistic when you think about doing updates on that. Before I continue, I do want to call out again, that a lot of guidance on there only focuses on the first load experience.

03:12 - This is still really valuable for attracting users and making sure they have a great first experience, but if you’re only optimizing this picture, this lighthouse scores for a first load then you could be causing users who come back to have a worse experience. And it’s important to remember here that core web vitals would show up in a bunch of places including the Chrome UX report include all loads, not just the first load. Lighthouse, despite being a great tool, is entirely designed to empty your cache and show what a first load feels like in a bunch of different environments, it doesn’t have any provision for the second load. For me, what I want to fundamentally propose in this talk is a well-lit path for caching which will benefit users on their second load. We honestly think that sensible default for caching is actually to not cache by default. So there’s nuance to this, of course.

03:57 - I suggest using a CDN, which is the globally distributed way to bring your content close to your users and use validation to make sure that your site is up-to-date without transferring it completely. And this is obviously a pretty extreme point of view, and to me this is just the same modern default. Tou can then opt-in to better caching rules as you need them. So in this talk, I’ll be talking a lot about immutable assets, which is the other extreme URLs that will tell the browser to cache forever because they have a unique hash name, but I think there’s a big problem with this weird middle ground. If you don’t quite understand what your cache is doing and you don’t define really clearly clear rules this can be quite dangerous as you really might not understand why something is either broken or you’re burning through network traffic.

04:40 - And this leads me into goals for you as developers, and I think it’s important for me to call this out. My goal in the end is to help you be lazy, and the modern web is super complicated, which, turns out, makes that really hard. Many sites have fully integrated pages which only report work if their CSS, JavaScript, HTML is all in perfect sync, and as we saw that for web.dev/live. We tend to think that historically caching is a bit fuzzy and most people today try to patch that rather than going back to a sensible default. And lastly of course, it’s also about reducing actual network traffic, it’s not just about developing complexity for you, but also about saving bandwidth.

05:18 - And of course I want to reduce your stress levels. For web.deb/live, it turns out, this was a pretty stressful moment for me as the developer. I risked my users having a bad time. So what does having a bad time actually mean? I talked about web.dev, but let’s talk about where poor caching strategies show up in the real world. And I confess, this area is kind of the stick of the talk, but I think it’s something we need to talk about.

05:41 - Bad caching rules tend to show up as something known as stale assets. As web developers, we’re so used to doing a hard reload when our sites just don’t look right. I do it habitually. I’ll hit Control + Shift + R or open a tab in incognito. And I have a question for all of you watching if you’re testing a site on your phone, how do you do a hard reload on mobile? It turns out there’s no button for that. I’ve got to manually clear the cache and do a bunch of stuff.

06:02 - There’s no key code for that, and here are some examples of what a stale asset really looks like for real world end-users. All of these people are trying to load this site as a second load. They have something broken in their cache from a previous load and they want to come back and get more of whatever they’ve been loading before. And these companies have no real recourse except to tell users there’s this weird step that they might not really understand and good luck, your site might come good if you’re lucky. So what’s the cause of these problems? Well, there is a bunch but a common one really is not giving the browser any instructions.

06:33 - Turns out if I don’t give a browser any instructions, if I say absolutely nothing, the browser will actually guess how long to cache a file form. It uses a concept called heuristic freshness, you can search for that term. It’s in the spec. How this works is if a file was created years ago, but you fetch it today, your browser will still cache it for 10% of that time. So for a year-old file, that’ll be another 36 days. This becomes a real problem when I have lots of assets that were created at different times.

06:59 - When I do my first load and they come back with a different last modified header, these assets will be cached for different times. So when I do my second load, as you can see here, I might actually be getting a cached version of some files but not of others. And even more confusingly, as a bit of an aside different browsers actually implement this differently. Some browsers will always search the HTML. They’ll assume it’s always out of date, but they might skip the CSS and JavaScript.

07:26 - So having their rules, this might have been fine in the ‘90s or early 2000s, but this ambiguity, it’s dangerous for today’s modern web. So I’ve introduced some of the concepts, described a well-lit path, and talked about some second load failure cases. Let’s move on, and I want to talk about some of the concepts that I kind of glossed over, our CDNs, how the headers work that really make this caching story really makes sense, and the sorts of files you might be able to make immutable. So let’s start with CDNs. As I mentioned before, CDNs are a way to replicate your content quite physically close to your users wherever they are geographically around the world, bringing the latency down. We’ve also got a great article on this on web.dev, so please check that out.

08:07 - This fundamentally means that when I disable caching, as I kind of suggested in the well-lit path, I can really validate quickly. I can make sure this is up to date because the content is actually physically quite close to me. For me, I live in Australia, which is, turns out, far away from a lot of the content that I’m really interested in. So if a site isn’t using a CDN, it’s going to the origin host to where the content is originally from. It turns out I can really feel that latency.

08:30 - It might add a couple of hundred milliseconds to my round trip. So what I’m trying to say is, while you have a bunch of options in terms of what CDN you might end up using, maybe consider using one where your users are going to be, and how your CDN choice is made might be influenced by that. So let’s talk about the cache-control HTTP header. It’s really the master knob that we have as web developers to control how browsers cache assets and how they’ll be used for the second load. Throughout the talk, I’ll give you a few suggestions of how to use this, but I’m going to talk through all the different options and many knobs that you have on the setting.

09:04 - I also mention expires here only to kind of larger let you know that you can ignore it. It’s completely now replaced by Cache-Control. So let’s talk about these knobs again. Max-age tells the browser how long they can hold onto a file for after the time it was served. How long is it valid for after you received it? Must-validate says after that same time, you really can’t use this file. It makes the Max-age a strict requirement as opposed to a loose requirement. Public and private go together.

09:33 - For now, let’s just say that private is important when you’re serving like a log-in page that you generate on the server that you don’t want other people to cache. No-store is actively disabling caching. The client and any CDN along the way will completely ignore this file. This is not just going to validate. The browser will never put this in storage. Lastly, immutable, which I’ll explain it a little bit more later, is something that should be on by default, but it has a few nuances. So next we have the ETag header. I’ve sort of mentioned validation a few times already in this talk.

10:04 - The way this works is when you do a network request to a server for the second time, your computer or device will parse along the ETag from any previous response that you’ve received. If the ETag matches, you’ll get the status code of three or four. This means you don’t have to download the file again. You’ve probably seen this all over the internet and it means that you cut down on your bandwidth with cost. It still has latency costs, which is really important, I’ve got to go to the CDN or the origins of it to validate whether the file is still the same. For small files this kind has diminishing returns. By the time I do my request to check the file is the same I can almost just get the file again. As a bit of a side note, one thing to remember is, every HTP request might, say, cost you about a kilobyte. If your file is vastly less than that, I often encourage developers to actually put those files inside base 64 versions inside CSS or HTML. Having more assets, isn’t always a good thing.

11:00 - With some background aside, let’s go back to that well-lit path that I introduced earlier and the modern, sensible default to caching. So HTTP has been around for a long time. I mentioned that last modified default behavior. That’s really not new, being introduced in about 1999, and browsers still don’t implement it correctly. I really do think it’s time for an update. And to restate our goal, we kind of only want to disable caching, don’t cache by default, and just validate the resources are up to date by ETags.

11:28 - Then introduce more caching rules while also bringing your content close to the users by CDNs. So I want to talk about how you might configure your CDNs. I put ETags only on the slide for completeness. Nearly every web server or web host in the world is going to be serving this properly. It can be a tiny bit tricky if you’re generating content dynamically rather than just serving real files on disk, but even that’s more or less soluble.

11:53 - And the most important part of this talk is encouraging you to set Cache-Control to this that you see here. This basically means this file is valid for no time at all. You can serve at once, and then after zero seconds, you must go back and check that it’s valid, and also it’s public, so CDN along the way can cache it. The last point is tricky. If a file is not valid for any amount of time, then how can your CDN actually cache it? Shouldn’t it throw it away? Well, turns out, CDNs will often have a much clearer and complete view of your site than your end user’s browser. They’re actually often moving your file close to your end users and serving it from there. So now you know how to configure them. I want to mention some great CDN options.

12:34 - Netlify actually implements the behavior which I’m talking about today, and they have a great article which describes, much like I’m doing in this talk, why they think it’s a great idea. Google also has a service called 5s hosting which acts as the CDN. It doesn’t quite match these semantics by default, but it’s pretty easy to configure, and I recommend you check out my article to see how you might do that. Of course it turns out, like everything, just taking the approach I mentioned verbatim in this talk has some downsides. Let’s cover some of them and how we can work around them and make progress for a better caching strategy.

13:08 - The really big one, of course, is that this strategy works well for 1st-world users on fast connections. For me in Australia, I live in Sydney where most of Australia’s internet staff tends to live. I get ridiculously fast latency to a lot of CDNs, often faster than 16 milliseconds, which is less than the time it takes to draw one frame at 60 FPS. That’s not so great for people living in other cities and definitely not so good for people in developing nations on 3D modems. Next, validation can still cause what’s known as critical request chains.

13:37 - This is something that also gets reported in Lighthouse and it’s latency bound, so the fact that validation is achieved in terms of bytes doesn’t really help us. It means that one file might request another, which requests another, and so on, and this latency ends up compounding and slowing down your loads of your site. And finally, like I sort of hinted at before, having lots of small files, even though that’s pretty fast with HTTP, too, turns out there’s still working you should do. So having fewer, faster, validate is always going to improve your performance. But the bigger opt-out, which I’ve already mentioned in this talk is that for most sites, most assets just don’t change every release. They’re immutable.

14:12 - When you do a build and release of your site, you’ll find that CSS, JavaScript, HTML, whatever, they don’t all change at once. So what we recommend you do as much as you can is include a hash of the asset’s contents in the file name itself. You can also put it in the query string, and I’ll compare these two options later. Most importantly, and I want to make this clear, you shouldn’t be doing this by hand. We’re now definitely getting into the realm of build tools.

14:35 - But of course, just renaming a file like this doesn’t mean anything to our browser. We need to actually set a header, so we do this by serving with a different Cache-Control header. You can cache these files with a regular expression, check out the related article, but like look for long strings of hashes like see here. As for the header itself, we said that this file is valid for a long time. This value is a year, and actually, according to the spec, anything over a year is effectively equivalent to forever. We also need to say that it’s immutable.

15:06 - I mentioned immutable before, but this is the flag on Cache-Control. I find this really confusing, and I think when most people learn about caching, they think it already works like this. If you fetched a file and it’s within this age, you can keep using it, but historically, it was actually more of a guideline. So in 2017, Chrome actually changed its behavior to work as if immutable was always set it was always true. There’s a great article explaining why, which I’ve linked to below.

15:30 - So this immutable flag is really just for other browsers, which haven’t caught up. Okay, so you’ve configured a site and now have a bunch of files that look like this. But the real kicker here is that while we can cache these assets at the top, we can’t really cache everything. While the HTML file refers to these files, you really can’t tell users don’t navigate to this long awkward index HTML file. That’s clearly a terrible user experience.

15:55 - So the big concession about making assets immutable is that you can’t really change your entrypoints to work like this. They can’t be served forever or made immutable. These files will always have to be validated or perhaps served with a middle ground caching strategy. So I want to digress a little bit. Even if you aren’t fully on board with the advice in this talk, a really valuable thing that site authors can do is really to try to create more of these immutable assets. Fundamentally, this is something that will really help almost every type of web experience.

16:22 - Lastly, before I talk a bit more in the abstract about tooling and the way you lay out your site, I want to talk about the middle ground. I’ve really given you two extreme options, right? Never cache or cache literally forever, and I think what fits into the middle ground are assets which have low effect on other files. These are files that don’t include or change other files. CSS, for example, is a really bad example. It has lots of effect on your HTML. Good examples of files like images that are part of your content because they don’t really employ other things, and if they change over time, maybe because they’re recompressed, that’s not really a big deal.

16:57 - I can keep the old version around for a bit longer. So you can cache these things for like a medium amount of time. So in the second section, I want to talk a bit about how the web is fundamentally a connected graph. What that really means is your HTML loads images, or JS or whatever that might even load more content. I’m especially going to be focusing on JavaScript because byte for byte, it’s the most expensive part of your site’s load in terms of CPU time. So here’s a really simple example.

17:22 - We have an index page which loads two other assets. I’ve marked assets here that aren’t immutable or can’t be treated as a immutable in blue. Of course, sites are actually lots of pages often depending on the same assets. I want to mention this to be complete, but this example just assumes a single file. And we want these files on the right, in our CSS and JavaScript, to be immutable if they can be.

17:47 - And what I’m showing here is even though the index page might change or revalidate, when used as the second load these two files on the right, which I’m showing in red to show that they can be immutable, don’t need to be loaded from the network. We’ve got our hash. We know that they’re not going to change. So when we get a new version of that index page, that’s all we need. Your browser will just say great. I’ve already got these dependent resources. I don’t need them to fetch them again. Here’s your content. Your second load is amazing. Let’s have a quick divergence on JavaScript build systems. One of the reasons I want to talk about JavaScript is that these build systems, tools like Rollup and Webpack, will often generate multiple files.

18:23 - They’ll pull that shared code into a common bundle that there’s no duplication. So to step back a bit, these tools will generate two classes of files, entrypoints which are loaded directly by your HTML inside script tags and chunks, which are these anonymous bits of shared code. Nearly everything else website tends to be flat and not have this additional loading cost based on these connections. What I find interesting, though, is that these build tools will often hash their anonymous chunks of code, as you just see on the right, but they won’t hash their entrypoints, at least not by default. These files are still blue and can’t be cached forever. They haven’t got a unique name. We don’t know if they’ve changed or not.

19:02 - This means we now have two layers of validation to perform, which, if you remember before, introduces what we know as critical request chains and extra roundtrip. So why is this? Turns out it’s because these tools want to be easy to use and fundamentally, they just don’t know about your HTML. What it means in practice is that your HTML can look like this in both development and production and just point to a file of the same name. But what you want to do is, of course, rename your entrypoints and add these hashes into your HTML, like you see in this example, adding these hashes into the place where we depend on these files. And this is where your browser makes that critical decision, right? It’ll see this file with the hash and go, great.

19:43 - I’ve got this file already, don’t need to go to the network. While I’ve kept this talk fairly abstract, I want to call out how we solve this problem in web.dev. We use Eleventy, the static site generator, which uses nunjucks internally for templating. For us, we have a nunjucks filter, which takes the name of a file and finds its hash before writing that out to the page. In production, this does what it says. It spits out a hash. In our case, we use the query string approach.

20:10 - This is simpler, and I’ll mention why in a little bit, but in development, this does actually nothing. It doesn’t get in the way of our fast reload and rebuild cycle. And this works because our website was configured to not generate anything different in terms of its cache headers. This file doesn’t have a hash in it, so the cache header is exactly the same. So I’ve mentioned query strings verses filenames. In practice, it doesn’t really matter which one you choose. Your build system might lead you down one path versus another, and that’s totally fine. In our experience, if you have a choice, query strings tend to be marginally safer, especially to entrypoints like your JavaScript. If, for some unknown reason, your users have an old version of your HTML, they’ll still get something if they load this file. The query string won’t match. It will be out of whack. In the end, it’s just kind of a hint to the client but they’ll still return a file that is on disk.

20:58 - It’s a bit of a safety net, and you might think that just renaming a file works the same way, I’ve got an old version of that file and a new version of that file, but in my experience, most build systems in most developed deployment environments tend to not encourage you to keep around those old files. So in fact it’s kind of dangerous because if you have a HTML file referencing the old name, that old file’s probably now completely gone. So this hashing work is kind of an interesting problem. I’ve obviously focused on JavaScript because I think, as web developers, we’re kind of used to working with it and having build systems around that, even for sites that are even just a little bit complex. It can definitely work for other assets too, things like CSS, images, and so on, but as I mentioned before, JavaScript is really quite expensive.

21:40 - So I want to talk through another approach of how we can avoid changing our JavaScript altogether if we don’t need to. So let’s go back to the simple example. I’ve got the single page and I’ve compiled my source code into a single file for it to load. I’ve done all the right things. It’s got a nice name and it’s immutable. But if I change even a single byte in my JavaScript source code, my file will actually get renamed. Something I’ve actually done myself by accident is include the current date and time in the build, and turns out, that’s going to change every time I hit that button. So my hash was always new. Don’t be like me, and don’t do this. Lots of build tools can actually help you extract parts of your code into separate bundles. In web.

dev’s case, 22:19 - what we actually do is to split out the web component libraries, lit-element and lit-html into their own dependency file. These are actually MPN dependencies we depend on. They’re pretty stable. We don’t tend to update them very often, and they don’t actually have that many changes. Build tools can help you pull out these files but you might have to write some code to do it. We fundamentally want to do this because this dependency, as I mentioned, isn’t going to change way less than our core code.

22:44 - And so when we update our site and change that core code, our active HTML will point to that new JavaScript entrypoint. We can’t avoid that, right? The file is fundamentally changing, but the bigger dependency of lit and lithtml hasn’t changed. So in placing it in its own bundle we get the benefit that it’s probably coming right out of the cache, and in a lot of browsers, that means not only the bytes from the network but also the parsed JavaScript file that’s ready to go and ready to be used. Don’t do this for every single file. We don’t want millions of these beacuse it’s kind of self-defeating with a ton of checks you’ve got to do, but for a few core large dependencies, this can be really valuable. So I’ve talked a lot about build tools and hashing in the abstract.

23:23 - What I’d encourage you to do is check out Tooling.Report for a comparison of different build tools. And it does list currently Webpack and Rollup which I’ve mentioned. And of course, the really important takeaway here is that hashing is going to be a huge saver for you because these files that you generate can be made immutable. It also includes a tool called Parcel. This is a bit newer than Rollup and Webpack.

23:48 - It’s a bit more holistic, tends to reach into your HTML and actually change these hashes for you. So a lot of the stuff I’ve mentioned is not so relevant, but it’s also different than these other tools. So please check out the Tooling.Report site to have a look. So finally, I think a talk from Google isn’t complete without a section on service workers, which are fundamentally these things that only run on the second load, So we very much care about them for this talk. They give you a whole extra cache. You can do what you like with it. A common pattern is to use Workbox to inject a file manifest, everything core you need to run the site, and then a runtime cache for everything else. And then the big question people ask on building a service worker is, are you going network first or cache first? There’s a few more strategies than that, but that’s really the big one.

24:31 - The cache first is great for apps and tools. In web.dev’s case, however, especially after our little hiccup back at the start of the talk, we decided to go network first because every time we go to the network, we get a new HTML file. We’re confident that our second load works properly, and most of the advice I’ve given pretty much already applies here. We have a service worker cache that we can hit for immutable assets along the way that we know about, but otherwise, you’ll only get the newest, freshest content. We don’t have any weird UX issues. We don’t need a reload button. And this is what I mean. If you go cache first, it’s very common that you have this UX problem.

25:05 - It’s a big sign of a service worker driven site. And when you make this decision to show this UI, you’re fundamentally saying, I want to load fast at all costs. There’s nothing inherently wrong with that, and lots of Google’s guidance actually does encourage this. The issue here is that, for me, most reload UX is just not that good. I suspect the average user kind of ignores it. In many ways, it breaks their mental model. Websites don’t look like this. They don’t need updating, What’s this weird button telling me to do? But this button in the end exists for a single reason. You’re trying to avoid a single round trip to check for an update. And unfortunately the web doesn’t really give you a chance really to check for updates while our website is closed. There is a new API, the periodic background sync API, which I talk about in my article, but it only works in Chrome and only works if a web app is already installed on a home screen.

25:55 - So in reality, it’s not a general purpose solution. So you only really get to check whether a site is out of date in the extremely brief time between when a user goes to your site by clicking on the icon or typing in the URL, and when it displays. So considering that time between a user’s intent and a site load, maybe that’s long enough to go to the network. I pose this idea to you, that perhaps you don’t need to be entirely cache first. The service worker does give you a bunch of controls.

26:26 - Rather than blindly serving a file from a cache first experience, you could decide on a budget to check whether a site is up to date. There are some concerns about this approach. Why would I block a user from getting to my site? But as this is the second load, we know that user wants to come back. Maybe you can decide on a budget. For me, it’s worth 50 or a 100 milliseconds to see whether something is out of date and give the user a great second experience. So thanks for listening. It’s been a long talk, and I hope you’ve learned something.

26:56 - Again, let me say that I’ve written an article which covers a lot of this content, and you can check it out in the description below. To restate my original point, I want to talk about the well-lit path again. I think the defaults from years gone by aren’t that great for modern intertwined sites. You should try to disable classical caching and instead use a CDN with good ETag validation. You should opt-in to caching where you need to and have a better understanding of your caching rules.

27:20 - Having simpler and clearer defined rules will reduce your complexity and reduce the complexity loading on your brain. I think for web.dev, we were backed into a corner by a poor understanding of our cache and a real desire to be as fast as possible, and planning a better strategy, now it means we’re in a better position to operate. So let’s finally recap. A modern, sensible default is not cached by default, and instead opt-in where it makes sense, and it makes sense a lot of the time, and make sure your site layout on disk works well to support these kinds of things, these immutable assets. So you can opt-in lots of files. And finally, if you’re building a service worker driven site consider whether going to the network does make sense for you. I think it often does, and purely cache only just presents you often with a confusing or basically ignored user experience. Thanks for listening today.

28:12 - I really hope you’ve enjoyed this CDS talk, and check out some of my colleagues other great content. Bye bye. (upbeat music) .