Upgrading DevTools architecture to the modern web
Dec 23, 2020 17:00 · 6103 words · 29 minute read
(Upbeat music) - Hi, my name is Paul, and I work as part of the Chrome DevTools team. Today, three of us, me, Tim, and Jack, are going to walk you through how we go about upgrading the DevTools’ architecture to use modern web platform primitives and best practices. So if you don’t already know, DevTools is a web app, and this means that, like any other web app or PWA, it’s built with HTML, JavaScript, and CSS. However, it’s also a very large web app, weighing in at around 150,000 lines of first party JavaScript along with a good number of third-party dependencies. The other thing to say is that DevTools is over 10 years old.
00:45 - Now some code was added to DevTools yesterday or even today, but some was added over 10 years ago, and this means, by anybody’s standard, the DevTools code base has a lot of legacy code, and the goal in front of Tim, Jack, me, and some others has been to modernize that legacy code, and the kinds of migrations that we’ve done, and in fact are still doing, are these. We’ve been migrating a custom module system to JavaScript modules, sometimes called ES modules or ECMAScript modules. We’ve been changing from a closure compiler managed type system to TypeScript managed one. We’ve been removing bespoke Python build scripts in favor of using industry standard tooling like a Rollup, and we’ve been moving from a custom non-standard component system to web components. Each of these migrations and others besides are very challenging to get right, and so what we wanted to share with you is how we approach this kind of work, and we’re going to do that in three sections.
01:42 - Firstly, Tim is going to run you through planning changes. Secondly, Jack is going to talk to you about how you can make the change as well, and finally, I’ll be back to talk about maintaining your changes. The things in this talk, the lessons if you will, they’re hard won, and we hope that you will find that they help with your own code base, but maybe you’ll only be able to adopt one or two of these things. Well, in any case, let me hand over to Tim to talk about planning changes. - So thanks, Paul, for that introduction. Hi there, I’m Tim, a software engineer on Chrome DevTools.
02:16 - As Paul mentioned, we have performed several large migrations in DevTools. In this section, I will be giving you an insight in how we’ve planned these migrations. As you just heard, DevTools is web application with over 150,000 lines of first-party JavaScript. It’s one of the largest web applications I have ever maintained and has an interesting mix of code patterns that accrued over time. Not only is DevTools a large web application, it also started over a decade ago when Backbone became the most popular front end library.
02:46 - I personally was not a web developer back then and I’ve not used Backbone myself. Not only that, DevTools has not seen a complete rewrite in this decade, and we’re still using and maintaining code that was originally written in the first few years. This means we’re working with code patterns that were designed in a different era than today. One interesting observation on working with these code patterns is that DevTools had to build numerous custom solutions that were not part of the platform. Back then, the platform did not have the features that we take for granted today.
03:20 - Therefore, DevTools gives an interesting perspective on what was necessary to build a large web application back then. For example, DevTools designed a custom module format roughly eight years ago, as there was no standard module format available. This module format consisted of module.json files that specified the scripts that were part of a module and any dependencies the module had on other modules. This module format worked for DevTools, but no other web application was using it. As such, DevTools built additional build infrastructure based on Python scripts to process and bundle source files.
03:57 - The main script for our build pipeline was build release application, which was performing tasks comparable to bundlers that are well known in modern development today. It is important to note that the DevTools maintainers had to maintain this additional build infrastructure, which is outside of our user work on features for DevTools users. Yet we’re also making the observation that the platform has significantly improved ever since these custom DevTools solutions were built. The platform now does offer a standardized module format, and there are numerous build tools built on top of these standards. So we came to the conclusion that maintaining unique build infrastructure, module formats, and other bespoke DevTools solutions was expensive, since any time we had to spend on maintaining infrastructure, we could not spend on building features or fixing bugs for DevTools users.
04:49 - So our first takeaway is to use standards, if you’re able to, since maintaining or using a custom solution becomes an expensive choice later. It is important to stress the word can here, as there might not be an appropriate standard available that you can use. However, when there is an appropriate standard available, we advise you to favor that option over a custom solution for the long term. As we said, we’ve migrated away from the custom module format to the JavaScript module standard, which we have written a detailed blog post about on the DevTools engineering blog. We will give you the high-level details about this migration in this presentation.
05:27 - If you’re interested in more detail, I recommend reading the blog post. And so after finishing that migration, we are now able to use and ship DevTools with JavaScript modules. The imports and exports that you see on this slide here are probably more familiar to you than the module.json format as shown before. The migration was difficult at times, and we’ve hit several roadblocks, but we are now able to fully use JavaScript modules. As such, we’re able to use the modern-day tooling that I mentioned before such as FuseCode and Rollup.
06:01 - And so instead of having the custom Python script to bundle our resources, we now use a small run-up configuration to bundle our source files instead. This allowed us to remove quite a bit of technical debt in our build pipeline and simplify its approach. Well, thus far, I’ve talked about modules. As Paul alluded to before, we’re doing similar explorations in other areas such as widgets. The widgets here are in our UI components where widgets are the custom DevTools component design.
06:31 - To give you an example of what a widget looks like, this is a code snippet I copied from our code base. This particular widget describes the ability to add particular classes to a DOMNode in the elements panel. These widgets are not only a custom component design. They also include custom solutions for, for example, CSS, where this register required CSS call registers an external style sheet that is included in this widget. The registration process is directly integrated with the DevTools runtime that handles how we load and bundle source files in DevTools and is completely custom to DevTools alone.
07:10 - As I explained before, DevTools was built in the Backbone era. In that timeframe, components for primarily using imperative DOM API’s to build UI. In this case, DevTools uses these APIs to, for example, add a class to a DOMNode. And lastly, we have custom methods that live on the element prototype to build a tree of DOMNodes. On this line, we’re using the create child call, which is added to the element prototype to create a DOMNode and automatically attach it to the parent node.
07:39 - Back in the day, adding methods to the prototype was an appropriate solution, but we have since realized that it is not a maintainable solution in the long term. However, these kinds of technical debt is still what we’re facing today in DevTools. And so for any of these large-scale problems that you need to tackle with, you need to plan and prototype your changes. Since these widgets are used throughout the code base, you need to figure out a plan to methodically migrate the code base to a more modern pattern. You will need to prototype your changes to make sure that your plan works when you scale it up to the broader code base.
08:16 - So when you’re faced with these large-scale changes, we tend to plan these migrations using design docs. Design docs are common at Google to discuss designs before we start executing them. The design docs contain general information like who performs the migration, which bug tracks its progress, and what the value proposition of the migration is, and they also contain the more detailed design and trade-offs that we made along the way. It allows the team to discuss potential solutions and figure out additional improvements. An important benefit of this process is that it allows us to catch issues earlier and lowers the chances we’ll run into problems later.
08:58 - By writing out the design first, we often figure out incompatibilities or solve the problems which we can make tweaks for later. Often, performing the migration then merely becomes the execution of the plan rather than having to solve problems you’re facing on the fly. What’s important to note is that it usually requires intermediate steps to perform a large-scale migration. These migrations on their own are very large, which means that splitting up the work in discrete steps makes the project tractable. This is even more important for DevTools, as DevTools is part of Chromium, which ships a new version every day as Canary.
09:36 - This means that every single day, DevTools must work. There cannot be a point where DevTools is broken for a week, and as our users and fellow Chromium engineers rely DevTools working. And so when we perform these migrations, we have to make sure that DevTools remains working. This means that the solutions we develop have to take into account both the old implementation and the new implementation that we are migrating towards. Now, that is easier said than done because usually the old implementation you’re migrating away from has limitations that weren’t migration in the first place.
10:10 - As such, you’re inherently constrained by the existing implementation and therefore you are limited in the possibilities of what you can do during the migration. It is often the case that, only after you finish the whole migration, you’re fully able to take advantage of the new solution. For a detailed explanation of what this would look like, you can read the blog posts on our engineering blog on migrating towards JavaScript modules. It describes how we made both module formats work in tandem so that we can gradually migrate our code to JavaScript modules while other parts of the code base remain using the module.json format. Now one catch to all of these migrations is that you get to the problem that at some point you get there.
10:55 - Since you’re actively migrating, it will take some time until you reach the point where you want to be, but it is difficult to know when exactly you get to that point. And so what we realized is that, with these kinds of migrations, it’s very difficult to know when you’re finished. For our migrations, Our estimates have almost always been wrong both positively, but most of the time negatively. For example, for the JavaScript modules migration, we originally estimated four weeks of migration. This estimate was based on our preparations with scripts that would be automatically transform our code to the new module format.
11:34 - Sadly, we got nowhere close to that estimate, and it took us seven months to complete the whole migration. Partly this was because JavaScript modules are a lot stricter and uncovered issues we were previously unaware of. So we were able to find these issues, but we were forced to apply additional fixes to keep DevTools running. And so ultimately, while you can prepare it to the best of your ability, you never know what you’re running into, and so the phrase “you don’t know, what you don’t know” is appropriate for any work in this area. Do not feel discouraged when you made an estimate that turned out to be wrong in hindsight.
12:12 - It is incredibly difficult to make estimations that are completely correct. So for planning your changes, here are the key takeaways listed. Feel free to pause the video and take a screenshot if you prefer. After planning, the next step is to execute the plan. For that, Jack will now explain how we perform these kinds of migrations. - Thanks, Tim.
12:36 - Now we thought about some of the planning, let’s dive into some of the challenges and things we learned when putting those plans into action. The first thing we found really important is to prove your assumptions in an early piece of work. Throughout the migration and however well you’ve planned it, you will have made some assumptions about how certain pieces are going to fit together. By finding a little piece of the code base that you can use to prove these, you’ll validate your approach early on or learn that you might need to make changes. So for the web components migration on DevTools, we picked the breadcrumbs element as the element to migrate.
13:05 - This element sets at the bottom of the elements pane and it shows the current active element all the way up to the HTML element. This isn’t a particularly exciting element, and you’re probably not gonna even notice when we migrate it, but what it was was the right size for us to prove some assumptions that we made about how this new migration was going to fit in with the legacy system. It had enough complexity and some challenges around overflow and scroll to challenge us and figure out solutions, but it was also simple enough that it could be contained without growing into a much larger piece of work or spiraling into a huge change that would be really hard to land into our code base. So from this, what we’ve learned is to keep your changes small. A migration, by definition, will often touch the majority of a system, but landing all those changes in one go is really risky, and there’s a lot of changes to ship in one new version of an application.
13:51 - If you could break your changes up into small pieces, you’re going to reduce the risk and also make it easier for your colleagues to review those changes because it will come piece by piece rather than one big overwhelming chunk of code. For the JavaScript modules migration that Tim mentioned earlier, there were 267 individuals CLs. CL here is short for change list, and you can think of it like a GitHub pull request. This meant that we could review those changes piece by piece. Also meant, if anything unexpected went wrong, we could just revert the small change we’d recently made rather than having to undo everything in one go.
14:23 - There is a little bit of extra work here because you need to plan those changes upfront and make sure that no one can flex with anyone else when making these small changes, but it will really pay off in terms of reducing the risk and keeping the code review manageable. And if your change does ship and cause problems, it’s actually a really good opportunity to learn about the migration. If it stays in the code base, your approach is validated and you can continue. However, if there’s any problems and you get it reverted, that’s a good opportunity to learn why it was reverted and figure out solutions for the next piece of work. Maybe you realize that you’re missing some test coverage in a part of your application, and you can add that coverage to make sure you catch problems like this in the future.
14:59 - In any migration, there will be situations where this happens. You can’t plan out everything and you can’t predict exactly what’s going to happen. So don’t worry if you have to undo some work and see it as an opportunity to learn. We also found it really useful to gamify and track our progress over time. Migrations can be long, and it can often feel like a bit of a slog, particularly if you’re in the middle of a massive months or even years-long piece of work.
15:19 - If you can track it, you have a much better sense of where you are at and also much better momentum to keep the migration going. No migration is without it’s roadblocks. You will hit problems. Even if you’ve done the best planning in the world, there’ll be something unexpected thrown at you. It’s most likely that some work lands, then you have to undo it, then you learn from that, then there’s another unexpected blocker that causes you a few days worth of delay. There’s always gonna be road bumps along the way, but if you can track your progress, you’ll keep the momentum high and understand how far you’ve got to go on the migration. So for the TypeScript migration, we created charts to show how many lines of code we still had to do.
15:55 - That’s the dark blue line on this slide, and also, we were able to predict an end date, which is the light blue line. This let us visualize our progress, but also made us realize that the predicted end date was far further in the future than we needed it to be, but because we realized this, we were able to prioritize the work a bit more highly, get some more colleagues involved, and we were able to take the end date down from November 2021 to December 2020. We also got a much-needed boost of momentum because everyone was working really hard to bring that end date forward, and we’re all able now to look forward to that end date and keep working on the migration. Without putting that into a chart and having an estimate, none of that would have happened, and we’d probably still be on for finishing late into 2021. Another thing we found really valuable is to use linting to avoid regressions.
16:38 - If you’re putting in the effort of migrating, you want to ensure that at the same time, no one is landing code that makes more work for you to do in the migration later on. On a big team where some of you are migrating code and some of you are working on new features, your teammates are almost certainly going to regress your code back to its pre-migrated state, not on purpose, but just because there’s so much going on. It’s hard for them to keep track of everything. A good example of this is a small migration we did to move from using assert.equal in our tests to either assert.strictEqual, which uses three equals to compare the items, or assert.
deepEqual, 17:10 - which compares the structure of objects or arrays. This was a pretty straightforward migration, but it would be really easy for someone to introduce in another assert.equal call into the test that we then need to migrate. You can easily imagine yourself typing assert., you register as just equal. You accept that suggestion and you move on, not realizing that you forgot to use either strictEqual or deepEqual.
17:31 - We were able to add an ESLint rule that would actually highlight these flaws and fail the build. This meant that no one could introduce this accidentally, and it also means in code review, we don’t have to ask our colleagues to actually look for these problems because we know we have tooling that will do this for us. We’re a big team. We ship code every single day, and it would be very easy for things like this to slip through the net. If you can add ESLint or other tools to capture these problems, you’ll find that’s a really good way to offload work from your colleagues reviewing changes and just trust your tooling. If you do enforce migration with lint rules, make sure you communicate clearly to the team what’s going on and why those new rules are in place.
18:07 - There’s nothing more frustrating than having a change fail a build because of some ESLint rule that you don’t quite understand or you’re not quite sure why that’s been put in place. So what we do is we share documentation. We add Read Mes to the code base that document the rules that we’re using and why we’re using them. so that when someone does have an issue and one of those rules fails in their build, they can understand why and fix the problem. Linting is particularly valuable when you want to make sure you catch things that sometimes will work and are very easy to slip in without anyone realizing. Assert.Equal is a good example of this. Most of the time, it will actually work fine in your tests, but there’s the odd time where you really need to make sure you’re using strict.Equal or deep.Equal.
18:45 - So by catching this in ESLint rule, we stop any tests sneaking under the net that may cause us problems in the future. If you would like to see how we use ESLint and all the rules we apply, you can actually do this online via the link in the slide. Finally, I want to talk about dealing with roadblocks that we talked about earlier and coming up with the solutions for them. You will hit these problems. It doesn’t matter how well you’ve planned. There will always be unexpected problems, but what matters more is how you deal with those and take which approach.
19:13 - If you hit an unexpected blocker during some work, you’ve got two ways that you can deal with that. You can use a quick fix or a little hack, if you like, to work around the problem or you can pause the work and investigate the root cause and plan a more robust solution. Now, in an ideal perfect world, we would always reach for the robust solution, but that might not always be the best thing to do. Any given solution, you have to weigh up the risk, the reward, and the effort. So if something’s blocking you, you have to weigh up how much risk there is involved.
19:41 - Is fixing that going to mean deep architectural changes across the whole code base? That might be so risky that it’s not worth it, and a quick fix that works around it for now might be the better approach. Similarly, there’s the reward. Is it going to be a lot of effort for a small fix that just unblocks you and maybe causes your colleagues problems? That doesn’t sound like it’s worth it, but if it’s going to unblock loads of issues that all your colleagues have faced, then that might be worth it. And along with the reward and the risk, you also have to weigh out the effort. Is it going to take you weeks and weeks to figure this out or can you do it in a couple of days? Depending on these three things and balancing them, you might decide that quick fix is actually a better solution here, and it’s important to bear in mind that temporary workarounds are rarely temporary. Every developer has landed a change and added a little to do or a little comment to come back and fix, but then something changes, different work comes up, priorities change, maybe there’s an urgent bug to be fixed, and that work gets left behind. This is no one’s fault. It’s just part of the work that we do.
20:39 - In the DevTools codebase, for example, there are 783 to do comments. That’s a lot of work that developers intended to do but didn’t. Often, though, this isn’t a big problem. These haven’t been done because other bits of work that were more important have been done, but you should bear in mind, when landing, what you think might be a quick fix that might only be in the code base for a week or so, but actually it could stick around for months or even years. So consider the impact when you’re landing a quick fix and be prepared for it to sit around in the code base for a long period of time. To unblock your colleagues and yourself from working on migration, it’s also a really good idea to keep documentation for when you do hit these common issues that you might catch.
21:16 - On our TypeScript migration, there’s a few issues that crop up time and time again. We’ve created a document that people can refer to when they’re working on the TypeScript migration. This means they don’t have to ask for help, and it means that they can unblock themselves and use this as a handy reference guide. It’s also a really good way of sharing knowledge across the team. When anyone has a problem, they can add it to this document and then we all are aware of it if we hit it again.
21:38 - So here’s a slide of all the things we’ve talked about, making your changes, if you want to screenshot that or take notes of that. And now I’m going to hand over to Paul, who’s going to talk about maintaining your changes once the migration is finished. - Thanks, Jack. So it falls back to me to talk about maintaining your changes, and right out of the gate, we’ve got a takeaway for you, that if it’s not tested, it’s broken. Tests are the table stakes for any project. Now, it might be fine to have no tests if, say, it’s a prototype or something, but DevTools isn’t a prototype and it’s a tool that people rely on every day to do their jobs, and even simple code needs tests. I’ll give you an example. Take this code from our code base. It looks fairly straightforward. this ensureEnabled function.
22:25 - If the thing is already enabled to do an early return, and if it’s not, call enable on agent which is an internal variable and then set the flag to true for next time. And we might test it like this because we’ve decided we want to test things. So I’ve called it food for simplicity, and you can see we describe it and there’s an it block and we create a new one, ensureEnabled, and then we assert that it’s true but not so fast because while testing is important, bad tests, if you have them, are even worse than no tests at all. And what do I mean? Well, it turns out that line that I just jumped over before, this agent.enable line, it turns out that, actually, that returns a promise.
23:10 - It looks like it’s synchronous, but is, in fact, asynchronous work. And so what we should have done is we should have written our function like this. It should be an async function, and we should await that agent.enable call, and correspondingly, our test should look like this. Now, if you have complex and complicated code, it really pays to go slowly with writing your tests and examining any assumptions that you’ve got. Let’s talk about our test types. All right.
23:40 - So testing is important, but not all tests are right for all situations. So for us, business logic is something that we want to be able to import, side effect free, to call and confirm that it gives us the right kinds of outcome. This means that our code is predictable and it provides a lot of confidence straight out of the gate. Sometimes you need to architect very specifically for that case, for being able to unit test your logic. End-to-end testing in the DevTools front end is the place where we actually follow our user flows and we literally click the buttons that you click and we observe that we get the right outcomes.
24:18 - This protects us from testing the implementation, and instead, we care that the outcomes themselves are correct. So we do this with Puppeteer and a special form of the DevTools front end called hosted mode, which is where we can run the DevTools front end in a browser tab and treat it like any other web app. Now, be careful here, though, as end-to-end testing tests the whole system, so it’s inherently more likely to flake than, say, unit tests. All right, next up, slightly contentious one, boring code is the best code. A fair warning: What follows here is very subjective because I’m going to be showing code and I’m going to be talking about readability, and that’s something that we all have our own individual preferences on, but here it goes anyway.
25:01 - So here’s some code here, again from our code base, and I’m just going to point out a couple of things that might prevent someone from easily reading this, or at least these are things that they would have to keep in their head if they were going to review this code. And the more they have to do that, the more cognitive overhead there will be. So first, there’s this bind call here. That means a copy of the function is being made, and any references to this inside of that function will be set to the instance of the class. Then there’s the function itself, which is defined as an inner function, but because of variable hosting, that function will be in scope when the bind call is executed. So there’s a few things there. Now you can think of this, then, if we say reduce everyone’s cognitive overhead.
25:47 - It might be to think of that in terms of applying to your teammates in code review, or you can think of it as applying to you in a few weeks or months when you come back to the code or, in my case, a few minutes. In any case the longer you spend the decoding the intention, the harder it will be for everyone. So what could we have done differently with that code? Well, let’s start here, start at the code. We can see that we had a promise at the end of the dispatch event. So that bound function could be an async, arrow function instead that would still return a promise if that’s what we need, and let’s bring that one line that was in the old bound function, bring that in there, and that’s it done, so, in this case, making it more readable, involved relatively few changes.
26:31 - Now I hasten to say this is not a case of code A is bad and code B is good. You likely have your own preferences about what is readable, but this is about tuning the code for team-wide readability. And often, code that doesn’t require specialist knowledge is far more readable. The bigger the code base, the more necessary a type checker, the next one, and in fact another contentious claim. Now, like we said earlier, DevTools has around 150,000 lines of JavaScript written by a range of developers over a long period of time, and there is no way you can keep all of that in your head.
27:08 - So naturally you’ll need as much assistance as you can get. Type checking can help here, and the first thing to say is that type checking is going to help with auto completion. As you type out the code, something like VS Code will pop up prompts that will help you navigate the code more completely. So it’s a good thing, but you have actually got a fair amount of options when it comes to how you do this. One option is to inline your types, like in the case of something like TypeScript.
27:37 - That happens to be our chosen path because we’ve found it to be expressive and that it provides high quality feedback. If you do something like TypeScript, though, a downside is that you’re committing to a build step, but the upside is that the build step will enforce your types, which can catch errors, and that’s, typically speaking, good. But another option is to use JSDoc, and you can still use TypeScript here to keep it in check. Here, you won’t need a build step, but you are reliant on author time checking. So something like the editor actually prompting the developer, and that might be just fine for your project. As I said, there are options.
28:14 - You could also use the closure compiler here too and run that as part of your build step, loads of options. Similar to our linting point earlier, though, having something warn you as you author the code or as somebody else authors the code, and certainly before somebody else reviews the code, the better off it’s going to be for everyone. And certainly for us on the DevTools team, type checking has helped us and our team enormously, and I would go so far as to say it’s actually essential. Next one, today’s modern code is tomorrow’s legacy code. Here’s a set of features that you might recognize from years 2015 to 2017: arrow functions, classes, async-await, spread operator, and temmplate strings. There are more.
You probably got your own set of things 29:01 - that you think of when you think of these kind of JavaScript features that are more recent. However, JavaScript’s not done, and there are more things being added. So for example, nullish coalescing, private methods, WeakRefs, finalizers, logical assignment operators. Again, the list goes on and on. So what do I mean then? Well, let’s come back to this code from earlier. This is something that Tim touched on when he was talking.
29:23 - When this code was written, arrow functions weren’t a thing, nor was async-await. So this would be pretty much the right way to write code like this, but things have moved on. So when you can update your code to use newer platform primitives, it is worth considering. If you’re able to use ASTs to take old patterns and upgrade them to newer patterns, then it will be all the better, and that’s certainly something we’ve done as we’ve been migrating the DevTools code. Of course, if the code isn’t broken and it has tests, you might not wish to touch it at all.
29:56 - In fact, if it has tests, you can update the syntax and easily confirm it still works as intended. So there’s something to think on. So there you go. Those are my list items for maintaining your changes, and you can see the list here if you wanna pause the video. Let me run into the conclusion and finish up this talk. Firstly, there are always risks. Any change to code carries risks. We’re not in the business of removing the risk, but we are in the business of reducing it, and we can do that by using the things that we’ve talked about today. Secondly, I want to say that this is an art, not a science.
30:37 - The things that we’ve talked about today, some of them may work better or less well in various situations and with various teams. Knowing what and when to apply these things that we’ve discussed is part of the skill and something that simply comes through practice. So let me leave this list here. You may want to pause the video, maybe take a screenshot if you’ve found this talk helpful. And with that, I will say thank you to Tim, to Jack, and to you as well. Enjoy the rest of the conference. (upbeat music) .