Categories
Podcasts

Annie Sullivan

Annie Sullivan time travels from Perl-based forums in the ’90s, to AI and gameplay programming, on through Google Docs, Core Web Vitals, and the new INP metric.

Show Notes: https://catchingup.dev/podcasts/annie-sullivan/

Catching Up With Web Performance
Catching Up With Web Performance
Annie Sullivan
/

Links

Video

Transcript

Tanner
Who knows how this is gonna start? I don’t know.
Annie
Uh, “Hi and welcome to!”
Tanner
Hello, everybody, and welcome to Catching Up With Web Performance, a podcast about stories of people in web performance. Today my very special guest is none other than Annie Sullivan. Annie, welcome to the show! How’s it going?
Annie
Thanks, I’m doing pretty well. How are you?
Tanner
I’m hanging in there. You know, trying not to be nervous, talking to a celebrity and all. Do you, I’ve gotta ask, do you feel like a celebrity? Like now that you’ve done all these podcasts, and you come up on Google IO, summit, and stuff like is it?
Annie
I don’t really, like none of my neighbors are in tech, so nobody in my area knows who I am. Every once in a while my eight year old is like, “Mommy, are you famous?” And I’m like, “No, no, I’m not famous.” And he’s like, “I think you’re famous.”
Tanner
Tell him, yeah, he knows. You tell him the truth, you’re famous! Annie Sullivan, of Core Web Vitals fame. We have so many things to talk about, but I want to just, like, it’s stories. All around here it’s just, let’s chat. I want to hear about you. So what’s like, how did you get into web performance? Or just even web, like what’s your first web performance memory?
Annie
Yeah. So, I mean, the first website I worked on when I was in college, and high school, I was really, really into punk rock. And so I made a website for a local record label and a forum for the local hardcore scene. And so that was really my biggest, like, first experience making a production website and having people look at it and things like that.
Annie
And the first web performance experience I had actually started as a quality issue. The forum was World Wide Web Board. It was this Perl script from the nineties. It was like 1998-ish. Yeah, and so if your forum has a lot of controversy and people are posting a lot and they’re yelling at each other, World Wide Web Board has this problem with concurrency, where the way that it decides what is the ID for the next post: it reads a file, it pulls the next post ID out, then it writes the post into like “number.txt” and then it increments the number and it writes it back out to the text file.
Annie
And then you can imagine User A and User B have concurrent requests and they both get the same ID and they overwrite each other’s posts and…
Tanner
Oh god…
Annie
Everybody got really mad about this. So I tried to understand like, “Well, what would be the right way to write a message board?” And I did a bunch of research and it seemed like using a database for the messages would probably be better because it handled concurrency. So I rewrote the thing, instead of to generate all the files out, I rewrote it as a script that would query the database and then add the next post to the database and then automatically generate the listing and the post. But it was 1998 and this was very slow because it was all dynamic.
Tanner
Right. Well, and too, I’m so curious, like, how did you… I’m picturing Annie in the library at college. Like, how did you research this? Because this, 1998, that’s pre-Google, right?
Annie
Yeah. Yeah, so how did I figure? I mean, the way that I learned about stuff was just, it was a long time ago. It is, like, it’s hard to emphasize how much it was a long time ago. So I really wanted to write, like, not just a web page in HTML but a dynamic web page.
Annie
And the first way that I found out about this was, like, I don’t remember the name of the guy, JMarshall.com? He had this website called, it had one “CGI Made Really Easy” and one “HTTP Made Really Easy,” and it talked about how these things work, not like how do you use some technology to do it. So he was like, “Basically, the users send a request and it has headers and then it has post form fields and then you run a program and you generate a response that has headers and HTML.” I’m like, “Great!”
Annie
Okay, so it’s 1998 and I’m in college and I’m all like, “I will write a C program!” And that’s what I did, I wrote like a guest book in C. And so I was a little bit frustrated, though, because C programs they crash a lot. They’re just a little bit… I didn’t really understand the security implications of writing a very simple guest book in C. But it just was very clunky, there weren’t very good string libraries and things like that.
Annie
And so I had this internship, and this was just weird in itself. The internship was at a local company and one of the engineers had written a bunch of tests in Bash. And for readability reasons—and this is hilarious—he wanted me to convert them to Perl.
Tanner
Of course, why not!
Annie
Yeah, because it’s nineties, right? Anything goes. And so I read “Perl in a Nutshell,” the O’Reilly book. I read about regular expressions, I learned Perl, and I was like, “Oh, okay. So we… You know, it makes more sense, that’s why people are learning Perl, is this web stuff. They want to do scripting in Perl.” And the way that it handled strings, I’m like, “Oh, right, this would be way easier than C++.” So I started reading about how to make websites in Perl, and that’s where I came up with the, like, people use databases. I took a class in databases in college.
Annie
So anyway, I’m working on this, so originally I had taken World Wide Web Board and then learned more Perl, learning out how it works. And then I think at that point I had heard about this new thing called PHP, and so I used PHP and MySQL to make my new forum. And the problem was that because I dynamically generated the form from the database every time it was pretty slow. I tried switching back and forth between, like the tree of messages, should that be recursive or should that be, you know, should I come up with an iterative method? It didn’t make a difference in timing. What made a difference was shortening the number of messages per page, and then more than anything else caching. I think that, you know, that’s pretty common knowledge now, that you don’t want to dynamically generate something every time if you can. And there’s lots of different layers of caching, right? You can cache on the client, you can cache on the server, etc.
Annie
So that was my very first experience with web performance. And you can see it’s the nineties because there wasn’t anything happening on the frontend. Like everything was on the backend and it was just about optimizing the server which…
Tanner
Which is nothing to speak of just, like, the network, right? Like trying to deal with, because what is it? We had AOL, I don’t know what, where you were. What were the connection speeds? Like was it 56 kilobits?
Annie
I got onto university ethernet, so I was a little bit faster, but a lot of people were on AOL.
Tanner
You got a T1 line! You got a T1!
Annie
Yeah, yeah. But you know, I mean that was another thing with shortening the HTML. We just didn’t have images and things like that because it just wasn’t feasible back then.
Annie
Yeah, so that was my first experience with web performance. I finished college and I took graphics and artificial intelligence and a lot of really interesting classes. And when I graduated, I decided to go make video games. So I spent the next five years, I did AI and gameplay programming at two small game studios. So totally different from the web, not even connected to the internet. Back in those days it was PlayStation 2 and the original Xbox, and you would write a game and it would ship on DVD and that would be done forever. Like whatever bits were on that DVD, they’re shipped.
Tanner
Which that’s already such a novel, like, going back to that and going, “Wait a minute. You had, you’d be done? I couldn’t just fix it?”
Annie
Yeah, yeah. It was, I don’t know. On the one hand, the crunch times were so intense and then the level of testing was so insane, right? Like, Sony did this thing called the “Technical Requirements Checklist,” and it was like 85 pages that your QA team had to go through and verify everything worked. And one of them was that the game had to run for 8 hours. And it’s, I’ll remind you, on 32 megabytes of memory. There is no more memory. And you just have to go forever.
Annie
And so that changed the way we did data structures. All of the data structure design was basically, like, you literally read the bytes off of the disk. And where on the disk are the bytes? They’re all in order and they’re on the outside because it spins faster. It’s very, very performance-focused work, but very low level. I mean, AI is higher level, obviously, than the graphics layer. But we’re still using our data structures and algorithms and thinking about, like, where is the memory and how much memory can we use.
Tanner
I mean, just the level of detail that you’ve described in those few short sentences. I feel like, and I don’t know, maybe I’m projecting here, but I feel like that puts you in such a better position. Or I don’t know, I guess what I’m trying to get at is, like, when you have… Performance is so much about that level of detail. To truly improve things, you have to get really down to the nuts and bolts of it. Like there are some easy wins, sure, but when you really want to make something better, you have to fully understand how it works and almost get a physical, yeah…
Annie
You have to really think about how it all works, right? Like I think with web pages, when I started to go back and work on the browser and look at, you know, how are network requests going through, how does the parser work? Like thinking about how does the browser see this web page is really, really different than you as a developer when you’re writing the web page. There’s a lot of layers of abstraction between them. And trying to figure out what’s important and understand, like, you know, are you making a blocking network request? It’s all those back of the calculations. Like a network request takes this much time, a disk read takes that much time. Knowing those things and how they relate to web programming can be really complicated because there’s a lot of layers of abstraction.
Annie
At one point I was speaking, I think it was at performance.now(), with Pat Meenan and he said, “People keep asking me how I do this.” He said, “Be curious!” Like he would just try and understand what’s happening. And I think it’s really, really good advice for learning about these things.
Tanner
So speaking of being curious, let’s go back here. You graduate college, you’ve been experiencing the web through making some hardcore community websites, or made for the record label, dealing with concurrency, switching out a file-based system to a database, and then you go into games.
Tanner
Now it’s totally different. The level of testing is wild because there’s no going back, once this hits the DVD, that’s it. And it’s entirely performance. Like, I need the game to run smoothly, people need to be able to play. Not only that, but we need to be able to play this for 8 hours straight. I’m picturing like when you go and drop your car off at the mechanic, and you wanna do performance testing on a car, you just put it up on the ramp and you let the wheels spin at 60 miles an hour. I’m picturing you guys doing that with a game, where you just set up the game, play and let it run for… Did you actually let the game run for 8 hours or how did you guys test that?
Annie
I mean, yeah, sometimes we’d have a game running for 8 hours. But so game programming, I may be making it sound more formal or mature than it is. We literally, like, we’re working 80 hours a week and you have two test crews and each one works 12 hour days, seven days a week. And so they’re literally playing the game overnight. Like you finish a build and they’ll play all night and they’ll bring you bugs and it’s…
Tanner
A person is playing the game the whole time.
Annie
Yes. And they’re doing crazy stuff, right? Like let’s see if we can… Can you jump through the floor? At any point in this level, can you jump through the floor? It sounds like a fun job, but video game testing is really painful.
Tanner
Yeah, and strenuous, just to go through for that long and see if you can keep breaking it. That’s wild.
Annie
Yeah, but I did want to point out it wasn’t all performance, or things like that. I mean, a lot of it was just really, really, really fun. Like taking the designers, the animators, and their ideas and their use cases and bringing it into reality. I wrote a visual scripting system so that the level designers could, without knowing any programming, make their own levels.
Annie
I have a video on my LinkedIn that a high school kid and an artist made together with no programming experience. They made this level, it had rising and lowering water levels, and the game didn’t have a concept of water. They made up the concept of water. They made switches that turn the water on and turned it off, and the level is just super duper cool, it goes up and down. And we were able to all kind of work together and figure out how to make that happen. And that was what I really loved about game programming.
Tanner
That’s fantastic.
Annie
I mostly focused on AI. I’ve written a bunch of pathfinding systems. But like just sitting with the animators and bringing a character to life, that’s kind of like the why, right? Like why does it have to be performant? So that we can kind of take these ideas and make them alive.
Tanner
Are you still doing any video game, yeah, I was going to say, are you doing any video games on the side?
Annie
I’m so burnt out. It’s been 17, almost 18 years since I left the video game industry and I am still burnt out.
Tanner
Oh my god.
Annie
It’s insane. Have you ever, like, really worked 100 hours a week? It’s a lot.
Tanner
No, I have not. I’m too, yeah, I’ve been blessed with a very easy life. I’m very privileged. I haven’t had to really fight for anything. And 100 hours a week? No. That’s crazy.
Annie
Yeah, it’s a lot. I mean, like I learned so much and did so much cool stuff, but I’m still tired when I think about it.
Tanner
But they are, there are fun memories to look back on and be like, “You know, that was a, I missed that. That was a good time.”
Annie
Yeah.
Tanner
Man. So when did you exit the game and what happened next? Like, did you go straight from gaming to Core Web Vitals? Or like what was in between here?
Annie
Yeah, so about 2005, I was getting really, really tired of working in the game industry, and pulling these crazy all-nighters all the time, and just not seeing a lot of career growth for what I personally really love to do. I think that there’s more, at least at the time, there was a lot more trajectory for engine programmers than people that were on the gameplay side. I don’t have any idea if that’s still true.
Annie
But anyway, I was starting to think about other jobs and I got contacted by a Google recruiter and somehow managed to make it through the interview. And I showed up on the first day and I had no idea, like, “What are they going to put me in?” Because it’s 2005, Google didn’t have anything really close to video game, especially like video game AI.
Annie
They asked if, so there’s this new browser called Firefox and we have this Google toolbar, might I be interested in making a Google toolbar for Firefox? And they sit me down and I’m in the cube area with, like, Ben Goodger and Darin Fisher and Brian Ryner and the folks from Mozilla, and they’re working on new browser ideas. At the same time as we’re making this Google toolbar for Firefox, and we’re trying to make it a Google toolbar, they’re also prototyping new browser features. The anti-phishing feature was prototyped through Google Toolbar for Firefox. We did some early prototyping on protocol handling, so for like Google Docs, right? Like if you click on a document, open it in Google Docs, and things like that.
Annie
So it was really, really fun to do that work on the toolbar and learn about browsers. Like I couldn’t get some XUL to work and I asked Ben Goodger, like, “What do I do?” And he shows me how to use the DOM inspector. So just kind of learning from people that were there. Like one time a guy came by with a CSS question and Ian Hickson, who also sat with us, just like opens up a whiteboard and explains why are CSS selectors like this? It was just really, really exciting and interesting to learn about browsers from first principles, and see what was happening, and all of the things that people were thinking about.
Tanner
Curious, do you have any, are there any of the first principles that you might recall, or like any whiteboard highlights? Because I feel like that’s a dramatic shift from game design, web browser. “What’s going on here? How does this work?” And then, okay, CSS selectors, why they work the way they work is one thing. Are there any other, I don’t know, surprises, revelations of, “Woah, this is how a browser works?” as you’re first getting into it. Do you recall any?
Annie
This one’s a little awkward. Like the biggest surprise for me? Alright, I’m gonna tell this story. Okay, so the biggest surprise for me was, I don’t know if you’ve ever looked at how parsers used to work back then? But like Internet Explorer, I mean, like, why isn’t it XHTML, right? Why isn’t it properly formed? Why is Internet Explorer and Gecko and WebKit, why are they all different? Like, how did they come to some of the similarities in like how they handle overlapping tags and things like that. And I was talking to… and he was like, “Yeah, so what happened was we were really trying, we worked at Netscape, we were really trying to have compatibility with Internet Explorer and we’d keep getting these bugs. My, my friend says he was looking at this site and it doesn’t work in Netscape,” and it was always a porn site. So like, essentially the web browser evolved to make sure that it was compatible on porn sites. We might want to cut this whole thing out.
Annie
Oh man, that’s hilarious. That’s like the stories of how we got Blu-rays, right?
Annie
Yeah, I mean a lot of the early internet, technology, all that. But I mean, another thing I think that was really interesting is, what I was actually most surprised about, was how much I just didn’t get it. Right? Like people would be talking about the need for new APIs and new features in JavaScript, and I just didn’t kind of really get it. Like, “Why is everybody talking about bind and call and apply?” And how all that mess could be better.
Annie
Like as somebody who’s really more focused on C-type programming, I just kind of didn’t get it.
Tanner
Interesting. Was there ever a moment where you did get it?
Annie
So there was a couple things that happened. The smaller one was I used to, when I was doing interviews, I would ask people, because it was only 2005 at the time, I would ask people, “If you could go back to the early days of Netscape and Internet Explorer and the browser wars, and you could change something about CSS and HTML, what would you change?” And I asked it a bunch of times.
Annie
And then there was this one guy, and he was like, “Well, obviously I would have made it a binary format and I would have…” You know, he had all of these performance changes that would make the web way better. And he’s like, “So I’m really glad they didn’t ask me, because no one ever would have used it!” And I was just like, “Oh, that’s really true.” Right? And that’s true about things like PHP too, people use it because it’s really easy to spin up and get used to. And, like, just realizing how much value there is to that and seeing a little bit about why the world is the way it is.
Annie
And the second thing that happened, which is a much bigger thing for me in my career direction, was as we were deciding, “Okay, we’re going to make a browser, we’re going to have Chrome, it’s going to be this new thing,” I was thinking a lot about this and how I just didn’t understand. Like it seemed like the most important thing that Chrome team wanted to do was provide better APIs on the web, and I just, I felt like I just didn’t get it. Like exactly why?
Annie
And so at the same time that was happening, we acquired this little company called Writely, and we were going to launch Google Docs. So I decided to actually step away from the Chrome work and go work on Writely and try to understand this web thing. Because I’d always wanted to have something exist, like a document editor on the web.
Annie
So I joined Writely and we launched on Docs.Google.Com. I learned a ton about security, performance, internationalization, accessibility, like everything you need to do to scale a website, and just all of the work that they were doing inside Google, like across the app suite. Like I got to see kind of how the Closure Compiler came to be, why the Closure library exists, how at the time we were building dynamic web applications, what the challenges were. And then I really, like really started to understand like, “Okay, yeah.” The web API surface is definitely, as of 2005, was not good enough.
Tanner
Right. This is like once you’re in the heat of trying to make this thing, like here’s, “We want to make Google Docs. I need these things to work better.” And then it feels like maybe there wasn’t a particular moment, but that was the time when you said, “I get it.”
Annie
Yeah, yeah. It was just like a long journey of trying to do one thing after the other, and seeing how they all kind of intermingle. Like you want to make something accessible and then it also has to be a bidirectional UI, it has to go back and forth, like for smaller and larger languages, the size of the UI has to go back and forth. And then you want to pre-render it on the server so that it’s fast, and minimize round trips. And you want, you know, sprites on your images were a big thing back then. How do we make rounded corners?!
Tanner
That’s funny. So we went from college, developing games, went into working on browsers, specifically making a toolbar at Google for Firefox. Somewhere along the way, “Let’s make Chrome!” And then roughly the same time, or after, around then, “I’m going to join the Google Docs team because this looks like a fun project. Holy cow, we need better APIs because we need to do all these things and be able to support all of these different things.”
Tanner
What happens next? I mean, we have Google Docs. I know that was a success story, right?
Annie
I worked on Google Docs for a couple of years. I got to make one of the first really big offline applications with Google Gears and learned all about, like, the pros and cons of different API surfaces for offline applications, in particular.
Annie
And then they had a Google-wide performance effort to try and make every single app perform better, and I got to lead that for Google Docs. So that was really, really interesting. I got to work with Steve Souders and I got to learn about infrastructure across Google and help the input into like, “Well, how can we monitor and measure apps? What do we need in terms of infrastructure?” Got to give first input on the Navigation Timing API.
Annie
But yeah, just mostly made Docs a whole lot faster. We went from, so we would measure from the doc list to opening the doc, how long that took, and at the beginning it was 6 seconds and we got it down to 2 seconds at the, I think at the 90th percentile for end-to-end latency for real users in 2008. So that was pretty cool and really fun, I learned a lot. I got to give a little workshop at Velocity that year, or 2010, I got to give a workshop. And so that was all really fun.
Annie
I think the most exciting thing about Google Docs was actually seeing the usage go up and up and up. Seeing classrooms start using it, and kids start collaborating on their docs, and getting to talk to teachers about how they’re using it, and things like that, was really, really neat.
Tanner
Are there any highlight stories that come to mind? Of like, “Woah, this made such a difference in somebody’s life.”
Annie
I think it’s just all the little ones. Like my brother in law, he didn’t know what I was working on, and I told him and he was like, “Oh, well, I’ve had to travel for work and I’ve been helping my son with his college applications in Google Docs and we collaborate together.” Like all those little ones, right? That you made all these little things possible, that was just really cool.
Tanner
That’s crazy. Just like, I mean something like that, you know, you think of what it would have been like before. Like, “Hey, I’m gonna help my, you know, I’m gonna help him with his college application. Oh, it’s so much easier now because we both can access a website and just jump in and do it together.”
Annie
Yeah, and that’s exactly what I wanted to exist when I joined the project, so that was kind of neat.
Tanner
So we’re getting in here to Google Docs, and there’s a performance initiative at Google, and y’all are coming up with metrics and ways to look at this like, “You know what, how do we know that Google Docs is better? Well, how about we look at opening a document, and the time from the list to the time the document’s actually there? Let’s measure that.” We’ll have some JavaScript in the background, or I don’t actually know enough behind the scenes to how you collect the telemetry on that. But somewhere we’re like, we’re coming up with metrics and saying, “How do we know this is actually better?” Tracking real world usage, improving, seeing the percentiles come down and improve.
Tanner
And then what happens? Like do you keep doing Google Docs forever? Or like what comes after that?
Annie
I worked on Google Docs for about three years. And so the version I worked on was the very first version that came in with the Writely acquisition. Then they decided they were going to make a new version, the Kix editor, which is much better and more robust for a lot of reasons. But they decided staff that in New York, and so at the time I was in Mountain View. And I had talked a lot to the team that worked on Google Search about frontend performance as we had this big company-wide initiative. And they were just doing some really, really, really cool stuff. They had, basically they would do a live experiment for every change that they were making, and they could quantify down to like five or fewer milliseconds what was the end user impact of this change. And I was really excited about that.
Annie
So I went and joined Google Search. And it was a really good time to join because they were really trying to move from ten blue links to something more interactive and dynamic. And so I had a lot of knowledge about the infrastructure that already existed at Google and how it might be applied, like the Closure Compiler and, you know, how the Closure library was designed, and what did and didn’t work well with that, and how people are using templates, things like that.
Tanner
You’ve been at, I mean, you’ve been at Google, like, how long has it been? You’ve been at Google 17 years?
Annie
We’re still back in 2010 at this point.
Tanner
Right. Somewhere along the, we’ve got a performance initiative, we’ve got Chrome. Here we are. There’s this, the 2010s, right?
Annie
Yeah so, I mean, I worked a lot on modernizing the Search frontend. Like how should it serve JavaScript? How can it break JavaScript up? How can it deal with the fact that maybe you have two different lazy-loaded modules and they each want a calendar, how do you load the calendar too and only once? Trying to design module loading, trying to figure out better templating systems, and dealing with some really, really interesting problems. Like you have an experiment framework with a thousand experiments that can be running at any one time. How do you serve code? How do you make it performant? How do you know if it’s running?
Annie
Because when I started on Search, they were really focused obviously on like backend monitoring, right? Like is the server doing, you know, is it running out of memory, are their 500s, etc. And I showed the SREs, like, “Look, you can screw up your CSS and make a blank page. We need to monitor this.” So like, I would do stuff like review all the postmortems and look for commonalities in frontend issues and, you know, try to look through how would we mitigate those? How would we detect those? Those types of things.
Tanner
That feels like such a paradigm shift.
Annie
Yeah, but it was really fun, right? Because I got to learn so much from, like, how Google Search does things that have actual data scientists working on, you know, like how do we decide if something’s better or worse? And learn all about how they were doing things. But also I was able to kind of bring some experience to them. And so it was just a really, really exciting time.
Tanner
Yeah. You have so many different perspectives that you can pull from, so much experience here. We’ve gone from just learning fundamentals of computer science, doing game design, optimizing backend applications, focusing on server, just looking at the browser itself, how it parses HTML, how it does its work, makes network requests. And then you’re on the frontend now too, looking at, “Oh, by the way, CSS can kill your site.” What? It feels like, after going through all these things, there’s another novel concept. Like that feels like… Did it feel as different as it sounds, focusing more on this frontend stuff?
Annie
I mean, it was a shift over years, right? And being able to really slowly move into it. I think especially with Google Docs, like, eventually we have millions and millions of users, but at the beginning it was just early adopters and we would hear things from them like, “This doesn’t work in Opera,” or whatever. And like, we could talk to them. And so you can learn kind of slowly. Like I make this mistake, it has this impact, and actually see what’s happening to users. And then by the time you get up to scale, you can think about, over the course of years, like how might we detect this problem is happening?
Tanner
Yeah. And then too, you’ve got experiments, you know, thousands. Like as soon as I heard that number, I went, “Woah, running a thousand?” Like in my day-to-day, I don’t normally have to think of thousands of experiments. Like we’re doing A:B split testing, right? Like here, I’ve got a couple of tests running, maybe I’ve got a multi-armed bandit, but it’s not that many arms! You know? Curious how you do all of that.
Tanner
You’ve mentioned a couple of things that you find interesting and exciting. Are there other things that you haven’t mentioned yet? That you like, “That’s really cool! I like doing this or solving these problems.”
Annie
I’m just basically interested in user problems. Like how, what are actual, how are people using the software and what is happening? How are the problems in their lives getting solved? I took like an interlude a few years ago and just took an entire summer at the U.S. Digital Service and I went to the Department of Energy. We did a discovery sprint and we interviewed like 40 people about this energy transmission modeling software. I just think it’s really fun to learn about new spaces and new problems.
Tanner
I mean, even to, because I feel like that, to me, I don’t know a ton of people, but I don’t know a lot of engineers who are excited about talking to the people who use their thing. I don’t know if that’s just because I have bad friends or… You get to actually talk to the people who used your thing. I don’t often, you know, at best I look at a graph on a dashboard. That feels really exciting that you actually get to hear first, or even sometimes watch, firsthand somebody using your thing.
Annie
Yeah, I always jump in when I have a chance to do that. Like when I worked in video games, I would be talking to the QA team and learning about things and, you know, sitting down and literally playing the game with people. And then when I worked on the Google Toolbar for Firefox, it had this little support forum, and I would just answer all the questions on the support forum.
Tanner
Nice.
Annie
Because, you know, I just felt like I knew the answers, right? It was pretty easy to help people. And similar for Google Docs, like I would read the support forums. And we had, I guess you would call, they changed the name of the people that lead user support a bunch of times, but I spent a lot of time with them, making sure I understood the user issues and what was happening on Google Docs. And we would sometimes get to do, like, forums with teachers and stuff like that, and hear how they’re using it in the classroom and, of course, I would jump at the chance to do that.
Annie
And then for Search, it’s just so big. Like you can’t spend a ton of time doing user interviews with Search. But like, you would just learn things from looking at the data, which I thought was so interesting. Like there was this big push at one point, for performance reasons, we wanted to make sure we’re always turning on gzip, and so they found all the conditions in which we don’t turn on gzip and they flipped them. This was still before HTTPS Everywhere. So they start to dig to the bottom and they find at one point that all of a city is turning off gzip and it appears to be like a proxy to censor the content. Just like seeing stuff about things that are happening in the world, I thought it was just really, really interesting.
Annie
But also, it depends. I didn’t get to talk much to Search customers, or like people that search the web, but I did spend a lot of time talking to, like… Basically, I worked on the infrastructure for Search frontend, right? And so I spent a lot of time, I actually went at the beginning of my time on Search, I went and interviewed a few dozen people that worked on Search features. And I just asked them, like, “What’s the biggest thing missing? What do you like? What do you think are problems?” And then I presented it to the team to try and help shape their focus, and that was super interesting.
Tanner
I have to ask then, and you’ve started to allude to it with the data, how do you do that for the whole web? Especially now, talking about Core Web Vitals, and I know we wanted to talk about INP, like, you’re coming up with metrics for everything. Like for everything, everyone. How do you… Is that different? How are you looking at this data now and trying to figure, “I need to talk to literally everyone”? Is it like that or is it different?
Annie
Yeah, I mean I do. I talk to a whole lot of people. So there’s quantitative analysis, but it veers into qualitative, right? So we look at the numbers. So for metrics, generally, we have a couple of different ideas, and then we try to figure out how could we compare the ideas. So at first we have a lot of ideas and we just take, like, some sort of test dataset.
Annie
I’ll give CLS as an example. We knew that we had some problems with the original CLS, how it was, it just recorded forever. Like it was a cumulative layout shift forever, so if your page is open a really long time, that wasn’t good. So we wanted to make a change to the metric, and there was lots of different ideas. Should we just use the average layout shift value? Average layout shift over time? Should we have some sort of windowing strategy? If we had a windowing strategy, would we look at the max window or an average window, or like a high percentile window, or how long would the window be? And so we thought about all these things and we came up with, like, I think about 150 permutations of different ideas.
Annie
And then we had a bunch of partner feedback. So a bunch of different companies came and told Google like, “Hey, I don’t think this CLS works on this use case.” And then other times we had user feedback like, “I’m really glad somebody cares when I’m scrolling and an ad pops in and shifts down all the content.” So we had a bunch of feedback about what people did and didn’t think was fair about CLS.
Annie
And we recorded videos of, like, we’d just browse the web and have all these users scenarios. And then we had volunteers rank them, pairwise, which was the best and worst, so we had kind of like a ranked set of user scenarios. And then while we’re browsing the web and recording, we’re taking a Chrome trace. We have all the individual layout shifts, and we calculated all 150 different permutations, and we saw which ones came to the top. And we did that with a little bit of a qualitative eye, right? Like we didn’t just basically machine learn the permutations on this very small dataset. We said, “Oh, it looks like windowing in general comes better than average. This type of window, the session window works best. There’s a couple of different window lengths and a couple of different statistics on the window that appear to be best.”
Annie
So we took the top ones, strategy-wise, that floated to the top, and then we put them into Chrome. So on every page load, it’s calculating like four or five different strategies of CLS and regular old CLS. And then we basically just took all the web pages and we ranked them top to bottom. Like what strategy, what rank does this web page have for each strategy? And we looked at the ones that were the most different to start off with. And then we did a qualitative analysis, right? Like is CLS being fair to this page? Obviously something’s weird when it says it’s a really good web page by one strategy and a really bad web page by another strategy, so we tried to figure out where those gaps were. And then we kind of wrote up like, “Okay, here’s the differences between the strategies and this is why we decided to go with this one.”
Annie
So it’s kind of a big back and forth, where we’re like… And we did qualitative analysis too, right? Like, how much does each strategy correlate with time spent? That’s obviously something we want to change numerically about our strategy. But we also have a lot of qualitative data about, like, what are we telling this web page author when we say either we’re gonna use a strategy that says your page is doing great, or this other strategy that says your page is not good? There’s a lot of bouncing back and forth between, like, doing large scale analysis, starting small scale and very qualitative…
Tanner
Zoom out, zoom in.
Annie
And then, yeah, zooming out to the larger scale. I mean, when there’s a bug with LCP or something like that, we do the same thing, where we try to collect as much data as we can about the specific situation and try to understand, like, how many pages does it affect? What would be the impact of making a change? And again, we talk about running live experiments. You can run a live experiment and understand that, like, how many pages are going to change if we do that? So we have live experiments running and we can look at what pages change scores and things like that. So part of it is a quantitative decision. Like just overall, if you look at the percentage of pages passing in CrUX, what was the shift? Was that what we intended? And then, like, let’s go and look at some filmstrips of pages that changed. Do we agree on it?
Annie
But as far as, like, we talk to a lot of people, right? Google has partner teams that go out and talk to individual websites and they give us a ton of summarized feedback. Like lots of people had problems with carousels and CLS, so we took a look there, for example. Then we have our developer relations team, and they get a lot more feedback on the ground, a lot more, like, they see the sentiment on Twitter, that type of stuff. The tooling teams get a ton of feedback, where people are like, “I don’t like what Lighthouse said,” and, “Lighthouse didn’t really dictate it, it was me.”
Tanner
“Lighthouse lied to me!”
Annie
Yeah, so the Lighthouse team tells us how people are feeling when they make a new scoring change and stuff like that. We give talks. We have the feedback channel where people can rate feedback. Bugs come in. So we get a lot of developer feedback through a lot of different channels and we work really, really hard to listen.
Annie
It’s really interesting because the number one thing we care about is, like, what is the user experience? And that is always how we make the decision. But at the same time, like, you can’t convince websites to change if you don’t listen to them, and understand what their problems are, and try to figure out what the right level is to come up with a solution.
Annie
So I think it’s a lot of back and forth, right? Like if we look at a web page and the author is frustrated with the score they have and they say, “This score isn’t fair.” And then maybe my team says, “You should have written your web page differently.” That’s not actually our job, is to tell people how to write their web page. But we can talk to DevRel and say, “Is that a fair assessment?” And usually DevRel has a lot more nuance about, like, just in general, approaches to writing web pages and why somebody might have done something a certain way.
Tanner
Fairness. We could talk a whole ’nother hour about this notion of fairness and how you deal with that. That’s fascinating. We’re coming close up on time, I know we want to talk about INP, though. So tell me a bit more, what’s that been like? We got this new big metric coming out.
Annie
Yeah, so I think everybody in the web performance community really recognizes kind of the biggest gap with Core Web Vitals right now is JavaScript, right? Like people don’t think that FID is a strong enough signal. And this is something that was extremely controversial throughout the whole, like, the initial launch of the program. We weren’t sure if FID was gonna be a strong enough signal. And I think the one thing that happened is people actually really did improve FID. It was about 83% of sites passed on mobile and now it’s closer to 96%. But there still is obviously a lot of low hanging fruit. There’s still too much JavaScript running and it’s slowing things down.
Annie
And the big problem on our side is that Core Web Vitals are user experience metrics. We really want to make sure that when we’re telling websites, “You should do something,” it’s because it’s affecting users. So we don’t want to just measure the amount of JavaScript that’s running and tell people it’s too much. Partly because we really care about the user experience and because we think that’s, if you think about ways to measure fairly, like trying to get down to the user experience is one of the best ways to be fair. But also because we’re a web browser and we actually, like, run JavaScript for people and we want them to write JavaScript and be applications, right? Like we don’t just want to say, “Don’t run JavaScript,” because if that was the case, we should just turn it off!
Tanner
It almost feels like dealing with health and diet. Like not everybody needs to eat the same number of calories, right? But we want you to be healthy. How do we do that? This is a tricky problem we need to figure out.
Annie
It is. But we really want to make sure that we’re not, that we’re sending a message that, you know, writing applications on the web is a good thing. Running things that are rich in JavaScript is a good thing, you just have to write performant applications. And so that’s where Interaction to Next Paint comes in. It measures the time for each user interaction on the page. We measure the time from when, basically, the finger goes down for a keypress until the next frame is painted and/or when the finger comes up and the next frame is painted. If your page is fast, there’s actually a gap between the down and up and we don’t penalize the page if the user’s finger is slow. But basically it’s from the time the hardware, like the mouse or the touchscreen or the key, sends a signal until the frame is presented to the user.
Annie
And that’s only a little, it’s not all of user interactions on the web, right? There’s asynchronous work, it doesn’t count that. But it’s actually pretty stunning if you think about it. That time that we’re measuring, during that time, the browser is completely frozen. So if you had a button and when you tap the button it turns gray, like don’t tap the button again, the button will not turn gray until after that next frame is presented.
Annie
And what we’re seeing in the wild is massively large times. We know from user experience research dating back to 1968 that people expect some sort of visual feedback within 100 milliseconds, and they have for decades. And what we’re seeing on the web is at least 80% of people at least once a week are seeing a full second of freeze when they tap something or click something. 90% of user time on page is spent after the page load, and we’re not measuring that, right? It’s a little bit of a crazy, like a dire situation, because a lot of analytics providers actually aren’t able to measure after onload at all. So your CLS after load, your response to user input, eventually if we get single page app transitions, like, all of that goes unmeasured in many analytics.
Tanner
That’d be like having cars and we’re only measuring the startup time of the car but not the entire drive to wherever you’re going.
Annie
Yeah, and when I look at the values of INP, I can really see that people aren’t measuring this. Like that’s really exciting on one hand because there’s a lot of low hanging fruit. There’s already optimizations happening in Chrome that can do stuff like cut off 10% at the 90th percentile. That’s unheard of for more highly optimized areas of Chrome. So it’s kind of fun that we can find something that really impacts the user and make it faster. And I think that’s what I want everybody to be thinking about when we tell them, like, your INP, it’s probably not very good right now.
Tanner
Right. I mean, you’re like shining a light on something that we haven’t even really thought of, or at least if we’ve thought of it, we haven’t been able to see.
Annie
Yeah. And so I’m personally really excited to see how the community will respond to INP, how they’ll work on fixing things and improving things. I think one thing that I really like about Core Web Vitals is we try as best as we can not to dictate how people should make their web pages. I’ve found a lot of surprises, like interesting and fun stuff that I wouldn’t have guessed people are doing to make web pages better, and I think that’s really interesting and exciting. I’m seeing just a lot already, like in the framework world, of ideas coming around INP, so I can’t wait to see more of those.
Tanner
Do you have any examples that have come up recently? I’m just so curious.
Annie
I mean, with INP, I think I personally need to learn a lot about hydration and how these different things work. But you can see there’s just so much happening in that space right now. And then trying to understand it all and see, like, are we measuring it fully? What is the impact of the different things that people are doing on INP? Things like that, I’m really excited to see where it goes. I think we’re just in super early days.
Tanner
Man, I’m excited too. It’s exciting to know that the journey doesn’t end here. We’re just gonna keep going.
Annie
Yeah, yeah. I mean, we will keep going, right? We’ll land INP, then we’ll have better tooling for INP. Yoav, on my team, is working on more asynchronous task tracking, and understanding the root cause of things that happen on a web page at a much deeper level. So I think we have lots and lots and lots of stones to unturn still.
Tanner
Well, I could keep talking for hours. It’s been so great. We’ve covered so much ground, from game design to INP. It’s been unreal. Annie, thank you so much for coming on and taking the time.
Annie
Of course.
Tanner
I hope we get to do this again.
Annie
Yeah, definitely.
Tanner
Until then, I’ll catch you later.
Annie
Alright, see ya.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.