!!Con West 2020 - Jordan Hendricks: Tex-Mex and malloc(3C): Restaurant Hosts and Memory Allocators!

Mar 20, 2020 18:52 · 1897 words · 9 minute read unhappy customers 56 upset watched

Okay. So yeah. This is Tex-Mex and malloc3C(). The problems that restaurant hosts and reallocators share. My name is Jordan. I’m a software engineer. I have a lot of interests in systems software in particular. So operating systems, distributed systems. But my very first job was actually working as a restaurant hostess in a Tex-Mex restaurant. Surprise. And this job taught me a lot, and I got really good at seating people from a wait to tables as people left and I think about this problem all the time, because it’s kind of hard.

01:00 - We don’t take reservations, and this restaurant was really busy. We would have 40 parties, 120 people, waiting for a table at a restaurant where we would only have 35 tables. Large parties did not fit in the restaurant, and it was really customer service oriented. We would not turn down a party regardless of size. You didn’t have to call ahead. Make our lives miserable, basically. And the way we would make tables for larger parties is by pushing smaller tables together.

01:34 - You might be able to see where this is going, if you’re familiar with memory allocation. Yeah. So we also don’t know how long people are gonna stay. We can come up with some metrics to predict it. But as a host, which is what my job was, I had to interface with customers, managers, servers. All these people who kind of had different views of the world. As you might see.

01:56 - So what does this have to do with malloc? Memory allocators, if you’re unfamiliar, take a region of memory and they split it up into pieces and give it to programs requesting some size. So you might request 10 bytes, but you get a 16 byte chunk. It’s not necessarily what you ask for. But that’s how it works internally. You can also do things like combine free memory regions together so you have bigger pieces. This is similar to the problem of taking small tables, pushing them together, et cetera. So this was the literal restaurant layout. This is how much I remember this, even though it’s more than 10 years ago. It doesn’t look like this anymore. I’ve been back. So the smaller ones are what we call two-tops, which means they have two seats. And the larger ones are four-tops, four seats. And everything in this restaurant is small. As you can see. And then we can fit big parties in these sections.

02:51 - So, like, maybe table 10 and 38 can seat 6 to 9 people, this section was for people who would come in with their whole soccer team. The way we would operate when the restaurant was full is like – say maybe tables 8 and 36 sat down at the same time. We know they’re probably gonna get up around the same time. And we can kind of observe that too, by whether they paid, et cetera. And so if table 36 gets up and we have a party of 6 coming up, we might hold table 36 so we can take 8 and 36 and push them together. That’s kind of the idea of how this would work. And this is a wait list. Just an example like – names, sizes, time spent waiting, and so on. So the rules… Rules? No one knows that reference? You guys seem young. We seat parties in order of the wait. It’s a queue. Generally. But people typically understand if they’re a larger party, smaller parties can go ahead of them, because they might be able to fit in different places. But in practice, we would actually announce the party’s table being ready over a loudspeaker.

03:58 - So everyone in the waiting area can hear when people are seated ahead of them, and will maybe notice. So I had to work with managers a lot. They would watch what we were doing. Keep tabs on us. Tell us what to do. And this is kind of like the way that they thought about the world, simplified. They liked to see tables seated if we were on a wait, because that means money. And if they don’t, then we’re losing money. And also, they have this special power, where they can give free guacamole to unhappy customers and make them happy. Which helps.

04:33 - Servers are another group of people I had to deal with. Who were making money by waiting tables. So they also like to have a full section. The tables they’re responsible for. But they also really like high turnover, because that means they’re getting multiple parties, more tips throughout the night. And they don’t like to be double or triple sat, which means sat with two or three tables at the same time, because there’s a lot of work up front when you first get seated, like taking drink orders, saying hello, all those things. This means they really don’t like big tops, because we have to hold their tables and that might mean they’re double or triple sat when we separate those tables later in the night. My incentives, though, as a host, are a little bit different. I’m dealing with customers. And I like them to be happy. I like to not have them yell or be upset.

05:16 - They generally like to see the wait seated in order. But they don’t have the full context. So it’s hard for them to tell if maybe a bunch of parties of two get sat ahead of them – they don’t realize that those are only for tables of two and we can’t push them together and give them that table or something like that. And I don’t have free guacamole, which is a shame. So now we’re getting to the good stuff. Allocation strategies and the properties of the system. So as seated parties leave, we have free tables.

05:44 - How do we allocate those free tables to parties on the wait, and how can we measure that effectiveness? One of the properties you might look at is throughput. Here the number of people served per unit of time. Maybe 400 people during dinner. This can have direct impact on sales for the day, which affects manager pay, because they’re compensated in part for sales and tipped wages for servers, who also get wages based on sales. Utilization. Another memory thing. Percentage of tables seated at a given time. So maybe for the average minute during dinner, 95% of tables are occupied.

06:19 - Lower utilization can probably mean that we’re holding tables for big tops if we’re on a wait, and that could have impact on sales, but also large parties spend more money. They also have to pay an extra fee, because they’re a big group. So maybe that kind of works out. The next is time to leave. Which is how long it takes for a table to leave. So… Expire. Whatever. Or table turnover may be a more reasonable thing to call it. Usually smaller parties leave more quickly. Low TTL is probably higher throughput, at least for that table, and that’s gonna be higher sales. This is a metric that… I don’t know if it’s used anywhere else. I kind of just came up with it for this talk. FIFO fidelity, first in, first out fidelity, as in: How close are we to seating the wait in order? 65% FIFO fidelity might mean that 35% of parties were seated ahead of their spot. And the impact of this is a little bit more indirect, because it’s about customer perception. So that might make my job harder and it might worsen how much customers come back, if they feel like we’re unfair or uneffective. Internal fragmentation. The seats used within a single table.

07:33 - So party of two seated at a table of four is wasting two seats, and this is not necessarily like a bad thing if we only have parties of two. That’s probably the most efficient way to use it. But it was usually better to minimize this. Maybe hold this table for a party of four that’s a little further down and seat this table of two at one that’s about to get up or something like that. External fragmentation is the number of tables that we cannot use for big tops, because the seating times don’t line up well.

08:00 - If table nine has paid and they’re about to leave but 37 just sat down, we’re probably not gonna use that for a big top, because we would have to hold that until the party ordered and ate and left. So this impacts FIFO fidelity, because it makes it harder to seat big tops in particular as the number of them on the wait increases and we have fewer options to put them at. So putting this all together, we know the incentives of each group of people and we can probably guess what the allocation strategy will be. How will this impact the system? Managers want to optimize for high utilization. Seeing lots of tables seated, and high throughput, high sales.

08:40 - And this ends up looking a little bit closer to a first fit strategy. So as soon as there’s a table available for a party waiting, the first one, seat tables as soon as they’re free – but this can lead to higher fragmentation and lower FIFO fidelity. Servers look pretty similar. They like low TTL, high throughput, high utilization, also closer to a first fit and want their tables seated as soon as they’re available, higher fragmentation and lower fidelity. Hosts, on the other hand, are way on the FIFO fidelity side and lower external fragmentation, so we have spots for big tops. We treat them as special. And this ends up looking more like a slab allocation memory strategy.

09:27 - So we hold those tables aggressively, really reserve them for fixed sized parties like 9 to 10 or whatever. And we also double and triple seat to make that happen in the future. So in an hour, they’ll all get up – all these three tables will get up at the same time and we can use it for a 15-top. And that can lead to lower utilization. Maybe lower throughput. It’s all really dependent on what the wait looks like. And customers don’t get to seat people. But they might have a preference on one of these, depending on their party size.

09:57 - So if you’re a big party, you’re definitely gonna benefit from slab, because you’ll probably be seated in about the order that you came in, which is great. Come in with 12 people and get to be sat within an hour. That’s great. But for smaller parties, you’re gonna benefit more from a first fit strategy, because you’ll probably jump in the queue, since you can fit many more tables than big parties. So thank you! This is a good paper about slab allocation by Jeff Bonwick, and a talk from Ryan Zezeski from Papers We Love. I watched it again before this talk to make sure I remember the details. I work at Fastly.

10:29 - That’s the hiring thing, if you’re curious. .