Machine Learning Projects Against COVID-19

Aug 28, 2020 03:31 · 6822 words · 33 minute read actually call destroy panel would

DARIO GIL: Today, I’m very, very pleased to welcome Professor Yoshua Bengio to join us. Yoshua is one of the world’s leading experts in artificial intelligence. He’s a pioneer in deep learning, specifically the rebirth of neural networks. Since 1993, he has been a professor at the University of Montreal in Canada, and he is the founder and scientific director of the Mila-Quebec Artificial Intelligence Institute, which happens to be the world’s largest university-based research group in deep learning. In 2018, Yoshua had more citations than any other computer scientist in the world thanks to his many, many high-impact papers, and that was the year he earned the prestigious Killam Prize.

00:44 - This prize is given to distinguished Canadian scholars who have shown continuous excellence and made a significant impact in their field. Of course, Yoshua is also the recipient of the Turing Award which he received jointly with Geoff Hinton and Yann LeCun. The award honors conceptual engineering breakthroughs that have made deep neural networks a critical component of computing and of modern AI. Yoshua is a fellow at both the Royal Society of London and of Canada. What is also very, very special about Yoshua is that he’s deeply concerned about the future and the social impact of AI.

01:20 - Over the years, Yoshua has actively contributed to the Montreal Declaration for the Responsible Development of Artificial Intelligence. He supports many AI ethics initiatives and guidelines and he tries to raise awareness of global issues, including the environment, climate change, and diversity and inclusion. Yoshua, as many of you know, is no stranger to IBM Research. We have enjoyed his participation in our annual AI Research Week events. And many of us have been privileged to collaborate with him and his team over the past several years as part of the AI Horizons Network.

01:53 - His research is inspiring and impactful with contributions spanning technical machine learning solutions and high-level forward-looking proposals about human and machine decision-making. Today, Yoshua is going to describe his work on machine learning projects against COVID-19. That’s a topic that, as you all know, here in IBM Research, we are also very passionate about. He’s going to highlight several projects that his team has been working on. One, for example, is the use of AI to accelerate the discovery of antiviral drugs, or another one about extending cell phone-based contact tracing methods to achieve earlier warning signals of the spread of COVID-19, and more.

02:30 - I would also like to introduce Francesca Rossi, IBM fellow and IBM’s AI Ethics Global Leader. Francesca will be moderating the question and answer period at the end of Yoshua’s seminar. Please use the Q&A panel on your console to submit any questions that you have throughout today’s session. Yoshua, again, welcome. We’re really excited to hear about your latest work. >> YOSHUA BENGIO: Thank you. Before I start, I want to go back in time, what seems to be an eternity, end of February and beginning of March when we started to realize that something bad was coming to us.

03:14 - By one month after that, I think a lot of scientists around the world were struggling, asking themselves the question what can I do with my expertise to help the fight against this pandemic, and so did we at Mila and many of our collaborators around the world. I must say that through all of these months, I’ve been really thrilled and my heart has been warmed by the enthusiasm of researchers who were ready to let go of the usual issues they grapple with regarding their research that have to do with ego, like who is going to be the first author or the second author or the last author, and who is going to get the biggest grant, or who is going to get more cited, or whatever. These petty competitions that we have with each other, I think, pale in front of the kind of challenges that face humanity. Right now, we are talking about COVID-19, but I’m hoping that that spirit of collaboration between companies, between research centers, between scientists is going to be in some part going to continue after this thing goes away because we have a lot of other global challenges, including climate change which is the bigger wave. It’s not the second wave. It’s the super big wave that’s coming very slowly but surely at us.

04:51 - That was a little intro to tell you a little bit about my spirit. Now I’m going to tell you mostly about two projects that I’ve been involved with at Mila, as Dario was saying, one really regarding new drugs and the other regarding how to use phones to help fight the disease. Before I do that, let me say a few words about responsibility since, again, Dario nicely introduced me with a concern for that issue. It used to be the case not so long ago when I was a grad student and then a professor for many years that I didn’t really think about the social impact of my work. It was just not a question, right, because, first of all, our work was not really used that much outside of universities, so we could just live in our ivory tower and not care too much and just focus on our math and our algorithms and our technical questions.

05:56 - But the world is different now and we really, really have to think about the impact of our work. That means we have a new responsibility. That means we can’t just do our research, or if we are engineers, working on products, without thinking about questions that we’re not used to. That means we have to educate ourselves. There is zero formal education in our current universities, especially in computer science, to prepare us for what it means in terms of ethics, in terms of society, in terms of democracy, in terms of privacy to have machine learning applications in the world. Sorry. The thing that actually I’m more scared about in the longer run, not in the next couple of years, is how what we’re doing with machine learning is building very powerful tools. These powerful tools obviously could be used for good, but they could also be misused. And so there is this kind of wisdom race.

07:03 - We have to become collectively and individually wiser in the way that we organize our societies, including how we use technology. We have, I think, become wiser. I mean, we have some setbacks I won’t talk about. But not fast enough. Technology is getting more powerful every year at a rate which I think outpaces our current ability to improve our collective and individual wisdom. That’s the thing I’m most afraid of. It’s a little bit like if we were to build tools that could be used to build super powerful bombs, but we allow children to play with those tools which can become weapons and can destroy each other and potentially the planet. That’s why I also care a lot about - now just thinking about our responsibility in a negative way, but also how we could use our expertise for goals that are not just profit-driven but also are focused on what technology can bring that’s best for humanity, AI for social good or whatever name you’d like for this.

08:26 - The projects I’m going to tell you about basically fit these kinds of things. Whether it has to do with healthcare, education, the environment, humanitarian applications, and so on, I think these are good examples. Okay. Now let me talk about COVID-19 antivirals. There are a lot of research groups around the world who are struggling to find vaccines and to find antivirals. These are different things. The vaccines are the ultimate weapon, but they might take a lot of time or maybe we won’t even find good vaccines. Think about HIV.

09:06 - We haven’t found any vaccine for that yet. A shorter-term goal is to build treatments, most commonly through new drugs or repurposing existing drugs. The first thing you can do is take one existing drug that was meant for something else and then realize that it could be used against the virus. The good thing is there are not that many existing drugs, a few thousand, and so you can actually use high throughput assays to test every single existing drug. So that’s good. But now if you go from one drug, maybe we don’t find the answer there, and it doesn’t look like we found that, the next thing up if you want in terms of complexity and power is two drugs or three drugs.

10:05 - We can take existing drugs, so we know their toxicity, we know how they could already - what harm they could do and the fact that they don’t harm too much, and we’re looking for a combination which together somehow makes a difference. We already know, for example, with HIV that, actually, where a single drug - no single drug might cure a disease, sometimes two or three together can make a miracle. And so there are a number of projects going in this direction. I’m involved with one of them. In terms of machine learning, what this involves is putting together a kind of knowledge graph of all the information we have about all the existing drugs and how these drugs are related to proteins. Typically, a drug, these small molecule drugs, they target a particular protein.

11:00 - In a sense, the drug is going to typically prevent the operation of the protein or potentially highlight - I mean, enhance it, but usually prevent it. And so there is a fairly clear relationship between drugs and proteins, but then those proteins can have an impact on other proteins. The cells are complicated. We don’t fully understand them. There is a complicated set of interactions between different proteins. If each drug is somehow doing something to one protein and the proteins interact with each other and you want to have multiple drugs, understanding that network of relationships, at least at a statistical level, could really help us guess good new combinations of drugs. And, of course, in addition, you can collect data.

11:47 - Once you have candidates - this is going to be the general strategy, right? You have some sort of machine learning method to propose candidates and then you’re going to test those candidates. You can test them first in, as I said, assays where you can use robots, for example, to test hundreds or potentially even thousands of new drugs or drug combinations and see if they bind to the target protein or how they affect some cells. And once you start doing this, you get data that you can now incorporate into your set of data points for your machine learning and then you’re going to iterate. You’re going to use that data to come up with new candidates, the new candidates are going to be tested by the biologist, and eventually some of those tests might actually be reasonable. That’s just the beginning of a long pipeline of evaluating the drugs, ultimately leading to clinical tests which are going to be, of course, slower and more expensive.

12:51 - And so there is this whole funnel of selection. Machine learning now is just acting at the top of that funnel to select from a set of candidates that we don’t have time to actually test chemically or biologically. The other approach besides combining existing drugs is to create new drugs, new molecules. Now the advantage is we are searching in this huge space of something like 1060 potential drug-like molecules. The problem, of course, is that it’s difficult to search in this huge set, and so we want to use, first of all, machine learning methods - first of all, we can use physical modeling methods that are in silico.

13:38 - We can use machine learning methods to approximate these physical models so that we can do say 100 times more evaluations and get an approximate evaluation, and then we can use things like reinforcement learning, active learning, and so on in order to search the space more efficiently. That’s the story I’m going to focus on. First of all, just so you get a sense of how big the problem is or the potential advantage because most small molecules have never been even thought by a chemist or a human being and much less being evaluated for real. There is a huge potential of discovering new drugs, whether it’s for COVID-19 or for other things, if we can develop better tools for searching in that space. That’s interesting. Now the potential for using machine learning in pharmacology and drug discovery, I think, is we’re just seeing the tip of the iceberg right now. One of the big issues with the current approach is it takes a lot of time to discover a reasonably interesting candidate or a lead, anywhere from three to 20 years, and that’s one of the reasons why typical drug development is so expensive.

15:06 - In the case of COVID-19, it’s clear we can’t afford to wait that long. And so it’s really, really worth it to invest in methods that have the potential to reduce that search to something like six months. Now we’re not sure that we’re going to be able to do that, but it’s a bet that’s really worth doing given that we are so short on time to find a cure. Okay. There is something really interesting about from a - now I’m going a bit more technical. There is something really interesting about searching in that space of molecules that has to do with the way that we’re going to evaluate new candidates.

15:46 - Naively, you could think, well, I propose a candidate and then I have some oracle which tells me how good it is. We want to estimate the binding affinity between the candidate drug and the target protein. If the world was as simple as that, it would be nice, but, actually, the world is more complicated. If you want to actually get the answer how good is this binding, you’re going to need to do an actual chemical experiment, and that’s going to take a lot of time compared to what you could do with in silico experiments. In silico experiments, there are many things you could do at different trade-offs between precision or fidelity and computational time.

16:39 - We enter a really exciting, I think, research area for machine learning, which is how do we trade off computation for the speed of solving a problem. In our case, we are searching for good candidates, right? Each time I take a decision, like I’m going to call a particular oracle - and that oracle is not perfect. It’s going to be an approximation. If I use FEP calculations, I can have a very good oracle, but it’s super expensive. It might take like hours or something to get an answer, or minutes, depending on the kind of calculations you do. You could also use a docking calculation, which is much cheaper than FEP, but also has less precision.

17:24 - Or you could use a neural net which has been trained to approximate either at a docking or the FEP or a combination of both. Now what happens if you use an oracle that has less precision is that you’re going to need to look at many more candidates than if you have one that has more precision. This is what this picture is saying. If you just do like random search and you evaluate candidates using different kinds of oracles, the blue would be like the true one, the green is the FEP, and the orange is docking, this is what you should expect in terms of how many candidates you have to look at on the X axis versus how well you are doing. You want to minimize this binding affinity. You can see that the slope depends on the precision of your oracle. Anyway.

18:17 - This is raising a lot of interesting questions about deciding at any point in time which oracle I should be calling given the information I have and how should I organize this search to make that slope better. We have developed a research program called LambdaZero which is - whose starting point is new zero, the enforcement learning approach that has been developed for Computer Go. It starts from running about 200 million simulations of physical docking and then using those as training examples. It’s interesting that because we’re using these simulations to generate data, this data is kind of low quality, right? It’s a bit low precision, but it’s a lot of data. You might think that when you do drug discovery, you’re going to have very little data because the actual number of candidates you can evaluate in real with chemists and biology is going to be fairly small, but then you have this huge amount of low precision data where the target is not perfect. This is also an interesting challenge. We have these two kinds of data.

19:40 - In fact, you can have a whole spectrum of different types of data with different precision. How can you take advantage of that? What’s interesting is that if you use some sort of reinforcement learning, instead of just randomly guessing candidates and testing them, you can actually really improve. Not only you can get an advantage by using a more accurate prediction of what would be the success of a particular molecule, but if you search in that space using better machine learning methods - in particular here, reinforcement learning and active learning methods - you could also, for the same amount of computation, discover molecules that are substantially better in terms of energy. This shows where we currently are with one of the elements in the system, which is going to be the approximate reward function for the reinforcement learning, is just a neural net that predicts the outcome of the docking. What’s interesting is we can get a fairly good approximation.

20:45 - On the X axis here is the real docking output and the Y axis is the predicted output from the neural net. It tracks the identity but there is uncertainty and noise around. But then we can do this 100 times faster than physical docking. What we want, of course, is to be able to use these searching methods in order to virtually scan through a space of molecules which is much larger than would ever be possible by enumerating molecules and then evaluating them separately. One of the components that’s very important in this kind of research is synthesis ability.

21:30 - It’s not enough that the molecule we’re looking for binds well to the target; it should also be something that chemists can actually build at a reasonable price or even is chemically feasible. There are software using all kinds of chemical rules to evaluate a synthesis ability and you can also use machine learning because these also are too slow to run to get the nice super. You can use neural nets, and here it’s graphing neural nets to approximate the result of these synthesizability calculations. Again, you can approximate these things pretty well. Now you have like a double objective which is the binding affinity and the synthesizability.

22:15 - Another one which we haven’t incorporated but will become important is toxicity. You want a molecule that will bind to your favorite target but is not going to bind to everything else and destroy the person at the same time as killing the virus. This is for chemists. That doesn’t mean anything to me. Let me just say a few words - and then I’m going to move onto the other project - about active learning. I mentioned already that we’re going to have these iterations where we use machine learning to generate candidates and then we are going to use these candidates to obtain new experimental data, and then that experimental data is going to be added to the training set. And so there are all kinds of interesting questions here how to iterate these things in an optimal way.

23:10 - For example, one way to think about one aspect of it is simply what are the criteria for selecting which molecules you want to evaluate. You might think in a simple-minded way that it’s just the molecules that have the best score in terms of the binding - predicted binding affinity and the predicted toxicity, but you also want to take into account uncertainty. You want to maybe be doing things like basing optimization or other ideas coming from active learning in order to also evaluate molecules for which you have a high uncertainty. Then there are many methods to evaluate uncertainty. Which one is going to be more appropriate? And then another interesting question is when are you going to provide these candidates to the biologists? You don’t want to just give them one molecule at a time, because when they’re going to do an experiment, they might as well do it with a batch of 100 molecules or something like this.

24:11 - And so you really want to provide a batch of candidates. If all of these say 100 candidates are good but they’re all kind of the same with some small variations, you’re not going to gain a lot of information. You have to think of this whole iterative process of trying to acquire information. With that in mind, you want to take into account the sort of mutual information that these different candidates have with - can bring together. There are things like batch ball, for example, which is a method that’s been proposed recently to estimate the information gain from a whole batch and not just from a single candidate.

24:54 - There’s really a lot of interesting machine questions that come up. Oh, and the other interesting machine question is instead of thinking of each of these filtering steps and prediction steps as independent machine learning problems, really the ultimate goal here is to think of the whole process with the chemistry and the loop is one big search problem where we can use machine learning in a way that’s optimized with respect to the whole search process, including the feedback coming from the real world. Okay. The work that is going on right now involves a lot of people. I’m not going to name them all, but people from Mila, people from many other organizations. It’s really exciting to see the energy that goes into this research.

25:45 - For the next 15 minutes or so, let me tell you about a completely different project which has to do with how machine learning could improve contact tracing as well as epidemiological modeling. Let me say a few words about the way that the virus propagates. What you see in the figure is the viral load evolution during the course of the disease. Zero is the day when you have symptoms appearing, and then you’re probably going to have symptoms for a number of days that can vary quite a lot. But the interesting thing is in the three days that precede having symptoms, you already have a high contagiousness.

26:35 - The viral load is how many viruses you are shedding. A high viral load means you are more likely to be contagious to others. What’s really scary here is that you are contagious for like two or three days before you even know that you are contagious or at least that you have any clue that you are contagious. People, when they realize they have symptoms, they get worried, and they will typically change their behavior. They will become more prudent, go out less, maybe even go in quarantine.

27:12 - The problem is - well, okay, so one problem is there are people who don’t care. That’s a social problem and a political one. But there are also people who simply are not aware that they could be contagious. If we can bring any kind of information to these people so they can change their behavior, just be a little bit more prudent, take a bit more distance, stay at home if you can, work from home, things like that, then we can really save lives and reduce the rate of spread of the virus. Alright. How can we do that? Well, first of all, let me tell you about one of the most powerful tools we have to provide early warning to people, and that’s contact tracing. That’s usually done manually.

28:12 - We ask people who have been tested positive to report who they have been in contact in the last couple of weeks and then we contact those people and we ask them to go in quarantine and potentially be tested. That’s the standard way. There are a number of issues with this. The main issue is it’s only when the person has been tested - and usually that may be like a week after they started having symptoms - that the information can propagate to the people they have been in contact with. By that time, many of these people may already be symptomatic and they already know that they might have the disease, so it’s not going to change as much. If we could know that you are carrying the disease earlier, then we can warn your contacts earlier. That’s a little bit of what we’re trying to do here.

29:15 - So, yeah, that’s where digital contact tracing comes in. First of all, one thing I forgot to mention is, at least in countries like Canada, it takes a while between a positive test and a contact tracer to actually call all of the people that you know. We are talking about like between one and two days. That’s an extra delay. If you consider the little time window that we have that I mentioned of like two to three days, adding one or two days to this is really bad. We really are playing against the clock. Everything we can do to shave time in the way that information propagates from people who clearly have the disease or might have the disease to people who don’t know that they might have it could have a big impact.

30:02 - Digital tracing, that means we’re trying to do something comparable to contact tracing, manual contact tracing, but we would like to take advantage of your phones to allow the information to propagate, A, faster, and two, even to people you don’t remember spending time with. Maybe you were, I don’t know, in a queue or in a bar together and, of course, you don’t know that person, so you can’t remember that person’s phone number and name and so on so that the contact tracer can find her. That’s the promise of digital contact tracing, but usually it’s still done kind of imitating the manual contact tracing that the information starts to propagate only after a positive test has been obtained. This is where machine learning can come in. Because one of the challenges with just looking at symptoms is there are many kinds of symptoms.

31:08 - You can have easily a dozen symptoms that we currently know about that may be revealing of having the COVID-19 disease. How do you know - and these symptoms can have different levels of severity. How you convert that information into an action probability, what should you do? Should you warn your contacts? How much - what should you tell them? How prudent should they become? Maybe it’s just a false alert. We don’t want everyone to go into quarantine because they’ve been in contact with somebody who is starting to have a cold, right? So there is a trade-off here in general between how much freedom we’re going to remove from people by asking them to be more prudent, maybe to go into quarantine, versus how we can slow down the rate of propagation of the disease. And so it becomes really important to have the right tools to evaluate the trade-off and that means we need to evaluate the risk, like the probability that you’re contagious or the amount of contagiousness, like this viral load I was talking about.

32:16 - You might use - if you consider one particular person, you can have many sources of information now to provide clues that that person is contagious. I already mentioned symptoms, but we also know that having the disease or not depends on prior medical conditions. One of the things that we’ve done is build a questionnaire. These questionnaires exist. They’re becoming fairly standardized. We know a lot about medical conditions that increase your probability of getting the disease. We also know that things like age make a big difference.

32:53 - You want to ask these kinds of questions ahead of time. And then when people start having symptoms, they could report them on their phone. Also, another source of information for a particular person is have they been in contact with people who seem to be at risk. What was the risk level of these people? How likely were they contagious and so on? Now we have all of these sources of information. You can see that it would be hard to come up with a hand-crafted heuristic to combine all of those pieces of information.

33:25 - It makes much more sense to use machine learning to combine these pieces of information. And then if we can do that, then we can provide an early warning signal first to the person but also to the people that the person has been in contact with and potentially their own contacts. That’s the idea of machine learning-based digital contact tracing. Okay. Let me go through a little sketch scenario here to illustrate how the early awareness could save lives. We’re looking at this character - I’m sorry if the letters are too small for you to read, so I’m going to explain verbally anyway.

34:11 - But we’re looking at three potential histories of the same underlying scenario where character Jim is going to get infected and then potentially infecting others. Under a scenario where there is manual tracing going on, a scenario where there is digital tracing, which we call binary tracing because you either have a binary decision that you are at risk and so you should change your behavior, if you’ve been in contact with somebody who was contagious, then suddenly you go in quarantine, versus a machine learning-based approach, the third row, where you can have a graded signal because you’re not completely sure that you’re contagious, that you’re infected, and so instead of going bang, bang, I behave as usual versus I go in quarantine, you can have intermediate levels of prudence. That’s going to change your behavior but not necessarily prevent you from doing any activity. At the top, what we see is that on the first week on the Wednesday - well, in all of the cases, our character Jim has actually a contact with a high-risk stranger somewhere. What happens then is that the stranger, a couple of days later, starts showing symptoms.

35:40 - In the manual and regular digital tracing, that information doesn’t reach Jim. But, actually, if Jim was using machine learning and the other person was using machine learning, an app with these kinds of tools, you could have early warning signals even before the other person gets tested in the second week, even before the other person gets symptoms because that stranger has the app, maybe her app has already calculated that because of the contacts that person had with others who were infected that she might already be at some level of risk, and then that level of risk will propagate to some extent to Jim. And so even on the first day, Jim might already get a signal that he should be a little bit careful. And then when the stranger starts having symptoms, Jim is going to get a slightly higher-level recommendation of being careful. And then a few days later when the stranger’s symptoms grow worse, then Jim gets an even stronger signal.

36:49 - This is where in the cartoon story there is a difference because in one case Jim decides to go to work, in the other case he decides not to go to work or to go to whatever public place. And then we can see the difference between the manual tracing and digital tracing in the sense that the test results - the time delay between the test result and Jim getting the information might make also a big difference. A few words quickly about some of the really interesting and challenging issues around these kinds of projects. First of all, it’s a lot about privacy. How do we find the right trade-off between privacy considerations and machine learning considerations because, at first sight, these two things are sort of in complete opposition? The privacy considerations say no data because any bit that I send is potentially exploited. Machine learning people want more data, as much data as possible. It looks like it’s hard to resolve this.

38:02 - Once you allow some data to be exchanged, there are many options and you’d like to find sort of the privacy and the security and the communication options that are going to be good from a privacy perspective but also allow enough information to be propagated for machine learning to do its job. When you look at the privacy issues, there’s basically two big categories that you have to think of. There is the Big Brother attacks and the Little Brother attacks. What is that? Big Brother attacks, I think we understand that we don’t want governments or large companies to centralize data about everyone so that they can use it for purposes that are not good for us and particularly threaten our democratic rights or exploit us in any way. The Little Brother attacks are a bit more - I mean, people think less about these things.

39:00 - What it means is your neighbor getting to know that you’re infected or people that you meet on the street to know that you’re infected. Of course, you don’t want that, right? You don’t want that because you want to protect your dignity. You don’t want to be discriminated against. You don’t want to be stigmatized. Or may even be taken advantage of by somebody who could, knowing that you’re infected, make money out of you or something. Unfortunately, the different kinds of solutions that exist in terms of privacy tend to be either mostly helping against the Big Brother problem or mostly helping against the Little Brother, but the problem is it’s hard to reconcile both kinds of defenses.

39:45 - But you have to keep that in mind and you have to make some choices which may depend on what do people care most about. From a machine learning perspective - I won’t have time to go in a lot of detail, but there are many approaches that one could look at. Unfortunately, often the approaches we can think of come in contradiction with privacy constraints. For example, it’s not easy to do things - like the first thing that came to me when I looked at the problem was, oh, we’re just going to do loopy belief propagation or something like this where the nodes correspond to different people and their phones. Unfortunately, this requires communicating a lot of information very often between all the phones.

40:32 - If you want to do learning on a server, that means the server would have access to the full what’s called contact graph. Who met whom, when and where? This is something that’s really, really bad from the point of view of a Big Brother attack. And so, ideally, we don’t want to have any such file that contains the contact graph of everyone. You need to find solutions that avoid that. I see that time is flying, so let me skip a few things. Let me tell you about what we have been exploring. We’ve explored a solution in which the phone is doing a lot of the calculations, but we’re also using a central server to do the learning. Ideally, we would use federated learning, but that raises other challenges. From a machine learning point of view, what’s sort of a big challenging is that the input now is not a fixed size thing. It depends on the contacts that I’ve had in the last two weeks and the number of these contacts varies.

41:35 - We wanted to use a machine learning method that can deal with variable length input. One such approach is transformers, which is what we’ve been using. Also, another kind of tricky question is, well, what do we want to predict, exactly? Ideally, we want to predict something which, unfortunately, we can’t measure, which is the contagiousness I had N days ago. Let’s say Alice met Bob five days ago and now Alice has some probability of being infected, what kind of information should Alice send Bob so that Bob optimally changes his behavior? What makes a lot of sense from an epidemiological perspective is, basically, Bob wants to know if he’s infected. The information that’s most relevant to that is how contagious was Alice five days ago when she met Bob.

42:34 - And so the predictor on Alice’s phone should be predicting for each day of the past two weeks how contagious she was so that she can send that information to her contacts of those days. That means the output is also not just a single scaler but at least one quantity for each day in the past. Let me skip that. One issue is the thing that we want to predict isn’t something we can actually measure. Contagiousness is not something you’re going to get from the tests, especially contagiousness in the past. What we want to do is infer it. The approach we have chosen is to infer it using a generative model of the joint distribution between the things you observe, like especially on your phone, and the things you don’t observe, which are latent variables, and training an epidemiological model which captures that joint distribution, on one hand, and the other hand, training a neural net inference machine that predicts the latent variables given the observed variables.

43:51 - This is something that should sound familiar. It’s basically what EM does, right? We’ve explored approaches based on amortized variational inference, which is what you have in VAEs, but it’s essentially similar to what you do in EM. I’m going to skip that. And just the last bit is that part of that research project involves building a good simulator, a good generative model for all of these variables. The best way to do that is to have a really good epidemiological model that is organized at the level of individuals. It’s not just the standard ones that are compartment-based models where you simulate the proportion of the population which is in different stages.

44:35 - Instead, here you do this but for each person in a population, which raises all kinds of computational challenges. It needs a lot of computational power. In this way, you’re able to simulate things like what if this particular person, given the information that she has on her phone, changed her behavior in such and such way. How would that affect the overall evolution of the virus? This is the kind of calculation we’ve been doing. This is more information about the simulator. Our simulations seem to suggest that, indeed, this early warning signal allow you to reduce the number of infections to reduce what’s called the reproduction number of the virus, which is how many people a person will infect on average.

45:21 - This is the second graph in there at the bottom where we see that the machine learning-based method is able to reduce by maybe 30% this reproduction number, and that translates into number of cases which is not going to grow as quickly compared to other approaches like the standard contact tracing. I’m going to stop here and just mention that this is the work of a lot of people again. Thank you very much. .