Mac Schwager: How engineers are putting the ‘auto’ in autonomous
On this episode of Stanford Engineering’s The Future of Everything podcast, guest Mac Schwager talks safety in multi-robot systems, like those controlling the autonomous vehicles that will soon fill our future. Some engineers are helping robots communicate better among themselves while others are working on “emotionally aware” algorithms able to pick up on subtle cues in how others are driving to help robots make better on-the-road decisions.
Never fear, Schwager says, the future is in good hands. “Autonomous cars will reach a level of safety that surpasses that of human drivers, but it may take a little while,” he tells host Russ Altman on this episode of The Future of Everything podcast.
Transcript
Russ Altman (00:03): This is Stanford Engineering's The Future of Everything, and I'm your host, Russ Altman. Today, Professor Mac Schwager will tell us how autonomous vehicles will interact with one another and with humans. There are key challenges before we can take full advantage of the promise of autonomous vehicles, and safety is a major challenge. It's the future of multi-robot systems. We hear all the time about autonomous vehicles in the future.
(00:29): We hear about self-driving cars, we hear about planes without pilots, and drones. How do we make sure they're safe, and how do we make sure they can coordinate with one another and with humans to make sure that the job gets done? Mac Schwager is a professor of Aeronautics and Astronautics at Stanford University. He studies the coordination of autonomous vehicles with one another and with humans. This requires, obviously, understanding how robots work, but also how humans think. For these systems to be generally accepted, they have to be extraordinarily safe to levels that are very hard to achieve.
(01:08): Mac, we all think about a future of self-driving cars. It's been in the air, it's been in the media, and most of us find it terrifying to imagine being on a road, not only with other crazy humans, but with potentially crazy self-driving cars. And I know you study this, so tell me, should we be terrified, or is there help coming?
Mac Schwager (01:31): Oh, dear. Yeah. You just went right to the heart of the matter. That's a great question, yeah. Okay.
(01:37): So we shouldn't be terrified. Of course not. However, if our streets were suddenly swarmed by autonomous cars in the state of the development of technology today, yeah, there would be a little bit of reason to be scared, but I think essentially, there's a very careful regulatory sort of system of laws, and rules, and certifications and so on, and we can rely on that to keep us safe, and I think ultimately, the potential of the technology is huge. Ultimately, I do think autonomous cars will reach a level of safety that surpasses that of human drivers, but we're not quite there right now, and I'm happy to talk about where the hurt points are. There are a lot of-
Russ Altman (02:21): And that is my goal for the next few minutes, but on that same topic, is it definitely the case that we will have to have human and robotic drivers together, or are there some futures where we get different roads or like a different infrastructure for humans and robots?
Mac Schwager (02:38): Yeah, that's also a great question. I mean, I think that it's very unlikely that we'll overcome the kind of societal inertia that would require us to have two separate road networks, one for autonomous cars and one for human drivers. It's just would require a massive, kind of unified investment in infrastructure for that, and I don't see that happening. It's not practical, but that makes our job, as engineers and as researchers, a little bit harder because it's a lot more difficult to develop an autonomous car that can co-drive with human drivers on the same roadways, but we really have to do that. And there's another factor, which is, I think that as autonomous cars become adopted more and more widely, their job will become easier because autonomous cars in the future will have the capacity to talk to one another through a communication network directly and to collaborate directly and say, "Okay, you go first, I go second."
(03:39): "You go here, I go there." The human drivers, of course, were not wired for Wi-Fi or whatnot, right?
Russ Altman (03:46): Well, it's interesting that you say that because I know that part of your work ... And we will get into you. You're right, we went right into the deep end, and guilty as charged, but you have written and talked about the fact that, although humans don't, like we don't have a Wi-Fi to one another, there's a lot going on in terms of eye contact, assessing whether the other driver is paying attention. "Is the other driver allowing me to enter a lane merge?" So there is different kinds of communication that a robot or a ... I don't even know if we should call them robots, an autonomous vehicle, may or may not be able to do.
Mac Schwager (04:19): Yeah.
Russ Altman (04:20): Do you imagine that we're going to have to train these autonomous vehicles to take the kind of facial cues?
Mac Schwager (04:25): Yeah.
Russ Altman (04:26): I ride my bicycle to work every day, and so I'm acutely aware of the need to make sure that I'm not going to get killed by a driver who has no idea that I'm about to do something. So is that part of the challenge for autonomous vehicles, is to learn human? And I know this is what your work is, so tell us, what are those challenges?
Mac Schwager (04:43): Yeah. No, that's exactly right. I mean, the challenges are in communicating through motion and inferring intention through motion, and also through other cues, like you went straight to kind of the driver's facial expression and assessing whether or not they're paying attention, and I think that is very important, but I think that's a few steps down the road, actually. I think that our first job is to be able to infer, and predict, and communicate motion based on the vehicle itself, and we do this all the time as human drivers, I think without even realizing it. If a driver on the freeway comes up aggressively behind you, you know that that's a communication cue that you need to move over and let that driver pass, or if a driver is swerving, maybe they're not paying attention, maybe they're looking at their phone, or if the driver is kind of starting to slide over to the edge of the lane, that's a cue that they're probably going to change lanes, or they're looking to get over for an off-ramp, these kinds of cues, or if a driver slows down in front of you, maybe they see a danger that you don't see because the danger lies ahead, right?
Russ Altman (05:49): One thing I might miss ... Sorry to interrupt-
Mac Schwager (05:51): Yeah. No, go ahead.
Russ Altman (05:52): ... is when there is a crazy driver in front of me, one of the strategies is to pass them and get them behind you, and we all know that we always take a look. You always look to the left, sometimes not even to intimidate them or give them any message, just to see like, "Who the heck was driving like that?" And I'm going to miss those days, when I have a crazy self-driving car and I look and I can't see, "Why was that car so crazy?" So forgive me, but I know that you call these multi-robot systems, and so just getting a little bit serious and stepping back for a minute, tell me how you've formulated this problem, and what is the framework that then guides your research program?
Mac Schwager (06:32): Yeah, yeah, that's great. So I work in robotics, and to me, an autonomous car is a robot, and it's of the same nature as an autonomous drone, or autonomous aircraft, or an autonomous spacecraft, or even a manufacturing robot, like robotic arms for assembly and welding and these kinds of things. These are all, to me, examples of the same kind of thing, which is a robot, a machine that can make its own decisions. So with all of these robots, my research focuses on interaction on cooperation and communication, and when you have groups of robots that work together ... And I think all robots are co-robots.
(07:17): All robots eventually are going to live in a situation where they have to either work with one another or work with human partners, and if we kind of design a robot in isolation and think about the algorithms for a robot in isolation, we're missing most of the delicate challenge, which is, "How should that robot interact with other humans and robots around it?" And the framework that I bring to this in my research is based in, ultimately, in what's called mathematical optimization, which is kind of an applied mathematical discipline that spans many areas of engineering and even economics and other areas. The basic idea is that in the context of robotics, you encode something that you want the robot to do, as what we call a cost function, which is just a mathematical function that gives you, let's say a price, or a cost on a particular activity, and you mathematically try to find the activity or the combination of motions for the robot to minimize that cost. And so there's lots of statistical mathematical machinery and computational machinery. Once you've got your problem encoded as a cost function, there's this machinery out there for minimizing that cost function, that then-
Russ Altman (08:40): And I'm guessing that it becomes even more complicated when you're looking at a swarm of robots that are all trying to do something. You still want to get that mathematical optimization, but there's just a lot more moving parts, so to speak.
Mac Schwager (08:52): Yeah, and that's interesting. Yeah. So when you have multiple robots, there are two ways to look at this, and I like to think of this in the context of, "Do the robots have a communication network with one another, or somehow, do they have a way of communicating directly with the human that they're interacting with, or do they not?" So in autonomous driving, currently, there is no communication network. Maybe in the future there will be, but currently, there isn't.
(09:12): In other contexts, there are communication networks, like drones frequently do have communication networks to talk to one another or to talk to a base station and relay messages to one another, and manufacturing robots frequently are connected either through a wired or a wireless network. There are other examples. So when there is an explicit avenue for communication, there are computational methods for, let's say co-optimizing a common objective, a common cost function, right? So suppose you've got a group of robots that are trying to manipulate a heavy object in a manufacturing scenario. So there's some assembly in a car manufacturing plant, say, or an aircraft manufacturing plant, and that assembly is too big to lift by one robot arm, and so you've got five robot arms working together to lift this thing up, right?
(10:02): That problem, you can encapsulate as an optimization problem, and then you can kind of farm out the computational load to all the different robots, and they can, in parallel, optimize that cost function, and in essence, that leads to the coordination of their motions. So one robot arm will lift this way, another one will lift this way and so on. And so there's a package of algorithms that we work on to deal with that kind of problem, and then there's the situation where there is no communication network, and in which case, there are a lot of different approaches out there, but the one that we've favored in my lab and that we've developed algorithms for in my lab models that as a game, as a mathematical game.
Russ Altman (10:44): I've seen papers on game theory, and I'm glad you got to that, because I was going to say, "Is this all a game to you?"
Mac Schwager (10:50): Yeah, right. So when I give research talks on this research, they usually title it something like Robots That Play Together and Robots That Work Together, and so the robots that play together are playing a game, the robots that work together are doing this distributed optimization.
Russ Altman (11:07): But when they're not communicating, game might be the way to think about it.
Mac Schwager (11:10): Game is the way to think about it, yeah. And so the idea here is that in a game, every robot, or every actor, could be a robot and a human, has its own objective function or cost function in the sense of mathematical optimization, but the challenge is that the cost is not only a function of, let's say my actions, but also the other person's actions. That's the notion of a game. You have have two players or 10 players, whatever, and each player has its own cost function, but that cost is a function, not only of that player's actions, but of the other player's actions as well, and so-
Russ Altman (11:54): Is it also true that you may not know exactly what that other player's cost function is?
Mac Schwager (11:58): Exactly. Yeah.
Russ Altman (11:59): Okay.
Mac Schwager (12:00): So on the road, to ground this in a little bit of context in autonomous, driving on the road, I'm an autonomous car, I know what I want to do. I want to stay in my lane, I want to go straight, and I want to maximize safety and optimize something about ride comfort, something like this, but I don't know exactly what the other driver wants to do, the driver in the lane next to me. Maybe that's a human driver, and maybe she's planning to accelerate and cut in front of me and change lanes, or maybe she's planning to slow down and go to an off-ramp to the right, something like this, right? And I don't know that beforehand, and so somehow, I need to ... In thinking about this as a game, I also have to reason about my uncertainty or lack of knowledge about her, what she's trying to accomplish, what her cost function is.
Russ Altman (12:50): Yes. And so do you try to learn these on the fly? Like is it a hypothesis generation, and then see if the data is compatible with that hypothesis?
Mac Schwager (12:59): Yeah.
Russ Altman (12:59): If it is, then you go with it, and if not, I need a new hypothesis about what ... Lots of puns in our company-
Mac Schwager (13:05): And that's the driving, yeah.
Russ Altman (13:06): And what drives that person or that vehicle.
Mac Schwager (13:09): Yes. Yeah, that's exactly what we try to do. It's easy to say, but this is computationally a really difficult thing to do.
Russ Altman (13:15): Yeah.
Mac Schwager (13:15): So this lives in a space that we would call inverse reinforcement learning or inverse optimal control, and all that means is ... Okay, so I talked about mathematical optimization, you have this cost function, you're trying to optimize it. So in inverse optimal control, or it's also called inverse reinforcement learning, instead, you have the set of actions, which are optimal, but you don't know what is the cost function that produced those actions. So you're observing the actions, you suppose they're optimal, but you have to back out what the cost function is-
Russ Altman (13:49): Well, that's fun. That's fun, yes. And this is kind of the reverse psychology that we all are doing every day with our coworkers, like, "Why the heck like did she do that, or did he do that?," and you try to figure out a theory of their mind that kind of explains their actions.
Mac Schwager (14:05): I'm so glad you said that. Yeah. So that's the ultimate goal, is to have a theory of mind for all robots. And what that means is we want to endow any robot, as with the capacity to put themselves in the seat of the pants of somebody else. This is a perfectly natural capacity for humans, is the capacity for empathy, right?
(14:25): So any human can put themselves in the seat of the pants of another human and kind of think about what they might be thinking or what their goals might be, or what their, might be seeing and how they might be reacting to what they're seeing. This is really, really hard for robots to do, and that's what we're after, and basically, the mathematical formalism that we used to do that is this game theory idea.
Russ Altman (14:48): It's very interesting to think about the humans too, because the first generation of interactions between cars will give us a sense of what those cars, what makes them tick, so to speak, and then changes to those cars could be very worrisome, because if we get used to certain behaviors, and then the next revision of version 2.0 of the car has a different approach, then all of these intuitions we've developed, interacting with them for the previous months, might no longer be valid, so it's going to be very interesting.
Mac Schwager (15:18): I love that. Yeah, I love that. That's not something that gets a lot of attention, I think, in my research area, but that's super important, which is how the human gets conditioned to behave around the car, and I think, "Okay, this is super interesting." So there's a statistic out there, that over the several years that we've had autonomous cars driving on the open roads, their accident rate is actually twice as high as the accident rate on average for human-driven cars on a mile for mile basis. I don't remember the statistics exactly, but it's the likelihood of having an accident for every mile is twice as high for autonomous cars.
(16:01): Now, things are improving, and this is over multiple years. The technology is always improving, but what's really intriguing is that the probability of the autonomous car being found at legal fault is lower than for human drivers. So somehow, they're more dangerous, but not legally culpable. I have a hypothesis for this, which is that autonomous cars are slow pokes, and human drivers get confused because they're so cautious. The autonomous cars are so cautious. Our human drivers have a mental model about what driving should look like, and these autonomous cars are at the far end of the spectrum.
Russ Altman (16:36): They're violating it.
Mac Schwager (16:38): Yeah, exactly.
Russ Altman (16:39): I was once on a local road that you know, El Camino, and Google had two cars in front of me that were self-driving cars, going the speed limit. I was infuriated, and to help Google with their learning algorithms, I drove my car as close to the back of the car in front of me as I could, and we could have easily have had an accident, and it would definitely have been my fault, but I was enraged. I was enraged by these two self-driving cars.
Mac Schwager (17:04): Yeah, and I think that's the delicate line between social intelligence versus sort of following the letter of the law, and this is a really challenging thing that autonomous car manufacturers have to contend with. You just can't drive according to the letter of the law all the time. It's dangerous. And this is culturally relative, right? What is safe driving in India is not safe driving in California.
Russ Altman (17:31): Yes. Well, this is The Future of Everything. We'll have more with Professor Mac Schwager next. Welcome back to The Future of Everything. This is Russ Altman.
(17:38): I'm speaking with Professor Mac Schwager from Stanford University. So Mac, I understand that you're using AI and deep learning in a lot of this work. People are worried about AI generally, because it takes away a certain amount of autonomy. People are worried about fairness. What is the role that deep learning plays in your work?
Mac Schwager (18:03): Yeah, that's a great question. So deep learning is kind of a engineering revolution. I don't think it's overstating it to say that. So for the listeners who aren't familiar, deep learning is a certain brand of AI that uses, let's say, inspiration from the brain in the form of artificial neural networks. Deep learning has really accelerated in the past few years and has become extremely powerful in fields like computer vision.
(18:38): For example, your kind of face detector on your phone, if you pull out your camera and there's a little box around the faces, that's all deep learning, and deep learning is now inextricably linked up with autonomous driving. There's no way to field an autonomous car without some kind of deep learning in there somewhere. Now, the challenge that I face in my work ... In my work, I'm concerned about safety. This is, really, the underlying sort of exclamation point at the end of all my research, is, "How do we make robots safer? How do we guarantee that they're going to be safe?"
(19:10): And it's really important for robots as embodied AI agents to be safe because they hold human life and safety in their hands, in a sense, right?
Russ Altman (19:20): Yes. Yes.
Mac Schwager (19:20): So autonomous cars hold people, and if they're unsafe, those people are unsafe, and autonomous cars drive around pedestrians and bicyclists. We need to make sure that those lives are safe, and so on. The list goes on. And so what's tricky about deep learning is that, I would say that the practice of deep learning, that practical power has accelerated way beyond the theoretical framework we have for understanding when deep networks go wrong. So when deep networks fail, they fail spectacularly and unpredictably, and so one challenge that we face and one thing we're working on is, "How can you put a box around deep networks? How can you know when they're behaving well and when they're misbehaving?"
(20:07): And one of the challenges is that they misbehave very infrequently and in bizarre ways. I'll give you an example. So there was a case where a fleet of autonomous cars kept stopping on a country road, in the middle of a country road with no intersection, no traffic, they would just stop, and then proceed, and all autonomous cars from this particular fleet were doing that. And engineers went back and looked at what was common about the scenario across all these cars, and it was that there was a billboard for some attorneys, and there was a stop sign as a picture in the billboard, and the autonomous car just saw a stop sign, got to stop. It couldn't reason that this is actually a photograph as part of a billboard for the attorney's advertisement, right?
Russ Altman (20:56): And meanwhile, it's making much more complex decisions correctly while it's driving, but then, it sees a silly billboard, and it's out of its experience, and some engineer probably thought, "When you see a stop sign, better to stop."
Mac Schwager (21:09): Stop, yeah.
Russ Altman (21:10): Wow.
Mac Schwager (21:11): Yeah, and that's the key, is out of its experience. So the way deep learning works is that you train a deep network on a massive volume of data, which represents experience, basically, and if all of that data showed stop signs, stop, and none of the data showed stop signs as a photograph in a billboard, don't stop, then it will never have had that experience, so it won't know what to do. Just like people, we all are able to operate because of our experience, and if you meet at situation where you don't have any experience, you're likely to do poorly until you gain some experience, right?
Russ Altman (21:47): So a big focus is how to kind of make these deep AI networks more robust. How would you characterize the challenge? Is it they have to have more life experience, so to speak?
Mac Schwager (21:58): Yeah. So it's kind of a multi-pronged thing, I think. So I would say that the primary job is to determine when the neural network is operating out of its own experience. There's a technical term for this. It's called out-of-distribution detection, OOD detection.
(22:18): And so these neural networks are trained on a huge quantity of data, so it's not reasonable to, let's say, take a new scenario, and then book check it against the existing data. This is not time to do that. You couldn't store that existing data anyway, right? So somehow, you've got to just take the neural network as it is, which is just this computational object that's very hard to interrogate, look at its inputs and outputs, and infer whether it's operating out of distribution. And it turns out, there's some pretty good ways to do that, even though it sounds kind of impossible.
(22:48): You don't know what the right answer is, you just see inputs and outputs. How do you know if it's seen that example before, and whether or not it's giving a correct output? The way that you typically do this, or one good way to do this is to train a bunch of neural networks. It's called a neural network ensemble, maybe five, five neural networks, to do the same task, but neural networks can accomplish the same task in a lot of different ways, roughly speaking. So you train five different neural networks to accomplish the same task, and you let them vote, and when they disagree, probably that's because they're in a territory that none of them has seen before. It's very much like a human committee, right?
Russ Altman (23:29): Right.
Mac Schwager (23:30): Yeah.
Russ Altman (23:30): It's going to be very interesting because that example you gave, however simple, at the end of the day, you're going to have to have an autonomous vehicle that sees what's clearly a stop sign, but it has enough contextual knowledge to say, "I'm going to ignore that," and that seems to me like a big deal.
Mac Schwager (23:46): Yeah.
Russ Altman (23:48): There's going to be a lot of capabilities.
Mac Schwager (23:49): Yeah.
Russ Altman (23:49): So that will be something we will be following. But in the last few minutes, I definitely wanted to get to some of your work in non-car settings, including, I know a lot of work on drones.
Mac Schwager (23:58): Yeah.
Russ Altman (23:58): So what's the excitement about drones, and what are the technical challenges, perhaps that are different from what you're seeing in cars and land-based vehicles?
Mac Schwager (24:06): Yeah, yeah, absolutely. Actually, probably most of my research is focused on drones, and there are a number of things that are easier for drones and things that are harder. So one thing that's harder, of course, is drones live in 3D space. They're not stuck on the roadway, so they have more degrees of freedom, and their dynamics are a bit more complicated in terms of the maneuvers that they can accomplish, and so you have to contend with those more complicated dynamics. On the plus side, drones usually are flying up above the tree canopy, or if they're not above the tree canopy, at least they're above people's heads and above the ground traffic, and so there's fewer moving obstacles to contend with, so in that sense, it's a little bit easier.
(24:52): As far as the excitement, it depends on who you talk to, but what I'm really excited about for drones is the ability to use drones to gather information on the ground over large scale areas to do things like ecological monitoring. So in California, wildfires are a huge problem and a growing problem due to climate change, and so you could imagine that the U.S. Forest Service has a fleet of drones that they just constantly rotate in the skies above the national parks, and those things are just our wildfire monitors, right? They're our eyes in the sky, to find out kind of an early alarm for when and where forest fires break out. And then maybe there in the future, one could imagine that there are forest firefighting drone fleets that actually go take off and autonomously put out those fires-
Russ Altman (25:41): Oh, they deliver the water or the anti-retardant.
Mac Schwager (25:44): For the fire retardant, yeah. [inaudible 00:25:46], so-
Russ Altman (25:48): Now, what are the challenges for you as an engineer in that?
Mac Schwager (25:51): Yeah. So for me, in my particular research focus, the challenges are around coordination and collaboration. So yeah, if you've got all of the Sierra Nevada range and you want to monitor that mini million square miles of forest for forest fire risk, how do you do that? You're given a fleet of ... This is totally academic, right?
(26:13): There is no such fleet of forest fire monitoring drones, but in the future, you could imagine that there are 10,000 forest fire monitoring drones. Where do you send them? How do they use partial information to recruit other drones in the fleet to come [inaudible 00:26:31]-
Russ Altman (26:30): Oh, I love that. I love that. I need some backup. I need some backup drones.
Mac Schwager (26:34): Exactly, need some backup. Exactly, yeah. There's something funny here, but I don't have the battery life or the camera resolution to make sure that this is safe, so friends, gather around and help me out, this kind of thing. And so this is a lot of the algorithmic work that we do in my lab, is, "How do you get this kind of group emergent behavior from groups of robots?"
Russ Altman (26:58): Yes. And I could imagine, it's not hard to imagine a million other uses of drones that can be coordinated, and especially if you have a high-level control language that says, "I want a thousand of you. I want you to go here, and I want you to use strategy number seven to cover this area and see what's going on."
Mac Schwager (27:14): Right. Yes.
Russ Altman (27:14): Well, in the last minute, I just wanted to get your perspective on the future outlook on these multi-robot systems. I presume that you believe that this will be a net good.
Mac Schwager (27:27): Yeah. Okay.
Russ Altman (27:27): And so what do you think are the highlights of what we should look for in the future?
Mac Schwager (27:30): Yeah, no doubt. I mean, I wouldn't be in this business if I didn't think there was real promise in this technology. I mean, I think autonomous driving, it will come and it will make our lives easier, safer. We will be able to eat our cereal and read our newsfeed in the backseat of our autonomous car, or have a conversation with our loved ones, whatnot. It is just going to take time.
(27:53): It's common that big technologies are ... The early impact is overestimated, and the late impact is underestimated, and I think we're in that transition for autonomous driving. I think the honeymoon phase is over. We're seeing that there are a lot of long tail problems that need to be solved, but the impact will come, I think, and it will save lives in the sense of reducing accident rates, and it will reclaim a lot of the human hours that are lost in traffic today. And then when it comes to drones, the power that's possible is incredible.
(28:28): Just imagine that you have drones that can, as I said, look out for forest fire risk, drones that can monitor animal populations, drones that can ... We've actually done this. We've deployed drones in Antarctica to count penguin populations in penguin colonies.
Russ Altman (28:44): Oh, fantastic. I also know you have drones or some kind of autonomous vehicle that can shepherd animals.
Mac Schwager (28:51): Yeah, yeah. So we've written papers about, I'll say algorithms for drones to coordinate to shepherd animals. We haven't deployed, though. There's a lot of hardware challenges and engineering challenges in building a drone that can actually fly long enough and far enough to shepherd animals, but algorithmically, yeah, we can do things like that. You can also imagine drones in agriculture, really eking out every last bit of productivity of farmlands, or stopping pest infestations before they spread, replacing pesticides with more targeted and more ecologically and health-friendly methods of avoiding pests. So yeah, I think the technology is extremely promising.
Russ Altman (29:32): So we are going to declare the future of autonomous vehicles, especially in groups and especially when they're communicating. We're going to declare that to be bright. Well, you've been listening to The Future of Everything. Thanks to Professor Mac Schwager. This has been Russ Altman. You can follow me on Twitter, @Rbaltman, and you can follow Stanford Engineering @StanfordEng.