Featured Post

This is the Kodak Moment for the Auto Industry

Plug-In Drivers Not Missin' the Piston Electric vehicles are here to stay. Their market acceptance is currently small but growing...

Friday, July 31, 2020

Tesla and The March of Nines to Full Self Driving


“It always seems impossible until it's done.” ~ Nelson Mandela



Tesla is working on full self-driving (FSD) cars. Some have said this is impossible. When it is done, this will be added to the growing list of things that Tesla has achieved that were once branded impossible. These once impossible achievements were not always delivered on the promised timeline, but they, nonetheless, arrived. Trent Eady, (the same person tweeting to Elon Musk in the image above) said it well when they wrote, “If Musk promises you the moon in six months and delivers it in three years, keep things in perspective: you’ve got the moon.” How long will the FSD moon take to be delivered? That's what we'll explore below.

In early July of 2020, at the World Artificial Intelligence Conference Musk said, “I’m extremely confident that Level 5 autonomy, or essentially complete autonomy, will happen, and I think it will happen very quickly. I remain confident that we will have the basic functionality for Level 5 autonomy complete this year.”

There’s a massive amount of work with each order of magnitude of reliability. This is the long 'March of the Nines'.

In the Q2 Financial update later in July, Musk reiterated his confidence in FSD, “The car will seem to have just like a giant improvement. We’ll probably roll it out later this year. [It will] be able to do traffic lights, stops, turns, everything pretty much. Then it will be a long march of nines, essentially. How many nines of reliability are okay? So it’s definitely way better than human, but how much better than human does it need to be? That’s actually going to be the real work. There’s just a massive amount of work with each kind of order of magnitude of reliability.”

What Are Nines

Musk mentioned the "nines of reliability." What are the nines? There are plenty of systems where 99% reliability is sufficient. If a video game crashes occasionally, it might be annoying, but no real damage was done. Whereas, something like a flight control system needs to be 99.999% reliable or better. However, it can be tedious to say, “ninety-nine point nine nine nine," so the verbal shorthand is to ignore the decimal point and just say the number of nines, e.g., 99.999% is called five-nines. It would be nice if we had 100% reliable systems, but that is an impossibility. Failures occur, components age, cosmic rays flip bits... so you have a backup, but the backup could fail too, so you have a backup for the backup, but that could fail too... Each layer of backup improves the overall system reliability, but, short of an infinite number of backups, it's not impossible that all of the backups fail at once either coincidentally or due to a common cause.

Why Nines Matter

Here's a simple example of why 99% is not good enough. There are about 150 billion credit card transactions each year totaling about $10 trillion. If these transactions were correct 99% of the time, that would be 1.5 billion transactions (~$100 billion) with errors each year. A system at this scale needs to be better than 99% reliable. Five-nines (99.999%) of reliability would reduce the annual error rate to “only” 15 million errors per year. Seven-nines would reduce it to 150,000 errors (still $10 million in annual errors). This is a system where it literally pays to improve reliability.

What is the March and Why is it So Long?

There are a few ways to look at this and it is different for any effort. Generally speaking, the more complex the system, the more difficult it is to improve its reliability. In a complex system, it can be hard to see the 2nd and 3rd order effects of potential changes.

There are several ways to view this concept; let's look at the 80/20 Rule.

The 80/20 Rule or Pareto Principle has many applications. For our purposes, we'll consider software development and we'll call feature-complete the 80% mark of the effort for a highly reliable application. Let's say that 80% effort took 8 months. That's an average of 10% each month, so the project should be 100% complete in just 2 more months, right? Unfortunately, the last 20% does not scale linearly like the first 80%. This last 20% is where all the hard problems live. These are the bugs that only show up intermittently, in full integration testing, the race-conditions, the new bug fix that would require nearly a complete rewrite, or the scalability problems that only show up at your biggest customer's site... 

That remaining 20% becomes its own 100% effort. An additional eight months later you might have 80% of that 20% done, and the cycle repeats. The number of iterations you go through depends on the level of dependability that your application needs. Let's look at a progression and see how long it would take for this fictional application to reach five-nines and just for fun, let's look at ten-nines too.

Cycle    Reliability %       Nines
180~1
2961
3992
499.8~3
599.973
699.994
799.9995
899.99975
999.999956
1099.999997
1199.999998~8
1299.99999968
1399.99999999
1499.99999998~10

According to our 80/20-rule table, it will take 7 development cycles to hit five-nines. In this example, each cycle was 8 months, so that's 56 months or 4 years, 8 months.

Imagine the conversation where you were 8 months into a project and you were 80% done and then you told your boss or customer that the last 20% will take 4 more years. They might think you're sandbagging them. It's hard to believe that it could take as long to go from 99.99% to 99.999% as it did to go from zero to 80% but this is why this is often referred to as “the long tail."

The 80/20 rule is straightforward, but as I mentioned at the start, no two projects are the same, progress is made in fits and starts and the 80/20 rule is just a rule of thumb and only one possible model.

If the problem that you're tackling has a long tail, then early progress must not be linearly extrapolated to determine a likely completion date.


Another, more academic, method to view the long tail is the Empirical Rule. The empirical rule is also known as the "68–95–99.7 rule." You can find tons of equations in project planning books on this, but we'll keep it simple here. With this method, each iteration is accounting for another standard deviation of input, defects, system behavior... on a normal continuous probability distribution curve.

Cycle   Reliability %    Nines
1680
2951
399.72
499.994
599.99996
699.9999998~9

If our hypothetical application follows the empirical rule, we'd achieve five-nines in just 5 cycles or 3 years, 4 months. Remember when it seemed like we could be done in just 10 months? If the problem that you're tackling has a long tail, then early progress, although great, should not be linearly extrapolated to determine a likely completion date.

How Good Are Human Drivers?

The goal is for an AI driving systems to be better than human drivers. In our article, “AI Driver: Safer Is Not Enough," we discussed why self-driving cars will need to be more than just a little better than human drivers, but let's just look at human drivers and see where that bar is set.

Despite the accident reports that cause mile-long traffic jams that seem to happen all too often, human drivers do a remarkably good job, all things considered. Humans have poor reaction time, are unable to look in multiple directions simultaneously, are distractable, have several blind spots, occasionally fall asleep at the wheel, drink & drive, have medical issues... yet humans are only involved in an injury collision about once every 1 million miles, and a fatal crash only once per 100 million miles or so. This is an injury collision avoidance performance of six-nines and a fatal collision avoidance rating of eight-nines.

Applying the Nines to Tesla Full Self Driving

Musk did not promise FSD by the end of 2020, he said he was confident that they would have “basic functionality for Level 5" by the end of the year, then “the real work" begins. Musk stated, “There’s just a massive amount of work with each kind of order of magnitude of reliability." I think Musk's assessment of the “real work" effort after feature-complete is accurate and an under-appreciated aspect of system development; remember our simple 8 months to feature-complete project that took another 3 to 4 years to reach five-nines. As we see from the human driving data, FSD will need at least six-nines to be as good as a human.

Every Tesla made today has eight cameras, a front-facing radar, and ultrasonic sensors. These sensors are important, but the heart of the system is a deep learning artificial intelligence. All of the various sensor data, GPS info, navigation, speed data, and more are streamed to the AI system where it attempts to make sense of the world around it, make real-time decisions, and get you to your destination without an insurance claim or a hospital visit.

The hard part of a self-driving system is not simply staying in a well-marked lane; it's dealing with all of the edge-cases. Computer systems interacting with each other can have a massive number of edge-cases. Self-driving systems have to interact with the real world, a smorgasbord of edge-cases: occluded signage, rain, snow, dirty cameras, construction, something falling off a truck, potholes, animal crossings, people in costumes, unpredictable human drivers, bicyclists, runners, scooters, skateboarders...

Some have asserted that the tail is so long that it will be impossible for an AI system to drive a car until AI has common sense and understands things like a person looking at their phone is not paying attention and that a person in a costume is still a person. Plus many situations at intersections are resolved with eye-contact and hand waves, how will an AI navigate this? These certainly are difficult problems, but that's what makes engineering interesting. They will be solved, without requiring an AI to be conscious, the only question is when.

When Will Tesla Achieve Level 5? 

To know when you're done with a project, you have to know the goal. Going through this, we've established some of the criteria:
  • Hit feature-complete, so the "real work" can begin
  • Better than a human driver (better than six-nines)
  • Able to handle novel situations safely
Using the methods we've outlined above, we need to need to know how long it took to get to feature-complete. Let's assume Musk is correct and feature-complete will happen in December of 2020. Now we need to know when AutoPilot development started. This is a more complicated question. Musk first mentioned Autopilot publicly in May of '13. Certainly, they had started working on it if it was discussed publically. Using this starting point, it would be 91 months to go from zero to feature complete. However, Tesla initially worked with Mobileye on the Autopilot 1 system. Autopilot 2 started shipping in October of 2016. This is when the sensor suite that's in production now was first seen. Using this starting point, it would be 51 months from start to feature-complete.

Andrej Karpathy became Tesla's director of artificial intelligence in June 2017. With Karpathy's arrival, the direction for Autopilot development was shifted greatly with more operations moved into a unified neural net backbone with multiple heads (dubbed the hydranet). Using Karpathy's arrival as the starting date would yield 42 months.

In April of 2019, Tesla released Hardware 3 and referred to the inference engine hardware as their FSD computer. This is a Tesla-designed custom system-on-a-chip (SoC) to run their neural network. Tesla claims that the new system was 21 times faster than their previous vendor-supplied solution. This is when Tesla said that they had the hardware platform that they required for FSD to be achieved. Based on this date, the time to feature-complete would be 20 months.

Now, which of these dates should we select as our start? I don't want to keep "moving the goalpost" and allow any significant event to be a restart point, yet I don't want to allow false starts or work by suppliers to count against the time either. Given these competing goals, I'm selecting Andrej Karpathy's start date as the legitimate beginning for the current direction for Tesla's FSD direction. (Let me know which date you'd select.)

Given Karpathy's start date and a possible feature-complete of December 2020, that's 42 months from start to feature-complete. So looking at the two models we have above, how long would it take to reach the goal of better than six-nines?

The 80/20 rule would need 9 iterations (8 more after December). That would be 8 * 42 months or 28 years to get to six-nines. By this model, the steering wheel could be deleted from the parts list in December of 2048. Let's look at the other model.

The empirical model gets to six-nines in only 5 iterations (4 more after December). That would be 4 * 42 months or 14 years before you could fall asleep and wake up safe and sound at your destination.

It's possible that we won't see self-driving cars until 2034, but let's use the more recent HW3 date. You could make a case that until this hardware was available, the AI was severely limited and this bottleneck hampered progress. Using this more optimistic date, it was 20 months from power-on to feature complete. And since we're going for the optimistic model, let's use the empirical model. Five iterations (4 more after December) would be 4 * 20 months or 6 years 8 months. That puts the 'sitting in New York and summon your car from LA' date as August of 2027.

Before you assume these models are accurate, let me assure you, they. are. not. These are rules-of-thumb based primarily on people debugging complex systems, not deep learning AI. They are based on projects that occur within a handful of years on a single generation of hardware. AI is still a nascent field, major breakthroughs are still occurring. Moore's Law yields periodic doubling of computing performance. In a mature technology, you don't see a 21x performance boost like Tesla's HW3 effort. And neural nets evolve on a non-linear "S"-shaped sigmoid function which means they can quickly go from incompetent to mastery.


The point of this long entry was to attempt to determine when we might see robo-taxis on the road. Toward this effort, we've generated estimates from 2027 to 2048. This ~20-year window seems large but if you're reading this, it means it is likely to occur within your lifetime. What I can guarantee is that new driver-assist features will continue to roll out and improve each year. And when self-driving cars happen, it will be a step-function in human history. Self-driving cars will join the list of humanity's greatest breakthroughs along with the wheel, electricity, and powered flight; it will save more lives than Penicillin, and yet it will be as taken for granted as quickly as the self-piloting elevator.

Disclosure: I'm long Tesla stock.