Tesla AI Day 2021 featured executives presenting the full range of the company’s artificial intelligence efforts, from computer vision to planning and controls to simulation to data infrastructure to automotive super computers to data center supercomputers. Elon Musk capped the event by announcing a humanoid robot that Tesla
The individual details are fascinating, often amazing and sometimes unconvincing. But the sum of the parts is simply that Tesla is doing a lot. Again and again, Tesla revealed that it is building in-house software, hardware, tools, and processes that most other companies would outsource to specialist suppliers.
Andrej Karpathy, Senior Director of Artificial Intelligence, took the first slot on stage to dive deep into the architecture of Tesla’s neural networks. Karpathy revealed that four years ago, when he joined the company, the neural networks used on the vehicle each took as input a single image from a single camera.
In the years since his arrival, Karpathy’s team has developed a new neural network architecture that takes advantage of both time and space metadata to improve the system’s performance. Specifically, a single network backbone takes in all of the images from all of the cameras on the car, and use the combination to better understand the holistic environment.
The new architecture also “remembers” previous image inputs. For example, a traffic sign that the car passed a moment ago could still affect the vehicle’s perception of and predictions about the future, even after that sign is no longer in view.
Perhaps most interestingly, Karpathy explained how Tesla changed the system’s outputs. Instead of outputting lane lines or object detections, which are the type of results that humans can understand, the neural networks now generate outputs in “vector space”. That is, the outputs maximize their usefulness to other parts of the AI system, even if those smae outputs might not be immediately intelligible to humans.
Planning & Control
Ashok Elluswamy, who leads the Planning & Controls Team at Tesla, took the stage next. Elluswamy walked through a number of the challenges in analytically solving the vehicle planning problem, and then shared Tesla’s application of neural networks to overcome these issues.
Specifically, Tesla applies an algorithm called Monte Carlo tree search, which has previously proven successful in training computers to play games like Go. Training a neural network to execute this algorithm leads to much faster convergence on a viable path, relative to algorithmic graph search. Elluswamy used a parking lot example to illustrate the gap in performance between deterministic search algorithms and neural network approaches.
Tesla has moved its data labeling in-house, another task the auto manufacturer is doing itself, even though most companies would outsource this. Karpathy returned to the stage to share that Tesla’s data labeling team is now based in the US and focuses on generating computational tools to label data in vector space, as opposed to paying low-cost off-shore human labelers.
As an example of the efficiency that Tesla has achieved, Elluswamy shared that the effort to remove radar dependency required 10,000 labeled video clips. Outsourced labelers would have taken several months to complete that task. Tesla’s in-house, software-centric labeling team completed the task in a week.
Simulation, like so many of Tesla’s software projects, has become heavily reliant on artificial intelligence. The company uses adversarial machine learning techniques to improve the photorealism of its simulator, to the point that the example clip Tesla shared was nearly indistinguishable from a real life video feed.
Ganesh Venkataramanan, Senior Director of Autopilot Hardware, then presented Project Dojo, Tesla’s effort build the world’s fastest supercomputer. Venkataramanan carefully explained the building blocks of the computational hardware, starting with 1 terraflop training nodes, 354 of which live together on a matchbox-sized “D1” chip.
Tesla optimized the training nodes for neural network computation, with a focus on parallel matrix multiplication. And the team custom designed the D1 chip with 7 nanometer silicon etching technology.
Tens of thousands of D1 chips come together on a pizza box-sized “training tile”, which has 9 petaflops of computational power.
One million training tiles will make up the ExaPOD, Tesla’s data center supercomputer.
During the presentation, Venkataramanan spoke of this hardware architecture in the present tense. However, at the end of the presentation, he revealed that the first training tile had arrived at Tesla only last week, indicating that Project Dojo remains a long way from completion.
Finally, Elon Musk himself took center stage to present the Tesla Bot, a humanoid robot designed to complete dangerous, repetitive, and boring tasks.
Musk explained the logic of applying Tesla’s robotics and AI expertise toward building non-automotive robots, calling Tesla, “arguably the world’s biggest robotics company.” With the progress on Tesla’s neural networks and hardware, Musk said, “it kind of makes sense to put that onto a humanoid form.”
Although he predicted the first Tesla Bot prototype would come together next year, Musk also cautioned several times that the Tesla Bot “is not real.” Although Musk also promised that it “will be real.”
Tesla’s ambition in artificial intelligence is breathtaking. Few companies have the resources to commit to even one of the several areas in which Tesla is breaking new ground.
In the question and answer session at the end of the event, Musk predicted that Tesla’s computational hardware would outperform competitors’ because Tesla designs its hardware to accomplish one task, not many.
Ironically, that same principle could apply to Tesla itself, as the company works on not only autonomous vehicles, but also building world-leading neural networks, data labeling, simulators, computational hardware, and humanoid robots.