Article Tesla Dojo Chip

Leeea

Diamond Member
Apr 3, 2020
3,625
5,368
136
The Tesla Dojo Chip Is Impressive, But There Are Some Major Technical Issues
The two things that stood out to me were CFP8 and the magic compiler solution.

The CFP8 data type is odd to me, going to a 8 bit data type now seems completely opposite of the general trend. An 8 bit floating just seems baffling. Assuming they do a 5 / 3 split that is 0-32 for the significand and 0-7 for the exponent. Just does not seem much resolution for driving a car. But the lack of memory in the design seems to force CFP8.

Tesla is also claiming they have/will have a compiler that automatically optimizes for their hardware. Having seen this claim made before, and the success rate associated, this seems unlikely.
 

dullard

Elite Member
May 21, 2001
25,065
3,413
126
The CFP8 data type is odd to me, going to a 8 bit data type now seems completely opposite of the general trend. An 8 bit floating just seems baffling. Assuming they do a 5 / 3 split that is 0-32 for the significand and 0-7 for the exponent. Just does not seem much resolution for driving a car.
1) This is for training AI, not for driving a car. Training is the step where you take in massive amounts of data and let the computer figure out what the data means and how to use it. A whole different chip should be used in the field for driving.

2) AI is not extremely precise. It doesn't have to be. Think about car data. The incoming data is driving probably things like speed, angle of wheels, etc. What is 8 bits on the speed of a car? Suppose Tesla doesn't train their cars with data going over 100 MPH. 8 bits then gives you a resolution of speed of 100 MPH / 256 = 0.39 MPH. When have you ever needed to know or control your car speed with less than 0.39 MPH resolution? If you are driving at 65.00 MPH instead of 65.38 MPH, would your decision whether or not to hit the brakes change? Or with wheel angle, the most you can possibly turn most cars is through 130° of wheel angle. 256 bits thus gives you at worst a 0.5° angle resolution. When have you thought that you need to adjust your steering by less than 0.5°?

If the AI was 16 bits instead, then you'd have speed resolution of 0.002 MPH. Do cars really have that level of speed control? Do they really have that level of accuracy in the speedometer? No. Those extra bits of information are just useless for AI. Useless bits means more power, more memory, and slower calculations. AI is all about as much calculations as you can possibly do, even if the results are not perfectly accurate. The interference application (actually driving) can just calculate another speed on the next clock tick.
 

Tuna-Fish

Golden Member
Mar 4, 2011
1,349
1,534
136
The CFP8 data type is odd to me, going to a 8 bit data type now seems completely opposite of the general trend.

No, it's perfectly in line of the general trend of ML accelerators reducing precision for more computing power. There is serious discussion of 4-bit types. In general, more nodes with less precision produce better results with less power use than fewer nodes with more precision.

An 8 bit floating just seems baffling. Assuming they do a 5 / 3 split that is 0-32 for the significand and 0-7 for the exponent.
It's probably more like 1 or 2 bits for the significand and 6 or 7 for the exponent.

Just does not seem much resolution for driving a car.

ML does not require a lot of precision, regardless of what it's being used for. No result of the system directly maps to any quantity stored in any single weight.

But the lack of memory in the design seems to force CFP8.

It's probably the other way around, they chose CFP8 and then sized the memory to fit their needs. The author of that article seems a bit out of the loop, first he worries about the lack of memory and then seems mystified why it has so much IO. The basic idea of these kinds of gigantic-scale accelerators is that you keep the weights stationary in the device and then stream in the inputs and stream out the outputs. If you need to fit more weights, it also means you need to do more compute, so instead of adding sram to the design, you scale outwards and buy more hardware. This of course does not work for the kind of general-purpose workloads that, say, nVidia targets, where they want a small system or even a single card to be able to work on large problems by just taking more time to do it. Instead, these systems will always be sized to the models they work on.
 
Last edited:

Tuna-Fish

Golden Member
Mar 4, 2011
1,349
1,534
136
The incoming data is driving probably things like speed, angle of wheels, etc. What is 8 bits on the speed of a car?

No, no, no. In no world would that be a good idea. This isn't used to literally train an AI to drive a car, it's being used to train an ai that does parts of the problem of driving a car. Most crucially, image recognition and visual reasoning. The input data is not quantities like angle of wheels, it's images, and every input probably describes something like brightness of a single color channel of a single pixel of image. The ultimate outputs of the system is a massive array of confidences of things it thinks it sees.

(Such as, one output is how certain the AI is that there is a car in the lane ahead going in the same direction.)
 

dullard

Elite Member
May 21, 2001
25,065
3,413
126
No, no, no. In no world would that be a good idea. This isn't used to literally train an AI to drive a car, it's being used to train an ai that does parts of the problem of driving a car. Most crucially, image recognition and visual reasoning. The input data is not quantities like angle of wheels, it's images, and every input probably describes something like brightness of a single color channel of a single pixel of image. The ultimate outputs of the system is a massive array of confidences of things it thinks it sees.

(Such as, one output is how certain the AI is that there is a car in the lane ahead going in the same direction.)
So, you want it to output whether or not a car is ahead going the same direction, but also NOT know which direction either car is going? Think about that.
 

Tuna-Fish

Golden Member
Mar 4, 2011
1,349
1,534
136
So, you want it to output whether or not a car is ahead going the same direction, but also NOT know which direction either car is going?

It's not going to output that. All it's doing is figuring out "this is a car". This is done very often, then the rest happens outside ML, feeding the objects from the ML system to some kind of simulation.
 

Cogman

Lifer
Sep 19, 2000
10,277
125
106
Tesla is also claiming they have/will have a compiler that automatically optimizes for their hardware. Having seen this claim made before, and the success rate associated, this seems unlikely.

Eh... looks like LLVM is likely doing all the heavy lifting. Languages like Rust do little optimizing in their compilation process before handing things off to the LLVM, yet it gets near C performance.

It's not going to output that. All it's doing is figuring out "this is a car". This is done very often, then the rest happens outside ML, feeding the objects from the ML system to some kind of simulation.

So, they've moved a bit beyond the "this is a car" stage and (at least from the presentation) it looks like they are also pumping out "That car is the same car I saw in the last frame" as well as "I think there should be a car there even though my vision is occluded".

Using those properties they are able to pull out things like speed.