Next Generation Tegra (Xavier SoC): Custom ARM64 + Volta

Sweepr

Diamond Member
May 12, 2006
5,148
1,142
131
7 Billion Transistors - TSCM 16 FF
8 Core Custom ARM64 CPU
512 Core Volta GPU
New Computer Vision Accelerator
Dual 8K HDR Video Processors
20 TOPS DL, 160 SPECINT in 20W
Sampling Q4 2017

14750582635821081937877_575px.jpg


www.anandtech.com/show/10713/gtc-europe-2016-nvidia-keynote-live-blog-with-ceo-jenhsun-huang#comments
 
Last edited:

itsmydamnation

Platinum Member
Feb 6, 2011
2,847
3,387
136
Given its sampling in over a years time will be interesting to see if they hit their targets.

Other interesting bits will be what memory tech they package it with, is it just mobile soc renamed and retargeted given there lack of penetration in the mobile space or was it build from the ground up?
 

Sweepr

Diamond Member
May 12, 2006
5,148
1,142
131
AnandTech has an article about it now.

Xavier brings 20 Deep Learning Tera-Ops (DL TOPS) at 20W. To give some perspective that's:
- Same as Tegra PX 2 at 1/4 the power (their current 16nm FF product)
- 43% of what NVIDIA’s flagship Tesla P40 can offer in a 250W card

Bodes well for Volta.

So what’s Xavier? In a nutshell, it’s the next generation of Tegra, done bigger and badder. NVIDIA is essentially aiming to capture much of the complete Drive PX 2 system’s computational power (2x SoC + 2x dGPU) on a single SoC. This SoC will have 7 billion transistors – about as many as a GP104 GPU – and will be built on TSMC’s 16nm FinFET+ process. (To put this in perspective, at GP104-like transistor density, we'd be looking at an SoC nearly 300mm2 big)

Under the hood NVIDIA has revealed just a bit of information of what to expect. The CPU will be composed of 8 custom ARM cores. The name “Denver” wasn’t used in this presentation, so at this point it’s anyone’s guess whether this is Denver 3 or another new design altogether. Meanwhile on the GPU side, we’ll be looking at a Volta-generation design with 512 CUDA Cores. Unfortunately we don’t know anything substantial about Volta at this time; the architecture was bumped further down NVIDIA’s previous roadmaps for Pascal, and as Pascal just launched in the last few months, NVIDIA hasn’t said anything further about it.

www.anandtech.com/show/10714/nvidia-teases-xavier-a-highperformance-arm-soc
 

Sweepr

Diamond Member
May 12, 2006
5,148
1,142
131
Do I see 2x SMs with 256 CUDA cores each? That would be twice the number of CC per SM of Pascal (and 4x GP100).
 

itsmydamnation

Platinum Member
Feb 6, 2011
2,847
3,387
136
Given the 20W power draw you'd certainly hope it was a ground up design :)
Not really, Turn everything up to max on a mobile SOC and you can draw some power, remember shield needed active cooling.

Do I see 2x SMs with 256 CUDA cores each? That would be twice the number of CC per SM of Pascal (and 4x GP100).

Its a year away from sampling you dont "see" anything yet :)
 

Sweepr

Diamond Member
May 12, 2006
5,148
1,142
131
Adding to the post above, that looks life a fairly large 512 CC GPU, given that Xavier should be around 300mm² (7 billion transistors @ 16nm FF). Volta cores might be larger than Pascal.

Its a year away from sampling you dont "see" anything yet :)

It's fun to speculate though. :p
 

raghu78

Diamond Member
Aug 23, 2012
4,093
1,475
136
Those specs indicate Volta is a massive improvement in efficiency. Nvidia's pace of innovation is increasing and I doubt we will ever see AMD compete. Volta should make its debut in HPC Tesla in 2017 and then will most probably make its way to desktop Geforce by early 2018.
 

itsmydamnation

Platinum Member
Feb 6, 2011
2,847
3,387
136
Those specs indicate Volta is a massive improvement in efficiency. Nvidia's pace of innovation is increasing and I doubt we will ever see AMD compete. Volta should make its debut in HPC Tesla in 2017 and then will most probably make its way to desktop Geforce by early 2018.

Not really, there is no such thing as a free lunch and GCN is still very (very) good at compute. DL TOP's are 8 bit int ops not that very useful to graphics and GCN doesn't really have an issue their. You will notice they didn't list 32bit or 64bit FP ops, their is probably a reason for it. 8/16 bit ops see far more usage in mobile GPU then Desktop so if this product was originally/still is mobile targeted its a nice alignment to neural nets.

Pascal's and maxwell's weapon's have been their very good clocking and power consumption, not really much else (polaris has caught up in the front end/geometry area). There's no need to jump right into the hyperbole of AMD being massive behind yet, especially seeing Vega requires a complete new GFX isa.

If Vega sucks yeah sure Hyperbole away...... :screamcat:
 

dark zero

Platinum Member
Jun 2, 2015
2,655
138
106
Remember that AMD has K12 too.. and is the evolution of Seattle and is supposed to have better performance than ARM A72... maybe near A73 levels.
 

itsmydamnation

Platinum Member
Feb 6, 2011
2,847
3,387
136
When/if AMD gets around to K12. It should be significantly higher performance then A72/73 and target a higher power envelope anyway. K12 OOOE engine is bigger then Zen ( space free'd up from less complex front end then x86). I dont think AMD has a desire to enter the mobile/android market, K12 will appear if the ARM server market looks like it has a chance of going somewhere.
 

antihelten

Golden Member
Feb 2, 2012
1,764
274
126
It's possible many of these TOPS are coming from some other dedicated IP on the SoC. That'd explain the efficiency.

If not, then they would have had to increase the INT8 throughput by a factor of 4 to reach those numbers (currently Nvidia does 8 INT8 per cuda core per cycle with Pascal, so they would have to increase it to 32).

Pascal (P4):
2560 cores at 1.063 GHz = 8*2560*1.063 = 21,770 GOPS = 21.8 TOPS

Volta (Xavier):
512 cores at 1.2 GHz? = 32*512*1.2 = 19660 GOPS = 19.7 TOPS