NVidia announces Carmel ARM CPU

Nov 21, 2016
57
0
61
#2
is it wider than skylake and zen?
 

dark zero

Platinum Member
Jun 2, 2015
2,515
8
91
#4
So why nVIDIA doesn't dare to enter to the laptop market?
 

Qwertilot

Golden Member
Nov 28, 2013
1,413
37
106
#5
Not worth it & still tough w'out an x86 license, Q'Com non withstanding. This thing looks at least somewhat specalised for supporting their other computational stuff. Probably very much so.
 

NostaSeronx

Platinum Member
Sep 18, 2011
2,367
156
126
#6
9 Billion Transistors, 350mm squared , 12nm FFN... which means when shrinked it will be sub-200 mm squared.

Is that dual execution mode about AArch32/AArch64 or is it something else?

http://s22.q4cdn.com/364334381/files/doc_presentations/2018/01/JHH_CES2018_FINAL_PRESENTED.PDF


Carmel dual-core cluster zoomed.

Also, Zen is a 10-wide architecture based on Nvidia's definition.
Zen: 4 ALUs, 2 AGUs, 4 FPUs = 10-wide
Denver: 1 JSR, 2 IEUs, 2 FPUs, 2 LSUs => 7-wide
Carmel: 1 JSR, 4 IEUs, 3 FPUs, 2 LSUs => 10-wide ((This is a seronx estimate)) // (edit: 3 IEU, 3 FPU, 3 LSU, 1 JSR is more likely the longer I look at it.))

Guessing: 512 KB L2 cache?, Two 256 KB L1Ds?, Two 64 KB L1is? (L2 could be 1 MB, but that is super dense..)

L2 is the bottom unit or top(flip the image), then there is the L1Ds, L1is, and initial branch prediction/fetch logic. The integer registers are right next to the L1is, the FPU registers on the exact top, with the FPU units being around it. The FPU front-ends is in between the FPU registers, not next to the Integer schedulers, etc.
 
Last edited:

NTMBK

Diamond Member
Nov 14, 2011
8,300
280
126
#7
So why nVIDIA doesn't dare to enter to the laptop market?
They got burned once before with Windows RT, remember?

Anyway, NVidia don't currently make any SoCs suitable for laptops or tablets. They used to make tablet SoCs, but all their latest ones are more focused on automotive.
 

Nothingness

Golden Member
Jul 3, 2013
1,895
32
106
#8
Anyway, NVidia don't currently make any SoCs suitable for laptops or tablets. They used to make tablet SoCs, but all their latest ones are more focused on automotive.
You mean except for the Tegra X1 used in the Nintendo Switch?
 

NTMBK

Diamond Member
Nov 14, 2011
8,300
280
126
#9
You mean except for the Tegra X1 used in the Nintendo Switch?
The Tegra X1 is the last one that was tablet suitable, and is pretty old now- it's a 20nm SoC in a 10nm world. They announced it three years ago!
 

FIVR

Diamond Member
Jun 1, 2016
3,753
13
106
#10
I wonder if it's susceptible to meltdown a spectre
 

dark zero

Platinum Member
Jun 2, 2015
2,515
8
91
#11
They got burned once before with Windows RT, remember?

Anyway, NVidia don't currently make any SoCs suitable for laptops or tablets. They used to make tablet SoCs, but all their latest ones are more focused on automotive.
But now they have a real promise with Parker and Xavier!
And those SoCs now fits perfectly on Laptops.
Finally Microsoft would be pleased to add more ARM manufacturers to maintain their domain.
 
Jun 20, 2014
31
2
71
#12
Do we know if this is another transmeta design like denver?
 

Jan Olšan

Senior member
Jan 12, 2017
261
6
86
#13
Do we know if this is another transmeta design like denver?
They used the same terminology "x-issue superscalar" for Denver, so I would say the architecture principle is the same, still Transmeta.

is it wider than skylake and zen?
Zen has a 6-wide front end, and 8-wide retire.
I don't think you can compare them if Nvidia still uses the VLIW architecture with runtime JIT in software.
The original Denver was also caleld "7-issue superscalar", and it wasn't a tremendously strong core.




Speaking of Denver: https://twitter.com/FioraAeterna/status/855445075341398017 :)
 

NTMBK

Diamond Member
Nov 14, 2011
8,300
280
126
#14
They used the same terminology "x-issue superscalar" for Denver, so I would say the architecture principle is the same, still Transmeta.



I don't think you can compare them if Nvidia still uses the VLIW architecture with runtime JIT in software.
The original Denver was also caleld "7-issue superscalar", and it wasn't a tremendously strong core.




Speaking of Denver: https://twitter.com/FioraAeterna/status/855445075341398017 :)
Oh wow, that bug is fun!
 

Qwertilot

Golden Member
Nov 28, 2013
1,413
37
106
#15
It really is :) Just so long as its someone else's job to figure it out!
 

dark zero

Platinum Member
Jun 2, 2015
2,515
8
91
#16
And here we go again... Another bug?
 

Abwx

Diamond Member
Apr 2, 2011
8,870
213
126
#19
10 wide, even wider than Denver...
They count everything as an exe path, at this rate how much wide is Zen if 4 ALUs, 2 AGUs, the LSU and 4 FP pipes are all counted as exe ports..?
 

dark zero

Platinum Member
Jun 2, 2015
2,515
8
91
#20
Some Geekbench would be useful there in order to see how it fares against X86 and other ARM processors.
 

Hitman928

Golden Member
Apr 15, 2012
1,774
213
136
#21
Oh wow, that bug is fun!
I'm not really in this space, but I've heard rumblings that the Denver chip was a bug ridden mess. This was just one example that went viral (viral as far as CPU bugs go).
 

Nothingness

Golden Member
Jul 3, 2013
1,895
32
106
#22
Phoronix have got their hands on one of these, and ran a few quick benchmarks:



https://www.phoronix.com/scan.php?page=article&item=nvidia-carmel-quick&num=1

I don't think any of their tests are single threaded, sadly, so tricky to tell how ST performance has scaled.
Yeah, everything is multi-threaded.

Someone posted the same benchmarks with a properly configures Jetson TX2 board: https://openbenchmarking.org/result/1809258-RA-1809248RA57. Read the comments: https://www.phoronix.com/forums/for...quick-test-of-nvidia-s-carmel-cpu-performance

It looks like Phoronix runs the TX2 in default mode where only 4 cores are being used (the Cortex-A57). When the SoC is set to run its 6 cores, then TX2 is faster than Xavier.
 

dark zero

Platinum Member
Jun 2, 2015
2,515
8
91
#23
Yeah, everything is multi-threaded.

Someone posted the same benchmarks with a properly configures Jetson TX2 board: https://openbenchmarking.org/result/1809258-RA-1809248RA57. Read the comments: https://www.phoronix.com/forums/for...quick-test-of-nvidia-s-carmel-cpu-performance

It looks like Phoronix runs the TX2 in default mode where only 4 cores are being used (the Cortex-A57). When the SoC is set to run its 6 cores, then TX2 is faster than Xavier.
Something is telling to me that the Carmel board is not well configured...
 
Apr 27, 2000
11,827
1,042
126
#25
The Tegra X1 is the last one that was tablet suitable, and is pretty old now- it's a 20nm SoC in a 10nm world. They announced it three years ago!
There's a TX2 now. Nintendo was too cheap to use it in the first run of the Switch. Hopefully they'll come to their senses and use the TX2 in a hardware refresh somewhere down the road. Should extend battery life and/or enable Nintendo to use higher clocks when the console isn't docked.
 


ASK THE COMMUNITY

TRENDING THREADS