• We should now be fully online following an overnight outage. Apologies for any inconvenience, we do not expect there to be any further issues.

Updated Knights Landing (KNL) Info.

ShintaiDK

Lifer
Apr 22, 2012
20,378
146
106
http://vr-zone.com/articles/intel-unveils-knights-landing/79686.html

211.jpg

212.jpg


Its still on track to be the first stacked DRAM product we gonna see. dGPUs may follow sometime in 2016 or later.

The silvermont cores got extended with 512bit AVX3.2 support and 4 threads per core. No TSX support in the cores, else they should be completely compatible with any instructions on the CPUs at the time.
 

NTMBK

Lifer
Nov 14, 2011
10,455
5,842
136
Interesting that they have partnered with Micron- I expected Intel to fab the on-package memory themselves.

Looking like a very nice product- full binary compatibility means that the x86 ISA will actually have a point. The ball's in NVidia's court now, if they don't want to lose Tesla customers they need to get stacked RAM out fast.
 

AtenRa

Lifer
Feb 2, 2009
14,003
3,362
136
http://vr-zone.com/articles/intel-unveils-knights-landing/79686.html

211.jpg

212.jpg


Its still on track to be the first stacked DRAM product we gonna see. dGPUs may follow sometime in 2016 or later.

The silvermont cores got extended with 512bit AVX3.2 support and 4 threads per core. No TSX support in the cores, else they should be completely compatible with any instructions on the CPUs at the time.

It clearly says On-package, this is the same thing as Iris-Pro. Stacked Ram is On-Die.
 

NTMBK

Lifer
Nov 14, 2011
10,455
5,842
136
It clearly says On-package, this is the same thing as Iris-Pro. Stacked Ram is On-Die.

Not necessarily. You can have a stack which is on-package, next to the main CPU/GPU die. This is exactly the same as what NVidia are doing in Pascal (and what AMD will probably do too).
 

ShintaiDK

Lifer
Apr 22, 2012
20,378
146
106
It clearly says On-package, this is the same thing as Iris-Pro. Stacked Ram is On-Die.

Hynix HBM is portraited the exact same way. Visual example:
HBM-SoC.jpg


Green=Package.

Not sure where you got the assumption that stacked DRAM had to be ondie.

As as said by NTMBK, nVidia Pascal prototype with on package stacked DRAM as well:

PascalBoard.jpg
 
Last edited:

AtenRa

Lifer
Feb 2, 2009
14,003
3,362
136
The Intel and the the two above (HBM and NVIDIA) are On Interposer.
Global Foundries TSV(Through Silicon Vias) is 3D Stacked On-Die.

Edit: You can also see that HBM Memory Chips are also 3D Stacked (one layer upon the other)
 
Last edited:

jdubs03

Golden Member
Oct 1, 2013
1,291
904
136
Its interesting that the uArch is silvermont/airmont? at 14nm, rather than Goldmont, which I expect to be significantly higher performing, maybe that uArch is for the 2017 product at 10nm.
 

ShintaiDK

Lifer
Apr 22, 2012
20,378
146
106
The Intel and the the two above (HBM and NVIDIA) are On Interposer.
Global Foundries TSV(Through Silicon Vias) is 3D Stacked On-Die.

Edit: You can also see that HBM Memory Chips are also 3D Stacked (one layer upon the other)

You seem to confused Interposer (2.5D) and Vertical stacking (3D) with 3D memory stacking. And 3D memory stacking isnt new.

HBM-Memory.JPG


For high performance devices they will use 2.5D due to the thermal issue.

And the reason why its not coming anytime soon(size):
TSV-Roadmap.png
 

ShintaiDK

Lifer
Apr 22, 2012
20,378
146
106
Its interesting that the uArch is silvermont/airmont? at 14nm, rather than Goldmont, which I expect to be significantly higher performing, maybe that uArch is for the 2017 product at 10nm.

Remember its a modified silvermont. AVX3.2, 4 threads per core etc.
 

Nothingness

Diamond Member
Jul 3, 2013
3,309
2,382
136
Its interesting that the uArch is silvermont/airmont? at 14nm, rather than Goldmont, which I expect to be significantly higher performing, maybe that uArch is for the 2017 product at 10nm.
It doesn't matter: what matters are the wide vector units and the multiple threads. It also means they probably didn't keep much from Silvermont, adding SMT to a processor requires many changes all around the place. The interconnect is also surely vastly different. And the memory controllers have nothing to do.

I guess calling it a "Silvermont Arch" core means little except that it's a two way superscalar with OoOE core and also that it's low power.
 

AtenRa

Lifer
Feb 2, 2009
14,003
3,362
136
Yes my bad i forgot to mention i was talking about 3D Stack.
What i was trying to point out is that Haswell Crystalwell was the first commercial x86 CPU with On-Package Memory. KNL will also use the same technology.
 

mavere

Member
Mar 2, 2005
196
14
81
They didn't mention latency in their otherwise extremely enthusiastic promotional deck, so I'm going to assume that things are as "bad" as GDDR5 or worse.

I'm guessing that's why HT is now at 4 threads/core even though one wouldn't think there'd be enough resources to support all 4 threads well. The extra threads would hide how each ns of downtime now mean more lost potential FLOPS for a task.
 
Last edited:

Nothingness

Diamond Member
Jul 3, 2013
3,309
2,382
136
They didn't mention latency in their otherwise extremely enthusiastic promotional deck, so I'm going to assume that things are as "bad" as GDDR5 or worse.

I'm guessing that's why HT is now at 4 threads/core even though one wouldn't think there'd be enough resources to support all 4 threads well. The extra threads would hide how each ns of downtime now mean more lost potential FLOPS for a task.
The 4 threads already exists on the current Xeon Phi and their use is more to hide instruction latency than external (or on package) memory. See this article for instance.

Also note that memory latency depends a lot on where your core is located due to the memory controllers and the cores being on a ring bus. Getting the most out of such an architecture surely is difficult :)
 

Homeles

Platinum Member
Dec 9, 2011
2,580
0
0
Interesting that they have partnered with Micron- I expected Intel to fab the on-package memory themselves.
Eh? Intel announced that they partnered with Micron for HMC years ago.
They didn't mention latency in their otherwise extremely enthusiastic promotional deck, so I'm going to assume that things are as "bad" as GDDR5 or worse.

I'm guessing that's why HT is now at 4 threads/core even though one wouldn't think there'd be enough resources to support all 4 threads well. The extra threads would hide how each ns of downtime now mean more lost potential FLOPS for a task.
That's an odd conclusion to come to, given that the memory moves closer to memory controllers.
AMD has an exclusivity agreement for the 4 Gb/8 Gb 2y-nm 4-Hi stack sizes. TSMC will have HBM and HMCC support in place by Q4 2014.
Do you have a source for that? I'd be interested in seeing it. Fits in pretty well with Nvidia's Pascal release date.
 
Last edited:

Idontcare

Elite Member
Oct 10, 1999
21,110
64
91
Will be interesting to see in what way this technology branch evolves and filters down into mainstream CPU platforms.
 

BrightCandle

Diamond Member
Mar 15, 2007
4,762
0
76
I have a lot of optimism that this is the right route for parallel computing for the future, its just more widely applicable to the wider software market than GPU architecture.
 

Cerb

Elite Member
Aug 26, 2000
17,484
33
86
That's an odd conclusion to come to, given that the memory moves closer to memory controllers.
The distance is just one part of the latency. There's also the latency of addressing lines in the arrays inside the memory chips. GDD5 has not great at that.

However, with so many threads, plus decent amounts of cache, RAM latency is not likely a high priority.
 

DrMrLordX

Lifer
Apr 27, 2000
22,945
13,028
136
So, according to the slides, this chip isn't restricted to only PCI-e slots. Any chance we'll see it (or a successor) sharing the QPI ring bus with "normal" Xeons?