[WCC]Lake Crest Chip Aims At The DNN/AI Sector – 32 GB HBM2, 1 TB/s Bandwidth, 8 Tb/s Access Speeds

Det0x

Golden Member
Sep 11, 2014
1,028
2,953
136
Dont know if this have been posted in this forum yet, but here is a copy from wccftech.

Intel’s Lake Crest Chip Aims At The DNN/AI Sector – 32 GB HBM2, 1 TB/s Bandwidth, 8 Tb/s Access Speeds, More Raw Power Than Modern GPUs
Intel-Nervana.png

Intel has further detailed their Lake Crest chip that will be aiming at the deep neural network sector. The new chip will be based around the Nervana platform which would deliver an unprecedented amount of compute density in silicon that delivers more raw power than modern GPUs.

Intel’s Lake Crest DNN Silicon Detailed – Will Feature More Raw Power Than GPUs

With the rise of AI learning in the tech industry, GPU makers such as NVIDIA and AMD have made chips that are specifically designed for DNN (Deep Neural Network) workloads. Intel wants to enter this ground with the Lake Crest silicon which is said to deliver more raw power than the fastest DNN GPUs available today. The chip will feature technology developed by the deep-learning startup, Nervana.
Intel-Xeon-Lake-Crest-Deep-Learning-Software-Optimization_2-840x485.jpg

“We have developed the Nervana hardware especially with regard to deep learning workloads,” said Rao (Intel VP Datacenter Group and General Manager for AI solutions).”In this area, two operations are often used: matrix multiplication and convolution.” via Silicon.De
Intel-Lake-Crest-and-Knights-Crest-Nervana.jpg

The software/hardware firm was acquired by Intel in August 2016 for more than $350 Million US. The first chip and systems to utilize the new Nervana based technology would be known as Lake Crest and Intel also named the follow up as “Knights Crest”. The Nervana platform which consists of an entire range of deep learning (DL) optimized products which would include Lake Crest and the recently announced ARRIA FPGAs which can also be programmed for special requirements and tasks such as AI learning. Both Lake Crest and ARRIA FPGAs will be working together with Intel’s Xeon processors.

Intel Lake Crest Chips Will Feature Unprecedented Amount of Compute Density, 32 GB of HBM2 Memory and 8 Terabits per Second Memory Access Speeds

The Lake Crest chip will operate as a Xeon Co-processor but is entirely different to the Xeon Phi hardware. It is specifically designed to boost AI workloads at an unprecedented pace. Intel is using a new architecture to be known as “Flexpoint” which will be used inside the arithmetic nodes of the Lake Crest chip. This will increase the parallelism of arithmetic operations for the chip by a factor of 10. The chip will also feature a MCM (Multi Chip Module) design.
Intel-Xeon-Lake-Crest-Deep-Learning-Features.jpg

AI is still in its early days, Krzanich writes, and the underlying hardware that’s used to execute deep learning tasks is bound to change. “Some scientists have used GPGPUs [general purpose graphical processing units] because they happen to have parallel processing units for graphics, which are opportunistically applied to deep learning,” he writes. “However, GPGPU architecture is not uniquely advantageous for AI, and as AI continues to evolve, both deep learning and machine learning will need highly scalable architectures.” via HPC Wire
Intel-Xeon-Lake-Crest-Deep-Learning-Block-Diagram.jpg

The discrete co-processor will feature a total of 32 GB of HBM2 memory. This will come in the form of four 8-Hi stacks which will deliver a total of 1 TB/s memory speeds at the rated clock speeds of 2 GHz. The Lake Crest chips will be available for testing during first half of 2017 and will be sampled to limited partners in the second half of 2017. Also to note is that the memory access speeds are rated at a whooping 8 Terabits per second.

Intel has also revealed that the chip will be highly scalable which is something their CEO, Brian Krzanich, has already stated to be the path forward for AI learning. The chip will feature 12 bidirectional high-bandwidth links and seamless data transfer via the interconnects. These proprietary inter-chip links will provide bandwidth up to 20 times faster than PCI Express links.
Intel-Xeon-Lake-Crest-Architecture-Block-Diagram.jpg

A detailed Lake Crest block diagram has been posted by Golem.de which shows the chip in more detail. We can see four 8 GB HBM2 memory blocks that are separate from the main die but will be featured on the same chip interposer. The chip contains 12 processing clusters which will feature several cores. Exact number has not yet been determined. Each HBM2 memory has its own HBM controller so there are four in total. There are 12 ICL (Inter-chip links) blocks for each processing cluster. There’s also a CPU management controller, SPI, IC2, GIPO, PCIe controller (x16) and DMA.

http://wccftech.com/intel-lake-crest-chip-detailed-32-gb-hbm2-1-tb/
 

CHADBOGA

Platinum Member
Mar 31, 2009
2,135
832
136
They tried that and failed miserably. Usually a good idea to stop doing something if you do it for nearly a decade and prove utterly incompetent at it.
I don't get how they can fail at phones and expect to succeed at IoT in the long term.
 

Nothingness

Platinum Member
Jul 3, 2013
2,400
733
136
I don't get how they can fail at phones and expect to succeed at IoT in the long term.
My feeling is that they'll fail miserably again. At least for what most people call IoT, that is very small chips that can be put anywhere. What Intel currently calls IoT is based on Atom and much bigger; there's a market for such chips, but it's much smaller.

They really should concentrate on what they do best: big chips with lots of FLOPS :D
 

DrMrLordX

Lifer
Apr 27, 2000
21,617
10,826
136
IoT is poison anyway. Who wants to buy something that's such a massive security liability? Win10 is bad enough.

At least Lake Crest is hitting the other side of the spectrum. Looks um . . . beefy! That's a lot of HBM2. Makes me wonder if something like this might kill Snowy Owl before it has a chance to fly.

I mean, look at the monstrous thing. It's got nearly every buzzword associated with alleged next-gen AMD hardware all in one package. It's using an interposer . . . it has HBM2! High-speed proprietary interconnects!

Instead of something like Snowy Owl which is leaning on GCN cores, this thing has more raw computational power than any dGPU on the market. Er, supposedly. They're a bit iffy on that point, but they do claim more raw compute than "today's state-of-the-art" GPUs. Does that mean one Lake Crest is going to be faster than . . . Volta?
 

zinfamous

No Lifer
Jul 12, 2006
110,562
29,171
146
I keep hearing from various members here:

HBM2 8gb stacks does not exist
will not exist
or, at the very best, will not exist until ~late 2018.

So, which wccftech strange rumor is accurate? That earlier posted timeline for this shows 2017 for Nervana? (holly hell Intel needs to scrap that name, lol--I mean, what wizard came up with "neural = nirvana = great idea!!!"? :D)
 

DrMrLordX

Lifer
Apr 27, 2000
21,617
10,826
136
Yay for dupes.

Interesting you mentioned the HBM2 stacks. It's also on an interposer. How many users have we had here say that can't/shouldn't be done for an APU?

Intel is basically rolling out a high-end APU, replacing the graphics IP with some nebulously-named processing cluster.
 

lolfail9001

Golden Member
Sep 9, 2016
1,056
353
96
Wait a minute... HBM2? Didn't Intel have their own on-package memory tech in-use? Smells fishy if you ask me.
 

DrMrLordX

Lifer
Apr 27, 2000
21,617
10,826
136
It could have something to do with Micron. Intel partnered with them to develop the technology.

HBM2 is (or should be) cheaper, so it may be more realistic for them to implement it now than HMC. Or they may have looked at what AMD and Nvidia are doing/are planning to do with HBM2 and decided to use it themselves.
 
  • Like
Reactions: Drazick

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
It could have something to do with Micron. Intel partnered with them to develop the technology.

HBM2 is (or should be) cheaper, so it may be more realistic for them to implement it now than HMC. Or they may have looked at what AMD and Nvidia are doing/are planning to do with HBM2 and decided to use it themselves.

I think the other reason is because its not Intel technology. Lake Crest being "Intel" is much Intel as Arria 10 is Intel. At most, its Nervana's technology using Intel's 14nm process. More likely its a rename. Post #3 by Burpo is an indication. Arria 10 is using ARM cores because its a mere porting to 14nm. Look to future Arria products using Atom or similar performance to the ones they are using in their Edison products.

Knights Crest, I believe will be real "Intel" technology. Aftermath of combined Intel/Nervana efforts.

Why do I think so? Because Nervana was acquired by Intel August of last year. Even if we assume some combined effort existed between the two companies, I wouldn't say a total redesign is one of them. Which is why I say 14nm port being the most. Another is that they were caught totally offguard by Nvidia's success in Deep Learning field and the rise of Deep Learning.

Knights Mill is another reactionary measure to Nvidia's success. Most likely KNM is KNL with performance bugs fixed due to better yielding 14nm process and 8-bit precision added in to achieve their "4x performance" claim. KNL, as some might know was better specced(lower power, and higher performance) and slated a year earlier than the actual product.

Now, make no mistake. I am not claiming they are making a mistake. Just saying its not a totally new product.
 

DrMrLordX

Lifer
Apr 27, 2000
21,617
10,826
136
Hmm, interesting take on it. I had figured FPGAs could be in the picture, though . . . well I dunno, I know so little about what Arria was doing before Intel bought them that I'd rather not resort to baseless speculation when trying to sort out exactly how this thing is going to pwn Nvidia and AMD in raw compute power.

If there are FGPAs involved, then . . . hmm.
 
  • Like
Reactions: Drazick

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
Hmm, interesting take on it. I had figured FPGAs could be in the picture, though . . .

If there are FGPAs involved, then . . . hmm.

I am not saying Lake Crest is an FPGA. From what they are saying it's a very specific DL optimized design. FPGA, while it can be used likely has a more general purpose target.
 

Hans de Vries

Senior member
May 2, 2008
321
1,018
136
www.chip-architect.com
It is an Altera FPGA: The Stratix 10 MX


The Nirvana "Flex-point" format used instead of 16 bit floating point
is the same as "Block Floating Point", using a single exponent for a
whole block of numbers. This is typically used on FPGA's to use the
hardwired integer building blocks. The Stratix 10MX has 32 bit floating
point blocks but no 16 bit floating blocks so this is the most effective
solution possible.
 

NTMBK

Lifer
Nov 14, 2011
10,232
5,012
136
They aren't saying anything about the software side, which is a bit troubling to me. The fastest hardware isn't much use if you can't quickly write software for it; just ask AMD how their GPGPU efforts to unseat CUDA are going... They don't seem to be making effort to get this integrated into existing deep learning libraries https://www.quora.com/Is-Neon-support-planned-in-Keras
 

Glo.

Diamond Member
Apr 25, 2015
5,705
4,549
136
It could have something to do with Micron.
I was reading this with one eye closed, because of some problem with it, and read at first sight: "It could have something to do with Moron."

lol.

End of off-top.
 
  • Like
Reactions: Drazick