Could you spoon feed us/me?
I'll just talk about everything. All of these are possibilities don't take them to heart.
Section 1: Internal Interconnect, HBM, and Integrated Voltage Regulation
Interconnect; Network on Chip
- Delta Compression
- Hypertransport 4.0 and High Node Count specification
- Per compute unit or per compute unit array
Tahiti;
Aggregate L1 Bandwidth; 2 TB/s ÷ 32 = ~62.5 GB/s per CU.
Aggregate L2 Bandwidth; 710 GB/s ÷ 8 = ~88.75 GB/s per CU array.
Tahiti(per-CU/NoC);
Aggregate NoC Bandwidth; 4 TB/s
Tahiti(per-CU Array/Noc);
Aggregate NoC Bandwidth; 1 TB/s
NoC is fully coherent and is more energy efficient than the crossbar used today. This method should also be used with Freedom Fabric Gen 2.
High Node Count specification includes both external and internal fabrics.
Memory System; Heterogeneous
- Both HBM and DDR3/DDR4/GDDR5/GDDR5M
4xHBM + 16x DDR3/DDR4/GDDR5/GDDR5M
2xHBM + 8x DDR3/DDR4/GDDR5/GDDR5M
Voltage Regulation; Push-Pull Shunt Regulator
- Adaptive Voltage
- Per CU and Per Part
Best voltage and frequency for a given workload. It is more complex but it is mostly variation-adaptive as well. Which means more parts can pass selection.
Section 2: Architecture, External Network
- Quality of Service replaces pre-emption. Temporal or Spatial are the only options that I know of.
AMD is leaning towards spatial from what I can find. Spatial is simultaneous multitasking, multiple application workload running in the same clock cycle. Temporal is multiple application workload running in different clock cycles.
- Future CPU/GPU architectures will be semi-homogenous architectures that run different ISAs.
This starts with Zen/K12/Faraway. The GPU core will be short-pipeline with many MSIMD/MIMD vALUs and few SISD/SIMD sALUs. The CPU cores will be long-pipeline with few MSIMD/MIMD vALUs and many SISD/SIMD sALUs. The building blocks will be similar for the best exploitation of HSA.
- AMD has a solution ready to compete with NVLink.
Both solutions are compatible with PCI Express boards. NVLink is copying a HNC spec connector. It could for all we know be the same thing as HNC. Just with a fancy title as NVLink, much like how AMD called their version of Hypertransport, "Direct Connect".
http://i.imgur.com/b4T07D1.png
AMD's solution is modified and uses the PCI Express slot not just for power and mechanical support. It uses the PCI Express section for backwards compatibility and the physical data interconnect. The second piece is what enables the GPU to use Hypertransport. The main priority of the second piece is to process the header info.
http://img840.imageshack.us/img840/6254/htxpcie.png
Nvidia can 100x better slides than that but fact remains on action not on presentation.
Between, the three; AMD, Nvidia, Intel.
AMD has a high aggression rate from research to products. Just look at Andy Glew's MCMT*. Even, if it can have a bad outcome in certain areas. AMD has a higher chance than anyone to launch what they have researched.
*It was a research project from a single man. That became a product within four to six years with Montreal.