@Exist50 Don't know why you think it needs Intel 3 to be competitive with TSMC N4. Already SemiWiki and Wikichip says Intel 4 is between N5 and N3 and is closer to N3. Perhaps in density you are right but Intel 3 is another 18% gain in performance, that's a gain equal to N7 to N5 and N5 to N3.
Two 600+ mm2 tiles could reach 100 cores with perfect yield. So, Foveros is not necessary, but do you really want to commit to making such large sized tiles on a new process (Intel 3). I would rather stich together much smaller (say 100mm2) top tiles using Foveros; EMIB seems to take a lot of space on the SPR die, so I’m a bit wary of that. Also going 3D has advantages. If done properly, it should provide shorter paths to I/O and memory interfaces on the base die. It also provides an opportunity for a large cache to be placed there.
Always be conservative with semiconductors, always. The hype goes to the stratosphere every. single. time. I am inclined to believe Falcon Shores is a way different thing from Granite Rapids. Yes, within the Falcon Shore platform you'll get the flexibility, but not something that's essentially fitting in the same socket. Remember I said Sapphire Rapids HBM is
BGA. Optimizing to have such a hetereogenous configuration will have different x86 cores to maximize the configuration. Using exactly GNR would be a waste.
Also since we don't know how much Raptor Cove is over Golden nevermind Redwood Cove is over predecessors and nevermind again Lion Cove which we speculate is in GNR is over predecessors, that estimation of 650mm2 for 100 perfect yield cores can easily change plus or minus 30%!
Look at how the FP block shrunk by 40% on Intel 4. That's a 60% density gain. We don't know how that applies to the blocks that makes server Golden Cove extra large over client Golden Cove.
Remember how we talk about how Golden Cove is inefficient in die area? I think they'll make this better generation after generation. The older diagram had 60(sixty) blocks per tile on Intel 4!
Intel also demonstrated a 6T SRAM module that's comparable in power efficiency to the 8T module but far smaller. Still 20% larger than the regular 6T SRAM but 8T SRAM is further 40% larger. It's an example of design optimization. They have been using 8T on L1 and L2 caches for over a decade now.
Ian says Granite Rapids is using HD libraries which doesn't exist for Intel 4.