The Intel Atom Thread

Page 208 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Bouowmx

Golden Member
Nov 13, 2016
1,138
550
146

On average, Tremont has 1.3x single-thread performance on components of SPEC CPU at iso-frequency relative to Goldmont Plus.

Appears to be about/above Sandy Bridge level, though not sure the increase is strong enough for ARM Cortex-A76.
 

mikeymikec

Lifer
May 19, 2011
17,705
9,566
136
IIRC a while back someone posted a link showing how Atoms perform against mainstream processors, I'd be curious to know what the more recent generations are like against processors I know a bit better (e.g. Core i3/i5), please.
 

Dayman1225

Golden Member
Aug 14, 2017
1,152
974
146

On average, Tremont has 1.3x single-thread performance on components of SPEC CPU at iso-frequency relative to Goldmont Plus.

Appears to be about/above Sandy Bridge level, though not sure the increase is strong enough for ARM Cortex-A76.
1.3x~ IPC, 2x3 Clustered Decoder, one of which can be disabled depending on the product. Configurable L2 1.5 - 4.5 MB, depending on product, the LLC can support inclusive and non-inclusive modes and it supports RDT (typically a Xeon feature). Lots of cool stuff going on here. Looks like a nice core.
 

Dayman1225

Golden Member
Aug 14, 2017
1,152
974
146

Intels Basestation chip codenamed SnowRidge appears to be using Tremont Atom cores with the Mesh Architecture seen in Skylake X/SP, Cascade Lake X/SP and Xeon Phi.

Looks like Snowridge could see upto 24 Tremont Cores.
 

Roland00Address

Platinum Member
Dec 17, 2008
2,196
260
126
On average, Tremont has 1.3x single-thread performance on components of SPEC CPU at iso-frequency relative to Goldmont Plus.

Appears to be about/above Sandy Bridge level, though not sure the increase is strong enough for ARM Cortex-A76.

Good I am still using a Sandy Bridge i7. Now that is at 3.5 ghz and also has turbo. Goldmont Plus is turboing up to 2.5 to 2.8 ghz. Surely Tremont is going to match this 2.5 to 2.8 ghz level or surpass it. The limitation of the 8 year old sandy bridge is not the cpu but the gpu, and also other other hardware components like the ram and the ssd.

Now of course we have much faster processors than my 8 year old Sandy Bridge i7 but this is enough for casual to even serious use. People will not be complaining about the CPU they will instead be complaining the tablet / other device either uses slow MMC or does not have enough ram.
 

NostaSeronx

Diamond Member
Sep 18, 2011
3,686
1,221
136
Tremont => Clustered Pick/Decode/Dispatch
Sunnycove => Clustered AGUs

All we need now is Willowcove to adopt Clustered Execution plus the above.

I also wonder if TremontX will go SMT2. It has similar enough execution width to Nehalem and a OoO window that competes with Skylake.
 

Bouowmx

Golden Member
Nov 13, 2016
1,138
550
146
Sunny Cove's L1 cache, 5-cycle 32 KB I + 48 KB D vs. Tremont's 3-cycle 32 + 32 KB. Sounds like an obvious thing to improve for Willow Cove.

Decoder clusters that can be fused off sounds like a difficult topic to market to normal consumers.
 

Bouowmx

Golden Member
Nov 13, 2016
1,138
550
146
Good I am still using a Sandy Bridge i7. Now that is at 3.5 ghz and also has turbo. Goldmont Plus is turboing up to 2.5 to 2.8 ghz. Surely Tremont is going to match this 2.5 to 2.8 ghz level or surpass it. The limitation of the 8 year old sandy bridge is not the cpu but the gpu, and also other other hardware components like the ram and the ssd.

Now of course we have much faster processors than my 8 year old Sandy Bridge i7 but this is enough for casual to even serious use. People will not be complaining about the CPU they will instead be complaining the tablet / other device either uses slow MMC or does not have enough ram.

Core i5-2400 in 10 W makes for a very capable entry-level product, but whether the entry-level (ex. Celeron N4100/J4105) gets the Tremont treatment is uncertain. Likely, Intel would find selling 10 nm in networking equipment (an area which Intel has high aspirations) more worthwhile than in entry-level PCs.
 

jpiniero

Lifer
Oct 1, 2010
14,591
5,214
136
Core i5-2400 in 10 W makes for a very capable entry-level product, but whether the entry-level (ex. Celeron N4100/J4105) gets the Tremont treatment is uncertain. Likely, Intel would find selling 10 nm in networking equipment (an area which Intel has high aspirations) more worthwhile than in entry-level PCs.

Might still happen, since Elkhart Lake looks like it is coming, and Jasper Lake might be the version without any IoT specific stuff on the die. Either way I assume the specs are the same for both: Up to Four Tremont cores and 8-32 Gen11 EUs. That might be small enough to viably yield.
 

Roland00Address

Platinum Member
Dec 17, 2008
2,196
260
126
Core i5-2400 in 10 W makes for a very capable entry-level product, but whether the entry-level (ex. Celeron N4100/J4105) gets the Tremont treatment is uncertain. Likely, Intel would find selling 10 nm in networking equipment (an area which Intel has high aspirations) more worthwhile than in entry-level PCs.

Whatever is going to happen is going to look self obvious.

Intel has limited 10nm supply (at the moment) and thus higher asking average selling price skus of all forms, in all markets, are going to get priority over lower asking average selling price skus.
In the past and this goes all the way back to core 2 duo and even prior to that, lower end skus were often a process behind for desktop and laptop until intel could match total supply and have an excess of supply we finally see intel use their latest process on things called celeron, pentium, and so on.

The exception to this was when Intel was trying to get Atom into Phones and thus was focusing heavy on Atom in tablets. But even then in actually reality instead of powerpoint slides at conferences Atom rarely got the latest foundry process until there was some form of excess capacity and Intel was not sacrificing higher ASPs.
 

Brunnis

Senior member
Nov 15, 2004
506
71
91

On average, Tremont has 1.3x single-thread performance on components of SPEC CPU at iso-frequency relative to Goldmont Plus.

Appears to be about/above Sandy Bridge level, though not sure the increase is strong enough for ARM Cortex-A76.
I've run quite a lot of tests on both Goldmont and Goldmont Plus. From my own testing, it appears Goldmont Plus is between Penryn and Nehalem in IPC. A 30% IPC uplift on top of that will result in IPC closer to Haswell.
 

moinmoin

Diamond Member
Jun 1, 2017
4,950
7,659
136

On average, Tremont has 1.3x single-thread performance on components of SPEC CPU at iso-frequency relative to Goldmont Plus.

Appears to be about/above Sandy Bridge level, though not sure the increase is strong enough for ARM Cortex-A76.
Looks like it will be the first Intel chip with Total Memory Encryption, an x86 extension Intel proposed back in 2017 after AMD launched Epyc with SME and SEV.
 

DrMrLordX

Lifer
Apr 27, 2000
21,629
10,841
136
1.3x~ IPC, 2x3 Clustered Decoder, one of which can be disabled depending on the product. Configurable L2 1.5 - 4.5 MB, depending on product, the LLC can support inclusive and non-inclusive modes and it supports RDT (typically a Xeon feature). Lots of cool stuff going on here. Looks like a nice core.

I'll say. I was wondering about the decode clusters but since, as you say, they can disable one on a per-product basis, that makes some sense. 6-way decode is up from Skylake client though. Which is fascinating.
 
  • Like
Reactions: Dayman1225
Mar 11, 2004
23,074
5,557
146
Wouldn't that just be the most Intel thing if Atom ends up setting some big new architecture shift in the future (kinda like Pentium M leading to Core 2). Almost seems like it might, and might even follow a similar setup as the Pentium M/P4, where P4 was languishing with Intel trying to push clockspeeds to keep them relevant (but that was killing their efficiency), with Atom maybe enabling Intel to pace or exceed in core counts.
 

DrMrLordX

Lifer
Apr 27, 2000
21,629
10,841
136
Wouldn't that just be the most Intel thing if Atom ends up setting some big new architecture shift in the future (kinda like Pentium M leading to Core 2). Almost seems like it might, and might even follow a similar setup as the Pentium M/P4, where P4 was languishing with Intel trying to push clockspeeds to keep them relevant (but that was killing their efficiency), with Atom maybe enabling Intel to pace or exceed in core counts.

Was sort of thinking the same thing. Dothan presaged greatness.
 

DrMrLordX

Lifer
Apr 27, 2000
21,629
10,841
136
I think Tremont will play an important part(emphasis on part) but it won't be the 2020's Banias.

Why not? Banias, in-and-of itself, was a far cry from Conroe. It was essentially Tualatin (sort of) -> Banias -> Dothan -> Yonah -> Conroe. Banias was a 2003 product, and Tualatin initially showed up in 2001. That's arguably 3-5 years it took Intel to finally bring a product like Conroe to market, starting from either Tualatin or Banias (depending on how you look at it).

If someone told me that Tremont might eventually go on to become something great in 2023 or so . . . I might believe them.
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
No, I believe this will be the future:


Ironic thing is Intel was the first to publicly announce it, but due to the tremendous delays to their process(and added with the stupidity of not porting IP) and anything based on those processes, they are incredibly late. If 10nm was not late, maybe we'd have seen it in 2016/2017 similar to Apple with the A10 chip.

It's amazing how they missed it when they(specifically Intel Labs lead by Justin Rattner) had the foresight to predict what's now being used in billion-plus devices today. Testament to the huge role bad management plays in everything.
 
  • Like
Reactions: CHADBOGA

DrMrLordX

Lifer
Apr 27, 2000
21,629
10,841
136
No, I believe this will be the future:

In all fairness to Intel, they did (sort of) try going that route with Larrabee. It just didn't work out as well as they had hoped, they didn't tie the little Atom cores in Larrabee (and later Phi variants) to larger general-purpose cores, and they targeted the whole thing at HPC instead of . . . anything else. Now they finally have Lakemont and who-knows-what-else coming in the future. Tremont is a big part of that. If the Cove architectures don't deliver as well as necessary to get Intel back into performance leadership, Tremont and its successors may have to take over.
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
Now they finally have Lakemont and who-knows-what-else coming in the future. Tremont is a big part of that. If the Cove architectures don't deliver as well as necessary to get Intel back into performance leadership, Tremont and its successors may have to take over.

Xeon Phi is a very different thing. As they say, devil is always in the details.

I think you are mistaken believing that Sunny Cove is horribly inefficient and is a repeat of Netburst with some magic "Banias" core coming in to replace it, and that'll be Tremont.

The Netburst approach didn't fail simply because Netburst uarch was bad. It failed because the belief that clock speed scaling due to process shrinks would continue gripped the best minds in the semi industry.

The problem we see today is still due to process. But it has worsened with real full nodes extended to being 3+ years(that includes foundries with misleading marketing numbers).

Tremont, with 30% improvement is still very far behind Sunny Cove and isn't a successor to it at all, because its slower in the all-important gaming, responsiveness, and light workloads. This is why it needs both cores, since its a tradeoff. The main market in 2003 was desktop PCs. Now it has shifted to thin, light 2-in-1s.

The shift to low power devices is partly why Lakefield is the future. Not just that, the level of integration will allow battery life to improve a level beyond what current Core devices can offer and be a true answer to WoA devices. You also need the big Cove cores alongside Tremont if they want to be ahead of the ARM camp, because Tremont will likely be behind A76, possibly even A75.

This is their return to the Tablet market they abandoned back in 2015. But it'll be a proper one with performance no longer being anemic using the cheapest smallest cores.
 

Thala

Golden Member
Nov 12, 2014
1,355
653
136
The shift to low power devices is partly why Lakefield is the future. Not just that, the level of integration will allow battery life to improve a level beyond what current Core devices can offer and be a true answer to WoA devices. You also need the big Cove cores alongside Tremont if they want to be ahead of the ARM camp, because Tremont will likely be behind A76, possibly even A75.

At least thats what Intel is hoping. Thing is at the time when Tremont/Lakefield arrives in products we will have Cortex A77 in phones! So Lakefield is a mid-range answer to WoA and does not play at the high end at all performance wise.
Other than that Lakefield is indeed a step in the right direction for Intel and its mobile efforts.
 
Last edited:

moinmoin

Diamond Member
Jun 1, 2017
4,950
7,659
136
No, I believe this will be the future:


Ironic thing is Intel was the first to publicly announce it, but due to the tremendous delays to their process(and added with the stupidity of not porting IP) and anything based on those processes, they are incredibly late. If 10nm was not late, maybe we'd have seen it in 2016/2017 similar to Apple with the A10 chip.

It's amazing how they missed it when they(specifically Intel Labs lead by Justin Rattner) had the foresight to predict what's now being used in billion-plus devices today. Testament to the huge role bad management plays in everything.
Let's embed that glorious slide:
evolution.jpg


Funny that that Intel slide pops up in an unrelated discussion about consoles.

Ironic that AMD is now (since two years) simply doing what Intel predicted in 2005 they would be doing by 2015.

So the development of the Atom architecture and scalability is more in line with their plans from 1 1/2 decades ago than what happened with their Core architecture.
 

DrMrLordX

Lifer
Apr 27, 2000
21,629
10,841
136
I think you are mistaken believing that Sunny Cove is horribly inefficient and is a repeat of Netburst with some magic "Banias" core coming in to replace it, and that'll be Tremont.

I actually think that Sunny Cove is pretty good. 18% IPC over Skylake - if true - would make Intel competitive again, even if it were competing head-to-head with Zen3 (though it would have been better for Intel to pit it against Zen2). The real problem is that Intel has not chosen to release any products featuring Sunny Cove to date that have more than four cores. So yields are bad, clocks are sketchy, etc. etc. And, furthermore, Tremont itself may wind up with similar problems. Intel may not be able to fab dice with more than some unfortunately-low number of Tremont cores on it before suffering such low yields using their 10nm process that they'll have to scratch the entire product.

But then look at Lakefield.

Instead of doing something radical (and potentially useful) like stacking multiple 4c IceLake dice via Foveros (put the I/O die + iGPU in the middle I guess), they chose only one Sunny Cove core and four Tremont cores in a custom die, stacked it with some RAM and an I/O die, and created Lakefield as the "pipecleaner" product for Foveros. Intel is up against the wall here. Their "Netburst" moment of today is that they're relying on high-clockspeed -Lake cores (4 GHz+) operating in numbers as high as 8 or higher all being fabbed on one monolithic die with the memory controller, iGPU, and other SoC functions. For whatever reason, they can't do it on their new 10nm process. What if they can't do it on 7nm either, even if 7nm yields prove to be less-problematic? AMD sidestepped this problem by limiting the number of cores per die to 8 and tying them together with IF. If Intel can't keep the monolithic train rolling, they'll have to do the same thing with their packaging technologies - EMIB and Foveros. But going back to Lakefield . . . Intel isn't even trying to stack Sunny Cove yet, and there does not seem to be any indication that they'll do so in a followup product. We see Rocket Lake - another monolithic design except for probably the iGPU - in 2021. We see Tiger Lake (Willow Cove) in 4c configurations, carrying on as a successor to IceLake. Unless 6c and 8c TigerLake are a thing? I haven't heard about those. We don't see 8c-16c Sunny or Willow Cove products coming up that relying on chip stacking, or anything of the sort, which is what Lakefield implies that Intel could do, if such a configuration could be made to work at all.

What if such a configuration can't work, but something involving Tremont can? Due to power density or some other factor I can't imagine at the moment. Tremont alone can't save Intel, but something 3 or 4 steps down the line with the same basic TDP targets and better performance, stacking via Foveros in 16c configurations or better, might start hunting bear for Intel while they struggle to make the Coves make sense. Or, to put it differently: what if Intel's Atom team can develop successors to Tremont that by 2023 or so, are competitive on a clock-per-clock basis with Apple's A-series, use about as much power per core as Apple's A-series, and can be stacked in ridiculous quantity? With clockspeeds maybe in the 3-3.5 GHz range? Also consider that I'm talking about A14 or A15, or wherever Apple will be by 2023. Not just A13 today.

The Coves may be relegated to duty in massive server-class packages where they can EMIB a bunch of 4c Cove-based dice. Sadly we haven't seen Intel try that yet, either.

Or Intel's future may be that the Coves will be relegated to the position of one or two per die, with clusters of Atoms supporting, either on the same die or on dice stacked via Foveros. That still shifts a great deal of Intel's future towards Atom, since overall system performance will still be limited by the performance of Tremont and/or its successors.
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
Funny that that Intel slide pops up in an unrelated discussion about consoles.

Ironic that AMD is now (since two years) simply doing what Intel predicted in 2005 they would be doing by 2015.

That slide pops up in another of their article about IDF but it coincidentally happened to be the one I chose. Neither AMD nor Intel is doing that. It's the ARM world that embraced it first.

But going back to Lakefield . . . Intel isn't even trying to stack Sunny Cove yet, and there does not seem to be any indication that they'll do so in a followup product. We see Rocket Lake - another monolithic design except for probably the iGPU - in 2021. We see Tiger Lake (Willow Cove) in 4c configurations, carrying on as a successor to IceLake. Unless 6c and 8c TigerLake are a thing? I haven't heard about those. We don't see 8c-16c Sunny or Willow Cove products coming up that relying on chip stacking, or anything of the sort, which is what Lakefield implies that Intel could do, if such a configuration could be made to work at all..

GPU won't be fabbed in the same as I/O die because GPUs require high density and the benefits latest processes bring such as lower active power. I/O doesn't scale well with process shrink and doesn't need to be as fast so it uses ultra-low leakage and cheaper older processes such as 22FFL.

As for saying Lakefield is the future, its because low power Laptops and Tablets are the main market now. 3D stacking has thermal related issues, but future laptops are likely going to move to fanless designs with low TDP chips making it a manageable issue(it looks like Tigerlake-Y will be the first to make it mainstream for fanless, just as Haswell made 15W mainstream). With desktops they'll likely go with MCM with organic interposers or sometimes EMIB.

It's simply a compromise. MCM is easier to make variants of but result in higher power consumption and greater package size. But there's no way around monolithic when designing mobile platforms with idle in the mW range and battery life in the 10+ hours.

Foveros is a solution but an expensive one that's justified in the ultra low power market, or scenarios needing super high bandwidth(TB/s range). The cheapest to most expensive goes like this: MCM-->EMIB-->Foveros with AMD using MCM approach for their Ryzen 3000 and EPYC CPUs.
(Technically they are all a form of MCM but in this case I'm thinking of traditional organic interposer MCM)

While Apple's chips are faster in single thread, chips from all vendors are basically categorized by amount of power it uses when it comes to multi-threaded performance. Sensational articles like to compare 9900K to the A13 for single thread, but simply put even the 8650U gets very close to it and in future generations you'll likely see chips like Tigerlake-Y in the ballpark level. That's because a single core has much more headroom before it reaches the TDP limit and can clock itself to the maximum. Try it if you have a modern laptop. 7W is a huge amount for 1 core.

Ideally that's the future we want, where our slimmest Tablets are just as responsive as the liquid-cooled, $5K desktops using 200W just for the CPU. Where the latter crushes the former is in demanding, many-thread applications.
 

VirtualLarry

No Lifer
Aug 25, 2001
56,339
10,044
126
This is their return to the Tablet market they abandoned back in 2015. But it'll be a proper one with performance no longer being anemic using the cheapest smallest cores.
But will the BoM cost be acceptable to the industry / tablet mfgs / etc.? Or will only a version of the MS Surface end up using these chips, because of their (traditionally high) cost?

ARM is everywhere, not because it's faster (although, it seems like it is, today, at the lower-end of consumer computing products), but because it was CHEAP, and continues to be
so.

Edit: And what about the SOHO / SMB NAS market? Will Intel create or modify designs for their needs? I think that's a growing market. A NAS-specific SoC fabbed based on their modern Atom cores, with a RAID controller / multiple SATA ports capable of RAID 0/1/5/6, and maybe a 10GbE MAC (requiring just a PHY chip), would be nice to have, at a low cost. Certainly, it should be cheaper and lower-power, than just sticking one of their mobile Core CPUs in there, with all of the associated chipset and peripheral / I/O chips too.
 
Last edited: