^ i stand corrected... i completely ignored the Godtanium's.. D:
And Atom too! hehe! Atoms have had hyperthreading since the beginning. It works pretty well too.
^ i stand corrected... i completely ignored the Godtanium's.. D:
P3 were the mobile cpu's, what do you think the first doltan and yonah's were?
http://en.wikipedia.org/wiki/NetBurst_%28microarchitecture)
"Despite these enhancements, the NetBurst architecture created obstacles for engineers trying to scale up its performance. With this microarchitecture, Intel looked to attain clock speeds of 10 GHz, but because of rising clock speeds, Intel faced increasing problems with keeping power dissipation within acceptable limits."
that 10ghz number i heard is where it was theoretically supposed to work.
And your last comment applies because cpu's started to have more cores.
LOL! That doesnt mean that hyperthreading on netburst needs to be at 10ghz to be worth anything. i like how you tryed to pull that though..
P3 were the mobile cpu's, what do you think the first doltan and yonah's were?
this is what i heard from a lot of Intel people..
So unless u have some different views share them.
Becuase if i was really off im sure JHU would of corrected me, or IDC.
regardless they skipped an entire generation of HT on all their processors so the ones which followed were optimized.
Can you provide a one-line summary for what Intel gets from HT along the same lines of what you stated for AMD and CMT?and AMD gets 33-59% boost from CMT (chip level multithreading).
I ran a few of these benches on my i3-530 with HT on/off
The problem is how much additional die size is traded for what performance gains. The addition of HT only adds about 5% additional resources to the i3 where AMD is adding what ? 60-70% to the die size to get CMT?
AMD said the 2nd INT core present in a module occupies only 5% of area, not a 60-70% increase.
I only have 3 of the benches they ran on hand to run on my i3-530.Can you provide a one-line summary for what Intel gets from HT along the same lines of what you stated for AMD and CMT?
I wasn't sure about the additional die size that the extra integer core adds to the FX -- I just threw out a number based on reading about Bulldozer. I checked and according to an article on Bulldozer by Anadtech in 2009 …AMD said the 2nd INT core present in a module occupies only 5% of area, not a 60-70% increase.
AMD has come back to us with a clarification: the 5% figure was incorrect. AMD is now stating that the additional core in Bulldozer requires approximately an additional 50% die area. That's less than a complete doubling of die size for two cores, but still much more than something like Hyper Threading.
I was thinking along the same lines. If we consider NehalemYeah it really seems the way to make these comparisons is 1 BD Module versus 1 Intel "core"...1 vs 2 threads on each...then compare the mm^2 of the cores (including L2$).
Anand A single Nehalem core isnt made up of a majority of cache. Approximately 1/3 of the core is L1/L2 cache, 1/3 is the out of order execution engine and the remaining 1/3 is decode, the branch prediction logic, memory ordering and paging.
The advantage of hyperthreading is that it will never cost performance. There are no cases in which a Sandy Bridge without HT will outperform a SB with HT. However, depending on workload, a BD with odd numbered cores disabled can outperform a 2 module/4 core BD. That is why it is a terrible idea.
The advantage of hyperthreading is that it will never cost performance. There are no cases in which a Sandy Bridge without HT will outperform a SB with HT. However, depending on workload, a BD with odd numbered cores disabled can outperform a 2 module/4 core BD. That is why it is a terrible idea.
xtremesystsems ran some benches with 4M/4C and 4M/8C configurations. It's essentially like turning HT (hyperthreading) on/off …
Chess 11800/8813=1.3389 ?
Wprime 13.814/9.531=1.4494
Winrar 4467/3027=1.4757
3d06 5803/4134=1.4037
3dvantage 19215/12102=1.5878
3d11 6340/4289=1.4782
CB R10 20552/15033=1.3671
CB R11.5 6/3.8=1.5789
Blender 9.76/7.16=1.3631
X264 37.23/25.18=1.4786
Transcode (222+210)/(185+135)=1.35
… and AMD gets 33-59% boost from CMT (chip level multithreading).
I ran a few of these benches on my i3-530 with HT on/off …
Fritz
FX-8150
11,807 … 4M/8C … 34.0% … (faster)
8813 … 4M/4C
i3-530
5418 … HT on … 31.2%
4129 … HT off
Cinebench 11.5
FX-8150
6.0 … 4M/8C … 57.9%
3.8 … 4M/4C
i3-530
2.32 … HT on …31.1%
1.77 … HT off
wPrime 32M
FX-8150
9.531 … 4M/8C … 44.9%
13.814 … 4M/4C
i3-530
19.281 … HT on … 33.1%
25.671 … HT off
The problem is how much additional die size is traded for what performance gains. The addition of HT only adds about 5% additional resources to the i3 where AMD is adding what … ? … 60-70% to the die size to get CMT? On the low end of the gains like in Fritz, HT gains 31.2% compared to 34.0% for the FX -- so AMD has added a huge die size penalty to gain marginally more than Intel does from HT. While a lot of those benches gain more than 40% and as high as 59% with CMT on the FX, it's not enough to offset the die size hit the chip takes for the performance gain. If the FX could see at least a 60% gain across the board in multithreaded benches like Fritz the FX would probably be looking like a pretty good chip right now.
One has to hope the bottleneck on the FX in multithreaded benches can be fixed and AMD can get consistently 60-80% from CMT. This could turn the FX into a fast competitive processor.
Im going to vote no they are not as effective as HT. Because SB manages to beat BD in pretty much all test while using half the die space and less power.
Perhaps if AMD knew how to design a CPU core that was as efficiant as SB maybe the whole module thing might work out better i dunno. But as of right now it seems HT is more effective both in die space(and therefor cost to produce/sell) and power use.
I think what we learned from this, as laymen, is that CMT is not an effective method of improving the performance of a core any more than hyperthreading is.
I think what we learned from this, as laymen, is that CMT is not an effective method of improving the performance of a core any more than hyperthreading is.
If the core (base) compute microarchitecture is weak (be it netburst/prescott or bulldozer/zambezi) then expanding the architecture in the direction of multithreading by way of CMT/SMT is essentially a fool's errand because you've merely diluted (shared resources) something that was already weak in the first place.
Making a weak core even weaker by forcing it to share resources is not going to result in stronger performance in a consistent robust manner. There will be niche apps, corner-cases, that can take advantage of it, but it hardly makes for a compelling argument that it is as good general purpose processor.
AMD said the 2nd INT core present in a module occupies only 5% of area, not a 60-70% increase.
