BlueBlazer
Senior member
- Nov 25, 2008
- 555
- 0
- 76
It's not 100% certainty, but all the rumors are pointing that way. It's inconceivable to me to imagine that AMD has an 8 core CPU with Nehalem like IPC, superior power consumption, and greater overclocking that SB, and all that for $266. Consider me "pessimistic" because I was here during Phenom I launch and the same hype followed it too.
8C Zambezi has 4 FLeXFP units which work in SMT mode (internally handling 2 threads via SMT). So speed up of around ~4.5-5x is expected. The problem however is the performance of one FlexFP. The brand new 8C 8150 can't beat Thuban. This is going against even what AMD stated for HPC workloads when Interlagos is in question. They stated: 35% more performance versus top MC we have today. Top Interlagos will be 2.3Ghz at launch. Correct for clock speed and you are left with : 1.35x2.5/2.3=1.46 speedup versus MC at same clock(Turbo won't kick in FPU heavy workloads). This is spec fp rate.Note that BD has peak flops that are the same for all 3 ISA targets : legacy SSE,AVX 128 and AVX256 .This is since you have fixed amount of SIMD pipelines that handle both AVX 128 and legacy SIMD,while AVX 256 is done via 2 units(so same peak flops again).IMHO that could be IPC issues (or perhaps bugs). If you look at the Cinebench scores (both R10 and R11.5), it performs somewhere around 4.7 to 5.2 multi-CPU speedup given 4 modules (or 4 real cores). That's greater than 4 real cores (additional 18% to 30% performance per core, ala SMT). The IPC in Bulldozer may not be clear cut (due to longer pipelines), just like this example >> Pentium 4 EE 3.46GHz vs Athlon 64 3400+. Thus one of the main reasons the need for higher clock speed. :hmm:
The figure is for servers and HPC. Quote from my post here....8C Zambezi has 4 FLeXFP units which work in SMT mode (internally handling 2 threads via SMT). So speed up of around ~4.5-5x is expected. The problem however is the performance of one FlexFP. The brand new 8C 8150 can't beat Thuban. This is going against even what AMD stated for HPC workloads when Interlagos is in question. They stated: 35% more performance versus top MC we have today.
..... and here.....I think that will depend on application tested. Remember David Kanter's analysis? It might vary, whether its server or desktop type applications. In servers, specific compiler optimizations can used for tuning. However in desktop, applications are already pre-compiled (fixed). Thus would be interesting to see how this turns out. :hmm:
Perhaps you can start to see the difference... :hmm:You may notice he keeps insisting about being a server guy, and in servers you can use compiler tuning to optimize (for Bulldozer pipleline architecture) and to enable new features in Bulldozer (such as AES, AVX and FMA4) that older generation Magny-Cours doesn't have, to claim increased performance. However on the desktop, the situation may be different. The programs are pre-compiled (fixed) and "run-as-it-is". You can also check JF-AMD's FAQ especially these lines....Final OS optimizaitons
Final drivers
An app compiled with the latest flags
That may not be accurate because its possible Bulldozer's performance behaviour changes when the clocks gets higher. Quote from my post here...Top Interlagos will be 2.3Ghz at launch. Correct for clock speed and you are left with : 1.35x2.5/2.3=1.46 speedup versus MC at same clock(Turbo won't kick in FPU heavy workloads). This is spec fp rate.Note that BD has peak flops that are the same for all 3 ISA targets : legacy SSE,AVX 128 and AVX256 .This is since you have fixed amount of SIMD pipelines that handle both AVX 128 and legacy SIMD,while AVX 256 is done via 2 units(so same peak flops again).
Now we know that 8150 will be running at 3.6Ghz in Cinebech,right? That is 3.6/3.3=1.09 or 9% faster than 1100T. All combined : 1.46x1.09xThuban's C11.5 score = 9.41 pts. This is huge difference to what leaks and guys who allegedly have seen NDA'd docs say -they say around 6pts for 8150 which is practically the same as Thuban.
You calculate yourself the slow down, taking into account the actual clock speeds shown in the SiSoftware results (2.0GHz and 2.75GHz). :hmm:The scores seems to get lower as the clock frequency increases (when compared, taking into account the number of cores and clock speed). :\
I do speculate that perhaps some sections/parts of the module as well as the rest of the CPU may be running at limited clock speed to keep the TDP in check. And running each section with different clock speeds and communicating asynchronously does have its ups and downs. This will depend on the engineering to minimize latency and lost/missed clock cycles. Setting the proper frequency, timing and clock phases at which these domains run together improves performance. Setting them improperly can decrease performance. :hmm:@Blueblazer
"possible Bulldozer's performance behaviour changes when the clocks gets higher."
Are you serious here? This won't happen unless the multi clock domains are run asynchronously inside the module and one FlexFP then runs at half the clock or something along those lines. I seriously doubt that.
Redundant post, by the way.
@Blueblazer
"possible Bulldozer's performance behaviour changes when the clocks gets higher."
Are you serious here? This won't happen unless the multi clock domains are run asynchronously inside the module and one FlexFP then runs at half the clock or something along those lines. I seriously doubt that.
Redundant post, by the way.![]()
You didn't read it. It gives info specifically on the fact that 8.3 ghz achieved by Intel chip was a cache-less one. Here you have 2 cores with 8 mb of L3 hitting that speed. Significant difference this. Even if you point out to 7 ghz by 990X, you can't discount another ghz on top of it, will you now?
Do you realize that 8MB of L3 cache means half of its 16MB L3 cache is disabled? And as mentioned in other thread, Bulldozer "cores" are not true cores. The fact is that only 1 "module" is enabled. That's why you do not see 1-core, 3-core, 5-core, 7-core SKUs. Most of us know about that Celeron already, long ago.You didn't read it. It gives info specifically on the fact that 8.3 ghz achieved by Intel chip was a cache-less one. Here you have 2 cores with 8 mb of L3 hitting that speed.
That's 7GHz with all 6 cores running, not with cores disabled. You are also starting to derail this thread. :hmm:Significant difference this. Even if you point out to 7 ghz by 990X, you can't discount another ghz on top of it, will you now?
Do you realize that 8MB of L3 cache means half of its 16MB L3 cache is disabled? And as mentioned in other thread, Bulldozer "cores" are not true cores. The fact is that only 1 "module" is enabled. That's why you do not see 1-core, 3-core, 5-core, 7-core SKUs. Most of us know about that Celeron already, long ago.
That's 7GHz with all 6 cores running, not with cores disabled. You are also starting to derail this thread. :hmm:
Do you realize that 8MB of L3 cache means half of its 16MB L3 cache is disabled?
If only we were getting release grade benchmarks, the desire to pour over tech site reviews is building up! Hopefully Anandtech or similarly reputable site compares the 4, 6, 8 starter clocks, 8 max clocks all in one go within a few days of NDA lift.
My bad (should be "cache" instead of "L3 cache"), and Vesku is correct. Must have Interlagos in my mind.I thought this was the specifications of the 8150. Could you link me this new information that shows they now have doubled the L3 cache?
Lack of previews at the moment. However it seems systems are coming >> iBUYPOWER Launches World's Fastest AMD FX-8150 "Bulldozer" SystemIf only we were getting release grade benchmarks, the desire to pour over tech site reviews is building up! Hopefully Anandtech or similarly reputable site compares the 4, 6, 8 starter clocks, 8 max clocks all in one go within a few days of NDA lift.
This. If it were $320-350 for the 8-core and lower-than-3.5GHz clock speed, then I'd believe it. For $265, no way.
That's the GPU market and AMD's profit here is razor thin (majority of AMD's profit still stem from CPU market). Just look at NVIDIA's higher profits due to charging higher prices. Sometimes too much under-cutting have negative impacts. Its not a question of being an underdog, but its a question of business sense. :hmm:The 5870 cost 200$ less than the GTX 480 when it first came out, while delivering 85% of the performance. What was so bad about it?
Now that AMD has shaped up, the 6000 series is priced completely adjacent to Nvidia's prices though. Ergo, when you are the underdog, you produce a very decent product and shove the price down your opponent's throat. What's so improbable about it?
You're only taking performance into account. Nvidia's IQ, extra features, and in my opinion, their drivers are superior. Besides, people should know generally not to get the first nvidia product of its generation, mainly because of all the revisions. While the 8800GTX was kind of an exception, the Geforce FX 5900 ultra was better than the 5800Ultra. I agree that the 480GTX was a rip off, but then so was AMD's offering regardless of how inexpensive it was. Once nvidia gets their GL frame rates under control (if they haven't already), there will be no reason to buy AMD.The 5870 cost 200$ less than the GTX 480 when it first came out, while delivering 85% of the performance. What was so bad about it?
Now that AMD has shaped up, the 6000 series is priced completely adjacent to Nvidia's prices though. Ergo, when you are the underdog, you produce a very decent product and shove the price down your opponent's throat. What's so improbable about it?