Rumour: Bulldozer 50% Faster than Core i7 and Phenom II.

Page 41 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.
Status
Not open for further replies.

zebrax2

Senior member
Nov 18, 2007
976
69
91
Anyone here knows if HT increases power consumption and if it does by how much? It just came to me that the new turbo could possibly their answer to HT from a performance/power point of view.
 

Mopetar

Diamond Member
Jan 31, 2011
8,460
7,682
136
Anyone here knows if HT increases power consumption and if it does by how much? It just came to me that the new turbo could possibly their answer to HT from a performance/power point of view.

It does, but not by terribly much. AMD's answer to HT is their use of CMT. The higher turbo is most likely an answer to their (assumed) lower IPC.
 

itsmydamnation

Diamond Member
Feb 6, 2011
3,058
3,870
136
It does, but not by terribly much. AMD's answer to HT is their use of CMT. The higher turbo is most likely an answer to their (assumed) lower IPC.

HT can add a ton of heat/power based on the workload, but thats because your getting a ton more thoughput, during summer i disable HT and turn it back on in winter (average summer temp 30C, avg winter temp 0C ).
 

zebrax2

Senior member
Nov 18, 2007
976
69
91
It does, but not by terribly much. AMD's answer to HT is their use of CMT. The higher turbo is most likely an answer to their (assumed) lower IPC.

The way i look at it is HTs purpose was to make a single core do more work which is what is AMDs doing with their new turbo by increasing clocks based on TDP. CMT for me is just a way of minimizing space consumed on the die of individual cores. I'm basing this on the purpose of the feature rather than why was it made/used. That is my opinion anyway.
 

HW2050Plus

Member
Jan 12, 2011
168
0
0
Unfortunately, no one has a job that demands that they use their computer to produce excellent SPEC results so it's not as useful as various video rendering or photoshop benchmarks for most professionals. I don't mean to say that SPEC is utterly useless, but regardless of how well a CPU does under that benchmark, I'm going to choose the one that best suits my needs based on the applications I most commonly use.
Yes but I do it with the SPEC results. So if I wanna do e.g. video rendering I will take the SPEC result for that application (PovRay). Or I take the H.264 video encoder results if I wanna do that or the zip-results if I am interested in compression.

The SPEC set of applications is a bit wider than that of Anandtech suite, e.g. it includes also chess engines and in general a wider application range with the exception of games but those games are as we know GPU limited anyway.

The difference between SPEC and e.g. Anandtech is that the application set of the 30 applications contained in SPEC CPU is unbiased and any other set of applications is accidently or willingly biased. That is basically why SPEC organization was created to get unbiased application results.

I agree with you that if you hit exactly an application and it's version with another benchmark set than it will be perfectly accurate for that use (but for this application and version only). But to get a general comparison between different CPUs the other sets are more or less inaccurate. And if you hit the application in SPEC, than SPEC is even more perfect for you.
 

bryanW1995

Lifer
May 22, 2007
11,144
32
91
You sure about that? I remember differently.



What about people that represent that they are from companies, and they get a special member title, indicating that they are from a company.
Last I heard, that requires some sort of official company confirmation to get that title.

keysplayr was a very special case, iirc he volunteered to put the "focus group member" info in his sig. I was not aware that this was a focus group requirement, however. however, don't take my word for it, go ask him.
 

Bearach

Senior member
Dec 11, 2010
312
0
0
http://forums.nvidia.com/index.php?showuser=29408&f=0 According to this person he must have a Focus Group signature, as that is part of their rules. But could be wrong.

What is this mysterious NVIDIA Focus Group, you ask, and how do we interact with NVIDIA? Good questions. We're a very small team of forum users that receives information, hardware and software from NVIDIA. We don't hide our identity. As a matter of policy every NVIDIA Focus Group member (were often called the forum champs) must display the disclaimer that I currently display in their forum signature.
 
Last edited:

bryanW1995

Lifer
May 22, 2007
11,144
32
91
That change was made a few years ago iirc. Either that or Rollo was a rogue operative who thumbed his nose at all the rules yet still maintained enough clout to get NV to convince anand to reinstate him. I'm going with A.
 

Nemesis 1

Lifer
Dec 30, 2006
11,366
2
0
Actually...your not, there are 13 signs..ignored by astrologers...as they ignore reality.

0/T Also 13 months in a real year . Lunar month is 28 day. How many days in a week 7 7divided into 28 = 4 weeks . Romans new exactly what they was doing when they changed that . Screwed us out of 1 months pay ALL of us. This is old old news to history buffs . but it seems to be new news to rest of world .
 

CTho9305

Elite Member
Jul 26, 2000
9,214
1
81
Yes but I do it with the SPEC results. So if I wanna do e.g. video rendering I will take the SPEC result for that application (PovRay). Or I take the H.264 video encoder results if I wanna do that or the zip-results if I am interested in compression.

The SPEC set of applications is a bit wider than that of Anandtech suite, e.g. it includes also chess engines and in general a wider application range with the exception of games but those games are as we know GPU limited anyway.

The difference between SPEC and e.g. Anandtech is that the application set of the 30 applications contained in SPEC CPU is unbiased and any other set of applications is accidently or willingly biased. That is basically why SPEC organization was created to get unbiased application results.

I agree with you that if you hit exactly an application and it's version with another benchmark set than it will be perfectly accurate for that use (but for this application and version only). But to get a general comparison between different CPUs the other sets are more or less inaccurate. And if you hit the application in SPEC, than SPEC is even more perfect for you.

Be careful with that. h.264 and zip may not look similar at all to a processor. Similarly, POV-ray and video rendering may look very different to a processor. Just because the tasks appear similar at a 10,000ft view ("compression", "rendering") doesn't mean that they have anything in common at all at lower levels. I would actually be shocked if there's a strong correlation between zip and h.264 performance given how different the algorithms are. Also, POV-ray is a raytracer, which has very different behavior (from the standpoint of a processor) from scanline / triangle-pipeline renderers (i.e. most real-time or near-real-time renderers) and has nothing to do with Windows Movie Maker-type performance.

edit: I see I misread, and you're not trying to use h.264 to indicate zip performance (or vice versa)... but the video rendering stuff still stands.
 
Last edited:

Cogman

Lifer
Sep 19, 2000
10,284
138
106
Be careful with that. h.264 and zip may not look similar at all to a processor. Similarly, POV-ray and video rendering may look very different to a processor. Just because the tasks appear similar at a 10,000ft view ("compression", "rendering") doesn't mean that they have anything in common at all at lower levels. I would actually be shocked if there's a strong correlation between zip and h.264 performance given how different the algorithms are. Also, POV-ray is a raytracer, which has very different behavior (from the standpoint of a processor) from scanline / triangle-pipeline renderers (i.e. most real-time or near-real-time renderers) and has nothing to do with Windows Movie Maker-type performance.

edit: I see I misread, and you're not trying to use h.264 to indicate zip performance (or vice versa)... but the video rendering stuff still stands.
zip performance MIGHT indicate H.264 performance to some extent. The problem is more that zipping a file up is pretty much completely Hard drive bound now (which makes it less then perfect).

The place that it is similar (and this is strictly speaking about x264 now) is the fact that zipping, as well as most lossless compression, heavily focus on integer instructions. So as a test of speed, you'll get somewhat similar results. The difference would come in the usage of SIMD instructions. While most lossless compression schemes could benefit from it, I doubt they would have taken to it as fast as x264 has.

You might be shocked at how well some applications can predict the performance of other applications. For the CPU, there are pretty much 3 types instructions when it comes to the performance of applications. Branching instructions, Floating point instructions, and integer instructions. So long as two applications has a similar mix of those instructions, it is going to see roughly the same performance increases and decreases from two different platforms.
 

Mopetar

Diamond Member
Jan 31, 2011
8,460
7,682
136
Seems interesting that they've doubled the L2 cache, but have decreased sizes for the L1 caches. Each core has a 16 kB cache now, although it is 4-way associative. The instruction cache is also shared between cores.

Has any information been released about the latencies for the caches? There was some expectation that the L2 would be faster on the new process, but I don't recall any confirmation of that.
 

BladeVenom

Lifer
Jun 2, 2005
13,365
16
0
0/T Also 13 months in a real year . Lunar month is 28 day. How many days in a week 7 7divided into 28 = 4 weeks . Romans new exactly what they was doing when they changed that . Screwed us out of 1 months pay ALL of us. This is old old news to history buffs . but it seems to be new news to rest of world .

On planet Earth a lunar cycle is 29.53 days, so most cultures came up with a 12 month year.
 

Triskain

Member
Sep 7, 2009
63
33
91
Has any information been released about the latencies for the caches? There was some expectation that the L2 would be faster on the new process, but I don't recall any confirmation of that.

The L1D Cache has a latency of 4 cycles, the L2 Cache a latency of 18 cycles.
 

Voo

Golden Member
Feb 27, 2009
1,684
0
76
The problem is more that zipping a file up is pretty much completely Hard drive bound now (which makes it less then perfect).
Depends on what you mean with "zipping". If you include some of the better compressing algorithms out there you become CPU bound quite easily. So in that case that wouldn't be the problem.
 

Cogman

Lifer
Sep 19, 2000
10,284
138
106
Depends on what you mean with "zipping". If you include some of the better compressing algorithms out there you become CPU bound quite easily. So in that case that wouldn't be the problem.

By zipping, I mean applying the DEFLATE algorithm as it is currently applied to the .zip file standard. (in other words, none of this LZMA stuff. Just standard, zlib like compression.)

I don't include other lossless compression methods when I talk about zipping a file. If I did mean that, then I would have said "lossless compression algorithms".

You are quite right, there are several that will tax the CPU (and even memory) beyond insanity. My favorite being the PAQ compressor.
 

SickBeast

Lifer
Jul 21, 2000
14,377
19
81
There's always going to be a use for more CPU power. Right now with my Phenom II I'm finding that I could use more single threaded performance in certain games that can only take advantage of one core.

Once programmers get around to optimizing their software, 8 core CPUs are going to be incredibly fast and powerful. There is already a great deal of software that can take advantage of pretty much as many cores as you can give it.
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,787
136
Has any information been released about the latencies for the caches? There was some expectation that the L2 would be faster on the new process, but I don't recall any confirmation of that.

There's always a trade off. They've increased the L2 cache by 4x to 2MB, so that'll alone result in higher latency.

New processes usually bring 25% or so faster transistors at equal power consumption. If you want lower power, you'll have to lower your speed gain. Of course it can be anywhere between the two.

Another thing is a design target. If they decided to increase the operating frequency by 20%, that means everything that is synchronized with the clock will have to operate 20% faster too. The L2 cache latencies are always represented relative to the CPU clock, meaning 20% higher frequency with same 20 cycle means the L2 cache is 20% faster. That means if the CPU is designed to be operated at much higher frequency, there won't be any relative reduction for the L2 cache.
 

Schmide

Diamond Member
Mar 7, 2002
5,729
1,021
126
There's always a trade off.

I wonder if they're going to reach a 2 cycle L1 or remain at 3 cycles and have extra headroom for clock speed? It would most certainly seem that they traded size for the extra associativity and snooping vs shared resources probably played a part in the small L1 as well.
 

Mopetar

Diamond Member
Jan 31, 2011
8,460
7,682
136
There's always a trade off. They've increased the L2 cache by 4x to 2MB, so that'll alone result in higher latency.

Technically it's doubled since the L2 cache is shared by both cores on the module. Of course if only one core is running then it's essentially quadrupled.
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,787
136
Technically it's doubled since the L2 cache is shared by both cores on the module. Of course if only one core is running then it's essentially quadrupled.

Which wouldn't matter in latency calculations. :)

Schmide said:
I wonder if they're going to reach a 2 cycle L1 or remain at 3 cycles and have extra headroom for clock speed? It would most certainly seem that they traded size for the extra associativity and snooping vs shared resources probably played a part in the small L1 as well.

64KB L1-I: 3 cycles
16KB L1-D : 4 cycles
2MB L2: 18-20 cycles
 

HW2050Plus

Member
Jan 12, 2011
168
0
0
Also, POV-ray is a raytracer, which has very different behavior (from the standpoint of a processor) from scanline / triangle-pipeline renderers (i.e. most real-time or near-real-time renderers) and has nothing to do with Windows Movie Maker-type performance.
Maybe you mix up video editing (Windows Movie Maker) with rendering (PovRay)? Anyway of course you look at the suited ones.
 
Status
Not open for further replies.