Search results

  1. H

    Bulldozer ES benchmark is out!

    Fine now these benchmarks look pretty reasonable. They match exactly what was expected from the design issues I mentioned in other threads here. And they have the same issue with not exceeding 2.8 GHz which was the reason for the delay. They have one or more speed pathes to be fixed in order...
  2. H

    Rumour: Bulldozer 50% Faster than Core i7 and Phenom II.

    Let me try to explain this fundamental misunderstanding of absolute IPC. e.g. if you have an IPC of the absolute value x.y that means actually nothing if not used as relative comparison with exactly the same code. E.g. I have a program that makes in the main loop 20 adds, 5 muls and 1 div...
  3. H

    Rumour: Bulldozer 50% Faster than Core i7 and Phenom II.

    As we have now lot's of official information from AMD I do not think we can find a lot in such articles. As you say it is likly more speculation because otherwise on which sources they rely on? I did not read the article BTW.
  4. H

    Rumour: Bulldozer 50% Faster than Core i7 and Phenom II.

    Again stuff from the Aprils fool 2010. It has been identified as fake since 2010 and it is not funny to repost this Bullshit again and again. It's proven as fake.
  5. H

    [What If] AMD going out of business?

    Llano is much more powerful than Sandy Bridge. Intel has simply no offering to be able to compete with Llano at the moment. And we all know that Intel is since a long time lacking extremly in graphics market. On the other side AMD had even less chances than Intel to break into this market. But...
  6. H

    [What If] AMD going out of business?

    Some comments on this: First AMD has TWO Fabs in Dresden not one (by Global Foundries). And they are building another one in New York (by Global Foundries). In addition they use e.g. TMSC. AMD is now again profitable, they recovered from the heavy losses they had by buying ATI. Next the...
  7. H

    Rumour: Bulldozer 50% Faster than Core i7 and Phenom II.

    I think you have little knowledge. Think again what an incredible success Intel had regarding by just adding Port 5 and how little it costed. I also named the die sizes of chips where it is done like that and how small they are. And Bulldozer where it was done vice versa and how bad the...
  8. H

    Rumour: Bulldozer 50% Faster than Core i7 and Phenom II.

    You make the mistake to forget to substract the die size consumption of included graphics unit. Oh what difference ... Yes a 4C Bulldozer which competes with Dual Core Sandy Bridge regarding performance. I take two similar (as far as you can do that) performing chips and comparing the die...
  9. H

    Rumour: Bulldozer 50% Faster than Core i7 and Phenom II.

    I do see the die size utilization as a performance issue. As you know you cannot make a chip as large as you want. There are many limits, speed pathes, TDP and finally costs. AMD will not suffer from initial Zambezi die size. But they already suffer when it comes to Interlagos and they will...
  10. H

    Rumour: Bulldozer 50% Faster than Core i7 and Phenom II.

    I made a little research on that but the problem is they have the same issue with Deneb and did not remove that so I do not think they will change this with Bulldozer, though I already emphazied in earlier posts that AMD has to fix their uncore die consumption issue. I don't think there will be...
  11. H

    Rumour: Bulldozer 50% Faster than Core i7 and Phenom II.

    That is out of question and not the problem. The problem arises from the amount of die space AMD Bulldozer needs to surpass the 4 Core Sandy Bridge. That is almost twice the die space (excluding graphics on Intel) and that causes the real problem, not in 2011 but in the upcoming years. So let's...
  12. H

    AMD Bulldozer benchmarks leaked

    Gigabyte has confirmed this to be a fake: http://www. rumorpedia/exclusive-amd-bulldozer-details-from-gigabyte/ Remove space in link
  13. H

    Rumour: Bulldozer 50% Faster than Core i7 and Phenom II.

    From my experience with recent CPU releases and benchmark results which leaked before I can tell that in all cases the final retail performance matched the performance of the pre-release/engineering samples. What might not match is the clock frequency but everything else is okay. Yes there might...
  14. H

    Rumour: Bulldozer 50% Faster than Core i7 and Phenom II.

    Maybe it does not matter, but false dependencies prevent scheduler from pulling instructions ahead. If there is no false dependency it can pull them ahead using shadow registers. The problem is if the BD does not work exactly as you expect but it could do more ops/cycle you might have...
  15. H

    Rumour: Bulldozer 50% Faster than Core i7 and Phenom II.

    It is not so good to use 32 Bit operations in 64 Bit code since that might cause false dependencies, though you spread the operations widly over registers. Changed to 64 Bit only, no false dependencies, I also changed the immediate operations to register operations to not run into decoding...
  16. H

    Rumour: Bulldozer 50% Faster than Core i7 and Phenom II.

    Sure it is smaller, it is on a new process on 32 nm SOI. There is no problem in the absoulte die size itself (my estimations for 8 core Zambezi are 280 mm², from the same die shot others get 292/294). The problem arises from the die size related to performance. The key point is that Intel's...
  17. H

    Cinebench score lower than before...WTF?

    Though it would need to be analyzed more exactly it is likly related to having more interrupts if your only change was switching the graphics card. The interrupts let the drivers run (scheduled by OS) and those consume some resources not to speak about the general loss in performance by context...
  18. H

    Rumour: Bulldozer 50% Faster than Core i7 and Phenom II.

    Thank you for the source. But I would like to read more of this document than just this quoted part. As for that statement it would completly fit with the optimization manual, as the lea/inc capability is in terms of micro ops. As this doesn't contradict to the optimization manual this is about...
  19. H

    Rumour: Bulldozer 50% Faster than Core i7 and Phenom II.

    The roadmaps I have seen showed a Sandy Bridge EN (desktop) which is 6 and 8 cores. However in latest news there was more the speak of the 6 core variant for desktop. I have no idea about actual Intel plans or if they have changed them. According to the optimization manual the two additional...
  20. H

    Rumour: Bulldozer 50% Faster than Core i7 and Phenom II.

    4 MicroOps which are 2 MacroOps. @Dresdenboy: Do you have a link/source to the fused branch, that would be a good feature. And the bad news just don't stop: New detail emerged and that is that all FPU, SSE, AVX operations cannot use L1 cache and have to access L2 or worse. I just cannot...
  21. H

    Rumour: Bulldozer 50% Faster than Core i7 and Phenom II.

    Okay hey the 4-wide decoder and 32 Byte fetch is great but you forget that for one core that means only 2-wide decoding and 16 Byte fetch. Decoding is an issue because AMD would need 6-wide decoders which is nearly impossible with x86. The main reason why obviously IBM was able to get it right...
  22. H

    Rumour: Bulldozer 50% Faster than Core i7 and Phenom II.

    Yes I know Interlagos. But JF-AMD said this is only issued as a server part and that they do not release that as a desktop part. Therefore in server market everything seems to be okay because CPU prices are not so much an issue and AMD uses MCM. What I mean is that they cannot do a single die...
  23. H

    Rumour: Bulldozer 50% Faster than Core i7 and Phenom II.

    From scheduling and execution standpoint for sure yes, that was part of question clarification on AMD's Bulldozer preview web site. Regarding sustained usage it depends of course what the other thread does and throughput information was missing in the optimization manual.
  24. H

    Rumour: Bulldozer 50% Faster than Core i7 and Phenom II.

    You are absolutly correct on this. What can I say? First I praised the design because if you can get 80% advantage with 5-10% of added die space this is just so much greater than getting only 30% with Intel's SMT. Also from an IBM research this 17 FO4 was analyzed as the optimum performing CPU...
  25. H

    Rumour: Bulldozer 50% Faster than Core i7 and Phenom II.

    I mean here we see another problem of Bulldozers CMT approach. Okay 3 AVX instructions issued in two cycles, 4 integer instructions in 2 cycles. Appears to be possible. But in detail it is very strange. All three AVX instructions have strong dependency. So they not even cannot be issued in...
  26. H

    Bulldozer screens/info (pcinlife)

    Frequencies are pretty low on these engineering samples, but that is possibly because of usual ramping issues. The processors which emerge in June are currently in production (apparently B0 or B1) and it is unknown in what speed bins those can be divided. The information gain from that is very...
  27. H

    Rumour: Bulldozer 50% Faster than Core i7 and Phenom II.

    Yes but for address calculation in call and lea e.g. And yes that helps for lea and call as can be seen in the latency tables. The BD architecture is 2 wide, 2 ALUs, 2 x86-Ops / cycle. That is the information we have so far and this information is an official document from AMD. As long as...
  28. H

    Rumour: Bulldozer 50% Faster than Core i7 and Phenom II.

    Are you kidding? Core 2 is superior to K10.5 in practically every design aspect: * Execution width (4 vs. 3) * cache latency * memory bandwidth/latency (since Nehalem) * instruction latency * prefetch * branch prediction * scheduler depth * L/S reordering * SMT * Trace cache (since Sandy Bridge)...
  29. H

    Rumour: Bulldozer 50% Faster than Core i7 and Phenom II.

    Except for Fritz Chess, it is quite obvious that the counts get lower the more you acces main memory and then you are much limited to memory access which reduce absolute IPC by memory stalls. Whereas synthetic benchmarks obviously run in L1 cache only. As I said the absolute IPC is more a...
  30. H

    Rumour: Bulldozer 50% Faster than Core i7 and Phenom II.

    K7 -> K8 was the biggest design step in the line K7-K10.5. Regarding performance the very improved reorder capability of the scheduler of K8 compared to K7 cannot be emphasized enough. Only since the K8 scheduler AMD could easily sustain all 3 pipelines at steady work.* *@Riek That is also why...
  31. H

    Rumour: Bulldozer 50% Faster than Core i7 and Phenom II.

    Regarding the faster L/S handling, this is coming from the prefetcher right. But the latency of L1 increased by 1 cycle. However I mean also several instructions which have increased latency. Regarding FP I already said that a shared unit is more than two halves because of exactly what you...
  32. H

    Rumour: Bulldozer 50% Faster than Core i7 and Phenom II.

    It is a bit more complicated but surly the Bulldozer scheduler is no issue and under guarantee no bottleneck. Scheduler depth defines out of order window size. So the larger it is the more you can reorder and the earlier you can load data. But you do not bottleneck because of scheduler. Even...
  33. H

    Rumour: Bulldozer 50% Faster than Core i7 and Phenom II.

    That is just not correct. In fact normally all 3 units are completly busy and only in seldom cases they are not (div stall, misprediction stall, memory stall, etc.). Just load any normal code in AMD Code Analyst Tool and start pipeline analysis, you will see that. Load + Calc. This is so...
  34. H

    Rumour: Bulldozer 50% Faster than Core i7 and Phenom II.

    This is quite interesting. In normal FPU code (fadd, fsub, fmul, fdiv) we have similar problems as we have with integer: 4 FPU operations for 2 cores in Stars vs. 2 FPU operations at slower speed (higher latency) with a 2 core Bulldozer module. But most FPU code is handled today by using SSE...
  35. H

    Rumour: Bulldozer 50% Faster than Core i7 and Phenom II.

    Quite easy, because we have this information from the optimization guide. These MicroOps are executed even slower than the MacroOps from Stars before! However the reason for this slower execution comes from the high frequency design (less work per cycle). As I said already, Bulldozer has...
  36. H

    Rumour: Bulldozer 50% Faster than Core i7 and Phenom II.

    The problem here and why you get to a 12% performance boost is that the Bulldozer parts will be clocked higher than Interlagos. So core for core and clock for clock BD is slower as Magny Cours. However because Bulldozer will have a significantly higher clock as Magny Cours it will be able to...
  37. H

    Rumour: Bulldozer 50% Faster than Core i7 and Phenom II.

    If you keep in mind that such a core from a half module cannot work without the other half of the module (second core) then it is okay. Where e.g. with Llano you could build a single (3, 5, 7) core system.
  38. H

    Rumour: Bulldozer 50% Faster than Core i7 and Phenom II.

    Yes that is because: 1.) They reduced the integer core from a 3 pipeline design (PII) to a 2 pipeline design (BD). This is somehow a new information which I assembled from information pieces of decoder and the new AMD optimization manual. They have 4 pipelines of NOW yops which are able to...
  39. H

    Rumour: Bulldozer 50% Faster than Core i7 and Phenom II.

    If you break it down like this yes.
  40. H

    Rumour: Bulldozer 50% Faster than Core i7 and Phenom II.

    This 4.5 GHz was the actual clock, means Turbo for all cores included and not on top. And despite JFAMD's insistance that IPC increases I doubt that. I not only doubt that regarding the latest information from AMD from the developer manual and others before I do not see any possibility for an...