http://vr-zone.com/articles/report-amd-to-announce-a-series-fx-series-in-june/12037.html
Looks like E3 is the announce date for Bulldozer and the FX series.
Looks like E3 is the announce date for Bulldozer and the FX series.
For me, if BD is similiar to SB in terms of performance (core vs core) than the platforms (i.e chipsets and its feature set) will decide who gets my wallet for the next round of upgrades that Im thinking of doing.
I don't see how anyone can think Bulldozer will match Sand Bridge in IPC. They'll be doing great to get past Nehalem(with programs not using AVX).
as someone who barely plays any games, and just surfs the internet and wants super low power, i'm actually more excited in the llano chip.
going to be great seeing it finally come out.
I'm hoping that AMD will be releasing some hexa cores FX with a good base clock speed.
I recall reading some where that the FX line up would release with 2 8 core processors, 1 6 core, and 1 4 core, and in the 4th quarter more models(probably higher clocks) would release Q4. I'll try finding the article in the morning.
There are three over $1000 available from Newegg, and one just shy of $1000.The return of the $1000 AMD CPU!!
There are three over $1000 available from Newegg, and one just shy of $1000.
Well that's a silly rhetorical question. :sneaky:Desktop or server?
Well that's a silly rhetorical question. :sneaky:Desktop or server?
Has anyone put together a moderately technical effort to compare the decode/execute/retire capabilities (and factored in mispredict penalties, cache latency expectations) of what we are expecting from BD versus what we know of sandy bridge?
Has dresdenboy, Hans, or RWT put their pencil to paper on such an effort?
I do know from the things that were released during Hot Chips about the architecture that the vast majority of the architecture has been improved from K10.
The front end has been completely overhauled, including the branch prediction which probably is the most improved part of this architecture (although it was a weakness for the STARS architecture, so how improved this is will have a big impact on the Bulldozer performance since the new architecture has deeper pipelines.) The Branch target buffer now uses a two level hierarchy, just like Intel does on Nehalem and Sandybridge. Plus, now a mispredicted branch will no longer corrupt the entire stack, which means that the penalties for a misprediction are far less than in the STARS architecture. (Nehalem also has this feature, so it brings Bulldozer to parity with Nehalem wrt branch mispredictions)
Decoding has improved, but not nearly as much as the fetching on the processor. Bulldozer can now decode up to four (4) instructions per cycle (vs. 3 for Istanbul). This brings Bulldozer to parity with Nehalem, which can also decode four (4) instructions per cycle. Bulldozer also brings branch fusion to AMD, which is a feature that Intel introduced with C2D. This allows for some instructions to be decoded together, saving clock cycles. Again, this seems to bring Bulldozer into parity with Nehalem (although this is more cloudy, as there are restrictions for both architectures, and since Intel has more experience with this feature they are likely to have a more robust version of branch fusion.)
Bulldozer can now retire up to 4 Macro-ops per cycle, up from 3 in the STARS architecture. It is difficult for me to compare the out-of-order engine between STARS and Bulldozer, as they seem so dissimilar. I can say that it seems a lot more changed than just being able to retire 33% more instructions per cycle. Mostly the difference seems to be moving from dedicated lanes using dedicated ALUs and AGUs, to a shared approach.
Another major change is in the Memory Subsystem. AMD went away from the two-level load-store queue (where different functions were performed in in each level), and adopted a simple 40 entry entry load queue, with a 24 entry store queue. This actually increases the memory operations by 33% over STARS, but still keeps it ~20% less than Nehalem. The new memory subsystem also has an out-of-order pipeline, with a predictor that determines which loads can pass stores. (STARS had a *mostly* in-order memory pipeline) This brings Bulldozer to parity with Nehalem, as Intel has used this technique since C2D. Another change is that L1 cache is now duplicated in L2 cache (which Intel has been doing as long as I remember). Although L3 cache is still exclusive.
Bulldozer now implements true power gating. Although unlike Intel who gates at each core, they power gate at the module level. This shouldn't really effect IPC, but might effect the max frequency so it is a point to bring up when discussing changes to performance. The ability to completely shut off modules should allow higher turbo frequencies than we saw in Thuban, but we won't know what they are until we see some reviews.
Well, those are the main differences that I know of. Add that to the fact that this processor was actually designed to work on a 32nm process versus a 130nm process like STARS, and you should see additional efficiencies. I expect a good IPC improvement, along with a large clockspeed boost. Although I can't say how much, and I really am looking more for parity with Nehalem based processors than I am with Sandybridge based processors.
References:
Butler, Mike. "Bulldozer" A new approach to multithreaded compute performance. Hot Chips XXII, August 2010.
[URL="http://www.realworldtech.com/page.cf...2610181333&p=1 [/quote"]http://www.realworldtech.com/page.cf...2610181333&p=1