• We should now be fully online following an overnight outage. Apologies for any inconvenience, we do not expect there to be any further issues.

Could AMD's Jaguar pull a Dothan?

Page 2 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Ancalagon44

Diamond Member
Feb 17, 2010
3,274
202
106
Believe me that Quad Core Kabini at 2GHz is faster than Quad core Richland at 2GHz. First, Jaguar IPC is higher than Richland, secondly Jaguar Core scaling(MultiTread) is higher because Richland uses CMT. The only part that Richland wins is in iGPU due to the bigger size of it.

Pretty embarrassing for AMD if you ask me. But then they have already admitted BD was an unmitigated disaster.
 

AtenRa

Lifer
Feb 2, 2009
14,003
3,362
136
Is Jaguar's IPC really higher than that of Piledriver? That's a fairly large core size difference.

AMD Kabini A4-5000, Quad core 1.5GHz, 15W TDP
http://www.notebookcheck.net/Short-Review-AMD-A4-5000-APU-Kabini.93173.0.html

AMD Trinity A8-4555M, Quad Core 1.6GHz(turbo 2.4GHz), 19W TDP
http://www.notebookcheck.net/AMD-A-Series-A8-4555M-Notebook-Processor.81873.0.html

Cinebench R11.5 CPU Multi 64Bit

A4-5000 = 1,49 points
A8-4555M = 1,24 points

x264 HD Benchmark 4.0 (2 pass)

A4-5000 = 8,74 fps
A8-4555M = 5,35 fps

True Crypt AES mean

A4-5000 = 0.929 GB/s
A8-4555M = 0.822 GB/s


I believe x264 score for the A8-4555M is not correct because A6-4455M(Dual core) scores 5,73 fps at the same bench (2 pass). Or perhaps A8-4555M was throtling.
 

AtenRa

Lifer
Feb 2, 2009
14,003
3,362
136
Pretty embarrassing for AMD if you ask me. But then they have already admitted BD was an unmitigated disaster.

Different Micro-Architectures designed for different goals/characteristics. BD was not designed for low frequency, Jaguar was designed for High Performance/Watt at low frequencies.
 

Ventanni

Golden Member
Jul 25, 2011
1,432
142
106
AMD Kabini A4-5000, Quad core 1.5GHz, 15W TDP
http://www.notebookcheck.net/Short-Review-AMD-A4-5000-APU-Kabini.93173.0.html

AMD Trinity A8-4555M, Quad Core 1.6GHz(turbo 2.4GHz), 19W TDP
http://www.notebookcheck.net/AMD-A-Series-A8-4555M-Notebook-Processor.81873.0.html

Cinebench R11.5 CPU Multi 64Bit

A4-5000 = 1,49 points
A8-4555M = 1,24 points

x264 HD Benchmark 4.0 (2 pass)

A4-5000 = 8,74 fps
A8-4555M = 5,35 fps

True Crypt AES mean

A4-5000 = 0.929 GB/s
A8-4555M = 0.822 GB/s


I believe x264 score for the A8-4555M is not correct because A6-4455M(Dual core) scores 5,73 fps at the same bench (2 pass). Or perhaps A8-4555M was throtling.

Interesting benchmark scores. Maybe we can convince the Anandtech team to do a comprehensive comparison between the two? But you're right; Bulldozer was meant to scale up, whereas Jaguar was meant to scale down. Redesigning Jaguar to scale up would mean changing the entire design of the chip, but it holds promise.
 

guskline

Diamond Member
Apr 17, 2006
5,338
476
126
IDC: Thank you for your posts and information. It sound like, to a large degree GloFo caused AMD to leave a lot of Bulldozer's potential on the design table and to some degree the same with PileDriver. Is AMD also locked into GloFo for the SteamRoller fabrication? :(
 

sm625

Diamond Member
May 6, 2011
8,172
137
106
The A8 ULV almost doubles the kabini in single thread performance. It falls behind in 4 thread performance probably because its such a huge core and it cant stay under the TDP limit when the entire core is active. Of course as we all know single thread performance is critical in windows. I bet these kabini notebooks wont handle simple flash games like bloons super monkey 2 smoothly.
 

Exophase

Diamond Member
Apr 19, 2012
4,439
9
81
AMD Kabini A4-5000, Quad core 1.5GHz, 15W TDP
http://www.notebookcheck.net/Short-Review-AMD-A4-5000-APU-Kabini.93173.0.html

AMD Trinity A8-4555M, Quad Core 1.6GHz(turbo 2.4GHz), 19W TDP
http://www.notebookcheck.net/AMD-A-Series-A8-4555M-Notebook-Processor.81873.0.html

Cinebench R11.5 CPU Multi 64Bit

A4-5000 = 1,49 points
A8-4555M = 1,24 points

x264 HD Benchmark 4.0 (2 pass)

A4-5000 = 8,74 fps
A8-4555M = 5,35 fps

True Crypt AES mean

A4-5000 = 0.929 GB/s
A8-4555M = 0.822 GB/s


I believe x264 score for the A8-4555M is not correct because A6-4455M(Dual core) scores 5,73 fps at the same bench (2 pass). Or perhaps A8-4555M was throtling.

But I don't think a well-threaded comparing a true 4C Jaguar and 2M Piledriver at similar clock speeds is really representative of IPC. The single threaded test would be a better proxy.
 

AtenRa

Lifer
Feb 2, 2009
14,003
3,362
136
The A8 ULV almost doubles the kabini in single thread performance.

But I don't think a well-threaded comparing a true 4C Jaguar and 2M Piledriver at similar clock speeds is really representative of IPC. The single threaded test would be a better proxy.


You both miss the part that Trinity has Turbo at 2.4GHz from 1.6Ghz base when Jaguar has no turbo and always stays at 1.5GHz. The MT performance scores are more representative.
 

Ancalagon44

Diamond Member
Feb 17, 2010
3,274
202
106
I have a relatively unscientific method of comparing the two.

Look at the Cinebench R11.5 single threaded scores for each
Kabini @1.5Ghz = 0.39
8350 (piledriver) @4Ghz = 1.1

To simulate scaling, divide 1.1 by 4Ghz, and multiply by 1.5Ghz. You will get 0.4125

However, bear in mind that the 8350 can turbo up to 4.2Ghz. If it was running at 4.2Ghz, the equivalent score would be 0.3928.
 

sm625

Diamond Member
May 6, 2011
8,172
137
106
It now costs $5 apiece for 20MB+/sec 64Gbit MLC NAND. That is just $20 for 32GB of NAND @ 80-100 MB/sec. Using basic compression algorithms (similar to sandforce) you can squeeze maybe 150MB/sec and 50-60GB onto that $20 worth of NAND. That's why I say if AMD had put a NAND controller on their chips they would have a killer product on their hands. Given the low latency of an integrated controller, as well as compression, windows would absolutely fly. And that 10GB of wasted space consumed by the WinSXS folder would only consume 2GB of NAND because it is so highly compressible.

Man it was 5 years ago I said they needed to do this. Now AMD and Intel are both being eviscerated by companies who actually did put a somewhat intelligent NAND controller in their SoC.
 

Exophase

Diamond Member
Apr 19, 2012
4,439
9
81
You both miss the part that Trinity has Turbo at 2.4GHz from 1.6Ghz base when Jaguar has no turbo and always stays at 1.5GHz. The MT performance scores are more representative.

So, you can disable turbo and fix the clock speed can't you? I didn't think I had to outright say that I meant testing these parts at the same clock speed.

It doesn't matter if it's more representative than some other bad comparison, it's still itself a bad comparison. Bad comparisons shouldn't be used, at least without first describing how they're bad.

I have a relatively unscientific method of comparing the two.

Look at the Cinebench R11.5 single threaded scores for each
Kabini @1.5Ghz = 0.39
8350 (piledriver) @4Ghz = 1.1

To simulate scaling, divide 1.1 by 4Ghz, and multiply by 1.5Ghz. You will get 0.4125

However, bear in mind that the 8350 can turbo up to 4.2Ghz. If it was running at 4.2Ghz, the equivalent score would be 0.3928.

perf/MHz isn't strictly linear, and even a small non-linearity can accumulate a lot over 4x.
 

Khato

Golden Member
Jul 15, 2001
1,293
373
136
Man it was 5 years ago I said they needed to do this. Now AMD and Intel are both being eviscerated by companies who actually did put a somewhat intelligent NAND controller in their SoC.

Just curious, which SoCs actually have a NAND controller integrated? I'd thought that pretty much all of them use eMMC storage in order to avoid doing so.
 

Exophase

Diamond Member
Apr 19, 2012
4,439
9
81
Just curious, which SoCs actually have a NAND controller integrated? I'd thought that pretty much all of them use eMMC storage in order to avoid doing so.

I don't know about somewhat intelligent NAND controllers, but practically every ARM SoC I've ever seen can interface NAND flash. The controller's on the NAND chip, you just give it commands over a normal asynchronous bus. Look at some random NAND datasheet like this one that was a quick google popup:

http://download.micron.com/pdf/datasheets/flash/nand/4gb_nand_m40a.pdf

It's not that different from the NOR flash support they all have too, except you have to give commands to setup a page for reading. A lot of SoCs can boot off NANDs, they probably have some ROM code that sets up the NAND chip.

When you said NAND controller did you mean integrated transparent hardware wear leveling and whatever other stuff you might find on a USB stick or SSD so it can safely use a normal filesystem? Usually the OSes running on devices with straight NAND chips will use special filesystems that handle this stuff.
 
Last edited:

Khato

Golden Member
Jul 15, 2001
1,293
373
136
I don't know about somewhat intelligent NAND controllers, but practically every ARM SoC I've ever seen can interface NAND flash. The controller's on the NAND chip, you just give it commands over a normal asynchronous bus. Look at some random NAND datasheet like this one that was a quick google popup:

http://download.micron.com/pdf/datasheets/flash/nand/4gb_nand_m40a.pdf

It's not that different from the NOR flash support they all have too, except you have to give commands to setup a page for reading. A lot of SoCs can boot off NANDs, they probably have some ROM code that sets up the NAND chip.

When you said NAND controller did you mean integrated transparent hardware wear leveling and whatever other stuff you might find on a USB stick or SSD so it can safely use a normal filesystem? Usually the OSes running on devices with straight NAND chips will use special filesystems that handle this stuff.

Guess I should have quoted sm625's entire post in my response as you're quite correct, pretty much all SoC's have a very basic NAND controller integrated in order to provide support for SD/MMC.

My comment was with respect to the quoted statement of "AMD and Intel are both being eviscerated by companies who actually did put a somewhat intelligent NAND controller in their SoC" being made after a brief wishful daydream detailing the integration of a ~100 MB/s NAND controller into AMD's chips. Regardless, pretty much all of the ARM SoC's used in smartphones, tablets and the like use e-MMC for their primary storage no? Due to it being a more robust/faster option than their integrated SD/MMC controllers.
 

mrmt

Diamond Member
Aug 18, 2012
3,974
0
76
IDC: Thank you for your posts and information. It sound like, to a large degree GloFo caused AMD to leave a lot of Bulldozer's potential on the design table and to some degree the same with PileDriver. Is AMD also locked into GloFo for the SteamRoller fabrication? :(

What did Globalfoundries do to hind Bulldozer?

Globalfoundries gave a process that allowed AMD to bring processors over 4GHz to the market, something I find pretty hard to happen with GLF 28nm. This situation is no different than when AMD had the upper hand. They never got a significant manufacturing advantage over Intel, they excelled with the designs they marketed, like the K6-2, K7 and K8. If AMD had a better design (better IPC, lower power consumption) they would be in a far better market situation.

In a sense, it's easy to see why AMD execs are calling Bulldozer an unmitigated disaster. No matter how good the process is and no matter how power efficient their GPU are, they cannot make a processor small enough and fast enough to be competitive against Intel. Bulldozer is AMD's own failure, they have no one to blame.

But if you are arguing that GLF is hindering AMD as a company, then yes, GLF is hindering AMD. The WSA, the delayed 28nm process, the absence of HP node beyond 28nm, then yes, all these caveats are destroying AMD.
 

wlee15

Senior member
Jan 7, 2009
313
31
91
Jaguar is not even in the same class if you wish to draw on history. Its only a 2 issue wide uarch while Pentium-M was 3 issue wide.

And AMD already killed of the cats cores for servers.

It only has 2 ports dedicated to it's execution units and I believe packed SSE instruction have to go through the complex decoder so in practice it's quite a bit slower.
 

Idontcare

Elite Member
Oct 10, 1999
21,110
64
91
IDC: Thank you for your posts and information. It sound like, to a large degree GloFo caused AMD to leave a lot of Bulldozer's potential on the design table and to some degree the same with PileDriver. Is AMD also locked into GloFo for the SteamRoller fabrication? :(

Well we can't fault GloFo for 32nm for two reasons - first being that GloFo as the company that exists today had zero input into the decisions that AMD made (Ruiz and Meyers) to team up with IBM and the fab-club for the creation and definition of the 32nm SOI node as it came to exist.

That was more of a situation where AMD got what it intended to get, whereas GloFo ended up being in a situation of having to sell a product (32nm SOI) that they didn't necessarily have much input on.

Second reason is that 32nm SOI probably helped bulldozer more than hurt it. What definitely did not help bulldozer was that the 32nm process was gate-first. That resulted in weaker driver currents versus a gate-last process like Intel's and TSMC's, which has the knock-on effect of limiting clockspeeds without resorting to pushing the voltages to the sky (which is ultimately what AMD had to resort to doing in order to get the chips to clock where they clock).

So AMD cooked their own bulldozer goose IMO in partnering with IBM and letting IBM drive the development of 32nm as they did.

It was an inevitable train wreck, and many on the inside saw it coming well in advance (including the BoD).

Steamroller will be 28nm, which will be a node developed for GloFo at the fab-club but at the total discretion of GloFo without any pre-decided AMD factors involved. That may not matter though, time will tell, it is late and that is usually not a good sign.
 

Vesku

Diamond Member
Aug 25, 2005
3,743
28
86
At the very least TSMC's process nodes seem to arrive in a reasonable schedule, it's Q2 2013 now and Glo-Fo still haven't released a 28nm process node yet.

Seems like GF 28nm is finally starting to produce, roughly a year and a half after TSMC.

http://www.digitimes.com/news/a20130618PD221.html

"Global shipments of 28nm HKMG wafers have topped over 850,000 units."

Wonder how many of those are Kaveri, does AMD now have to jockey for wafer priority even after paying for not having product made at 32nm?

Second reason is that 32nm SOI probably helped bulldozer more than hurt it. What definitely did not help bulldozer was that the 32nm process was gate-first. That resulted in weaker driver currents versus a gate-last process like Intel's and TSMC's, which has the knock-on effect of limiting clockspeeds without resorting to pushing the voltages to the sky (which is ultimately what AMD had to resort to doing in order to get the chips to clock where they clock).

So AMD cooked their own bulldozer goose IMO in partnering with IBM and letting IBM drive the development of 32nm as they did.

It was an inevitable train wreck, and many on the inside saw it coming well in advance (including the BoD).

While I don't have much interest in reading Ruiz's book I'd actually read one from Dirk if he actually described how this train wreck of decisions came about.

Seems that GlobalFoundries is sticking with the IBM alliance through at least 10nm.

http://www.xbitlabs.com/news/other/...l_Tech_for_Development_of_Next_Gen_Nodes.html

Although they have detached themselves from having to wait for IBM to approve everything. Common Platform looks to be less IBM centric than the IBM Technology Development Alliances group.

http://www.commonplatform.com/

Side note: What is with the ridiculous PR time frames? GF saying 20nm will be in production in 2013, said in February 2013 when 28nm was just starting to produce?

http://www.xbitlabs.com/news/other/...2015_7nm_Fabrication_Process_Due_in_2017.html
 
Last edited: