So the WiiU's cpu is a 1.25 Ghz tri-core PowerPC 750...

Page 4 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

tipoo

Senior member
Oct 4, 2012
245
7
81
It is interesting that they went with a core with such terrible SIMD performance. The 360 core has pretty beefy vector units, and the PS3 obviously has its SPUs. Are Nintendo expecting developers to use GPGPU more for their highly parallel floating-point algorithms?

Even high end GPUs that we get today slow down their 3D performance quite a bit while doing GPGPU work concurrently, and the Wii U has at best a mid range GPU from a few years ago.

That may explain their choice of going with AMD's GCN for the gpu. I hope they do so that developers make use of it for other platforms. Plus GPGPU just kills the performance on Nvidia Kepler.


I've heard zero talk of GCN in the Wii U, most things pointed to the R700.
I've only heard of GCN in Durango and Orbis.

The timeframes don't really work out well for that theory either, the Wii U spec would have been finalized well before that. Pretty sure it's the 4k series then.
 

SPBHM

Diamond Member
Sep 12, 2012
5,056
409
126
one thing we are 100% sure is, the console uses 64bit DDR3 at no higher clock than 1600MHz,
this should give an indication about what to expect... so yes... old and slow CPU + RV730 makes sense.
 

tipoo

Senior member
Oct 4, 2012
245
7
81
one thing we are 100% sure is, the console uses 64bit DDR3 at no higher clock than 1600MHz,
this should give an indication about what to expect... so yes... old and slow CPU + RV730 makes sense.

Yeah, that makes for 12.8GB/s. The GDDR5 rumored in the PS4 is apparently 170-192GB/s, the Nextbox has slower DDR3 but there's more of it at 8GB vs 4 in Orbis and 2 in the Wii U. Plus it's probably faster than the Wii U RAM at any rate, and uses eSRAM rather than eDRAM which is faster.


I didn't expect it, but this really is looking like a wii-PS360 power gulf again. Although with a more compatible GPU at least.
 

MightyMalus

Senior member
Jan 3, 2013
292
0
0
"if you're playing games just for how much adult content there is"

You wouldn't actually be an adult then.
Since, there is much more adult content absolutely everywhere else.


And on topic, the HW makes sense to me, make it easy for devs. I'm not getting a Wii U simply because I am tired of Nintendo not making anything actually new. Same old stuff. Game wise, I would choose Little Big Planet over Mario any day now.

The first day I played LBP, I felt like a kid again it was just that fun and entertaining playing with someone or alone.
 

Plutonas

Junior Member
Mar 5, 2013
2
0
0
The wii U cpu is not a ppc model but a POWER model. PPC is different from POWER cpus.

IBM says that, Nintendo says that. If you log into their oficial tweeter and websites, wii U is advertised with an IBM POWER based cpu.. Now if some people dont believe the companies who made this chip, its an another matter.

That also explains why the middle cpu core, have 2mb of cache. That also means, wii U cpu is away to strong compared to ps4 and 720 ones... (underpowered - amd ones... doesnt sound thrilling). IBM power cores, are well known for their very high instruction per cycle. (IPC)

[FONT=&quot]I also remember an early rumor that nintendo got a bargain in the cpu... And just after that, IBM announced that NINTENDO is one of the best partners in the gaming industry, thats why they shared some of their newest technologies with the wii U's cpu.[/FONT]
 
Last edited:

NTMBK

Lifer
Nov 14, 2011
10,237
5,020
136
The wii U cpu is not a ppc model but a POWER model. PPC is different from POWER cpus.

IBM says that, Nintendo says that. If you log into their oficial tweeter and websites, wii U is advertised with an IBM POWER based cpu.. Now if some people dont believe the companies who made this chip, its an another matter.

That also explains why the middle cpu core, have 2mb of cache. That also means, wii U cpu is away to strong compared to ps4 and 720 ones... (underpowered - amd ones... doesnt sound thrilling). IBM power cores, are well known for their very high instruction per cycle. (IPC)

[FONT=&quot]I also remember an early rumor that nintendo got a bargain in the cpu... And just after that, IBM announced that NINTENDO is one of the best partners in the gaming industry, thats why they shared some of their newest technologies with the wii U's cpu.[/FONT]

PowerPC was renamed to POWER in 2006. The POWER name didn't exist when this CPU core was originally designed... back in 1997. They took a design from the 90s, ramped up the clock speed, and stuck 3 of them on a die. POWER7 this ain't.
 

Cerb

Elite Member
Aug 26, 2000
17,484
33
86
They unified the specs for the ISA family, awhile back (it's almost as bad as x86, but better codified). The ISA is POWER, but is not a monolithic thing. As well, the CPU core Nintendo uses was derived from the PowerPC750Xe, so, it is quite correct to call it a PPC.

POWER encompasses everything from MMU-less uCs to the biggest baddest record-setting CPUs. It is not a performance qualifier, any more than x86 or ARM. The Wii U CPU is still basically a SMP Apple G3 on steroids. It may be much improved from the Wii, which was improved from the Gamecube, which was improved from the PPC750Xe, but unless Sony really screws something up, the PS4 will have substantially more CPU performance available.

Hmmm...all within the same minute. The planets must have aligned or something. :p
 
Last edited:

inf64

Diamond Member
Mar 11, 2011
3,698
4,018
136
The wii U cpu is not a ppc model but a POWER model. PPC is different from POWER cpus.

IBM says that, Nintendo says that. If you log into their oficial tweeter and websites, wii U is advertised with an IBM POWER based cpu.. Now if some people dont believe the companies who made this chip, its an another matter.

That also explains why the middle cpu core, have 2mb of cache. That also means, wii U cpu is away to strong compared to ps4 and 720 ones... (underpowered - amd ones... doesnt sound thrilling). IBM power cores, are well known for their very high instruction per cycle. (IPC)

[FONT=&quot]I also remember an early rumor that nintendo got a bargain in the cpu... And just after that, IBM announced that NINTENDO is one of the best partners in the gaming industry, thats why they shared some of their newest technologies with the wii U's cpu.[/FONT]
No.
It's been known for a while what exactly Wii U has inside.
1.243125GHz, exactly. 3 PowerPC 750 type cores (similar to Wii's Broadway, but more cache). GPU core at 549.999755MHz.
The guy is a known "programmer"(if you like to call it that way). He hacked the console and this is what he got from the hardware itself. Still, the core in WiiU is somewhat better (in integer ops) than the ones in Cell/Xbox360 as it's OoO core while Cell and Xenon are in-order.
Unfortunately ,when compared to advanced OoO design that will be in both PS4 and nextgen xbox(Jaguar), Esspresso looks very weak. Jaguar not only has much higher IPC in integer and float(vector), it supports complete x86 ISA and has 8 physical cores(threads) with massive memory BW at its disposal. There is no comparison between PS4/Xboxnext and WiiU,they are generation (if not two) apart.
 

Plutonas

Junior Member
Mar 5, 2013
2
0
0
the jaguars are 2 ipc, even xbox720 leaked documentation write that.... even wii surpass that, with 7 ipc. (The oficial documentation about ppc750 at 400mhz, say 4 ipc) And the wii was an overclocked ppc750cl.

So wii U if its 750 as YOU claim, it may be 4 jaguar per 1 wii U core at 1.2ghz.

https://www-01.ibm.com/chips/techlib/techlib.nsf/techdocs/2F33B5691BBB8769872571D10065F7D5/$file/750cldd2x_ds_v2.6_16Oct2009dft.pdf

This section summarizes the features of the 750CL implementation of the PowerPC Architecture™. Major
features of the 750CL include the following:
• Branch processing unit
• Fetches four instructions per clock
• Processes one branch per cycle and can resolve two speculations
• Executes single speculative stream during fetch of another speculative stream
• Has a 512-entry branch history table (BHT) for dynamic prediction
• Dispatch unit
• Has full hardware detection of dependencies, which are resolved in the execution units
• Dispatches two instructions tosix independent units (system, branch, load/store, fixed-point unit 1, fixed-point unit 2, or floating-point)
• Has serialization control (predispatch, postdispatch, execution, serialization)

Anyway, this programmer Markan, didnt found the model type inside the hardware, but the speeds. He assumed that these are 3x750s taped together.


Also, jaguar cores exist in tablets and mobiles. But they are not oficially released in desktop pc's. I highly doubt the xbox and sony cores are desktop ones. Thats a HUGE BIG and spendy investment (to use an unreleased product). Time will tell. Even if the jaguar cores are desktop based, they run at 1/3 of their original speeds/ performance compared to a PC.
 
Last edited:

itsmydamnation

Platinum Member
Feb 6, 2011
2,772
3,148
136
seriously stop...

jaguar isn't "TWO IPC" that make no sense ( thus stop talking).

Jaguar is two issue/two retire per core but that means nothing unelss you can sustain two ops every cycle which we don't even get close to on the "best cpu cores" across varying workloads ( much like workloads games have ;)) .

Put it simply, executing operations on operands is cheap and easy, having data in the right location at the right time is really really really really raelly hard.

the collective anand forum would be grateful if you don't argue but instead listen and learn :whiste:.
also learn the difference between CISC and RISC. RISC with the same issue rate as CISC would be a very bad thing for a RISC processor. ( all other things being equal)
 
Last edited:

inf64

Diamond Member
Mar 11, 2011
3,698
4,018
136
Now I think that guy is just trolling us :). Just ignore him ;).
 

tipoo

Senior member
Oct 4, 2012
245
7
81
Plutonas...No. Just no. Instructions per clock isn't the decode rate. Marcan didn't just reveal the clock speed, he got the core type too, the instruction set is exactly the same as the PowerPC 750. This is venturing into delusional territory if you believe 3 of those at 1.2GHz can beat 8 Jaguar cores at 1.6. This is just like the GPU, when we analyzed that at neogaf and got a rough estimate of how many GFLOPS it can put out people started finding all sorts of excuses, magical fixed function hardware etc. It's not a big deal, it's less powerful, period.
 

Centauri

Golden Member
Dec 10, 2002
1,655
51
91
PowerPC was renamed to POWER in 2006. The POWER name didn't exist when this CPU core was originally designed... back in 1997. They took a design from the 90s, ramped up the clock speed, and stuck 3 of them on a die. POWER7 this ain't.

It's amazing how messed up your history is. PowerPC was the name given to eventual desktop derivatives of the POWER family. IBM's development on POWER began in the late 80s, with POWER1 shipping in 1990, followed by POWER2 in 1993 and POWER3 in 1998.

IBM, Motorola and Apple began collaboratively working on desktop RISC in 1991 but 'PowerPC' wasn't really born, even internally, until 1992. The first shipping PowerPC desktop wasn't available until 1994.

POWER and PowerPC are completely separate development efforts, with POWER being the sole property of IBM. Nothing has been renamed to anything else. Development of both is still completely separate as well as ongoing, and IBM's POWER8 is nearly completed.
 

Cerb

Elite Member
Aug 26, 2000
17,484
33
86
POWER = ISA, as it generally refers to a POWER Architecture(tm) product.

POWERn, where n is as high as 8, as of today = IBM workstation or server CPU based on the Power ISA.

PowerPCX, where X is several numerical digits, followed by some letters, usually = blade, desktop, or embedded CPUs using supporting some mixing and matching of POWER ISA version and book features. They used to be totally separate, but post-Motorola ones haven't necessarily been. The PPC970, FI, was derived from the POWER4; and the Cell PPE was apparently very distantly derived from it (much in the same way as Atom came from the Dothan Pentium-M).

In 2006, IBM released specs and guidelines for a common Linux platform, including common firmware guidelines (translation: "if we don't make this stuff easier for FOSS OS developers, all our fancy RAS won't mean jack, next to x86's widespread software support"), which has allowed them to stay in the embedded world successfully (and all they have to really do is collect royalties--can you say win?), and then started branding more server systems with the POWER name, over the next few years.
 
Last edited:

tipoo

Senior member
Oct 4, 2012
245
7
81
Some historical perspective on 750s

The PowerPC 750 (a.k.a., the G3)

The PowerPC 750, known to Apple users as the G3, is a design based heavily on the 603/603e. Its four-stage pipeline is the same as that of the 603/603e, and many of the features of its front-end and back-end will be familiar from the previous article's discussion of the older processor. Nonetheless, the 750 sports a few very powerful improvements over the 603e that made it faster than even the 604e.

PowerPC 750 summary table

Introduction date: November 10, 1997
Process: 0.25 micron
Transistor Count: 6.35 million
Die size: 167mm2
Clock speed at introduction: 233-266MHz
Cache sizes: 64KB unified L1, 512KB L2
First appeared in: Power Macintosh G3/233

The 750's significant improvement in performance over the 603/603e is the result of a number of factors, not the least of which are the improvements that IBM made to the 750's integer and floating-point capabilities.

A quick glance at the 750's layout will reveal that its execution core is wider than that of the 603. More specifically, where the 603 has a single integer unit the 750 has two, a simple integer unit (SIU) and complex integer unit (CIU). The 750's complex integer unit handles all integer instructions, while the simple integer unit handles all integer instructions except multiply and divide. Most of the integer instructions that execute in the SIU are single-cycle instructions.

Like the 603 (and the 604), the 750's floating-point unit can execute all single-precision floating-point operations, including multiply, with a latency of three cycles. Unlike the 603, though, the 750 doesn't have to insert a pipeline bubble after every third instruction in its pipeline. Double-precision floating-point operations, with the exception of operations involving multiplication, also take three cycles on the 750. Double-precision multiply and multiply-add operations take four cycles, because the 750 doesn't have a full double-precision FPU.

The 750's load-store unit and system register unit perform the functions described above for the 603, so they don't merit further comment.

The 750's front end and instruction window

The 750 fetches up to four instructions per cycle into its six-entry instruction queue (c.f. the 603's six-entry IQ), and it dispatches up to two non-branch instructions per cycle from the IQ's two bottom entries. The dispatch logic follows the four dispatch rules described above when deciding when an instruction is eligible to dispatch, and each dispatched instruction is assigned an entry in the 750's six-entry reorder buffer (compare the 603's five-entry ROB).


Figure POWERPC.4: The PowerPC 750

As on the 603 and 604, newly-dispatched instructions enter the reservation station of the execution unit to which they have been dispatched, where they wait for their operands to become available so that they can issue. The 750's reservation station configuration is similar to that of the 603, in that with the exception of the two-entry reservation station attached to the 750's LSU, all of the execution units have a single-entry reservation station. And like the 603, the 750's branch unit has no reservation station.

Because the 750's instruction window is so small, it has half the rename registers of the 604. Nonetheless, the 750's six general-purpose and floating-point rename register still put it ahead of the 603's number of rename registers (five GPR and four FPR). Like the 603, the 750 has one rename register each for the CR, LR, and CTR.

You would think that the 750's smaller reservation stations and shorter ROB would put it at a disadvantage with respect to the 604, which has a larger instruction window. But the 750's pipeline is shorter than that of the 604, so it needs fewer buffers to track fewer in-flight instructions. Even more importantly, though, the 750 has one very clever trick up its sleeve that it uses to keep its pipeline full.

Branch prediction on the 750

In the previous article's discussion of branch prediction, we talked about how dynamic branch prediction schemes use a branch history table (BHT) in combination with a branch target buffer (BTB) to speculate on the outcome of branch instructions and to redirect the processor's front end to a different point in the code stream based on this speculation. The BHT stores information on the past behavior (i.e., taken or not taken) of the most recently executed branch instructions, so that the processor can determine whether or not it should take these branches if it encounters them again. The target addresses of recently taken branches are stored in the BTB, so that when the branch prediction hardware decides to speculatively take a branch it will have immediate access to that branch's target address without having to recalculate it. The target address of the speculatively taken branch is loaded from the BTB into the instruction register, so that on the next fetch cycle the processor can begin fetching and speculatively executing instructions from the target address.

The 750 improves on this scheme in a very clever way. Instead of storing only the target addresses of recently taken branches in a BTB, the 750's 64-entry branch target instruction cache (BTIC) stores the instruction that's located at the branch's target address. When the 750's branch prediction unit examines the 512-entry BHT and decides to speculatively take a branch, it doesn't have to go code storage to fetch the first instruction from that branch's target address. Instead, the BPU loads the branch's target instruction directly from the BTIC into the instruction queue, which means that the processor doesn't have to wait around for the fetch logic to go out and fetch the target instruction from code storage. This scheme saves valuable cycles, and it helps keep performance-killing bubbles out of the 750's pipeline.

PowerPC 750 conclusions

In spite of its short pipeline and small instruction window, the 750 packed quite a punch. It managed to outperform the 604, and it was so successful that a 604-derivative was scrapped in favor of just building on the 750. The 750 and its immediate successors, all of which went under the name of "G3," eventually found widespread use both in the embedded arena and across Apple's entire product line, from its portables to its workstations.

The G3 lacked one important feature that separated it from the x86 competition, though: vector computing capabilities. While comparable PC processors supported SIMD in the form of Intel's and AMD's vector extensions to the x86 instruction set, the G3 was stuck in the world of scalar computing. So when Motorola decided to develop the G3 into an even more capable embedded and media workstation chip, this lack was the first thing they addressed.

http://arstechnica.com/features/2004/10/ppc-2/
 

Techhog

Platinum Member
Sep 11, 2013
2,834
2
26
People still argue about this? It's a 13-year-old CPU that proves Nintendo needs to be banned from designing hardware. End of story.