Would RISC have been better?

Maximilian · Dec 7, 2013

Been reading about RISC CPU's, its suggested that CISC only won because Intel had a ton of cash to throw at it. Would CPU's today have been better had Intel threw all their cash at RISC instead?

Would RISC even have stayed that way or would it have ended up more like CISC because of chipmakers tacking on extra instructions?

I guess nobody could really know the answers here but speculate away! 🙂

BallaTheFeared · Dec 7, 2013

I can't speculate since I have no idea what is it is you are talking about.

But it seems interesting none the less 😉

Centauri · Dec 7, 2013

Of course RISC would have been better. Intel itself tried to kill x86 more than once. But by then it had long since become bigger than they could control.

Don't think Intel wouldn't push the Itanium button if it were possible.

Fox5 · Dec 7, 2013

I'm not sure there'd be any appreciable difference.

Almost all architectures have similar baggage now.
X86's flaws also tend to give it some rather unique strengths.

I'd say it seems like it's easier to get a high performance processor out of RISC than it is out of CISC. However, Intel's Atom and AMD's small core line do show us that even x86 cores with low monetary investments can hang with or beat non-x86 cores.

its suggested that CISC only won because Intel had a ton of cash to throw at it.

This is countered by AMD's existence. AMD beat Intel for a crucial portion of the processor wars, and even now AMD's processors compare well against any RISC processors.

lamedude · Dec 7, 2013

I wouldn't be surprised if AMD was selling more K6s than all RISC CPUs combined (press release says AMD had 54% share in 98) . AMD didn't have the cash but they had the economies of scale to just throw more transistors at it.

ShintaiDK · Dec 7, 2013

The last CISC CPU was the Pentium.

Fox5 · Dec 7, 2013

ShintaiDK said:
The last CISC CPU was the Pentium.

http://en.wikipedia.org/wiki/Complex_instruction_set_computing

"A complex instruction set computer (CISC /ˈsɪsk/) is a computer where single instructions can execute several low-level operations (such as a load from memory, an arithmetic operation, and a memory store) and/or are capable of multi-step operations or addressing modes within single instructions. The term was retroactively coined in contrast to reduced instruction set computer (RISC).[1][2]
Examples of CISC instruction set architectures are System/360 through z/Architecture, PDP-11, VAX, Motorola 68k, and x86."

"Before the RISC philosophy became prominent, many computer architects tried to bridge the so-called semantic gap, i.e. to design instruction sets that directly supported high-level programming constructs such as procedure calls, loop control, and complex addressing modes, allowing data structure and array accesses to be combined into single instructions. Instructions are also typically highly encoded in order to further enhance the code density. The compact nature of such instruction sets results in smaller program sizes and fewer (slow) main memory accesses, which at the time (early 1960s and onwards) resulted in a tremendous savings on the cost of computer memory and disc storage, as well as faster execution. It also meant good programming productivity even in assembly language, as high level languages such as Fortran or Algol were not always available or appropriate (microprocessors in this category are sometimes still programmed in assembly language for certain types of critical applications[citation needed])."

The CISC vs RISC idea is defined by the external instruction set, so x86 is still CISC.
ARM used to be RISC, but given how it's evolved in complexity over time, I'd say it could be considered CISC too. As hardware gets cheaper, I'd imagine that the balance shifts in favor of completed instructions.

Other difference is the usage of cache vs registers. RISC based architectures have more registers and have register to register operations. CISC based architectures generally work off of the stack. In the case of x86 at least, instructions often have implicit operands (saving in code size), and the instruction set itself is highly encoded to reduce in size, reducing cache pressure significantly.
IIRC, the average x86 instruction is <2bytes. On 32 RISC cpus, it's almost always 4 bytes, creating code significantly larger than x86 on top of just flat out generating more instructions to accomplish the same code.

For instance, the PDP-8, having only 8 fixed-length instructions and no microcode at all, is a CISC because of how the instructions work, PowerPC, which has over 230 instructions (more than some VAXes) and complex internals like register renaming and a reorder buffer is a RISC, while Minimal CISC has 8 instructions, but is clearly a CISC because it combines memory access and computation in the same instructions.
Some of the problems and contradictions in this terminology will perhaps disappear as more systematic terms, such as (non) load/store, become more popular and eventually replace the imprecise and slightly counter-intuitive RISC/CISC terms.

The lines between CISC and RISC are pretty blurred now, but the model of "can operate directly on the stack" and "must load values into registers first" is about the only consistent divide.

ShintaiDK · Dec 7, 2013

We can both agree that Pentium Pro and forward uses RISC internally yes?

Hulk · Dec 7, 2013

ShintaiDK said:
We can both agree that Pentium Pro and forward uses RISC internally yes?

I thought that the use of micro-ops (breaking down long instructions into smaller ones) is what makes x86 processors "RISC-like" and micro-ops started with the orginal Pentium right?

Ajay · Dec 7, 2013

Maximilian said:
Been reading about RISC CPU's, its suggested that CISC only won because Intel had a ton of cash to throw at it. Would CPU's today have been better had Intel threw all their cash at RISC instead?

Would RISC even have stayed that way or would it have ended up more like CISC because of chipmakers tacking on extra instructions?

I guess nobody could really know the answers here but speculate away! 🙂

Well, we still have the Power 7 and Sparc T5 architectures going (in terms of 'high power' CPUs). But I don't know if either offers even workstations anymore. I think it's all servers. As big iron, they lead Intel, but Intel is slowly catching up and the Unix server market is shrinking every year.

I always wonder what would have happened if Apple picked the DEC Alpha over the PowerPC. Alpha's rocked, and if DEC hadn't gone bankrupt, I like to think they'd still have the top dog. But Apple insisted on IEEE 754 extended precision format support (80 bit) and DEC hated the idea and wanted to stick with 64 bit DP-FP (among other issues).

I had a workstation running WinNT 4.0 on an Alpha (late 90's), the JIT worked great, and native apps flew! Of course, having a noisy as all hell 10K rpm SCSI drives added to the sense of speed - fortunately it was in the lab which was noisy already.

Tuna-Fish · Dec 7, 2013

Centauri said:
Intel itself tried to kill x86 more than once.

Yes, but of those three tries, only one was RISC. The APX was CISC taken to extreme with all the problems that brings, and Itanium was VLIW-like. i860 was RISC.

Fox5 said:
ARM used to be RISC, but given how it's evolved in complexity over time, I'd say it could be considered CISC too.

I disagree with this. While ARM has plenty of instructions, their scope is mostly limited. The meaning of RISC has slowly evolved from (reduced instruction set) computer into (reduced instruction) set computer. That is, it's okay to have a lot of instructions so long as they are all limited to very narrow scope.

The new 64-bit ISA is definitely RISC.

ShintaiDK said:
We can both agree that Pentium Pro and forward uses RISC internally yes?

Yes, but this is only an implementation detail on how the CISC is achieved. It does not make the CPU a RISC one.

Hulk said:
I thought that the use of micro-ops (breaking down long instructions into smaller ones) is what makes x86 processors "RISC-like" and micro-ops started with the orginal Pentium right?

No, it was the Pentium Pro.

Fox5 said:
I'm not sure there'd be any appreciable difference.

Ultimately, I agree with this. If the past two decades have shown us anything, it's that the ISA just doesn't matter *that* much, so long as it's not utterly braindead. Microarchitecture is more important.

Fox5 · Dec 7, 2013

ShintaiDK said:
We can both agree that Pentium Pro and forward uses RISC internally yes?

Does it? It translates x86 instructions to some other instruction set. I have no details on what that is, it's a complete black box. Some internal instructions may be RISC like, some may be CISC like. What you feed into the processor is still CISC.
FYI, from what I've heard, even the internal micro ops are getting more CISC like again as time goes on.

I disagree with this. While ARM has plenty of instructions, their scope is mostly limited. The meaning of RISC has slowly evolved from (reduced instruction set) computer into (reduced instruction) set computer. That is, it's okay to have a lot of instructions so long as they are all limited to very narrow scope.

ARMs thumb encoding and mode switching certainly makes it more complicated. You've got 3 seperate decoding schemes supported within the same program. It has multiple sets of registers with many functions having implicit side effects. And the very idea of SIMD instruction sets doesn't seem very RISC like to me.

There's also the Jazelle Java bytecode accelerator for ARM, which though it translates to simpler ARM instructions, is converting a more complicated stack based instruction scheme.

Wikipedia states that RISC is more accurately termed "load store architectures." I will agree with that, ARM is still load store / register based, while x86 is stack based.

The CISC vs RISC difference doesn't seem particularly important. How stacks and registers are used is though, especially to compilers. The SPARC cpus had some pretty interesting ideas on stack and register usage imo.

ShintaiDK · Dec 7, 2013

Fox5 said:
Does it? It translates x86 instructions to some other instruction set. I have no details on what that is, it's a complete black box. Some internal instructions may be RISC like, some may be CISC like. What you feed into the processor is still CISC.
FYI, from what I've heard, even the internal micro ops are getting more CISC like again as time goes on.

The decoder translates x86 CISC into RISC uops. All Intel x86 CPUs since the Pentium Pro are RISC inside.

glugglug · Dec 7, 2013

From the P4 onwards, CISC is actually a very clear advantage, because the instructions are cached as RISC microcode, and since CISC code is more compact than RISC, it gets pulled out of RAM and put into that microcode cache faster.

tweakboy · Dec 7, 2013

Nice info gluggylug

jhu · Dec 8, 2013

Fox5 said:
Wikipedia states that RISC is more accurately termed "load store architectures." I will agree with that, ARM is still load store / register based, while x86 is stack based.

Modern high performance x86 processors have been load/store since 1995 regardless of what the ISA is.

Trick question: is Transmeta Crusoe or Efficeon RISC or CISC.

Borealis7 · Dec 9, 2013

i think that today, in the age of multi-core and highly parallelised code, you would want the processor instructions to be as light as possible and granular in order to run them at the same time across the different computational units in each core. i'm no chip designer, but it sounds logical to me at least.
x86 was not conceived in an era when engineers were thinking in terms of "branch predictions" or parallelism and perhaps it's time we think of a new approach to computing.

ViRGE · Dec 9, 2013

ShintaiDK said:
The decoder translates x86 CISC into RISC uops. All Intel x86 CPUs since the Pentium Pro are RISC inside.

Actually I'd argue that they're both RISC and CISC. Micro-Ops are distinctly RISC, but then you also have distinct Macro-Ops and Micro-Ops Fusion. As the AT article notes, this are still very much RISC cores, but they have the capability to swing both ways as the situation requires.

It's a neat compromise really. Instead of picking one or the other we just use whatever execution model works best for the instruction at hand.🙂

jhu · Dec 9, 2013

Fox5 said:
Wikipedia states that RISC is more accurately termed "load store architectures."

This is about the only thing that "RISC" processors nowadays have in common. Take a look at the POWER and PowerPC ISA: They're not any less complex than what Intel/AMD have bolted on to x86 in the past decade.

The CISC vs RISC difference doesn't seem particularly important. How stacks and registers are used is though, especially to compilers. The SPARC cpus had some pretty interesting ideas on stack and register usage imo.

SPARC's register window seems weird and annoying to me.

Exophase · Dec 9, 2013

ShintaiDK said:
The decoder translates x86 CISC into RISC uops. All Intel x86 CPUs since the Pentium Pro are RISC inside.

From Pentium M forward Intel x86 processors have decoders that output an instruction set Intel calls fused uops. We don't really know what those uops look like, they could literally be represented as two unfused uops but they could also be represented as something much more compact. They support load + op which isn't very RISC-like, and while they're probably a fixed width that loses a lot of relevance after the front-end.

These fused uops are atomic for most of the pipeline, only being turned into simpler unfused uops when dispatched to the execution ports.

This doesn't apply to Atom processors either, neither Bonnell/Saltwell or Silvermont. At the very least they keep read-modify-write flowing through the pipeline as an atomic unit.

Fox5 said:
Other difference is the usage of cache vs registers. RISC based architectures have more registers and have register to register operations. CISC based architectures generally work off of the stack. In the case of x86 at least, instructions often have implicit operands (saving in code size), and the instruction set itself is highly encoded to reduce in size, reducing cache pressure significantly.

x86 was never a stack-based ISA, it has always had register to register operations and while it was pretty register starved and lacked a lot of orthogonality it's hard to really make that argument anymore with x86-64 having 16 registers. While a lot of RISCs have 32 registers they were usually designed to be able to hide latencies that an OoOE CPU can hide with register renaming.

Fox5 said:
IIRC, the average x86 instruction is <2bytes. On 32 RISC cpus, it's almost always 4 bytes, creating code significantly larger than x86 on top of just flat out generating more instructions to accomplish the same code.

Average size of x86 instructions is much larger than that, it's about 3.5 bytes and AFAIK that's for 32-bit code. A mixed 16/32-bit instruction set like Thumb-2 almost certainly has a lower average instruction size, although that doesn't mean that the code size is smaller if it needs more instructions to compensate. While I'm sure x86 started out with a strong focus on code density for the operations you get over time it lost ground because of the way it had to add extensions.

glugglug said:
From the P4 onwards, CISC is actually a very clear advantage, because the instructions are cached as RISC microcode, and since CISC code is more compact than RISC, it gets pulled out of RAM and put into that microcode cache faster.

Not until Sandy Bridge did the P6 lineage introduce some kind of post-decode cache, unless you count Nehalem's loop buffer. And they all have L1 icaches which store x86 instructions. Netburst's ideas for a trace cache were bold but ultimately not really validated..

jhu said:
This is about the only thing that "RISC" processors nowadays have in common. Take a look at the POWER and PowerPC ISA: They're not any less complex than what Intel/AMD have bolted on to x86 in the past decade.

Just because an instruction "seems" complex doesn't make it so in the RISC vs CISC sense. While some RISC ISAs had mixed width instructions like 16/32-bit they're much simpler to get width of than x86 instructions that are 1-15 bytes. And it's not just that they merge loads with ops but full read-modify-write, calls and returns that implicitly push and pop the stack, and instructions that can take arbitrary amounts of time like the rep prefix memory ops.

I do however agree that most of what Intel and AMD have added to the ISA in the last 10 years isn't very CISCy.

Fox5 · Dec 9, 2013

Average size of x86 instructions is much larger than that, it's about 3.5 bytes and AFAIK that's for 32-bit code. A mixed 16/32-bit instruction set like Thumb-2 almost certainly has a lower average instruction size, although that doesn't mean that the code size is smaller if it needs more instructions to compensate. While I'm sure x86 started out with a strong focus on code density for the operations you get over time it lost ground because of the way it had to add extensions.

Depends on the source and how you're thinking about it (just looking at instructions, or looking at usages).
http://www.strchr.com/x86_machine_code_statistics
http://www.ijpg.org/index.php/IJACSci/article/view/118

The most typical instruction seems to be 2-3 bytes, but the larger instructions can drive that average up. Looks like it just depends on what code is being analyzed. Actual comparisons between binaries have x86 code as more dense than arm+thumb.
Just looking at the instruction sets irrespective of output code, x86 instructions approach the density of Huffman encoding (and I'd bet that's how they generated the x86 instruction set originally), while any fixed length instruction set has density more along the lines of ASCII.

tweakboy · Dec 9, 2013

It would be a big risk to go with RISC and that is why they didnt do it.

Schmide · Dec 9, 2013

Fox5 said:
CISC based architectures generally work off of the stack.

Exophase said:
x86 was never a stack-based ISA

Work off the stack is not equivalent to stack-based ISA.

Work off the stack refers to addressing data with the bp (base pointer+displacement) while stack-based ISA refers to a system where operands and data coexist on a stack and are executed in place.

Exophase · Dec 9, 2013

Fox5 said:
Depends on the source and how you're thinking about it (just looking at instructions, or looking at usages).
http://www.strchr.com/x86_machine_code_statistics
http://www.ijpg.org/index.php/IJACSci/article/view/118

The most typical instruction seems to be 2-3 bytes, but the larger instructions can drive that average up. Looks like it just depends on what code is being analyzed.

Yes, and you said 2-3 bytes on average, not median. The average for this is is 3.267 bytes/instruction.

Fox5 said:
Actual comparisons between binaries have x86 code as more dense than arm+thumb.

The fact that you call it "ARM+Thumb" and not Thumb 2 makes me wonder which studies you've looked at with Thumb 2 code.

Mind you, static code sizes are not necessarily precisely representative of dynamic code size.

Fox5 said:
Just looking at the instruction sets irrespective of output code, x86 instructions approach the density of Huffman encoding (and I'd bet that's how they generated the x86 instruction set originally), while any fixed length instruction set has density more along the lines of ASCII.

But the problem is that any initial attempt was based on what they thought x86 code should look like 35 years ago. Since then that original 16-bit ISA has been obsoleted in so many ways.

Schmide said:
Work off the stack is not equivalent to stack-based ISA.

Work off the stack refers to addressing data with the bp (base pointer+displacement) while stack-based ISA refers to a system where operands and data coexist on a stack and are executed in place.

Actually he did use the description "stack based":

Fox5 said:
I will agree with that, ARM is still load store / register based, while x86 is stack based

While memory-based operations can address variables allocated on a/the stack I wouldn't consider it at all stack-oriented.

Schmide · Dec 9, 2013

Exophase said:
Actually he did use the description "stack based":

Yeah he is wrong there.

Exophase said:
While memory-based operations can address variables allocated on a/the stack I wouldn't consider it at all stack-oriented.

I disagree. x86 is totally stack oriented. 25% of their original registers are dedicated to working off the stack and it includes the ENTER/LEAVE instructions for facilitating those operations. Code would be totally tedious without them and the base+offset operations on memory. Seriously there are only 4 gp registers and 2 of those are generally used for indexed addressing and loop operations.

Would RISC have been better?

Lifer

Diamond Member

Golden Member

Diamond Member

Golden Member

Lifer

Diamond Member

Lifer

Diamond Member

Lifer

Golden Member

Diamond Member

Lifer

Diamond Member

Diamond Member

Lifer

Platinum Member

Elite Member, Moderator Emeritus

Lifer

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member