A CPU Architecture Question

phisrow

Golden Member
Sep 6, 2004
1,399
0
0
First off, I think this is sufficiently theoretical to go here; but if it actually belongs over in CPU/Processors, I apologise. In any case:

One often reads about how nobody really likes the x86 architecture, as an architecture, very much. Sure, for market reasons, x86 cores are cheap, fast, common, and run a huge percentage of the code and binaries in the world; but when people go for serious performance they tend to use some other architecture(64 bit extentions at the very least, and often something else entirely e.g. Itanium2, Power5, SPARC, MIPS, PPC, &c.) Also, even within modern x86s, the internal architecture is usually something else entirely, with x86 conversion bolted on(either in chip with the Intel, AMD, and VIA offerings, or in software with the Transmeta gear). This leads me to wonder, why has no one designed a system that allows you to run it either in x86 compatibility mode or in the higher performance(but esoteric) native mode? I don't imagine that Intel or AMD would be too likely to do this; but imagine the prospect for VIA, Transmeta, or one of the other x86 bit players: Typically the little players are building embedded systems, so they don't have to worry as much about being compatible with every last motherboard out there(you'd have to be one sick puppy to swap the CPU off of an EPIA and into another board, for example). It seems that you could, especially in the case of Transmeta, where the code transform occurs in software, build some extra code into your BIOS to allow the system to boot as either an x86 or as the native architecture(or heck, allow it to emulate x86, ppc, sparc, and arm, just for the coolness of it all). You would then either slip the GCC devs a chunk of money, or have some of your people submit the necessary patches to make your platform easy to develope for. I can't imagine that it would cost all that much more than what they do now; and it would give them a real edge. The people who want to use embedded Windows, or one of the x86 linux distros can continue to do so; while the people who are willing to recompile could get a free performance boost, and begin to create a supply of code that works on the new architecture, easing it into the market.
Am I missing some serious technical hurdles to this sort of thing, or are the economics actually worse than they look, or what?
 

sao123

Lifer
May 27, 2002
12,653
205
106
Acutally intel tried this. one of the Itaniums had a x86 emulation mode. But it just didnt come close to performaing as an actual x86 chip.
the real problem is that most software will only run on one platform. Java tried to accomplish this, (all software will rune on all java supported platforms) but isnt succeeding well.

So, there really isnt a market for any such device now. Each hardware platform has its own purpose in the world, where it outperforms any other platform.
 

Vee

Senior member
Jun 18, 2004
689
0
0
Originally posted by: phisrow
First off, I think this is sufficiently theoretical to go here; but if it actually belongs over in CPU/Processors, I apologise. In any case:

One often reads about how nobody really likes the x86 architecture, as an architecture, very much. Sure, for market reasons, x86 cores are cheap, fast, common, and run a huge percentage of the code and binaries in the world; but when people go for serious performance they tend to use some other architecture(64 bit extentions at the very least, and often something else entirely e.g. Itanium2, Power5, SPARC, MIPS, PPC, &c.)
IMO, old texts, or texts written by old minds. Only things interesting today is Power and x86. Itanium hangs in there yet, but is IME is pretty dead while not yet knowing it. The rest is slow or dead. Any reason to use SPARC for instance, is probably not serious performance but serious throughput. Eventually, I think x86 is going to intrude there too.
Also, even within modern x86s, the internal architecture is usually something else entirely, with x86 conversion bolted on(either in chip with the Intel, AMD, and VIA offerings,
or in software with the Transmeta gear). This leads me to wonder, why has no one designed a system that allows you to run it either in x86 compatibility mode or in the higher performance(but esoteric) native mode?
I think this possibility mainly suggests itself to you, due to the language & words you use to describe the inner workings. It's not quite like that. It's not "bolted" on. It's much more integrated. Think of a modern x86 CPU as a car factory. The x86 instructions being cars to be built. It takes maybe 20 hours to build a car, but even so a factory can spit out a new car every few minutes.

The modern CPU use similar techniques to commit results from instruction as frequent as possible. Despite that it takes many clock cycles to do the work of one instruction. In a K8, the incoming instructions are split on three parallel lines for decoding. The instruction wait in pool, until all *components* needed are at hand. It is then sheduled individually "Out of Order" into one of 6 parallel execution lines (3 int, 3 fp), where the result is computed. The result is then stored in a reorder queue, until any *previous* results are ready, and then the results are committed in the correct order. That is, written to memory, registers.

On these lines, instructions can be seen to come to exist as more but *smaller* instructions, subassemblies if you like. I think this is what is meant by code fission. P4 follows roughly similar scheme, but is mostly only 2 lines wide where K8 is 3, and so on. The lines are also longer and have *smaller* subassemblies, and move faster.
But there really isn't any internal alternative instruction set, in the way that there is in Transmeta's morphing tech.
And those 'microinstructions' would also be a very inefficient way of getting instructions and data into, and out of, the CPU.

"Other" architectures, you're speaking of, handle instructions in pretty much the same way as modern x86. The point of RISC, (SPARC, PPC, MIPS, ARM) was that all this speed (the car factory) was easier to do, on a limited number of transistors, due to the instruction set being smaller, less flexible, and more restricted. So the whole point of RISC was to accomplish the architecture we now have in x86 anyway.
Today, x86 has enough transistors, and RISC has run out of steam due to a number of reasons. One of them is that RISC is not as efficient as x86 to get things into and out of the factory. Another is that the many registers cause problems (false dependencies) for OoO execution sheduling.

Power is sort of evolved PPC.

EPIC (Itanium) is something different. RISC fundament. But trying to get the compiler to predicate branches and parallelize work in advance. After many years of failures, Intel now seem to try lower compiler ambitions, run CPU faster and use humongous caches. Performance in relation to die size and cost are very low. Integer performance is, for instance, no better than Athlon64/Opteron. And while FP performance is much better, the basic K8 architecture can probably fairly straightforwardly be boosted to much higher FP performance than in the current iteration (and K9 or K10 might even do that), without increasing cost to Itaniums level.

x86 is a fairly powerful, flexible, simple and efficient *language*. Really originally made for human assembler programmers. The complexity is kept inside the CPU, while compiler technology is simpler and very mature. Advantage of this is that the CPU can utilize any means to increase speed at a brisc pace with every generation, independently, without waiting for decades of expensive compiler research to catch up with the latest iteration of the core.

RISC and EPIC came from the opposite idea, complex compiler with intimate knowledge of the CPU, to handle a horrible instruction set, and a simple CPU that would be easy to make run fast. Lots of people were very enthusiastic about this. "- Let the compiler do the hard work!". It now seems this wasn't quite such a brilliant idea.

Mind you, not everyone thinks EPIC/RISC and complex compilers is wrong. Intel doesn't seem to. IBM might not either. And I'm sure someone is also going to oppose this, here in this very thread ;). And hell, they might be right. What do I know?

Well, I know that RISC cpus are neither so simple or so fast anymore, and software development/porting costs are much, much higher.
 

cquark

Golden Member
Apr 4, 2004
1,741
0
0
Originally posted by: Vee
Originally posted by: phisrow
One often reads about how nobody really likes the x86 architecture, as an architecture, very much. Sure, for market reasons, x86 cores are cheap, fast, common, and run a huge percentage of the code and binaries in the world; but when people go for serious performance they tend to use some other architecture(64 bit extentions at the very least, and often something else entirely e.g. Itanium2, Power5, SPARC, MIPS, PPC, &c.)

As an earlier post pointed out, Itanium had an x86 emulation mode. Today, with the push towards dual cores as a common architecture, the best solution might be to offer you both processors on a single die. You'd need a specialized socket and chipset still, but it would likely work better than emulation.

The complexity is kept inside the CPU, while compiler technology is simpler and very mature. Advantage of this is that the CPU can utilize any means to increase speed at a brisc pace with every generation, independently, without waiting for decades of expensive compiler research to catch up with the latest iteration of the core.

We've run into some scaling issues recently that may be as difficult as the compiler problems are to solve.

RISC and EPIC came from the opposite idea, complex compiler with intimate knowledge of the CPU, to handle a horrible instruction set, and a simple CPU that would be easy to make run fast. Lots of people were very enthusiastic about this. "- Let the compiler do the hard work!". It now seems this wasn't quite such a brilliant idea.

Mind you, not everyone thinks EPIC/RISC and complex compilers is wrong. Intel doesn't seem to. IBM might not either. And I'm sure someone is also going to oppose this, here in this very thread ;). And hell, they might be right. What do I know?
How about the EDGE architecture as a compromise between putting too much reliance on the compiler understanding everything at compile time, as in EPIC, and too much on the CPU figuring how code works at runtime, as modern superscalar processors do?
http://www.cs.utexas.edu/users/mckinley/ papers/trips-computer-2004.pdf
 

jhu

Lifer
Oct 10, 1999
11,918
9
81
i think the nexgen nx586 was able to operate in both x86-mode and native mode. that was back in the mid-90s.
 

jhu

Lifer
Oct 10, 1999
11,918
9
81
just a few corrections:

Originally posted by: Vee
"Other" architectures, you're speaking of, handle instructions in pretty much the same way as modern x86. The point of RISC, (SPARC, PPC, MIPS, ARM) was that all this speed (the car factory) was easier to do, on a limited number of transistors, due to the instruction set being smaller, less flexible, and more restricted. So the whole point of RISC was to accomplish the architecture we now have in x86 anyway.
Today, x86 has enough transistors, and RISC has run out of steam due to a number of reasons. One of them is that RISC is not as efficient as x86 to get things into and out of the factory. Another is that the many registers cause problems (false dependencies) for OoO execution sheduling.

what you mean to say is that we can cram enough transistors to make the limitations of x86 moot. as such, the "risc" advantage has decreased significantly. as for the amount of registers, x86-64 tends to perform better when programs are recompiled in part due to the doubling of visible registers.

Originally posted by: Vee
Power is sort of evolved PPC.

it's actually the other way around.

Originally posted by: Vee
x86 is a fairly powerful, flexible, simple and efficient *language*. Really originally made for human assembler programmers. The complexity is kept inside the CPU, while compiler technology is simpler and very mature. Advantage of this is that the CPU can utilize any means to increase speed at a brisc pace with every generation, independently, without waiting for decades of expensive compiler research to catch up with the latest iteration of the core.

actually, x86, as an architecture, sucks. but since most sane people don't write in pure assembly, it doesn't really matter. in addition, anyone can implement their own x86 processor without the need to have to recompile existing software.
 

Vee

Senior member
Jun 18, 2004
689
0
0
Originally posted by: jhu
just a few corrections:

... as for the amount of registers, x86-64 tends to perform better when programs are recompiled in part due to the doubling of visible registers.
Of course! Selfevident! But thanks to that the dependencies are still manageable to be handled independantly by the CPU with hidden registers and register renaming.
Power is sort of evolved PPC.

it's actually the other way around.
...ok?... PPC was developed as a joint venture by Motorola and IBM, utilizing technologies from Motorolas 88000 and IBM's R6000? I assume R6000 then is a power architecture? Don't remember it being referred to as such, but it was a long time ago, and I suppose I wasn't so interested. Anyway, I thought Power sprung out of the PPC project.

actually, x86, as an architecture, sucks. but since most sane people don't write in pure assembly, it doesn't really matter.
Well, but your reason for feeling that way, would primarily be the segmented modes? What the hell do I know, but I wasn't comparing to other CISC. My point was that the compilers task is less complex, and less intimately coupled with the CPU.
 

cquark

Golden Member
Apr 4, 2004
1,741
0
0
Originally posted by: Vee
Power is sort of evolved PPC.

it's actually the other way around.
...ok?... PPC was developed as a joint venture by Motorola and IBM, utilizing technologies from Motorolas 88000 and IBM's R6000? I assume R6000 then is a power architecture? Don't remember it being referred to as such, but it was a long time ago, and I suppose I wasn't so interested. Anyway, I thought Power sprung out of the PPC project.

The POWER architecture was used in the IBM RS6000 workstations long before PPC was developed. PowerPC supports a subset of the POWER architecture, but has some architectural features from the 88000 as you point out above.

actually, x86, as an architecture, sucks. but since most sane people don't write in pure assembly, it doesn't really matter.
Well, but your reason for feeling that way, would primarily be the segmented modes? What the hell do I know, but I wasn't comparing to other CISC. My point was that the compilers task is less complex, and less intimately coupled with the CPU.

Involving the compiler to a higher degree than CISC chips do is a good idea, because a great deal of information about the program is lost after compilation; however, relying on the compiler to as high a degree as EPIC does can be a bad idea because dynamic runtime analysis tells you much about performance that static compile time analysis cannot.

Moore's Law has favored the CISC chips for most of the last decade, as it's made adding all the extra transistors for runtime analysis cheap, but we've become stuck over the last year. Turning to dual cores is an admission of this fact. The EDGE architecture I mentioned above offers an interesting compromise between the two extremes of CISC and EPIC.
 

tinyabs

Member
Mar 8, 2003
158
0
0
Since the CPU nowadays are fast enough, I don't see a reason to compile for native mode to get a 25% increase in speed. How much speed increase can you get anyway?

To compile for native mode before anything, I would need to learn new technology, get new tools, new machines, new OS and inexperience people, and face risks of canceling project half-way. That's something an experienced manager won't do.

Unless the speed increase is a few fold increase and the world is turning to it, using unfamiliar native mode is a waste of engineering time. The key note is that every architecture has disadvantages; you just reshuffle the pros and con factors but the speed increase won't be extraordinal.

There isn't anything fundamentally wrong with x86 instructions or others because they are designed to outperform in a specific area.
 

PsharkJF

Senior member
Jul 12, 2004
653
0
0
My father works with Solaris SPARC machines in a hospital database setting.
[Just interjecting. No point. Har.]

Anyway, with more transistors on a die shrink, can't we add hardware compatibility for 2 instruction sets? I mean, Itanium proved emulation is slow, but if we have enough tranny headroom, can't we try for hardware support for 2 sets?
 

sao123

Lifer
May 27, 2002
12,653
205
106
Even if it were feasible to do such a thing... (and it may be, but would require lots of additional control circuitry on the cpu)
It would never become cost friendly enough for a generic computer. Remember... x86 chips cost in the few hundreds of dollars. Itanium Chips costs in the thousands.

Secondly what benefits do you really expect from it?