What would we be capable of

EightySix Four

Diamond Member
Jul 17, 2004
5,122
52
91
What kind of general home computing processor performance could we see if we could ditch the x86 architecture and start over new, no backwards compatiblity concerns, nothing?
 

Fox5

Diamond Member
Jan 31, 2005
5,957
7
81
I don't think the x86 instruction sets are a limitation of current computing, once all other limits have been removed I'm sure switching to a different architecture will make a huge difference.(current athlons and pentiums aren't old school x86 designs anyhow)
Of course, a processor like Cell shows that you can have a completely different focus and really kick ass in certain areas, but then suck in others.
 

Loki726

Senior member
Dec 27, 2003
228
0
0
I think intel already tried this with the Itanium. I'm not sure how different IA64 is from x86 but I don't think the results were spectacular.
 

Fox5

Diamond Member
Jan 31, 2005
5,957
7
81
"you can have a completely different focus and really kick ass in certain areas, but then suck in others. "

Basically Itanium.
Itanium also can run x86 code, but it sucks at it because it's an emulation.

I don't think Itanium is considered a modern architecture though, at least it's not designed like x86 or powerpc processors.
 

Calin

Diamond Member
Apr 9, 2001
3,112
0
0
Itanium's instruction set was developed as an Multiple Instruction Multiple Data architecture (named EPIC, from Explicitly Parallel Instruction Computing). One could create an instruction that is 8 "atomic" instructions wide, and runs on a processor capable to run a single "atomic" instruction. If later you make a processor 8 times as wide, you have 8x the performance.
Itanium have other problems, but I think that using optimized software it can compare with any other x86 processor. The almost failure of the Itanium line comes from the very fact that it wasn't able to consistently outrun the existing server-level processors (the Xeon lines).
 

uOpt

Golden Member
Oct 19, 2004
1,628
0
0
I don't know a single person working on compilers for classical programming languages who claimed that he/she could cope with Itanium.

These "optimizations" needed are really highly developed dependency graphs which are almost impossible to do for languages like C and C++, too complicated for the budget for Common Lisp or Eiffel or whatever, and the Java VM assembly languages maps very badly to it, too.

If people want to make real use of SIMD they will have to program in languages where they declare possible parallelism themself, in the program. It would be like threads now, except on a much smaller level, per loop. Such languages exist but are very unpopular due to other constraints and automatisms that programmers don't like. Most programmers want to do whatever the heck they want and declare things manually. Language researchers and the most efficient/effective programmers in the fields always clashed about anguage design and will continue to do so. To make a new language with explicit parallelism that pleases hardcore coders will be a huge effort along the lines of what was required to make C++ into something consistent.

You will need a new language for that. Lisp could cope by just making native implementations of specialized control constructs but in effect that's a new language, too.

People are now moving to multi-core instead of SIMD which allows you to use any language with just using a threads library. In addition they have SSE and 3Dnow to put selected hand-coded things into SIMD. A much more realistic approach, IMHO.

Personally I would have loved to change a certain Common Lisp compiler to do Itanium but AMD just rolled over Intel and I had to settle for AMD64.
 

Fox5

Diamond Member
Jan 31, 2005
5,957
7
81
Originally posted by: Calin
Itanium's instruction set was developed as an Multiple Instruction Multiple Data architecture (named EPIC, from Explicitly Parallel Instruction Computing). One could create an instruction that is 8 "atomic" instructions wide, and runs on a processor capable to run a single "atomic" instruction. If later you make a processor 8 times as wide, you have 8x the performance.
Itanium have other problems, but I think that using optimized software it can compare with any other x86 processor. The almost failure of the Itanium line comes from the very fact that it wasn't able to consistently outrun the existing server-level processors (the Xeon lines).


It's not an x86 processor though, and I don't think its designed to have good general purpose computing performance, but rather that it can be heavily optimized for to have good performance.(which I think most server cpus are)

BTW, can't the Athlons operate on up to 9 different instructions at one time? Something like 3 ints, 3 floats, and 3 moves?
So say something that made full use of the Athlon's architecture might take two clock cycles to run on a P4?(which I think can do 5 or 6 instructions at once?)
 

MetalStorm

Member
Dec 22, 2004
148
0
0
Originally posted by: BigB10293
Nothing that super-impressive. You could always buy a mac to check it out.

Yea, but all that'd show you is that you can have less programs, and have it running slower all at the same time! Bargain!

Originally posted by: crazySOB297
Originally posted by: BigB10293
Nothing that super-impressive. You could always buy a mac to check it out.

LOL, writing from a dual 2.0ghz G5

Hahahahahahah... Chump.
 

Calin

Diamond Member
Apr 9, 2001
3,112
0
0
Originally posted by: Fox5
Originally posted by: Calin
Itanium's instruction set was developed as an Multiple Instruction Multiple Data architecture (named EPIC, from Explicitly Parallel Instruction Computing). One could create an instruction that is 8 "atomic" instructions wide, and runs on a processor capable to run a single "atomic" instruction. If later you make a processor 8 times as wide, you have 8x the performance.
Itanium have other problems, but I think that using optimized software it can compare with any other x86 processor. The almost failure of the Itanium line comes from the very fact that it wasn't able to consistently outrun the existing server-level processors (the Xeon lines).


It's not an x86 processor though, and I don't think its designed to have good general purpose computing performance, but rather that it can be heavily optimized for to have good performance.(which I think most server cpus are)

BTW, can't the Athlons operate on up to 9 different instructions at one time? Something like 3 ints, 3 floats, and 3 moves?
So say something that made full use of the Athlon's architecture might take two clock cycles to run on a P4?(which I think can do 5 or 6 instructions at once?)

Athlon64 can work at the same time on multiple instructions (just like Pentium 4 can). But the number of instructions that can be finished per clock is like 3 on Athlon64 and 2 on Pentium 4.
Itanium is not an x86 architecture, but the topic wasn't requesting reverse compatibility.

To MartinCracauer: Developing for Itanium could be very difficult at all levels - however, I think there are things that could run very well on it, and also things that will run very slow... No wonder Itanium is a success in the supercomputers arena (much more so than in the general server arena)
 

ColKurtz

Senior member
Dec 20, 2002
429
0
0
A bunch of fussy, overly aesthically-obcessed, gourmet-coffee chugging, tree-hugging, fancy pants graphics pros say that is already here. As others have pointed out, it's called a Power chip - aka a MAC. ;)

 

Fox5

Diamond Member
Jan 31, 2005
5,957
7
81
Originally posted by: Calin
Originally posted by: Fox5
Originally posted by: Calin
Itanium's instruction set was developed as an Multiple Instruction Multiple Data architecture (named EPIC, from Explicitly Parallel Instruction Computing). One could create an instruction that is 8 "atomic" instructions wide, and runs on a processor capable to run a single "atomic" instruction. If later you make a processor 8 times as wide, you have 8x the performance.
Itanium have other problems, but I think that using optimized software it can compare with any other x86 processor. The almost failure of the Itanium line comes from the very fact that it wasn't able to consistently outrun the existing server-level processors (the Xeon lines).


It's not an x86 processor though, and I don't think its designed to have good general purpose computing performance, but rather that it can be heavily optimized for to have good performance.(which I think most server cpus are)

BTW, can't the Athlons operate on up to 9 different instructions at one time? Something like 3 ints, 3 floats, and 3 moves?
So say something that made full use of the Athlon's architecture might take two clock cycles to run on a P4?(which I think can do 5 or 6 instructions at once?)

Athlon64 can work at the same time on multiple instructions (just like Pentium 4 can). But the number of instructions that can be finished per clock is like 3 on Athlon64 and 2 on Pentium 4.
Itanium is not an x86 architecture, but the topic wasn't requesting reverse compatibility.

To MartinCracauer: Developing for Itanium could be very difficult at all levels - however, I think there are things that could run very well on it, and also things that will run very slow... No wonder Itanium is a success in the supercomputers arena (much more so than in the general server arena)

Hmm, I tried to look up the exact number and found....
Athlon can decode up to 3 x86 instructions at a time.
P4 can only decode 1 x86 instruction at a time, but can often bypass the decoding process or something like that.(it only does the 1 x86 decode if it has an L1 cache miss) Didn't find a maximum for the P4 without the cache miss.
However, is the number of instructions that can be decoded at a time the same as the number that can be operated on at a time? If that's the case, then it seems like IBM had the right idea with a more flexible architecture like the G4.(just in that it offers the same kind of power the athlon can, but with less transistors)

Hmm, actually I found another site, it says the athlon has 3 floating point execution units, 3 integer units, and 3 pipelined address calculation units. Hmm, but then later on it says it can fetch, decdoe, and issue up to 3 x86 instructions per cycle...which it can either break down into up to 9 microcode instructions for simultaneous execution, or can pipeline up to 9 instructions.(not sure what it says) Hmm, well looking at another source that seems to be correct, 3 x86 instructions, which can be broken down so up to 9 instructions can be performed at a time after that(but only 6 under most circumstances). That doesn't seem like it would be directly comparable to a RISC architecture then, I believe the Power970 core does up to 15 instructions at one time, but I'm assuming those are native instructions and not decoded.

Then there's the alpha 21264 that does 4 instructions per clock, then the P3 which does 1 x86 instruction, which are then broken down into up to 3 instructions.
Wish I knew what the P4 could do, though honestly judging by how close the Pentium 3 and Pentium Ms are to the athlons in performance(even when equal mhz and cache), I'd say the amount of instructions a processor can do doesn't matter much in most circumstances.(specialized benchmark software says otherwise though, but it also says itanium rules) Then again, I did find a website claiming that benchmarks quickly optimized for the athlon performed 50% faster than on a p3 of the same speed, but I remember seeing the same for benchmarks optimized for p3.

Hmm, still can't find how many instructions the P4 can do...maybe it is only 1 x86 instruction and 3 microops, but it's more efficient than the athlon? Oh, the P4 processor can do 2 integer operations per clock, compared to athlon's 3, floating point seems to be 1 to 3, and the 3 vector operations the athlon can do don't seem to matter. So it's basically 3 microops max versus 6, but the P4 comes closer to its maximum?

How much does x86 decoding hurt modern processors? Or does it only take like 1 pipeline stage away and thus has a minimal performance hit that is overcome by the production capabilities of AMD and Intel and the mass market they have to feed? Or are the instruction limitations of x86 a bigger bother?
 

EightySix Four

Diamond Member
Jul 17, 2004
5,122
52
91
Originally posted by: MetalStorm
Originally posted by: BigB10293
Nothing that super-impressive. You could always buy a mac to check it out.

Yea, but all that'd show you is that you can have less programs, and have it running slower all at the same time! Bargain!

Originally posted by: crazySOB297
Originally posted by: BigB10293
Nothing that super-impressive. You could always buy a mac to check it out.

LOL, writing from a dual 2.0ghz G5

Hahahahahahah... Chump.

Umm, actually it works well for what it needs to do, and considering the fact that I have windows pc's and mac's all sitting around me as I speak (typing from winxp64) I actually prefer the mac for general stuff, but the pc's for gaming/servers.


No I'm not a mac addict or anything, but I do hate it when people knock em' down when they haven't been there done that and truly played with the Unix core of osX which is exteremly versatile and powerful, and yet elegant. The G5's hold up well, and everything does what it needs to, certainly not the fastest thing in the world, but I like it.
 

travers

Junior Member
Apr 1, 2005
4
0
0
x86? Well the Dothan and Banias actually convert it into micro-opts. That is probably the most intresting flavor or x86. Don't forget i386, i586, i686, x86_64... Anyhow, since Intel and AMD are now more intrested in fitting more cores in a chip, rather than improving the performance of a single core, like they have done in the past, x86 is here to stay for awhile. ppc is a fun arch, of which none other than Linus Torvalds is working on right now. (He got bored of x86 and he's a whore for his shiny G5.)
 

shortylickens

No Lifer
Jul 15, 2003
80,287
17,080
136
Originally posted by: travers
x86? Well the Dothan and Banias actually convert it into micro-opts. That is probably the most intresting flavor or x86. Don't forget i386, i586, i686, x86_64... Anyhow, since Intel and AMD are now more intrested in fitting more cores in a chip, rather than improving the performance of a single core, like they have done in the past, x86 is here to stay for awhile. ppc is a fun arch, of which none other than Linus Torvalds is working on right now. (He got bored of x86 and he's a whore for his shiny G5.)

I'm not a hardcore geek and dont keep up with such things. Are you saying that Linux for Apples is more than just a hobby now?
 
Nov 11, 2004
10,855
0
0
Originally posted by: shortylickens
Originally posted by: travers
x86? Well the Dothan and Banias actually convert it into micro-opts. That is probably the most intresting flavor or x86. Don't forget i386, i586, i686, x86_64... Anyhow, since Intel and AMD are now more intrested in fitting more cores in a chip, rather than improving the performance of a single core, like they have done in the past, x86 is here to stay for awhile. ppc is a fun arch, of which none other than Linus Torvalds is working on right now. (He got bored of x86 and he's a whore for his shiny G5.)

I'm not a hardcore geek and dont keep up with such things. Are you saying that Linux for Apples is more than just a hobby now?


Hehe. Unfortunately, for us normal people (or hardcore gamers depending on how you look at it) dual-core won't be a major improvement.
 

Fox5

Diamond Member
Jan 31, 2005
5,957
7
81

But current processors are no longer really CISC or RISC. Especially since current x86 processors have RISC backends. I'd heard the term FISC used to describe current architectures.(F = Fat)

I believe the motorola 68000 was a RISC processor, yet was in many ways more complicated and unwieldy to produce than many CISC processors.

It's really more x86 instruction set, versus more modern instruction sets.
 

travers

Junior Member
Apr 1, 2005
4
0
0
Yeah, dual core is going to suck for games, at first. Then we will have games with parallelization support. Say that you don't have one of those shiny PPU's, right, and you're playing like Half Life 2 and it slows way down cuz it has to do some physics calculations. One core could handle the physics while the other does everything else, so your framerate stays up and nothing bogs down exept those barrels you just blew up. Or, instead of mult-threaded games, there could be MPI support, so you could essentially run the game on two dies at once=you will use the speed of both dies, not just one.
 

Vee

Senior member
Jun 18, 2004
689
0
0
Originally posted by: Calin
No wonder Itanium is a success in the supercomputers arena (much more so than in the general server arena)

"(much more so than in the general server arena)" is a needed qualifier.
And where would Itanium be if the business DEC-Compaq-HP-Intel ties hadn't conspired to kill off Alpha and PA-Risc? And if Sun hadn't chosen to (commit suicide?) go Niagara, despite recent successes with SPARCIV? Today there's not much choice (non clusters) if you compete with IBM.

On the same note: x86 is a success in the supercomputer arena (clusters).

Itanium is big, expensive and "slow". Despite featuring several times the number of transistors, it's often outperformed by x86, even when running native code. It performs well on some floating point tasks.
However, x86's architectures FP performance can be increased greatly, if resources were to be spent on it. So I'm not so sure that Itanium have any real architecture win, even in this case.
Itanium is going to get cheaper. And it might become a winner in the end. But not because of an architectural win. But because there might be no other choice outside IBM due to business maneuvring. AMD does not have the financial muscle or technical resources to compete. If AMD continues to provide competition for Xeon though, it might be Intels own Xeon products that in the end kill off Itanium.
(A 3'rd part project aiming to produce a chipset - Newisys Horus - that interconnect a larger number of Opterons in a parallel computer, does not seem to have been killed off by a recent takeover, so there might be a wrench in the cogs for Intel yet. )

As to the topic. There's not much reasons to expect any gains without x86 or CISC. I'm not going to launch myself into longwinded, system philosophical arguments, of why placing to much emphasis on the compiler is a gross mistake. There's plenty enough "competing" architectures that have seen an enormous development effort, that falls short. Itanium, Transmeta, PPC, ARM,.. ...wow? ...anyone?

****

Some notes to Fox5. Motorola's 68000 or 68k family, were CISC type cpus, just like x86. But they started off like 32-bit cpus, with flat 32-bit address space. They were superior to contemporary x86 cpus. Motorola (somewhat unwisely?) killed off the family prematurely. Before advances in chip die size had given back the advantages to CISC. They got all high and intoxicated on the RISC mythology. 88000 and PPC, which have never quite delivered, and now MC is out of the CPU business. Ironically, 68k was even capable of producing superior performance on small, lean cores, and survived as "Coldfire" family of embedded processors.

3 is in most senses the width of AMD K7 and K8. I think it will also be the width of Intels future cores. 3 seem to be pretty much the ideal width of a cpu that does its own instruction sheduling.
It's a question of what to do with additional available transistors. Here, both additional cores and longer pipe - higher clocks (maybe even asynchronous) seem more attractive than greater width. An exception to this is floating point. More and/or more general FP execution units would allow higher vector math performance, and wider vector instructions.
Multiple cores will not be an immediate bonus. But in the long run it will be, - very much, - don't doubt it, - this is the right way.

I think the "x86 and CISC is bad" -myth primarily fails to consider what are the real bottlenecks for computing performance. Instruction set is not important. RISC, VLIW, EPIC are the solutions to the wrong problems. - And all come with their own luggage of penalties.

Were the Dinosaurs "reptiles"? No they weren't.
Was the great Death of medieval Europe caused by the buponic plague and spread by fleas and rats? No it wasn't.
Did the Chinese invent gunpowder? No they didn't.
Did Magellan circumnavigate the world? No he didn't.
Are our number symbols of arabic origin? No they aren't.
Does x86 carry significant penalties today? No it doesn't.

People will however continue to believe what they believe. And things that are not true will continue to be stated, even by authorities, again and again.
 

Fox5

Diamond Member
Jan 31, 2005
5,957
7
81
Didn't motorola's 6 series start as a RISC processor? I mean, there aren't really RISC and CISC processors, when do you decide a processor has become CISC or RISC?

And I'm sure in theoretical circumstances that the x86 instruction set could carry severe penalties, but there are many other limits in real life, plus x86 may have once constituted a large amouunt of baggage, but now its a very small part of the transistor budget.