New Platform

Jerboy

Banned
Oct 27, 2001
5,190
0
0
Are we soon going to hit the end of x86 platform? It has been same since the very first 86. It went to 32bit from 16bit somewhere in its history. Thats about it. We are just cranking up the clock speed more and more. Doesn' this compare well to keeping a motor at same torque and increasing power by pumping more RPM? I think its about time to see MAJOR improvement in personal computing such as more bandwidth and RISC based processing.
 

LordDoug

Junior Member
Nov 11, 2001
3
0
0
I hear AMD has a pretty good solution in the works for 64/32bit. As far as giving up all the software and hardware I have invested in just to go to 64bit tomorrow...that ain't gonna happen. Maybe down the line in a few years, as I upgrade almost continuously, but intel's idea that they can just MOVE everyone to a 64bit system seems a bit ambitious. I would like to see an achitecture change because as you said, x86 is getting a tad old. But so is AGP, and PCI, and IDE. Maybe we should just move to a multi-processor, multi-bus solution, even if it involves a move to 64/32bit at the same time. I could deal with software, mainstream, designed to use 4 processors at once. Ahhh...to dream.
 

Locutus4657

Senior member
Oct 9, 2001
209
0
0
Actually x86 design has changed quite a bit over the years now and features many RISC like implamentations. So no, it's not really like that, the IPC of x86 has changed quite a bit since the original implementation.




<< Are we soon going to hit the end of x86 platform? It has been same since the very first 86. It went to 32bit from 16bit somewhere in its history. Thats about it. We are just cranking up the clock speed more and more. Doesn' this compare well to keeping a motor at same torque and increasing power by pumping more RPM? I think its about time to see MAJOR improvement in personal computing such as more bandwidth and RISC based processing. >>

 

Sohcan

Platinum Member
Oct 10, 1999
2,127
0
0


<< It went to 32bit from 16bit somewhere in its history. Thats about it. >>

That's far from it...to only consider the switch from the 16-bit to 32-bit instruction set completely ignores so many of the changes x86 has made. Consider just some of the generational improvements:

386: Virtual paging, 32-bit extensions, orthogonal registers
486: Pipelining, on-die floating-point unit, on-die L1 cache (or was that the 386?)
Pentium: Superscalar architecture, SIMD computing
Pentium Pro: Out-of-order execution, x86 -> micro-RISC op decoding, decoupled execution, on-die L2 cache (Coppermine)
Athlon: on-die L2 cache (TBird), wider-superscalar
Pentium 4: decoupled instruction decoding, hyper-pipelining
Future: Integrated memory controller, chip-level multiprocessing, 64-bit extensions, double-sized general purpose register set (Hammer); simultaneous multithreading (future P4 chips)

There is much more to the performance improvement than the increase in clock speed...out-of-order execution alone is said to improve performance by 30%. x86's ISA may hurt the platform due to the increased complexity and pipeline stages due to the x86 -> micro-RISC op decoding, and because of its small register set (which forces it to rely more on memory latencies, and limits the programming model the compiler sees), but the switch to a pure RISC ISA would be far from revolutionary.

Keep in mind that x86 is by far the most successful ISA ever created. If you really think that the x86 ISA is holding back the platform, go to aceshardware.com and check out their SPEC database. High-end RISC definitely has the highest IPC (partially due to their wider superscalar implementations and large caches), but the Athlon and P4 are still in the top of the class for SPEC performance. The reason is you can't look at only the ISA to determine the performance of a platform. Engineering and process technology can make up for ISA deficiencies. Intel's process technology is often a generation ahead of the high-end RISC vendors; historically, this (along with higher production) has been one of the larger factors in MPU performance.
 

Sohcan

Platinum Member
Oct 10, 1999
2,127
0
0


<< Well clock-per-clock which faster? Motorola G4 733MHz or Pentium III 733MHz? >>

You're looking for too much of an over-simplification to sufficiently answer the question. The Iron Law says that execution time = seconds/cycle (inverse of clock rate) * cycles/instruction * instructions/program. The second two terms are extremely software dependent; they vary depending on the program, instruction set, and the compiler used. For these reasons it's difficult to make any comparisons between CPUs of different instruction sets, since benchmarking software cannot be directly run on both platforms.

For this reason, the industry uses SPEC, a benchmarking suite on which performance is based. While it's not the perfect test of performance, nothing really is. The only official numbers that seem to be available from Motorola place the G4 733's SPECint 95 at 32.1, and its SPECfp 95 at 23.9 (for some reason they're using the outdated SPEC 95 instead of 2000). SPEC's numbers for the P3 733 place it at 35.2 for SPECint 95 and 30.5 for SPECfp 95.

The G4 is a great processor, but as you can see, you shouldn't believe Apple's marketing. x86 CPUs, notibly the P3/P4 and Athlon, tend to have better floating-point performance, due partially to the wider superscalar floating-point implementation (2 FP units/3 FP units on the P3/P4 and Athlon vs. 1 on the G4). While x87's stack-based architecture is ridiculous, the use of latency-free FXCH instructions on the P3 and Athlon can emulate a register-based instruction set. Likewise, x87's two-operand format is inferior to PowerPC's three-operand format, since it forces it to rely more on memory latency and bandwidth; on the other hand, the P3/P4/Athlon's generally faster caches, larger re-order buffers, and larger system memory bandwidth help work around the problem. On the other hand, the G4 tends to match or exceed integer performance. Also, its SIMD set (Altivec) is superior to 3DNow and SSE1/2 (check out 3 1/2 SIMD Architectures for details).

Check out Ace's SPECmine database...in SPEC 2000, the 1.6GHz AthlonXP is only beat by the 1.3GHz IBM Power4. The 2GHz P4 is only beat by the aforementioned Athlon and Power4, as well as the 1GHz Alpha EV68. In SPECfp, the 2GHz P4 and 1.6GHz Athlon take 5th and 9th place, respectively. Granted that the P4/Athlon can't come close to touching high-end RISC in terms of system & CPU scalability, system bandwidth, and cache size, but don't sell x86 short...especially when you consider the huge price difference between x86 and high-end RISC.

edit: If it seems like I'm coming down on you, Jerboi, I didn't mean to do so. While the ISA is important, IMHO it's not as important as the microarchitecture, engineering talent, and manufacturing process technology. x86's clunky ISA certainly has an effect on the performance and design of its CPUs. The x86 instruction to fixed-length micro-RISC op conversion facilitates pipelining and superscaling, but it adds a few pipeline stages (which hurts branch misprediction penalty) and makes the CPUs hotter and more complex. x86's two-operand format (a=a+b instead of 3-operand a = b + c) makes it rely on memory more often. x86's small 8-register general purpose register set has the same effect, and it limits the programming model the compiler sees (and prevents some cool compiler tricks). These last two detremental effects can be eased with fast caches w/high hitrates...the use of register renaming can bypass the increased frequency of write-after-read and write-after-write data dependencies due to the small register set.

So while an elegant ISA makes the CPU architect's, engineer's, and compiler writer's job easier, the performance increase from going to x86 to a RISC ISA would be evolutionary, not revolutionary. It's said that AMD Hammer's 16-register register set (vs. 8 with x86) in the x86-64 mode will increase performance by 5-10%...certainly an incremental improvement, but nothing revolutionary.
 

LordDoug

Junior Member
Nov 11, 2001
3
0
0
Well spoken sohcan.
Now if we can address the problem of the Lead Law: Those that want a computer watch TV and their money becomes too heavy...so they drop it on the nearest machine regardless. Then whine about what they got.
Solution: Buy what you need.
 

Jerboy

Banned
Oct 27, 2001
5,190
0
0


<<

<< Well clock-per-clock which faster? Motorola G4 733MHz or Pentium III 733MHz? >>

You're looking for too much of an over-simplification to sufficiently answer the question. The Iron Law says that execution time = seconds/cycle (inverse of clock rate) * cycles/instruction * instructions/program. The second two terms are extremely software dependent; they vary depending on the program, instruction set, and the compiler used. For these reasons it's difficult to make any comparisons between CPUs of different instruction sets, since benchmarking software cannot be directly run on both platforms.

For this reason, the industry uses SPEC, a benchmarking suite on which performance is based. While it's not the perfect test of performance, nothing really is. The only official numbers that seem to be available from Motorola place the G4 733's SPECint 95 at 32.1, and its SPECfp 95 at 23.9 (for some reason they're using the outdated SPEC 95 instead of 2000). SPEC's numbers for the P3 733 place it at 35.2 for SPECint 95 and 30.5 for SPECfp 95.

The G4 is a great processor, but as you can see, you shouldn't believe Apple's marketing. x86 CPUs, notibly the P3/P4 and Athlon, tend to have better floating-point performance, due partially to the wider superscalar floating-point implementation (2 FP units/3 FP units on the P3/P4 and Athlon vs. 1 on the G4). While x87's stack-based architecture is ridiculous, the use of latency-free FXCH instructions on the P3 and Athlon can emulate a register-based instruction set. Likewise, x87's two-operand format is inferior to PowerPC's three-operand format, since it forces it to rely more on memory latency and bandwidth; on the other hand, the P3/P4/Athlon's generally faster caches, larger re-order buffers, and larger system memory bandwidth help work around the problem. On the other hand, the G4 tends to match or exceed integer performance. Also, its SIMD set (Altivec) is superior to 3DNow and SSE1/2 (check out 3 1/2 SIMD Architectures for details).

Check out Ace's SPECmine database...in SPEC 2000, the 1.6GHz AthlonXP is only beat by the 1.3GHz IBM Power4. The 2GHz P4 is only beat by the aforementioned Athlon and Power4, as well as the 1GHz Alpha EV68. In SPECfp, the 2GHz P4 and 1.6GHz Athlon take 5th and 9th place, respectively. Granted that the P4/Athlon can't come close to touching high-end RISC in terms of system & CPU scalability, system bandwidth, and cache size, but don't sell x86 short...especially when you consider the huge price difference between x86 and high-end RISC.
.
>>




Are you looking at just the integer performance or FP performance as well? I read somewhere that even P4 2GHz is an earth to moon comparision with Alpha 650MHz RISC processor
 

miniMUNCH

Diamond Member
Nov 16, 2000
4,159
0
0
Well, we always have IBM's Power4 chip as a standard but that's a different paradigm and a very different price wrt the desktop market...a lot of people think that industry will return to a central computing / thin client platform for business infrastructures...but we all still need something for the house and no one wants to pay out the whazoo for it.

There nothing really wrong with today's current chip platforms from a personnel ccomputing perspective... athlon, P3/4, and G3/4/5 (coming soon) offer plenty of computing power for the individual. Unless gaming, how often does your CPU load remain above 20% for more than 5 seconds? Now, with computer video gaming "peeling" off into the embedded system market...why do most of us need fast computers?

Business platforms are completely different...I venture to state that any business (that's little too vague and broad, I know) should not be running windows anything on their servers...there wasting money...overall, I think microsoft, as current seen today, is on it's way out...Linux and "bigger iron", centralized platforms (IBM, HP, Sun large servers running Linux or Unix/zOS w/ Linux partions and multiple virtual servers) will replace most MS installs in the business world. MS will have to fight, like Apple, for a segmented install base while Linux and Unix/zOS installs will garner the majority of the business world.

I think there's a place there for Itanium and Hammer and the future evolution of those platforms but at the lower end of implementation.
 

Sohcan

Platinum Member
Oct 10, 1999
2,127
0
0


<< Are you looking at just the integer performance or FP performance as well? >>

Yes, read this paragraph that I posted:


<< Check out Ace's SPECmine database...in SPEC 2000, the 1.6GHz AthlonXP is only beat by the 1.3GHz IBM Power4. The 2GHz P4 is only beat by the aforementioned Athlon and Power4, as well as the 1GHz Alpha EV68. In SPECfp, the 2GHz P4 and 1.6GHz Athlon take 5th and 9th place, respectively. Granted that the P4/Athlon can't come close to touching high-end RISC in terms of system & CPU scalability, system bandwidth, and cache size, but don't sell x86 short...especially when you consider the huge price difference between x86 and high-end RISC. >>


SPECint is the integer performance test, SPECfp is the floating-point test. Here is a summary of SPECfp results....there was a brief period a few months ago, before the 1.3GHz Power4, 1GHz Alpha EV68, 833MHz Alpha EV6 (using the new compiler), and 1.05GHz US-III scores were officially reported, in which the 2GHz P4 held the highest score in SPECfp. The new 1.6GHz AthlonXP also recently posted a very respectable score. Though SPEC is the best test of cross-platform integer and floating-point performance, the fact that the P4 briefly held the top spot doesn't mean it can replace an Alpha, Power4, or US-III in an enterprise or database system. These systems require high-scalability, large system bandwidth, and large caches. Though the reason high-end RISC has these qualities is not due to their ISAs, but their target market....x86 could have high-scalability, large system bandwidth, and large caches (can you say AMD's Sledgehammer? :)), though it would loose its price-point.

So high-end RISC may achieve better performance/MHz in SPEC, but you can't look only at the x86 vs. RISC ISA for an explanation, since there are many more variables. x86 apologists look at the Athlon/P3/P4's decoupled execution and x86 -> micro-RISC op conversion, and say that the ISA doesn't matter anymore. Mac apologists look only at the ISA and say that RISC will always be superior. Sensible people look at the whole picture: ISA, microarchitecture, clock rate, engineering, process technology, compiler development, scalability, bandwidth, cache size, etc. To find the reason high-end RISC performs so well despite its deficit in clock speed compared to the Athlon/P4, you have to look at this whole picture. Besides the aforementioned cache size and system bandwidth, high-end RISC employs more robust microarchitectures, since they can afford to spend more die area on logic due to their target market. The Alpha EV6 is a four-way fetch superscalar out-of-order core, compared to the Athlon's and P4's three-way fetch superscalar out-of-order core...part of the reason that the Alpha can take advantage of a wider superscalar core is because of its RISC ISA....a 4-way x86 CPU would likely receive little benefit of parallelism over the existing 3-way CPUs, since it has fewer general purpose registers and has to rely on memory more often. The IBM Power4 has unheard of performance, but it has two CPU cores on a die and 4 dies on a multi-chip module (MCM) for a total of 8-way SMP on a single MCM (though SPEC isn't multithreaded and doesn't use SMP). In addition, each core has 1.4MB of on-die L2 cache, and 32 MB of off-die L3 cache...the final MCM has over 16 square inches of silicon....it is so ridiculously expensive, I wouldn't be surprised if it costs over $10,000-$20,000 just for the MCM.