How AMD and its partners are putting x86 back on the right track

NaughtyusMaximus

Diamond Member
Oct 9, 1999
3,220
0
0
This from http://www.matrixlist.com/pipermail/pc_support/2002-May/001416.html - linked to on /. by the author. I thought it was a good read, hope you guys enjoy it. :)

---------
When is Intel "IA-32" (aka Intel "x86") complex instruction set
computation (CISC) going to finally die? That question has been asked
every since MIPS R2000 processors hit the market in the mid-80s. While
the debate of "kill x86" v. "x86 forever" rages on, the company who is
behind the latter might actually be the best one for killing the former
as we will see.

Overview:
- IA-64: When Reality Breaks Theory
- Athlon: The Re-programmable Pentium
- AMD x86-64 and Intel Yamhill
- Digital FX Flashbacks
- Transmetting the Future


- IA-64: WHEN REALITY BREAKS THEORY

The 60s introduced complex reduction set computation (CISC) which was
quickly followed by the birth of the microprocessor by Intel, which all
its subsequent products would be based. And CISC moved into superscalar
and pipelined design, it was obviously difficult to optimize. So the
80s brought us reduced instruction set computation (RISC) drastically
reducing logic size and design times and many CISC vendors made their
switch then and there. Unfortunately, RISC still didn't solve the issue
with only 50% of pipelines being utilized at any time. So when Intel
skipped the RISC generation but finally decided to move away from its
CISC backbone, they moved to address the shortcomings of RISC with a
approach for the 21st century known as explicitly parallel
instruction-set computation (EPIC).

EPIC is extremely innovative. It uses heavy compile-time optimizations
to assemble traditionally 32-bit RISC instruction words into a 128-bit
very long instruction word (VLIW) of three 41-bit RISC words and some
control bits. This eliminates a lot of overhead in the run-time design
of the processor, making RISC even more RISC. And in an effort to
completely eliminate the dreadful even of a processor stall caused by
branch misprediction, it introduced the concept of branch "predication"
where both branches are executed and the road not taken result is
discarded when the branch is resolved. Unfortunately, EPIC wasn't as
good in silicon as they thought it would be.

The first Intel IA-64 processor, Itanium, wasn't just a flop because it
did not run older IA-32 CISC code well. It failed to really keep its
pipelines 90% full like the EPIC approach promised -- despite heavy
compiler optimization development. And when it came to branch
predication, the savings in "stalls" was not worth the extra, useless
work the processor committed itself to doing by executing the branch
that would not be taken. While Intel is addressing the utilization
issue with the addition of traditional run-time optimization, and even
some traditional branch prediction in its 2nd generation IA-64
processor, "McKinley," even Intel itself is wondering if they have made
the right approach to transitioning away from CISC IA-32.


- ATHLON: THE RE-PROGRAMMABLE PENTIUM

You've never heard an Intel engineer curse more than when they speak of
Math Matrix eXtentions (MMX) or Streaming SIMD Extensions (SSE). Intel
has not only bloated its CISC IA-32 instruction set with such
concoctions, but have ended up giving their engineering teams all kinds
of tangent designs to figure out how to slap onto their cores. Instead
of evolving their now aged Pentium core design with more general
arithmetic logic unit (ALU) and floating point unit (FPU) pipes and
registers, they slap on more "lossy, application-specific" integer-float
interpolating logic and registers for them. Worse yet was the fact that
they still haven't addressed their "less-than-ideal" out-of-order and
branch prediction units because the whole Pentium series was supposed to
be addressed by IA-64 EPIC/predication by now.

The result is a chip that excels at specific, visual applications where
accuracy is not necessary, but one that is not so fast at general
applications let alone engineering and scientific ones.

While the well-funded Moore and co. design teams were busy either adding
accessories to their Mustang or their prototype that only millionaires
could afford, his former Fairchild colleague Sanders was off spending
the few R&D dollars they had to build a Viper. They took the aged
muscle car approach that they knew worked, refined and modernized it
with more pipes, better branch prediction, lots of buffering into a
solid, efficient, 9-issue core in a few years instead of a decade. Not
the most efficient, easily double the size of an original RISC design,
but it was built to run code written for a 4 decade old approach. The
result would be known to end-users as the Athlon. A core design that
will serve them a good 5 years before it needed to be overhauled.

AMD has always led Intel in ALU performance and memory loads, and their
branch prediction unit was based on lessons learned in the K6 (which was
overkill). But the Athlon's greatest strength was its 3-issue FPU which
causes Intel headaches to this day. Whenever Intel adds another 50+
opcodes for some fancy, schmancy multimedia niche, AMD just writes some
microcode to leverage its FPU (or ALU in some case) to do it. So while
Intel has to slap on yet another execution unit and more registers, AMD
just figures what FPU pipes to use and registers to dedicate to it. The
effort is far less, and more time can be spent to optimizing the
accommodation in existing design, instead of rushing to finish the "slap
on" design, do timing resolution of the new logic with the old, etc...

Although IA-64 also uses microcode to execute the bloated CISC IA-32
instruction set on its EPIC design, it wasn't designed for it like
Athlon. Seeing the Intel IA-32 team add more and more junk to its
product without giving the IA-64 team a thought reminds me a lot of
another company, who's "Chicago" team did the same with their products
without consulting the other guys in their same company.


- AMD x86-64 AND INTEL YAMHILL

The Athlon also did one more thing for AMD, it gave them their own
hardware platform. No longer did AMD need to wait on Intel to move on
the OEM end, they moved the platform themselves. Sure, the first 6
months were dominated with few products, poor 3rd party support, and
even poor, end-product reliability, but the platform boomed in no time,
and by the end of the first year, few OEMs were limiting themselves to
Intel. Now AMD is going to finish the job.

AMD x86-64 brings 64-bit addressing to IA-32, in a fully, backward
compatible, similarly performing way. In fact, x86-64 is nothing
special, it's just an Athlon with 64-bit addressing, another pipeline
and more registers now with 64-bit lengths. Nothing major to address in
overall design, other than adding in the addressing/register extensions
and making sure it handles run-time resolution of switching between
legacy and 64-bit modes.

Since IA-64 "McKinley" won't arrive until x86-64 does as well, Intel
realized they had far too many of their eggs in one basket. Although
Intel has not confirmed it, their "Yamhill" project is one to build an
x86-64 compatible processor. This means that Intel had to license AMD
x86-64 which AMD has confirmed. This means engineering bliss for the
future of IA-32. Why?

AMD has a history of not bloating IA-32. Only once have they introduced
instruction set extensions (3DNow!), and those were done to address the
_shortcomings_ of a marketing-driven extension set from Intel (MMX).
Later refinements of those extensions were often just adoptions of Intel
introductions and, as discussed before, done in a way where microcode
was added using the existing ALU/FPU pipes. Now that AMD controls the
ISA as well as its own platform, IA-32 will finally "stabilize" under
AMD's x86-64 leadership. Even Intel marketing will take a "back seat"
for awhile as they cannot even hope to have an x86-64 competitor out
until late 2003 -- a good year behind AMD.


- DIGITAL FX FLASHBACKS

AMD doesn't have the R&D dollars of Intel. Even though they spend a
greater percentage on R&D than Intel (who spends a lot of that on
marketing-related R&D projects), they cannot make a dent in comparison.
So they rely on industry partnerships who contribute and proliferate
their combined concepts, innovations, ideas and products into a
community designed platform. No more apparent is this than in the
introduction of their ultra-flexible HyperTransport interconnect, which
is being used by basically everyone outside of Intel, even for Intel
platform systems in some cases.

At the forefront of this are employees of the former company known as
Digital, now owned by Compaq, now owned by HP. These employees built
the most anal of RISC designs, the Alpha microprocessor (uP) and the
most practical of microcontroller designs (uC), the StrongARM. They
dominated the design of pretty much all of the enterprise-level system
and bus logic other interconnects, EV6/7, PCI bridges, etc... And they
seeded much of the commodity Ethernet market with their popular design,
the Tulip. Although that collective engineering resource is gone, their
footprint on history even continues today at AMD and partners like API
Networks (fka Alpha Processor, Inc.). And one major technology they
introduced continues to be undervalued.

When Digital created the Alpha, they created an ultra-clean 64-bit
platform for _only_ 32/64-bit computing -- no 8 or 16-bit. This wasn't
by mistake, nor was it just to show how efficient RISC could be when
taken to a level an "analness" like the Alpha. It was a hardware
conduit for an innovative software concept and associated set of tool.
Those tools was FX!32, which silently won award after award for its
approach.

FX!32 was a "binary compiler" (if I may call it that) that not only
run-time emulated software written for another architecture or "byte
code," but did run-time _conversion_ of binary executables and libraries
from one architecture into Alpha. It then further did post-conversion
optimizations on the new Alpha binaries each time it was run -- to try
to further match the execution speed of the original -- and boy did it
come close! It was a brilliant piece of work -- one that Digital needed
not just to sell the NT/Alpha platform as it could run NT/x86 binaries
but, more importantly, to allow users to run VAX/VMS binaries on the
accompaning Alpha/VMS platform. Digital would even go as far as to
introduce FX!32 software for Linux/x86 -> Linux/Alpha and even some
limited UNIX/MIPS -> Alpha/UNIX.

Digital realized that software runs on the operating system platform,
not just an architecture. While it is common to emulate other software
platforms via library calls or even semi-virtualized hardware on the
same architecture or "byte code" (e.g., VMWare or WINE on x86), Digital
found it was far easier to emulate various other architecture (MIPS,
VAX, x86) on the same software platform (UNIX, VMS, Windows/Linux,
respectively) to theirs (Alpha). And it didn't stop there because they
could _permanently_convert_ the binaries of those other architectures to
Alpha. Because binaries are built for a software platform -- the
architecture was just an instance of it.

The Digital Alpha technology was licensed to Samsung, AMD and Intel,
with Intel being the owner of the platform now. One has to wonder if
Intel knew today how IA-64 would perform, would they had not bought
Alpha a long time ago and used it as its nexgen, non-CISC platform?
Alpha has _always_ been the highest performing architecture. I mean,
while an 800MHz, 0.18um Itanium toasts even a 2.4GHz, 0.13um Pentium 4
at floating point, even 3-year old, 600MHz 0.35um Alpha 264s
_outperforms_ that same Itanium by an even wider margin! You add in the
fact that FX!32 on Alpha _greatly_outperforms_ Itanium when it comes to
running x86 binaries, and one can only wonder if we wouldn't have 64-bit
Intel Alpha chips now, running at 4GHz at 0.13um, with fully supported
FX!32 software for running legacy Windows and Linux binaries on it. And
instead of talking about "fixing" IA-64 with "McKinley," we'd be talking
about the new Alpha 364 design that is the best of both worlds --
adopting Intel EPIC ideas like compile-time optimization to improve RISC
run-time utilization.

- TRANSMETTING THE FUTURE

So what's my point? The main reason we have NOT seen something like
FX!32 is because Intel keeps extended IA-32 and toying with IA-64.
Yeah, so, Intel finally owns Alpha now, and while McKinley and later
IA-64s will benefit, it's far too little, far too late. Now that AMD is
commanding IA-32 c/o their 64-bit x86-64 -- maybe, just maybe, the
AMD-API guys are thinking about going beyond legacy CISC IA-32. Maybe
they are thinking of building their own 128-bit VLIW design. Or doesn't
someone else already have one???

Yes, one company does. In fact, they looked at it a little
differently. Instead of writing some add-on systems software that lets
one architecture run the software written for the same platform as
another, this company put it in the firmware and that's all it does! It
doesn't even market its own, natively running software but _always_ runs
the foreign bytecode. The Transmeta Crusoe architecture is a 128-bit
VLIW RISC design that has virtually _no_ microcode at the core, but uses
a software/firmware-driven principle know as "code morphing" to take
another bytecode and break it down into its raw, native VLIW words at
run-time. "Code morphing" is yet another innovative approach based on
the simple fact that x86 bytecode rules the landscape which, like FX!32,
is based on the fact that it is easier the same software platform on a
different architecture than a different software platform on the same
architecture. Furthermore, why else do you think they hired the guy who
wrote the first operating system against the full Intel i386+MMU
specification, Linus Torvalds -- because he knew x86 bytecode in and
out! And guess who is also a licensee of the Transmeta IP?

Yeah, the same company who is now in control of IA-32, AMD. Makes you
wonder where this is all leading. Let me piece it together my
predictions for you ...

- As the new leader in x86-64, AMD will "permanently stabilize" IA-32
ISA. x86 bytecode will now be a "standard" that doesn't change.

- A new, 2nd generation 128-bit VLIW using HyperTransport will be born
out of the AMD-API-Transmeta alliance. This chip, unlike Crusoe, will
have native versions of 64-bit Windows and Linux released for it.

- A merger of FX!32 and Code Morphing concepts will lead to an improved
"binary complier" for both Windows and Linux. You will still have to
run Windows/VLIW2 to run Windows/x86[-64] and Linux/VLIW2 to run
Linux/x86[-64] binaries, respectively, but it will finally move people
away from IA-32/x86 by 2006-2007.

IMHO, if this happens, Intel will have its issues go exponential. Not
only will they have a tough time proving to people that IA-64 is viable
versus this new VLIW2, but their other strategy revolves around the,
"now dying," x86-64 ISA. Since IA-64 hasn't "caught on" yet and there
is a very good chance that even 2nd gen "McKinley" won't either (the
consumer version isn't due until late 2003), the only chance Intel has
is to go x86-64 "full bore" and keep people from moving off it. So
we're back to Intel actually being the "x86 forever" guys!

I could be wrong about AMD looking at VLIW. But something tells me all
those former Alpha engineers are watering over the Transmeta technology
-- or at least when thinking about making improvements to what they have
already done. If this new "binary compiler" becomes available, x86 may
very well die regardless of Linux adoption. In fact, Linux desktop
adoption helps Intel with IA-64, so maybe continued Windows/x86 usage is
in Transmeta-AMD's favor? So maybe the support of Microsoft by AMD is
not so blind, eh?

It's just hard to tell. But it'ss harder to sit by and see good ideas
and innovations that could easily move us away from x86 inefficiency to
a new, RISC-like, VLIW bytecode platform not happen in the next 5
years. Because if it is going to happening, their is more change of it
from the AMD-Transmeta partnership than from Intel and its IA-64 IMHO.
Ironic this is because it is AMD who is keeping x86 alive with x86-64
because their "seizing control" of it is our best chance of stabilizing
and getting off it. Because like Microsoft with its Win/NT-ignorant
Win/DOS market, Intel cannot keep their IA-32 marketeers from ruining
any chance IA-64 has.

-- Bryan
 

Ben50

Senior member
Apr 29, 2001
421
0
0
That was a good read with a lot of interesting ideas. As a frequent shareholder in AMD, I hope that AMD can take control of x86 with their new hammer processors. I might make as much as I did when they came out with the athlon and their stock price soared to $97 from around $16.
 

jcmkk

Golden Member
Jun 22, 2001
1,159
0
0
Interesting. Am I the only one that actually read this? I'm still trying to digest it, but I may have some comments later.
 

joohang

Lifer
Oct 22, 2000
12,340
1
0
Originally posted by: jcmkk
Interesting. Am I the only one that actually read this? I'm still trying to digest it, but I may have some comments later.

I copied it to my iPAQ so that I can read it on the bus later when I go out.
 

Booster

Diamond Member
May 4, 2002
4,380
0
0
Hmmm... AMD has never been setting standards, and I don't think they ever will. Remember the VLB bus by AMD? Probably not. We only know PCI now, and VLB was AMD's alternative to the present day PCI by Intel. Then again, the 3Dnow! technology. Did many, if any at all, applications use it? I don't think so. Even if they did, they didn't seem to benefit from them. AMD had to adopt Intel's SSE to make their CPUs look better in the eyes of potential customers. And would everybody be looking forward to Hammer if it didn't feature SSE2 support?

I'm really intrested to see how Hammer actually performs and, not less important, if it will have thermal protection logic similar to that of the P4. After AMD made the Hammer look exactly like its P4 rival, I hope they did include some thermal protection in it.

My guess is that Hammer will be a very fast 32-bit CPU, but its 64-bit powers may be never even used, just like the 3Dnow! technology. Well, even if it will be so, it's going to be a nice upgrade option.
 

Lord Evermore

Diamond Member
Oct 10, 1999
9,558
0
76
I'd certainly want Hammer if it didn't support SSE2. Nothing I've ever used has required or even used SSE, SSE2, or 3DNow of any flavor, other than MPEG encoders like Flask.
 

gregor7777

Platinum Member
Nov 16, 2001
2,758
0
71
Originally posted by: Booster
. And would everybody be looking forward to Hammer if it didn't feature SSE2 support?


What a massive marketing juggernaut convinces people is right and should buy is often a far cry from what is actually the better option in the long run.
 

Sohcan

Platinum Member
Oct 10, 1999
2,127
0
0
I wish I had more time to comment on this, but I have to run....

There's some good points, but some obvious head-scratchers:

But the Athlon's greatest strength was its 3-issue FPU which causes Intel headaches to this day
A 3-way issue FPU does not directly lead to strong FP performance. The P3 had the same 2-way FP unit that the P4 has....the difference in FP performance is more directly related to FXCH instruction behavior, instruction latency, and data alignment.


Whenever Intel adds another 50+ opcodes for some fancy, schmancy multimedia niche, AMD just writes some
microcode to leverage its FPU (or ALU in some case) to do it.
Um, this is just ridiculous and hypocritical. I can be very sure that the Athlon does not use microcode to execute SSE. Why criticize Intel for implementing SSE, then praise AMD for adopting it?


Since IA-64 "McKinley" won't arrive until x86-64 does as well
McKinley will be released this month, Hammer is still unsure.


Although Intel has not confirmed it, their "Yamhill" project is one to build an
x86-64 compatible processor.
Even the Yamhill rumors are unsure of whether it is an x86-64 MPU or Intel's own 64-bit x86 extensions...


This means that Intel had to license AMD x86-64 which AMD has confirmed
Really? I'd like to see some official word on this...


This means engineering bliss for the future of IA-32. Why?
LOL, whatever...x86-64 may be a (small) step in the right direction, but calling any extension of x86 an "engineering bliss" is pretty laughable. x86-64 does nothing to address x87 floating-point, as well as the decoding nightmare that x86 is with its numerous prefixes and addressing modes.


AMD has a history of not bloating IA-32. Only once have they introduced
instruction set extensions (3DNow!), and those were done to address the
_shortcomings_ of a marketing-driven extension set from Intel (MMX).
Um, this is a bit hypocritical to praise 3DNow yet criticise Intel's SIMD efforts...both are far from perfect.


Now that AMD controls the ISA as well as its own platform, IA-32 will finally "stabilize" under AMD's x86-64 leadership. Even Intel marketing will take a "back seat" for awhile as they cannot even hope to have an x86-64 competitor out
until late 2003 -- a good year behind AMD.
This is assuming a lot...specifically that Intel won't develop their own 64-bit x86 extensions. Intel still owns around 80% of the x86 market, and commands where it goes. I hope x86-64 does well, but he's doing a bit too much wishful thinking.
 

splice

Golden Member
Jun 6, 2001
1,275
0
0
Hmmm... AMD has never been setting standards, and I don't think they ever will. Remember the VLB bus by AMD? Probably not. We only know PCI now, and VLB was AMD's alternative to the present day PCI by Intel.

That is absolutely wrong! VLB (VESA Local-Bus) was developed by VESA (Video Electronics Standards Association), and implemented in the early 90's! I clearly remember having a VBL Video Card (made by Western Digital) on my INTEL 486DX2-66 Gateway System.
 

ElFenix

Elite Member
Super Moderator
Mar 20, 2000
102,396
8,559
126
nice read.


alpha still rules.

intel should abandon ia 64 and go alpha.

probably would if they didn't suffer from "not invented here" syndrome.
 

Booster

Diamond Member
May 4, 2002
4,380
0
0
That is absolutely wrong! VLB (VESA Local-Bus) was developed by VESA (Video Electronics Standards Association), and implemented in the early 90's! I clearly remember having a VBL Video Card (made by Western Digital) on my INTEL 486DX2-66 Gateway System.

Well, it was so long ago I could prolly forget :p. Don't take it all too serious, anyway. It was not AMD, but VESA. Just fine. Any difference? :D
 

MadRat

Lifer
Oct 14, 1999
11,975
294
126
<<LOL, whatever...x86-64 may be a (small) step in the right direction, but calling any extension of x86 an "engineering bliss" is pretty laughable. x86-64 does nothing to address x87 floating-point, as well as the decoding nightmare that x86 is with its numerous prefixes and addressing modes.>>

All this talk about x86, x87, SiMD... Sohcan, what was x88 again?
 

splice

Golden Member
Jun 6, 2001
1,275
0
0
Originally posted by: MadRat
<<LOL, whatever...x86-64 may be a (small) step in the right direction, but calling any extension of x86 an "engineering bliss" is pretty laughable. x86-64 does nothing to address x87 floating-point, as well as the decoding nightmare that x86 is with its numerous prefixes and addressing modes.>>

All this talk about x86, x87, SiMD... Sohcan, what was x88 again?

The 8088 was exactly like the 16-bit 8086 except it had an 8-bit address bus. I think Intel did this to make it compatible with current (at the time) 8-bit data busses. ;)
 

imgod2u

Senior member
Sep 16, 2000
993
0
0
I particularly liked the idea of the firmware emulation ability. When transmeta first came up with this I thought it could've gone very far. Too bad they weren't big enough to make it stick. Intel, on the other hand, certainly has the power to implement such a thing successfully. I wonder why they don't. It could certainly help out IA-64 a lot when it does come down to the consumer level. As for IA-64's performance, correct me if I'm wrong but doesn't Itanium at 800MHz perform quite similar to competing Ultraspark and Power MPU's?
 

Soulkeeper

Diamond Member
Nov 23, 2001
6,732
155
106
now that's one of the best reads i have seen in the forums in months
my brain hurts mmmmmmm
 

Idoxash

Senior member
Apr 30, 2001
615
0
0
dudes this sounds very kewl indeed for AMD.... man why can't that be sooner thoe eh!
 

Idoxash

Senior member
Apr 30, 2001
615
0
0
dudes this sounds very kewl indeed for AMD.... man why can't that be sooner thoe eh!
 

Idoxash

Senior member
Apr 30, 2001
615
0
0
dudes this sounds very kewl indeed for AMD.... man why can't that be sooner thoe eh!
 

Idoxash

Senior member
Apr 30, 2001
615
0
0
dudes this sounds very kewl indeed for AMD.... man why can't that be sooner thoe eh!
 

Idoxash

Senior member
Apr 30, 2001
615
0
0
dudes this sounds very kewl indeed for AMD.... man why can't that be sooner thoe eh!
 

Idoxash

Senior member
Apr 30, 2001
615
0
0
dudes this sounds very kewl indeed for AMD.... man why can't that be sooner thoe eh!
 

Idoxash

Senior member
Apr 30, 2001
615
0
0
dudes this sounds very kewl indeed for AMD.... man why can't that be sooner thoe eh!
 

Idoxash

Senior member
Apr 30, 2001
615
0
0
dudes this sounds very kewl indeed for AMD.... man why can't that be sooner thoe eh!
 

Idoxash

Senior member
Apr 30, 2001
615
0
0
dudes this sounds very kewl indeed for AMD.... man why can't that be sooner thoe eh!