AMD vs Intel at the high end in the future

Page 4 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

SickBeast

Lifer
Jul 21, 2000
14,377
19
81
Originally posted by: Idontcare
Originally posted by: SickBeast
AMD wants to win a "core race"? 16 cores on the Bulldozer? What are people going to do with all those cores? I'm being serious.

Presumably the same thing they'd do with it if it was a single-core 30GHz chip...run some computationally intentsive application that needs the speed of the day and thusly is compiled by the producer to best extract the performance available at that time.

What application is that? I don't know, super pi maybe? :laugh:

I just don't see games utilizing 16 cores in 2 years time. Call me crazy.
 

Idontcare

Elite Member
Oct 10, 1999
21,110
64
91
Originally posted by: Nemesis 1
Support and capabilities does not mean =. As I said read the paper, You work for intel .

Intel left no room for amd in avx. read the paper . Just because AMD is AVX compatable doesn't mean . The same as capable . Read the paper its here.

AMD isn't calling their ISA extensions "AVX" yet are they?

That to me is the most obvious/telling fact that their ISA isn't the same as Intel's for these instruction set extensions. If AMD says their chip has SSE3 then it has to have SSE3...but if they say "we've got FMULDIV32A and so too does AVX, so you know, we kinda have AVX by extension" then they don't really.

Regardless it means nothing to you and me unless you or I happen to write compilers. The differences in ISA will be handled by compilers. We do care about the raw intrinsic capability of the ISA though as in the end the programs we run will either run faster or slower on one ISA or the other, and that matters to us.
 

Nemesis 1

Lifer
Dec 30, 2006
11,366
2
0
Your a smart guy I know that . You work at intel. But Informal is wrong .

This is from AMDs PDF.

The XOP and CVT16 instructions utilize a new three-byte XOP prefix preceding the opcode byte.
This prefix replaces the use of the 0F, 66, F2 and F3 prefix bytes and the REX prefix and encodes
additional information as well. The FMA4 instructions utilize the new AVX VEX prefix which
provides similar encoding capabilities.


Until I see FMA working On X86 like it does on Itanic . I am not buying it . Sun tried and failed miserebly
 

Nemesis 1

Lifer
Dec 30, 2006
11,366
2
0
. Youalready know were I stand on compilers. If it comes down to AMD software Vs. Intel software this game is over. Amd needs do do something orginal with hardware rather than coping Intel . Who made AMD intel official monkey on back . If a company can't stand on its own merits it should fall . AMD stole Intels tech . Than are low enough to whine about unfair business proctice, Excuse me but thievies shouldn't point fingers.

I like the new ATI/AMD but. AMD did steal and thats a fact. How can 1 fab company hope to compet with multi fab company . Intels cost are lower per chip . + amd pays intel for every chip they produce. Its in the agreement. So for AMD to have fair playing field intel can't sign contracts exclusive . Yet they can supply HP with every chip it needs. AMD has never had that capability. It all scam . Nothing works as it should everthing has gone from light gray to very dark . I mean everthing.
 

Idontcare

Elite Member
Oct 10, 1999
21,110
64
91
Nemsis what exactly did AMD steal from Intel? I'm not disagreeing with you stating it as fact, I just don't remember the specifics of how that turned out. I wasn't following the industry that closely at the time, but to recollection I thought AMD was exonerated of having stolen anything from Intel, and in fact Intel was required to license x86 to AMD out of the settlement.

My recollection is probably wrong, but what is it specifically that you are recalling as "AMD stole Intel's tech"?
 

Nemesis 1

Lifer
Dec 30, 2006
11,366
2
0
Originally posted by: Idontcare
Originally posted by: Nemesis 1
Support and capabilities does not mean =. As I said read the paper, You work for intel .

Intel left no room for amd in avx. read the paper . Just because AMD is AVX compatable doesn't mean . The same as capable . Read the paper its here.

AMD isn't calling their ISA extensions "AVX" yet are they?

That to me is the most obvious/telling fact that their ISA isn't the same as Intel's for these instruction set extensions. If AMD says their chip has SSE3 then it has to have SSE3...but if they say "we've got FMULDIV32A and so too does AVX, so you know, we kinda have AVX by extension" then they don't really.

Regardless it means nothing to you and me unless you or I happen to write compilers. The differences in ISA will be handled by compilers. We do care about the raw intrinsic capability of the ISA though as in the end the programs we run will either run faster or slower on one ISA or the other, and that matters to us.

Go over to beyond 3d read intels AVX stuff. While your there see if you can find me.

 

Nemesis 1

Lifer
Dec 30, 2006
11,366
2
0
Out of court settlement and this dammed agreement between the 2. If any company needs to be investagated it's IBM
 

Nemesis 1

Lifer
Dec 30, 2006
11,366
2
0
Originally posted by: IntelUser2000
Nemesis 1, english isn't your strength I take. "10-20% gains per clock on overclocking" doesn't make sense.

Sorry! I feel real good since last visit to Mexico. So I have another PC running . trying to do 2 things at once . A step forward.

I meant. No comment on overclocking . and you know exactly why.

 

ilkhan

Golden Member
Jul 21, 2006
1,117
1
0
because posting twice in a row wasn't sufficient?
And no, your grammar is almost always atrocious. Reading what you type is painful, most often.
 

Nemesis 1

Lifer
Dec 30, 2006
11,366
2
0
Ya I know.

Say Idontcare. Isn't it a fact or Is my recall incorrect. But Larrabee . Isn't that listed in Intel PDF as being IA-64?
 

Kuzi

Senior member
Sep 16, 2007
572
0
0
Originally posted by: Idontcare
Originally posted by: Nemesis 1
Support and capabilities does not mean =. As I said read the paper, You work for intel .

Intel left no room for amd in avx. read the paper . Just because AMD is AVX compatable doesn't mean . The same as capable . Read the paper its here.

AMD isn't calling their ISA extensions "AVX" yet are they?

That to me is the most obvious/telling fact that their ISA isn't the same as Intel's for these instruction set extensions. If AMD says their chip has SSE3 then it has to have SSE3...but if they say "we've got FMULDIV32A and so too does AVX, so you know, we kinda have AVX by extension" then they don't really.

Regardless it means nothing to you and me unless you or I happen to write compilers. The differences in ISA will be handled by compilers. We do care about the raw intrinsic capability of the ISA though as in the end the programs we run will either run faster or slower on one ISA or the other, and that matters to us.

Remember in the Athlon 64 days, while it did support SSE/SSE2 and later SSE3 instructions, the improvement in performance was "less" than what Intel got because K8 processors ran SSE instructions in 2-cycles.

Phenom I changed that to 1-cycle thus improving gain from SSE optimizations. I think this is what Nemesis is getting at, while the first iteration of Bulldozer may support AVX, the implementation may not be the best, either because of time constrains or other problems. This does not mean that Bulldozer would not benefit from AVX, it just means Intel would have a a bigger performance gain from software that supports it.

I don't agree with Nemesis about AMD not being able to have a "complete" AVX implementation in the future as both companies have a full cross licensing agreement for stuff like that.

The thing is will AMD even survive till that time...

 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,787
136
Originally posted by: Kuzi

Remember in the Athlon 64 days, while it did support SSE/SSE2 and later SSE3 instructions, the improvement in performance was "less" than what Intel got because K8 processors ran SSE instructions in 2-cycles.

Phenom I changed that to 1-cycle thus improving gain from SSE optimizations. I think this is what Nemesis is getting at, while the first iteration of Bulldozer may support AVX, the implementation may not be the best, either because of time constrains or other problems. This does not mean that Bulldozer would not benefit from AVX, it just means Intel would have a a bigger performance gain from software that supports it.

You mean comparing to Pentium 4's SSE2?? Athlon 64 was only slower than Pentium 4 in SSE2 because first of all developers had more time to optimize for Pentium 4's SSE2, which isn't exactly similar to Athlon 64's SSE2 because of the vast architectural differences, even though the instruction set is same. Plus, SSE2 liked Pentium 4's high clock speed, the architectural negatives weren't really a "negative" for SSE2.

We'll see if Intel's and AMD's AVX implementations are really different outside of processor architecture difference.

Nemesis 1: Larrabbee is x86
 

evolucion8

Platinum Member
Jun 17, 2005
2,867
3
81
Originally posted by: Nemesis 1
Originally posted by: IntelUser2000
Nemesis 1, english isn't your strength I take. "10-20% gains per clock on overclocking" doesn't make sense.

Sorry! I feel real good since last visit to Mexico. So I have another PC running . trying to do 2 things at once . A step forward.

I meant. No comment on overclocking . and you know exactly why.

That means that your brain implementation of SMT is slower and takes more processing cycles than the Intel counterpart :laugh:
 

Kuzi

Senior member
Sep 16, 2007
572
0
0
Originally posted by: IntelUser2000
You mean comparing to Pentium 4's SSE2?? Athlon 64 was only slower than Pentium 4 in SSE2 because first of all developers had more time to optimize for Pentium 4's SSE2, which isn't exactly similar to Athlon 64's SSE2 because of the vast architectural differences, even though the instruction set is same. Plus, SSE2 liked Pentium 4's high clock speed, the architectural negatives weren't really a "negative" for SSE2.

I think even compared to Core 2 SSE2, performance in general was slower on Athlon 64. This is what I remember at least. K10 improved the SSE units and gained some performance from that.

But anyways will AVX be that important at release? If AMD had it or not at that time, will there be many programs optimized for AVX?

Looking back at older instruction sets AMD added MMX, SSE/2/3 etc after Intel but that wasn't a big deal really since software had to catch up anyways.

Originally posted by: evolucion8
That means that your brain implementation of SMT is slower and takes more processing cycles than the Intel counterpart :laugh:

LOL :D
 

Nemesis 1

Lifer
Dec 30, 2006
11,366
2
0
Originally posted by: IntelUser2000
Originally posted by: Kuzi

Remember in the Athlon 64 days, while it did support SSE/SSE2 and later SSE3 instructions, the improvement in performance was "less" than what Intel got because K8 processors ran SSE instructions in 2-cycles.

Phenom I changed that to 1-cycle thus improving gain from SSE optimizations. I think this is what Nemesis is getting at, while the first iteration of Bulldozer may support AVX, the implementation may not be the best, either because of time constrains or other problems. This does not mean that Bulldozer would not benefit from AVX, it just means Intel would have a a bigger performance gain from software that supports it.

You mean comparing to Pentium 4's SSE2?? Athlon 64 was only slower than Pentium 4 in SSE2 because first of all developers had more time to optimize for Pentium 4's SSE2, which isn't exactly similar to Athlon 64's SSE2 because of the vast architectural differences, even though the instruction set is same. Plus, SSE2 liked Pentium 4's high clock speed, the architectural negatives weren't really a "negative" for SSE2.

We'll see if Intel's and AMD's AVX implementations are really different outside of processor architecture difference.

Nemesis 1: Larrabbee is x86

Ya I went and read the pdf it say 64bit. not IA-64. God I know IA-^4 has come up yjo . I just can't recall tho . Wife has stuff . On her PC. But I can't recall file name.

. I am not so sure that AMD will ever beable to do Intel AVX. As intel does . Intel relies on the compiler to make the prefix of Vex on sse2 functional . If you follow my rants . You will recall AMD was bitching about Intel compilers saying but we can't do that. I posted it here. That is something AMD and Intel do not share by agreement. THATS were the magic is. REread the info . After its pointed out to ya. It sticks out like sore thumb once aware.
 

Nemesis 1

Lifer
Dec 30, 2006
11,366
2
0
OK . I have run into a brick wall> Google no help.

On larrabee native larrabee is running on a software layer. Now as I understand it. That means intel is Morophing. Its X86 processor. So its not morophing x86 right. So what is it morophing. or did a miss this thing all together with ignorance.
 

Idontcare

Elite Member
Oct 10, 1999
21,110
64
91
Originally posted by: Nemesis 1
Go over to beyond 3d read intels AVX stuff. While your there see if you can find me.

Can't find a single thread there relating to AVX...have a link or a thread title so I can find it?
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,787
136
Originally posted by: Kuzi

I think even compared to Core 2 SSE2, performance in general was slower on Athlon 64. This is what I remember at least. K10 improved the SSE units and gained some performance from that.

But anyways will AVX be that important at release? If AMD had it or not at that time, will there be many programs optimized for AVX?

Looking back at older instruction sets AMD added MMX, SSE/2/3 etc after Intel but that wasn't a big deal really since software had to catch up anyways.

Hmm. I don't think you realize the differences then. Pentium 4 introduced SSE2, which expanded the width of the register to 128 bits. However the hardware still took 2 cycles to execute the 128-bit SSE2 instruction and it did by splitting up into two and executing 64 bit each. Core 2 introduced single cycle SSE2 execution, which meant that the 128-bit instruction now only took single cycle to execute them.

Although Athlon 64 started using SSE2 since its first implementation, the full 128-bit execution didn't happen until Barcelona/K10.
 

Kuzi

Senior member
Sep 16, 2007
572
0
0
Originally posted by: IntelUser2000
Hmm. I don't think you realize the differences then. Pentium 4 introduced SSE2, which expanded the width of the register to 128 bits. However the hardware still took 2 cycles to execute the 128-bit SSE2 instruction and it did by splitting up into two and executing 64 bit each. Core 2 introduced single cycle SSE2 execution, which meant that the 128-bit instruction now only took single cycle to execute them.

Although Athlon 64 started using SSE2 since its first implementation, the full 128-bit execution didn't happen until Barcelona/K10.

Thanks for the explanation, kind of forgot all the details, it's been a while since P4 and Athlon 64 where released :)

My point was that while a certain CPU can support a certain instruction set, the execution time may not necessarily be as fast as other competing CPUs. I think this is what Nemesis was getting at when talking about the AVX implementation in Bulldozer, it might be supported but may not run as fast as on Sandy Bridge.

But I believe the IPC, clock speed, and SMT (if any) of Bulldozer will be much more important than AVX at release, this is what AMD should worry about really.
 

Idontcare

Elite Member
Oct 10, 1999
21,110
64
91
Originally posted by: Kuzi
Originally posted by: IntelUser2000
Hmm. I don't think you realize the differences then. Pentium 4 introduced SSE2, which expanded the width of the register to 128 bits. However the hardware still took 2 cycles to execute the 128-bit SSE2 instruction and it did by splitting up into two and executing 64 bit each. Core 2 introduced single cycle SSE2 execution, which meant that the 128-bit instruction now only took single cycle to execute them.

Although Athlon 64 started using SSE2 since its first implementation, the full 128-bit execution didn't happen until Barcelona/K10.

Thanks for the explanation, kind of forgot all the details, it's been a while since P4 and Athlon 64 where released :)

My point was that while a certain CPU can support a certain instruction set, the execution time may not necessarily be as fast as other competing CPUs. I think this is what Nemesis was getting at when talking about the AVX implementation in Bulldozer, it might be supported but may not run as fast as on Sandy Bridge.

But I believe the IPC, clock speed, and SMT (if any) of Bulldozer will be much more important than AVX at release, this is what AMD should worry about really.

Hi Kuzi, yes what you are getting at is called "instruction latency", basically the number of clockcycles needed to fully complete a given instruction.

Being capable (compatible) with executing an instruction (this is the ISA) has very little to do with how quickly the instruction is executed...that part depends on the architecture.

By the way Everest has a really cool/handy tool for generating the information on instruction latency for the user's processor. Here's a post on it. And check this thread for posts by dmens on reading the output correctly (I had misinterpreted the output in my initial post on it in the thread, dmens clarified a few posts later).
 

amenx

Diamond Member
Dec 17, 2004
4,525
2,862
136
As long as Intels financial position remains superior to that of AMDs, they will have the edge in producing faster, better chips. If AMDs finances were improved and able to sustain the sort of R & D expenditures that Intel is capable of, then it would be a different story. Despite the odds, AMD has come out with some surprisingly good chips lately, I only hope they can continue to do so.
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,787
136
Not fully sure at the moment, but as it stands right now...

Westmere: lower latency caches(more bets on L1)
Sandy Bridge: 512KB L2 cache and bigger L3 cache per core

Westmere should show what Nehalem would have been without the 4 cycle L1 cache latency, and I think we will see another 6-7% performance increase from that alone. Cache related enhancements will be there on Westmere, just not the conventional "lets increase the cache sizes".
 

Idontcare

Elite Member
Oct 10, 1999
21,110
64
91
Originally posted by: IntelUser2000
Not fully sure at the moment, but as it stands right now...

Westmere: lower latency caches(more bets on L1)
Sandy Bridge: 512KB L2 cache and bigger L3 cache per core

Westmere should show what Nehalem would have been without the 4 cycle L1 cache latency, and I think we will see another 6-7% performance increase from that alone. Cache related enhancements will be there on Westmere, just not the conventional "lets increase the cache sizes".

It's too bad the cache sizes are to remain the same as for sure the L1/L2 cache sizes contribute to a significant share of the "hyperthreading penalty" when threads run on the same core and effectively cut the cache in half as far as each thread is concerned.