• We’re currently investigating an issue related to the forum theme and styling that is impacting page layout and visual formatting. The problem has been identified, and we are actively working on a resolution. There is no impact to user data or functionality, this is strictly a front-end display issue. We’ll post an update once the fix has been deployed. Thanks for your patience while we get this sorted.

Prescott to have AMD's x86-64 extensions?

ST4RCUTTER

Platinum Member
Perhaps Yamhill will not show up after all? Could Intel actually manage to package x86-64 extensions into a "revised" P4? I think it will be interesting to see how Intel acknowledges to the industry how it will be using AMD technology in their new processor (if this becomes a reality). Would Intel incorporate Hypertransport as well? 3GIO is still too far away for products based on that chip interconnect method which is why I ask. The next year is really going to be interesting...


Link
 


<< I think it will be interesting to see how Intel acknowledges to the industry how it will be using AMD technology in their new processor (if this becomes a reality). >>



If they do this, then I think Intel should adopt the name Paul DeMone coined for the yamhill - "Imitanium" 😀

greg
 


<< Would Intel incorporate Hypertransport as well? >>



They would be foolish not to IMO. While the move to 64-bit will be worthwhile, I believe that moving from old fashioned bus architecture to hypertransport will do more to increase performance than 64-bitness will, especially in the short to medium term.

Greg
 


<< If they do this, then I think Intel should adopt the name Paul DeMone coined for the yamhill - "Imitanium" 😀 >>


LMAO
 
I think 3GIO is closer than you think. I believe that there's already a server system planned sometime soon that incorporates some of 3GIO's concepts. 3GIO includes hypertransport as one of its features I think, I could be wrong. Either way, those are way down the hill, in 2003 I think.
 
3GIO and HyperTransport are opposing standards.


There is discussion that they can co-exist. HyperTransport is much more of a chip-level protocol and 3GIO is a bit more designed for detachable IO devices (like NIC cards and bus expansions- to replace PCI perhaps?)

Then there's IEEE 1394b that will take over as the external bus when it gets thrown in there.


I think HyperTransport is much more scalable than 3GIO from waht I remember. But also a bit slower at the high-end.

Don't know exactly what Intel will use- but I suspect they MAY even use both. 🙂

Eric
 
They are opposing standards, they actually fit side by side, since hypertransport has no replacement for PCI, and 3GIO does.

If intel can add that in the p4, its going to be weird, new socket, again. There will be bottlenecks, and serious performance problems from just adding x86 to the p4. It already suffers from not getting enough data to fully fill its pipeline, the hammer series from amd is basically a k7 but redesigned so that it is able to get enough data to process. Imagine the die size on those p4s? they are big enough as it is. Intel would also lose out on the server market, if they offered a chip with extensions like that.
 
<< Would Intel incorporate Hypertransport as well? >>



To do that, Intel would have to buy into AMD consortium as well. It's not alot, but Intel begging for AMD's technology would sure be funny. And then Intel would have to share a bit of their technology to everyone else right?
 
Intel already shares a lot of technology with AMD. Their patent agreement for the x86 ISA itself has been prolonged for years. I think it would actually benefit AMD to let Intel use their technology, as it would help standardize it. If Intel went another route, that would be 80% of the desktop CPU market that goes that route, and whatever AMD uses would be fighting an uphill battle to gain acceptance. If Intel did adopt the 64-bit extentions to the x86 ISA and Hypertransport, it'd go a long way towards helping both technology gain wide acceptance throughout the market.
 
heh ..

I've said it before and I'll say it again;

AMD and Intel aughta come here to find new employees 🙂
Either that or you're all full a schitt.
 
I noticed some interesting talk about ClawHammer over at Aceshardware. I'll just post the most interesting stuff here:

<<<

"There are a number of improvements due to x86-64.

1) The addition of 8 new GPRs allows more registers to be used for temporary data and intermediate results instead of memory which is found in L1 which takes 6 cycles additional to store/retrieve and can only be started 2 per cycle. Registers take only 1 cycle and 9 can be obtained per cycle (6 for the 3 integer ALUs and 3 for the 3 AGUs). This will lead to about 10-20% improvement when compiled to take advantage of it.

2) The above also causes a secondary effect due to the additional GPRs force more registers to be in the virtual (reorder) pool. This increases the reorder window causing less stalls and more performance. This will add about 1-2% to IA-32 code.

3) The addition of 8 new SSE/2 128bit registers. This does the same for floating point as the 8 GPRs do for integer performance. It also increases the virtual pool for floating point as in point 2. Performance increases here will be between 20 to 40% with special cases over 100% (3x3 matrix multiplication).

4) 64 bit addressing (in Hammer family implementation, 40 bit physical and 48 bit virtual). This far exceeds any IA-32 bit CPU in memory, 256 times more memory and 4096 times more virtual memory than Xeon. Much larger simulation problems and other workstation and supercomputer tasks can be done. Also large databases routinely exceed 100GB and some exceed 1TB which standard rules of thumb require between 10 and 100GB of main memory plus program code and data requirements above that. Xeons and their IA-32 bretheren can't use that much memory so their database performance falls off in the larger uses. In addition, OSes like Linux use unused (idle) memory to cache disk (compared to memory, disk is very slow (40x bandwidth and 100x access)). With the larger memory footprints, it is conceivable that future systems may do it all in memory rather than R/W from disk. This is the reason solid state disks are popular with the large DB super servers. And your IA-32 bit programs do not need to be recompiled to take advantage of this. The OS, like Linux, will do this if it is compiled for x86-64 "Long" (64 bit) mode. So any application that uses disk will be much faster. Just think, those games with long load times due to the amount of reading the CD (or later DVD) will load in a tenth or a hundredth of the time. Level switches under 1 second give a more seemless run and gun time for those FPS fanatics out there.

Some additional benefits accrue to changes in the implementation of the Hammer family of x86-64:

5) The x86 decoders have three full decoders versus the 1 full and 2 simple decoders of the Athlon. This will allow more IPC even in IA-32 code. This may increase performance 5 to 10%.

6) A new stage will attempt to combine micro ops into fewer micro ops to increase the number of x86(-64) instrcuctions that can be scheduled at the same time. This may increase performance 2 to 4%.

7) Additional TLB size will allow less memory references to be used in virtual memory mapping adding 2 to 4% more performance.

8) Better address prediction will eliminate some stalls and improve performance another 2 to 4%.

9) Some branches may have both sides taken in speculative execution which means that there will be fewer stalls and improve performance between 1 and 5% depending on how often this is done.

10) On die DRAM controller. This will shave cycles off of DRAM latency improving effective bandwidth and shortening latency by tens of cycles. This will be a large boost in performance of 10 to 20% depending on application.

11) HT links between CPUs for "glueless" SMP. This makes the Hammers scale much closer to 1 to 1 with the number of CPU increases. Most SMP boxes get 50 to 80% increase with the second CPU, 25 to 50% of a CPU for the third CPU and 10 to 25% for the 4th CPU. Hammer will add probably 95% for the second CPU, 90% for the third and 85% for the fourth. This adds up to 1.85 to 2.55 CPUs for a quad SMP box for FSB based systems versus 3.7 CPUs for Hammer. This is due to each CPU having local memory and can get remote memory in about 140ns versus 300ns or more, if any possible, for chipset based.

12) Multiple I/O HT links possible to have more devices attached. One single CPU clawhammer could have an 8x AGP slot, 2 3 slot PCI-X busses, 5 slot PCI bus and all of those other SB based peripherals attached. This is far largger than most x86 servers by other makers including Intel. Even dual Clawhammer SMP and even more so for quad or octal Sledgehammer SMP boards.

Overall this allows even with current 1GB registered DDR DIMMs, Sledgehammer could have as much as 64GB of memory (there are some 2GB DIMMs showing up so 128GB is just around the corner) with about 42GB/sec total memory speed and 51.2GB/sec total HT I/O speed given 8 Sledgehammer CPU dies. This is far beyond any current x86 based systems or even all, but exotic supercomputer platforms.

All of this yields some large improvements in x86 (IA-32) performance, more with a 64 bit x86-64 OS and even more with 64 bit compiled applications. The later may get more than 50% improvement overall compared to an equally clocked Tbred and may be faster than a double clocked P4 NW. For those huge dataset problems, P4 NW will be left in the dust focing Intel to add x86-64 to their CPUs which are rumored to be in the works.

Pete"

>>>
 
Wow, it really looks like Intel is going to have to use AMD's implementation of x86-64. I cant imagine it though, it will be a MAJOR embarrassment to Intel, and especially their CEO Craig Barret.
 
Whoa!

Quite a load of data there AGodspeed. I think it's quite clear that the former Alpha engineers that now reside at AMD have been quite busy. I think going the route AMD has (architecture over speed) was one that they didn't really have a choice on. Intel has the capital to create new packaging (such as BBUL) that will be a necessity as they ramp to higher clockspeeds. AMD just can't go this route...at least for now. Either way, the next year is going to be damn exciting.
 
Back
Top