Originally posted by: AlphaQ
How will it affect gaming?
Originally posted by: AlphaQ
Well, wouldn't 32 bit games running in a 64 bit system get a performance boost? Or shouldn't they at least...
Originally posted by: Acanthus
Originally posted by: AlphaQ
How will it affect gaming?
Thats still a big unknown, drivers are not properly optimised and there are no 64-bit games yet to test.
Originally posted by: Acanthus
Thats still a big unknown, drivers are not properly optimised and there are no 64-bit games yet to test.
Originally posted by: blurredvision
When is the 64-bit Windows OS scheduled to release anyhow? I haven't heard anything for a while now.
64-bit GP registers speed up floating point calculations? Why would the register size of GP registers affect FP operations? FP registers have been 80 bits for a long time (386?).Originally posted by: foxkm
I doubt that 64 bit will make a big difference in gaming. Most of the floating point calculations are done in the GPU anymore (64 bit registers mainly speed up large floating point calculations)
Originally posted by: pm
While there are some benefits of 64-bit code, 64-bit data and pointers are also twice as big as 32-bit instructions which has the effect reducing the amount of speculative prefetches you do and effectively halves the usefulness of a given cache size. Some apps will speed up but some slow down too.
I'm not denying that there are benefits. I'm just saying that it's not all good news and there are many reasons (I can list a few more too) why performance can go down - in some cases by a significant amount when there are a lot of pointers involved.Originally posted by: Matthias99
OTOH, at least on amd64, you have twice as many registers for your program to use (and even if the program is running in 32-bit mode, a 64-bit OS can use the other registers to avoid having to displace program data as often, at least theoretically).Originally posted by: pm
While there are some benefits of 64-bit code, 64-bit data and pointers are also twice as big as 32-bit instructions which has the effect reducing the amount of speculative prefetches you do and effectively halves the usefulness of a given cache size. Some apps will speed up but some slow down too.
I honestly don't know. I don't know much about the microarchitectural implementation of the Athlon 64. But I would be very surprised if it couldn't - how else would it work? Pad the rest of the it?And isn't the effective cache size *already* halved, or does the Athlon64 have the capability to access the cache in 32-bit pieces in its native 32-bit-only mode?
Originally posted by: pm
I'm not denying that there are benefits. I'm just saying that it's not all good news and there are many reasons (I can list a few more too) why performance can go down - in some cases by a significant amount when there are a lot of pointers involved.Originally posted by: Matthias99
OTOH, at least on amd64, you have twice as many registers for your program to use (and even if the program is running in 32-bit mode, a 64-bit OS can use the other registers to avoid having to displace program data as often, at least theoretically).Originally posted by: pm
While there are some benefits of 64-bit code, 64-bit data and pointers are also twice as big as 32-bit instructions which has the effect reducing the amount of speculative prefetches you do and effectively halves the usefulness of a given cache size. Some apps will speed up but some slow down too.
I honestly don't know. I don't know much about the microarchitectural implementation of the Athlon 64. But I would be very surprised if it couldn't - how else would it work? Pad the rest of the it?And isn't the effective cache size *already* halved, or does the Athlon64 have the capability to access the cache in 32-bit pieces in its native 32-bit-only mode?
Originally posted by: Matthias99
Well, yeah.The simplest cache hardware implementation (I would think) would be to load and store the whole 64-bit register all the time, and then in 32-bit mode the CPU simply ignores the upper 32 bits.
I was thinking that having the capability to split the cache like this would be a huge PITA (and it probably is), but it could be done. You'd have to essentially have cache logic that could switch from 32-bit to 64-bit mode as well, and data/address lines that could be split in half and multiplexed to different registers simultaneously, wouldn't you? I would think that would slow it down a *lot*, but maybe there's more logic already between the registers and cache than I'm taking into account, and it wouldn't hurt it that much to add another layer of indirection.
Originally posted by: pm
Originally posted by: Matthias99
Well, yeah.The simplest cache hardware implementation (I would think) would be to load and store the whole 64-bit register all the time, and then in 32-bit mode the CPU simply ignores the upper 32 bits.
I was thinking that having the capability to split the cache like this would be a huge PITA (and it probably is), but it could be done. You'd have to essentially have cache logic that could switch from 32-bit to 64-bit mode as well, and data/address lines that could be split in half and multiplexed to different registers simultaneously, wouldn't you? I would think that would slow it down a *lot*, but maybe there's more logic already between the registers and cache than I'm taking into account, and it wouldn't hurt it that much to add another layer of indirection.
In general, designing a CPU is a horrendous PITA.![]()
I can only speak to caches that I have worked on the design of. The way that I did it when I worked on the load and store buffers was to read out the entire cache line for a read, and then steer the "critical chunk" - whatever size that chunk needed to be - to the output. So, using fake numbers (as much as anything because I can't remember what the real numbers were), I'd read out 256 bits (not counting ECC) and then knowing specifically what part of the line that I wanted, I'd mux out the 64-bits that I wanted and send that off. If we had smaller quantities that we'd need - 16-bits, 32-bits, I swazzle those smaller bits up too down to a resolution of 8-bits.
Originally posted by: Matthias99
Ah, OK. I was thinking the natural thing would be to build in cache lines that were 64 bits wide, but if they're already using wider ones, they would need this kind of logic anyway. That makes sense.