Originally posted by: CTho9305
Originally posted by: Gamingphreek
Jesus christ thats his first post. That is incredibly advanced!! im speechless (im referring to devnull)... ill have to sit for a long time and look at that to interpret it. Then ill have to ask some people to help. Wow AWESOME post!!
Just out of curiosity what is the fastest most expensive "best" processor in the world?
First, best and most expensive are not necessarily the same. Case in point, at a given performance level, you'll generally pay less for an AMD cpu than an Intel CPU. Second, there is no one "best" processor. It really depends on the task.
If your task has a 32 MB working set, it would crawl on any Opteron or Itanium, since you'd be pounding on the RAM, but would absolutely FLY on one of
these HP PA-RISC CPUs.
SPECint/fp don't benefit as much from the huge cache, so you get a higher rating with a
different processor.
Also who would use the Alpha and Creusoe Chips? What does alpha run on (microarchitecture wise)
Alpha is alpha (as in, not x86, not PPC). Alpha is also pretty much dead - Intel killed it to avoid competition with their IA64 products. The EV8 would have been one heck of a CPU
🙁. Alphas were used in very high-end workstations.
Crusoe is a mobile part - it goes in some laptops. I think Fujitsu has some Transmeta-based laptops. The performance isn't great, but the battery life should be.
Originally posted by: imgod2u
Originally posted by: CTho9305
What happens when two years down the line, 2 ALUs aren't enough? You have to recompile code to take advantage of the new, more powerful 4 ALU 4 FPU processor? Or the new processor has to pull the same old superscalar tricks on the old code?
Strickly speaking, for VLIW, this would be true. However, this isn't true for Itanium/IA-64. IA-64 doesn't let you specify which ALU to use, rather it just lets you specify which instructions can be done in parallel. So, in a sense, you sill are issuing individual instructions (e.g. add 5, 3, 2), but you're "bundling" them into groups that is defined to be executed in parallel. This way, the back-end won't have to check before it just executes them, using all resources available.
There is a fixed width for the instructions - you can only specify so much parallelism with each instruction. If there is more parallelism available, you're not maximizing your throughput (and can't, without extra superscalar-esque hardware). If there is not enough parallelism, you're wasting memory bandwidth.