• We’re currently investigating an issue related to the forum theme and styling that is impacting page layout and visual formatting. The problem has been identified, and we are actively working on a resolution. There is no impact to user data or functionality, this is strictly a front-end display issue. We’ll post an update once the fix has been deployed. Thanks for your patience while we get this sorted.

P4 will scream at RC5

ss59

Banned
from reading the whitepapers on SSE2 i now know that it supports 128bit Integer. It's like altivec on steroids. at 3GHz no less!
 
Actually, everything I've read says the P4 sucks at RC5; worse then a P3 and far worse then an Athlon.

Russ, NCNE
 
Its too early to say either way, it needs a new core before any benchmarks for it can be considered valid
 
Using the current client, RC5 performance for the P4 is dismal (see the review on this site). However, an optimized client could possibly change things.
 
Since the current batch of P4 are not SMP capable and the fact that 1.4 GHz is not enough when the branch prediction has to go back 20 steps, P4 is very ineffective for RC5. You're better off using the same amount of money to get PIII. You can assemble 2 dual 933 PIII with just the cost of a single P4.
 
Russ - that's because there's no new client😉

I know that in some paper quite awhile back, Intel stated that it would be very fast at something along the lines of "an internet decryption scheme" or something to that effect. Whatever it was, it heavily implied that it was RC5 they were talking about....or maybe they stated it outright. I don't recall for certain, all I know is that its been talked about, but so far, its a no-show.

It should, as ss59 said, be like the G4 in that sense (I don't know about the "on steroids" part, as I haven't read the whitepapers). You haven't been around quite long enough, but DES was another short term contest, and once they came out with an MMX client, LOOK OUT, it was REALLY a huge difference.

read 3.5 SIMD architectures on arstechnica. It discusses this enough so that you get the gist that if we believe ss59 (which I do), then it should rock.

SS59, care to share the source of the whitepapers? I don't know if I have it already, or not, if I just havne't read it, or if its something that I need to get 😉
 
Right now, the SSE2 instructions look very promising. Hopefully, the right instructions are there to bitslice the RC5-64 key. What I don't know, is if the P4 double-pumps the SSE2 integer instructions. If it does, and the bitslicing works, a special P4 core client could make the G4 look like it was trying to solve the key with an abacus. However, I tend to believe that the SSE2 instructions are not double-pumped. Hence, I'm figuring close to G4 keys per second speed. But, given that the P4 is 1.5GHz, that would still be an ENORMOUS keyrate...

Now, we just need someone to donate a P4 to one of the DNet guys for development.... Anybody? 😉

JHutch
 
I read something awhile back about the P4 and how amazing it would be at decryption. Don't know if it's true or bs but it was probably the same thing BK read.🙂
 
Am I the only one that doesn't want to depend on optimization to have a nice computer? Sure, Photoshop will take 3 seconds instead of 5 with SSE2, but what if I don't use photoshop? What if I use program x from Joe Smith? What then? the 2000 dollars I spent on the P4 is worthless because I could have spent 1000 on a T-Bird which would perform better.
 
well the program doesnt have to be designed around the CPU, its the compilers that pick out where optimizations can be made and make programs scream. I think if INTEL wants it to kick but in RC5, they better make a core to take advantage of 20 pipes and a 128 bit integer, and not leave it up to Dnet
 
If they optimize a client for the P4 I see no reason why it wouldn't perform very well.

Doesn't mean I'm getting one though 😉
 


<< Am I the only one that doesn't want to depend on optimization to have a nice computer? >>

I agre with you, but remember that this IS the Distributed Computing forum. Performance on RC5, SETI, and the like take priority for us.
 


<< Am I the only one that doesn't want to depend on optimization to have a nice computer? >>



Well, if it wasn't for optimization, games like Q3A/UT and even the RC5/SETI clients would not run at the speed they do. Whenever a new architecture is released, programmers have to upgrade their tools or coding methodologies to take advantage of it. Optimization is a neccessary part of coding. Just as a couple of years ago, programmers had to multithread their code, or re-write it to take advantage of pipelining (or buy a new compiler) they now have to do it again if they care about P4 performance. It's possible that the next new Athlon core will require the same &quot;adjustments&quot; if AMD deviates too much from the present design.


 
Maniac9127

I don't want to. But its a fact of the computer world. Your computer already relies on optimized code! I don't care if its a mac, an Athlon, a Celeron, P3, whatever, they are all optimized.

The fact of the matter is, the Athlon was designed, at least in part, due to the fact that everything was optimized for the P6 architecture. Take, for example, the fact that the Athlon has a &quot;free&quot; FXCH instruction (the one that &quot;pops&quot; a value in the register stack to the top of the stack) - that was originally on the Pentium Pro, but AMD decided, well, gee, considering ALL GOOD COMPILERS are working to fit the architecture.

I agree that it should, in general, be the compilers job to &quot;optimize&quot; things. But compilers, especially at this point, are only so good. I've heard it said by a developer that it is really tough to vectorize code, unless it is done by hand. That's the situation here. All the cores that are written for D.net are pretty much hand made, not just a reliance on compilers. If that were the case, then we'd all have cores that were basically P6 optimized, and nothing else, as those are the most prominant compilers out there.

I agree that the P4, from the performance standpoint, as of righgt now, is a flop. Unless AMD can ursurp Intels position, then the corporate world will make it so that they demand top performance on their Intel machines, and then people will start compiling things with P4 optimized code, instead of Ppro optimized code.

I guess my point is that we are so way beyond the point of disintertwining sophisticated compilers and new architectures, that we might as well accept it. I for one think that hand optimization should become more common place. I've heard it said by many comp sci instructors who (hopefully) know what they are talking about. In real software development, where programs are large, about 80-90 percent of the time is spent outlining the code. Only the last 10-20% is done writing it. If people were to spend TWICE as much time actually coding, and making sure that they were better optimized, things would work out better. I for one would like to see MORE optimizations for architectures, we'll get a LOT more out of our CPU's. That's what intel's been doing since the 486/pentium transition.

And if it weren't for &quot;hand optimization&quot;, Athlons would be running over 10% slower in RC5. The guy that wrote it is quoted as saying that this is &quot;only the first step&quot;....he has more plans to get it to run even faster. Why not let people optimize?
 
dang, I was beat to the punch, and in a much more succint way....ouch 😉

Anyhow, I totally agree that waiting for optimization is annoying as heck. That's why a lot of people buy &quot;just behind&quot; the curve....one, its cheaper, two, its nearly the same performance, if not better in some cases, and fewer compatability problems 😉
 
What it comes down to though is what gets optimized. AMD optimized their hardware to work with current software. Intel created a new architecture and expects software developers to optimize for them. Intel says jump and and software industry is supposed to ask how high. What I really wish is that Intel wasn't the 800 lb gorilla that it is. Letting a single company dominate the way it does holds technology back in the long run.
 
Kind of like M$ is planning for the future, not the past? 😀

I still see AMD taking a bite out of Intel's market share.
 
It's my opinion no one else's and it may be wrong but; 😛

Intel seems to have made a huge blunder here. This has been the worst year in memory for Intel mistakes, this is just another one. RAMBUS, the 820 chipset, the 1.13 GHz fiasco and the RAMBUS to sdram bridge ( ack can't remember the name 🙁 ) have all contributed to the idea that when Intel changed leadership things didn't go so well...

Intel is jumping the gun with the P4 , software needs to be in place to take advantage of it's features. The 1.4 GHz and 1.5 GHz P4's give absolutely no reason for buying them. If you have to buy a new motherboard, case, powersupply and 2 RDRAM dimms, the cost factor alone is scary. This doesn't take into account that it's slower then the AMD flagships either that are half the price.

Intel should have just waited until they had the .13 die process in place and then pushed the P3 at higher clock speeds until the P4's infrastructure was in place. Maybe even if they started the P4 at 2 GHz so they could show a real reason to move over to the P4.

AMD won't have a .13 process until 2002. That would give Intel plenty of time to get thier cpu's back out in front.

I agree with Fdiskboy. Intel is faltering and AMD is going to pick up market share.
 
Another point that I would like to make regarding the current P4's is that there won't be any way to plug a new cpu into that $250 mobo in less than a year. Not only is Intel releasing a cpu that has to have software recompiled in order to work well on it, they are releasing a temporary version. Anyone who buys a P4 system now will have to dish out the cash for a new cpu, mobo, and possibly even ram (isn't Intel planning on moving to a dual channel DDR setup for their high-end boards next year?) if they want to upgrade in 6-8 months. That's aftering having paid $600 just for a mobo and ram for their current system. If the mobo/ram setup was cheaper I wouldn't have a problem with this. But you have to fork out so much money for those components that seeing a 6 month end to upgrade potential is downright scary.
 
Notice i said &quot;Will&quot;. Of course there needs to be a new core. The MMX extensions in P4 run through SSE2 at core speed. Through a 128bit pipeline. There are 8 registers used by both SSE2 integer and fpu. Branch prediction won't come into play here as the core can fit into the L1 cache, and a miss-prediction will only cost 2 cycles. Also, remember the fancy prediction algorithms have large caches, which will have no problem running just rc5. I'd say a safe bet is at least 10MKey/s at 1.5GHz.
 
Back
Top