1.7 Ghz Xeon

barbary · Jul 4, 2001

I've searched but can't find it.

There was a thread about running seti on 1.7Ghz Xeon chips. It included a link to a site that had reviewed dual 1.7 Xeon system using clibench and seti.

Does anyone know the site. I've just got a Dual 1.7Ghz system and I'm finding the performance running seti to be quite poor.

Smoke · Jul 4, 2001

We would appreciate hearing about your system and WU production stats.

barbary · Jul 4, 2001

System

Dual 1.7Ghz Xeon 400Mhz FSB
512Mb RDRAM 800Mhz
Win 2k Pro
15k 18Gb HD
10k 73Gb HD

It's processing a WU on average 4.2 hours per processor. (Ave of last 5 WU)

I may be getting ahead of myself. I know there can be a big difference in WU's so a large sample must be taken. It started off at 5hrs a WU (Which worried me) but I noticed that some of the more recent ones are at 3.5hrs WU.

I'll report back tommorow when I have another 24 hrs worth to take a sample of.

IJump · Jul 4, 2001

I thought there was something with SETI about dual processor machines. Don't you have to run two instances of the client in order to get the benefit of the duel processors?

Please excuse my ignorance, I haven't been running SETI very long and I only run it on one machine until RC5 finishes.

cakin · Jul 4, 2001

I may have just been your first few WU's had low ARs. My P3-1000@1100 has a best time of 4:47 (12.026 AR) to a high of 5:57 (0.494 AR) Avg CPU 5:23

What type of board is it running on?

Seti can't take advantage of dual procs, so you need to run two.

barbary · Jul 4, 2001

Rest assured I'm running two copies of the 3.03 cli.

The machine is a Dell 530 so it's actualy a Dell motherboard.

Sukhoi · Jul 4, 2001

Hey, are you on Team AnandTech? I don't recognize your name. 😱 If not, join us! 😀

Smoke · Jul 4, 2001

barbary, Let me second Sukhoi's invitation.

We would love to have you join us. 🙂

Assimilator1 · Jul 4, 2001

Barbary
I'm afraid I don't know the site you refer to 🙁 ,but myabe if you give us a clue we could guess it?.

BTW I 3rd the motion ,if your not already in TA then why not join us? ,you'd be welcomed 🙂

Ijump
I think your thinking of the shared memory bandwidth issue ,however I believe he is using a m/brd with the 850 chipset with RDRAM.Presumabley that would be able to cope with 2 SETI instances much better.(?)

ColinP · Jul 4, 2001

Plus the fact that they are Xeons.
What size cache do they have ???

Col

BurntKooshie · Jul 4, 2001

256Kb L2 cache. The next iteration of the P4 Xeon should have either larger L2, an on-die L3, or both.

Sukhoi · Jul 4, 2001

BTW, I do remember what article you're talking about, but I have no idea what site it was at.

barbary · Jul 5, 2001

I obviously jumped the gun and should have waited before cursing this machine.

It has settled on an ave 4 hours a WU (sample 22 high 5 low 3.25). It's managed 13 in the last 24 hours.

I was a little paranoid after finding that often increasing processor speeds had little or no effect if the FSB was left the same. My dual 800/100 produces not many more than my dual 400/100. Like wise my dual 1000/133 produces no more than my dual 800/133.

I am tempted to join. I waited ages before joining a team (#Amiga) but as soon as I joined people lost interest. Dispite a few updates to the web page most people haven't returned a result in ages.

Of course if I joined team Anandtech you'd only get my future results added in not the 10500 I've already done.

BurntKooshie · Jul 5, 2001

barbary - if you are referring to the fact that the S@H page says that work units don't transfer...well, that's what it says. The fact is, it doesn't actually work that way. Work units will transfer to the team you are currently on, even though that contradicts what the S@H pages say. Trust me, that's how it works - we've lost, and gained, many work units due to members joining and leaving the team.

As for the reason behind the lackluster performance increases when the clock-rate increases, it basically works like this: think of memory bandwidth as a shared commodity (because it basically is in most architectures). The S@H client spills over the L1 and L2 cache (unlike some other distributed programs), and so makes use of a good deal of main memory, and does it often. The fact that one processor uses some main memory bandwidth means that there's less available for the other CPUs. In the case of the P3 (even worse, celeron based) SMP systems, memory bandwidth is very limited, so the fact that one CPU uses some of it up means the other has less available. Kinda like downloading two huge files from a fast server on a 56k modem.

<< I was a little paranoid after finding that often increasing processor speeds had little or no effect if the FSB was left the same. My dual 800/100 produces not many more than my dual 400/100. Like wise my dual 1000/133 produces no more than my dual 800/133. >>

That right there is experimental evidence of what I was talking about above 😉

I could go into more detail if you'd like, but that's the basic idea. The P4 architecture has a LOT more main memory bandwidth, so that is somewhat less of an issue.

barbary · Jul 5, 2001

One version of seti I think it was 2.04 suddenly went a lot less memory bandwidth hungry but the more recent 3.03 has gone more so.

Still as you say the P4's extra bandwidth is making a huge difference. I was just suprised when the first few WU's were no quicker. Now it's a more realistic speed. Obviously it just did this to fool me. Really my first post is about finding that review site that did the review of dual 1.7Ghz Xeons. I wanted to compare clibench numbers with it.

I have thought about it and as I visit Anandtech every day and enjoy the wonderful support from these forums I have joined Team Anandtech. It made sense don't know why I didn't do it before.

"You have been added to the team Team AnandTech "

BurntKooshie · Jul 5, 2001

I believe that version 3.0 was also very small-cache friendly, as WU times dropped dramatically with that client 🙂. Of course, about that time there were a lot more server outtages....probably due to the fact that a lot of people were sending in a lot more results 😉

Welcome to the team! We all appreciate it 😀

barbary · Jul 5, 2001

Possibly it was 3.0 I was thinking off. I know all of a sudden my 800/100 started to turn in great results but has now dropped back towards the 400/100 with 3.03. I think it's better to have the client do more work per WU.

Dingas · Jul 5, 2001

It would be wonderfull if you could join our team 🙂?

JWMiddleton · Jul 5, 2001

Looks like Barbary did join and brought 10,600+ WUs with him! WoW! Welcome to the TeAm! 🙂😀

Smoke · Jul 5, 2001

barbary, we had no idea! 😀😀 Terrific stats.

You are welcomed to the TeAm if you have 0 WUs or 10,000 WUs just the same, but 10,000 is a WOW NUMBER. 😉

😀

ColinP · Jul 5, 2001

🙂🙂🙂🙂

🙂

Welcome !!!!!!

🙂

Col

Assimilator1 · Jul 5, 2001

Welcome to TA barbary 😀 ,and a whopping boost you gave to our stats with your 10k+ WU's 🙂.Seeing as you entered straight into the exclusive 10K club you get the Beer & babes straight away 😀😉.

BTW re less bandwidth ,it was v3.0 you were thinking of.v2.04 GUI/v2.4 CLi needed lots memory bandwidth but was more cpu friendly.eg an X ship of mine was a PII 233 @ 280 (2.5x112) it used to do WU's in 10.5 hrs (on CLi)but with v3.0 it nearly doubled.At approx a PII 400 v3.0 to v2.4 WU times were virtually the same.Faster cpu's got faster WU times with v3.0

ElFenix · Jul 5, 2001

welcome to the TeAm barbary!

as for the dually p4, don't forget that 1 p4 plus dual channel rdram is matched perfectly. because AGTL is not point to point, not only is the memory bandwidth shared but the FSB is shared as well.

i'm trying to think of what hardware prefetch would do in this sort of situation. we're used to it with dual p3s slowing down 20% or so due to the sharing. but the p4 has prefetch. and seti probably takes blocks of data in recognziable patterns, just going deeper into the work unit. so you'd think the prefetch algorithm would be able to predict that, right? now... if seti is all thats running... i wonder if the prefetch can predict everything (except for the random OS event) and minimize the impact of sharing bandwidth? easy way to find out is compare a sample of WUs done with the machine on a single processor only to those done while both processors are munching. then compare to p3 machines and get a difference in differences. it would probably work best when the p4 speeds are such that they use 1.2 or so GBps of bandwidth.

BadThad · Jul 5, 2001

another....Welcome to the TeAm!

P.S. - Have any friends with 10k wu's? 😉 lol

Polo · Jul 5, 2001

Welcome to the TeAm Barbary ! 🙂

Nice fleet you have... (me jealous) 😉

1.7 Ghz Xeon

Senior member

Distributed Computing Elite Member

Senior member

Diamond Member

Golden Member

Senior member

Elite Member

Distributed Computing Elite Member

Elite Member

Golden Member

Diamond Member

Elite Member

Senior member

Diamond Member

Senior member

Diamond Member

Senior member

Senior member

Diamond Member

Distributed Computing Elite Member

Golden Member

Elite Member

Elite Member

Lifer

Diamond Member