BOINC benchmarks

petrusbroder · Sep 10, 2006

@Assimilator1: I agree and that is why I choose not to use that version. I also do not know how those scores are acheived.
It does not matter in those projects which use a quorum of 3 or more; it does not matter in seti@home, in rosetta (not anymore at least ...) and i the other projects I do not care: as long as the results are OK, are reported OK and accepted, thats just fine.

I think it would be great if it would be very clear in the posts in this thread which BOINC-version is being used for benchmarking. That may avoid confusion.

CupCak3 · Sep 10, 2006

Originally posted by: Assimilator1

cmdrdredd
Saying that a team mate is putting up faked scores is not good team spirit ,it would be better if you removed that word.

well said!

:thumbsup:

Bateluer · Sep 10, 2006

. . .I don't crunch for the AT teams. I crunch for the my employers team. Can I accuse people of posting fake scores?

To get back on topic, I currently have 1 SFF system on the way, and 1 more pending, which I'll be adding to my current X2 and Dothan laptop. Though I don't actively crunch on my notebook during the summer months, its too hot for notebook use. I'll start crunching it again once the weather cools down.

cmdrdredd · Sep 10, 2006

team spirit? get real it's just a benchmark and it looked faked that's all. why is everyone so touchy here?

After looking around it seems that the optimized clients run faster because they are less exact in calculations than the standard clients.

caferace · Sep 10, 2006

"Get Real" yourself. If you look around the DC forum here, you'l notice quite the team spirit thing going on. We spend a fair amount of time helping each other, combining our collective forces on projects and generally giving each other good natured grief.

You came in and took a crap on our floor. We'd prefer you clean it up, nicely. That's called team spirit, or solidarity, or whatever. But please don't poop on the floor again.

-jim

The Borg · Sep 11, 2006

The optimised clients don't run faster becasue they are 'less exact'. Their results would not be accepted and that would be a waste of time. They produce better scores because they give a better indication of the speed of the processor when it is using things like SSE3 etc. The standard clients are more 'generalistic' so as to work on a broader range of CPU's (PII - P4, AMD, etc).

If you think it is cheating, then read some of the other posts so see how excited people get. The forums where screaming mad when crunch3r left and there where accusations of cheating. I do it for the score and the science.

Coquito · Sep 11, 2006

Originally posted by: The Borg
The optimised clients don't run faster becasue they are 'less exact'. Their results would not be accepted and that would be a waste of time. They produce better scores because they give a better indication of the speed of the processor when it is using things like SSE3 etc. The standard clients are more 'generalistic' so as to work on a broader range of CPU's (PII - P4, AMD, etc).

If you think it is cheating, then read some of the other posts so see how excited people get. The forums where screaming mad when crunch3r left and there where accusations of cheating. I do it for the score and the science.

Pay no mind. The Dolphins didn't play well this week.

Hey, is that a bottle of syrah you're hiding? :wine:

Fullmetal Chocobo · Sep 11, 2006

Laptop results (Toshiba Satellite w/ Intel Core Solo 1.83Ghz)
1658 Whetstone
3468 Dhrystone

**BOINC 5.4.11

lizardth · Sep 11, 2006

Opty 148 OCed to 2.6GHz

9/11/2006 10:52:44 AM|| 2477 double precision MIPS (Whetstone) per CPU
9/11/2006 10:52:44 AM|| 4612 integer MIPS (Dhrystone) per CPU

BOINC version 5.2.13

Assimilator1 · Sep 11, 2006

Keep the scores rolling in

lizardth
Interesting ,seems the benchmark doesn't really factor in the cache size (comparing your Opty @2.6GHz with my Semp @2.5GHz) ,somewhat of an omission on the benchmarking program!.

Originally posted by: cmdrdredd
team spirit? get real it's just a benchmark and it looked faked that's all. why is everyone so touchy here?

Get real? nice attitude! ,its not the benchmark Im bothered about ,its the fact that you accused someone of faking scores straight off (before looking at other possibilities,you know benefit of the doubt?),its rude thats all ,as is saying 'get real' ,that's what we're getting 'touchy' about ,as you put it.
I don't want a big arguement starting out of this so I hope you see where I'm coming from.Thanks

Bateluer
No ,we're just gonna beat ya with a wet rotten fish

Though you could avoid that by joining us

The Borg
So is v5.5 showing much higher scores on the Integer because the client is optimised for SSE3?
In which case shouldn't your P4 630s have a much higher score? ,I'm still

Btw is that optimisation just for the benchmark or for the projects too?

Rattledagger · Sep 11, 2006

Well, the BOINC-benchmark isn't really much to go by, but for that it's worth...

Opteron242:
floats: 1510M ops/sec
integer: 2662M ops/sec
Real Performance in Season Attributon: 44.5 s/TS

MP2400+:
floats: 1862M ops/sec
integer: 3127M ops/sec
Real Performance in Seasonal Attribution: 60 s/TS.

So, based on the benchmark, the 2nd. computer is 23.3% better at floating-point-operations, and 17.5% better at integer-ops. But, then you look on the real performance in Seasonal Attribution, the 2nd. computer uses 34.8% longer time per timestep, meaning in reality the 2nd. computer is 25.8% slower...

As for "optimized" BOINC-clients like Crunch3r's v5.5.0, the purpose was to give "correct" claimed credit then used in combination with optimized SETI@home-v4.xx-applications. With roughly 3x-4x faster optimized seti-application than default, it means the "optimized" benchmark is also roughly 3x-4x faster than default BOINC-benchmark, to give roughly the same claimed credit for optimized seti@home... But, if used in projects without optimized science-application, an "optimized" BOINC-client gives roughly 3x-4x higher claimed credit, for no change in scientific work done...

But anyway, all the "big" BOINC-projects is now relying on server-based crediting like for CPDN and Einstein@home, counts the actual flops done as in Seti_Enhanced, or Rosetta@homes averaging-method. This accounts for 94% of the daily BOINC-crediting, leaving only 6% that can be significantly altered by an "optimized" BOINC-clients 3x-4x higher credit-claims...

Assimilator1 · Sep 11, 2006

Interesting stuff RD.

I guess the trouble with this benchmark is that it doesn't appear to factor in cache ,& looking at your results it would seem it doesn't factor in memory bandwidth either.
I guess its very much like SSS CPU tests?
But at least it does give some sort of multi project benchmark ,flawed as it maybe

Btw what clock speed is your Opty 242?

petrusbroder · Sep 11, 2006

Just to make it very clear: the BOINC-client (which calculates the benchmark) does nothing to calculate the WU. Optimized BOINC-clients do not affect the scientific results in any way - because it does not crunch the WU. Optimised BOINC-clients affect only the number of credits which each cruncher claims.
The BOINC client (BOINC-manager) is essentially a communication- and accounting program: it downloads the WUs, when the WU is crunched by the math-program (=application), the BOINC -client uploads the results, compiles the crunching times, the credits and reports them (and the result) back to the project.

Wus are crunched by the application, which is provided by each project.
Application can be optimized. If they are, then they crunch faster, you crunch more WUs and get - probably more credits.
For example: some time ago, Einstein@home had a generalised application. Some very skilled programmers (not in tyhe service of the institute which has Einstein@Home) optimized the application, which turned out to be much faster. Now the programmer is working for Einstein@home and helps them with the optimisations...

Optimized clients (BOINC-managers) use all the functions of the processor to generate a more correct (? :Q ?) benchmark. The benchmark is used to calculate the credits which each WU earns by using a formula where the benchmarks and the crunching time are important factors. If the benchmark is high, and the the applications takes some time to crunch the you get more claimed credits sent to the project. If the project uses a quorum of eg. 3 (i.e. the claimed credits from three computers are used to calculate the average credits, which are then awarded. Alternatively the highest credits are dumped and the average of the two lower are awarded (I am not sure anbout this though: is the highest or the lowest dumped ...

)

There are other factors: other programs running: I konw from experience, that Firefox decreases the benchmarks by approx. 5 - 8% Moving the mouse whice the benchmarks are run decreases the values. The size of the L2 or L3 cache affects the benchmarks, as does (to a lesser extent) the amount of RAM. If the HDD is used by the OS during the benchmark run it affexts them ...

So: The benchmarks here may be confusing for many reasons:

1. different BOINC-versions report different whetstones and dhrystones and thus will claim more or less credits,
2. the bechmarks are affected by the number of other programs running when the benchmarks are compiled.
3. Benchmarks are affected by other hardware, and is the said hardware is active or not.

All the numbers are approximate and if many members add their numbers to this thread, then we may get a feeling how the diffrent factors affect the benchmarks ... but for that to become valid stats we need many more numbers. So let them come!

Rattledagger · Sep 11, 2006

Originally posted by: Assimilator1
Interesting stuff RD.

I guess the trouble with this benchmark is that it doesn't appear to factor in cache ,& looking at your results it would seem it doesn't factor in memory bandwidth either.
I guess its very much like SSS CPU tests?
But at least it does give some sort of multi project benchmark ,flawed as it maybe

Btw what clock speed is your Opty 242?

The BOINC-benchmark only measures raw cpu-speed, and it's so small it AFAIK fits in even a small 64 KB cache-memory, making it useless for testing any form of memory-speeds or effects of large cache-memory-cpu's.

As for cpu-speed, Opteron242 = 1.6 GHz, MP2400+ = 2 GHz.

Rattledagger · Sep 11, 2006

Originally posted by: petrusbroder
Optimized clients (BOINC-managers) use all the functions of the processor to generate a more correct (? :Q ?) benchmark. The benchmark is used to calculate the credits which each WU earns by using a formula where the benchmarks and the crunching time are important factors. If the benchmark is high, and the the applications takes some time to crunch the you get more claimed credits sent to the project. If the project uses a quorum of eg. 3 (i.e. the claimed credits from three computers are used to calculate the average credits, which are then awarded. Alternatively the highest credits are dumped and the average of the two lower are awarded (I am not sure anbout this though: is the highest or the lowest dumped ... )

1. different BOINC-versions report different whetstones and dhrystones and thus will claim more or less credits,

Well, starting at the bottom...

For BOINC-client v3.18-v3.21, the windows-compiler optimized-away part of the benchmark, so in reality not all of the benchmarks was run. This gave significantly higher benchmarks on Windows-computers, and therefore also higher claimed credit.

v4.19 and earlier v4-clients partially fixed this problem, but still Windows-compiler optimizes-away part of the benchmark, and this still gave higher benchmark.

With v4.20, the AFAIK last code-change to BOINC-benchmark was made, and all later BOINC-clients should give roughly the same benchmark-score on the same computer/OS. But, the Windows-compiler being used is still better at optimizing than Linux/Mac-compilers being used, this gives higher benchmark-score on windows.

As for all the things that influences benchmark, the single most significant influence is for multi/HT-computers, there the Integer-benchmark is litterally "all over the place", this can give rise to atleast 2x difference in credit-claims between benchmark-runs...

The flops-benchmark on the other hand seems much more stable, and is little influenced by other things.

Also, AFAIK neither memory nor cache-size is significantly influencing the benchmark-scores...

As for crediting in quorum-system, the easy one is SETI@home:
At the same time wu validated, all results that also passes validation at the same time decides crediting after these rules:
1; If only 2 results passed validation, lowest claimed to all.
2; If 3 or more passed validation, remove highest and lowest claimed, and average the rest.
3; Any later-returned results that also passes validation, gets the same credit as other for same wu, no re-calculation of granted credit is done.

Other projects AFAIK uses the same rules, with the addition that projects without any redundancy directly gives claimed as granted...

As for "optimized" BOINC core clients (the Manager doesn't run any benchmarks, and AFAIK isn't even re-compiled)... Well, if not quite mistaken, by using a different compiler, and by using other compiler-switches, just like the old v3.xx for windows removed part of the benchmark, the same is AFAIK true with the "optimized" BOINC-clients...

How can a compiler "optimize away" part of a benchmark?

Let's look on an easy example of a possible benchmark, there i and a is local variables not used in other parts of the code:
i = 1 to 100k {
a = sqrt 4
b = a * a
}

Optimized, a compiler can change it to this code:
b = 4

Now, both code-parts gives the end-result that b = 4, but, while the original code does 200k calculations, the optimized code in reality has no calculations at all, and is therefore faster to execute.

Yes, the BOINC-benchmark isn't so easy, but, it's enough for a compiler to find one small point that cuts-down on how many calculations is done, and you've got a benchmark that runs faster, and therefore gives a higher benchmark-score...

petrusbroder · Sep 12, 2006

Thanks, Ratteledagger, for correcting my misconceptions!

Your post is - as usually - very informative and to the point.

I still think, that we need much more reports about the benchmarks different computers produce in real life.

Assimilator1 · Sep 12, 2006

Wow! , I think I actually got most of that!

As I mentioned ,& as appears to be shown between my Sempron scores & lizardth Opty scores L2 cache & memory bandwidth isn't tested.And what RD says about that explains it.

So it would seem that v5.5 BOINC is giving scores which are not at all comparable to all other recent BOINC managers (even if they are better related to the client).

Anyway,more benchmarks please

Ken g6 · Sep 15, 2006

Intel Celeron M 1.5GHz (that's a Centrino, maybe without the special FSB, with 1MB cache, and without the speed-throttling capability.):

Boinc 5.4.9:
Float: 1332 MFLOPS
Integer: 2745 MIPS

Boinc 5.5.0:
Float: 2072 MFLOPS
Integer: 8413-8840 MIPS

(It's interesting how the higher MIPS rating came when the benchmark was run while the computer was starting up. That run got 1852 MFLOPS.)

Athlon XP 2200+ (1.8GHz), Boinc 5.5.0:
2161 float
8970 int

Wiz · Sep 15, 2006

Knowing that it really doesn't mean much...

P4 3.8ghz 2 mb cache

I was running the opt client until the new scoring system put in place for R@H

BOINC (Opt) 5.5.0 - SSE2

6/20/2006 12:49:20 PM||Benchmark results:
6/20/2006 12:49:20 PM|| Number of CPUs: 2
6/20/2006 12:49:20 PM|| 3251 floating point MIPS (Whetstone) per CPU
6/20/2006 12:49:20 PM|| 6828 integer MIPS (Dhrystone) per CPU

Now running the standard BOINC client: 5.4.11

9/15/2006 8:41:59 AM||Benchmark results:
9/15/2006 8:41:59 AM|| Number of CPUs: 2
9/15/2006 8:41:59 AM|| 1663 floating point MIPS (Whetstone) per CPU
9/15/2006 8:41:59 AM|| 1986 integer MIPS (Dhrystone) per CPU

This score is way below my AMD 2400+ even though this box does more than double the amount of work in the same amount of time.

Assimilator1 · Sep 15, 2006

Hmm ,that last score is really screwed up lol ,should be much higher

Another thing I miss about SETI classic was how easy it was to benchmark....

Wiz · Sep 15, 2006

Yeah, a P4 3.8ghz 2 mb cache ought to score higher than a XP2400+ but then again I guess the benchmark scores don't really mean anything at all huh?

Ken g6 · Sep 15, 2006

You might be interested by the Google search I did for Boinc benchmarks. One of the results (actually, two) is this thread! Another result is this thread on TPR forums, though it's rather old.

Assimilator1 · Sep 16, 2006

Yea I think I turned up another thread at TPR about benchmarking BOINC.

Wiz
It seems to be particularly out on your P4 though

,at least their was some correlation with other CPUs

Anymore P4 scores out thier? (preferably on v5.4.x) ,of course any scores are welcomed

Bateluer · Sep 18, 2006

Well, one of my new systems turned out to be a 2.4Ghz Celeron instead of a 2.4 P4.

The eBayer misrepresented this machine, but it still should serve my needs. I'm currently in negociations with him to see if I can get some of the payment refunded though. I'll see how that plays out.

Here's the Seti BOINC benchmarks for client 5.4.11.

Celeron 2.4Ghz (400Mhz FSB 512MB RAM)
1224 Float Whetstone
2316 integer Dhrystone

I don't think BOINC's benchmarks are all that reliable though. I don't see this celeron as an equal to many of the more powerful CPUs in integer performance.

teriba · Sep 18, 2006

CPU type AuthenticAMD
AMD Athlon(tm) 64 Processor 3500+
Number of CPUs 1
Operating System Microsoft Windows XP
Professional Edition, Service Pack 2, (05.01.2600.00)
Memory 1535.23 MB
Measured floating point speed 2093.46 million ops/sec
Measured integer speed 3882.36 million ops/sec

BOINC benchmarks

Elite Member

Golden Member

Lifer

Lifer

Golden Member

Senior member

Diamond Member

Moderator<br>Distributed Computing

Golden Member

Elite Member

Elite Member

Elite Member

Elite Member

Elite Member

Elite Member

Elite Member

Elite Member

Programming Moderator, Elite Member

Diamond Member

Elite Member

Diamond Member

Programming Moderator, Elite Member

Elite Member

Lifer

Golden Member