PCmark2004 oddities!!!

Duvie

Elite Member
Feb 5, 2001
16,215
0
71
Funny here is my score for a 3509mhz oc on a 2.4c w/ (292fsb) 3:2 cas 2,3,2,7 Gat auto...

System Test Suite
Multithreaded Test 1 / File Compression 6.231 MB/s
Multithreaded Test 1 / File Encryption 58.587 MB/s
Multithreaded Test 2 / File Decompression 42.314 MB/s
Multithreaded Test 2 / Image Processing 16.433 MPixels/s
Multithreaded Test 3 / Virus Scanning 3274.828 MB/s
Multithreaded Test 3 / Grammar Check 2.688 KB/s
File Decryption 101.696 MB/s
Audio Conversion 3090.788 KB/s
Web Page Rendering 6.810 Pages/s
WMV Video Compression 63.778 FPS
DivX Video Compression 70.704 FPS
Physics Calculation and 3D 139.993 FPS
Graphics Memory - 64 Lines 944.526 FPS


Simple enough....My last 2 numbers can be disregarded since I run an 8500 AIW card with optimal video settings....


NOw check out these guys.....


P4 2.8@4088ghz w/ 3:2

P4 2.8@3.5ghz 250fsb w/ 5:4


The funny thing is the non multithreaded items I am ahead of the 3.5ghz guy and naturally behind the 4ghz guy in some things but kill both of them in the WMV and Divx encoding....

However look at the multi-threaded and they kill me...Then I notice they say disabling HT gives them better score...First I say to myself .."oh those dumb-ass Futuremark guys are too lame to implement it right"..I mean cmon I ran the test of HT on several apps and i never ran into 1 that hT was a disadvantge....

So I disable HT in Bios restart and run again and my scores are naturally worse in the multithreaded apps as they should be but now I got numbers closer in the divx and WMV encoding....

What gives here??? Same build and revision.....I have a raptor frshly installed and defragged so I have basically an equal system if not faster then the 2nd guy...The 1st guy has a raid setup so naturally he is faster...


To me this seems weird....Those items don't seem to be IO dependent and should have been basically CPU driven and HT should have been enabled to get best scores...IN my case they did and not theirs....

Was there a fix that corrected something???
 

Duvie

Elite Member
Feb 5, 2001
16,215
0
71
OK I know what they did....they set the priority to realtime which I have no issue with that....However when i ran the bench with HT on I basically got around same numbers as before since I don't have a lot of crapping runnig in the background like many others....


NOw I disable HT in the BIOS and guess what.....About a 1000 more points in the test and I have numbers like the other guys....

System Test Suite
Multithreaded Test 1 / File Compression 8.622 MB/s
Multithreaded Test 1 / File Encryption 91.514 MB/s
Multithreaded Test 2 / File Decompression 60.839MB/s
Multithreaded Test 2 / Image Processing 29.154 MPixels/s
Multithreaded Test 3 / Virus Scanning 3703.234 MB/s
Multithreaded Test 3 / Grammar Check 5.384 KB/s
File Decryption 102.277 MB/s
Audio Conversion 3007.937 KB/s
Web Page Rendering 6.241Pages/s
WMV Video Compression 54.077 FPS
DivX Video Compression 63.436 FPS
Physics Calculation and 3D 110.152 FPS
Graphics Memory - 64 Lines 941.965 FPS


Conclusions I draw fom this....

The actual benchmark takes longer to run in the multithreaded sections which makes me think it is actually running them one at a time for them to get better scores in each phase...

Also when you look at the other single app test I get similar to slightly better and clearly HT is helping in the single apps like WMV and DIVX....


Final conclusion is Future Mark has a real POS here!!! Pretty lame....I guess this is why I regularly avoid this synthentic crap and the morons who write these things....
 

THUGSROOK

Elite Member
Feb 3, 2001
11,847
0
0
Originally posted by: Duvie
...conclusion is Future Mark has a real POS here!!! Pretty lame....I guess this is why I regularly avoid this synthentic crap and the morons who write these things....
;)
 

Duvie

Elite Member
Feb 5, 2001
16,215
0
71
Thanks for the wordy post Thugs!!!!;)

i looked at a lot of athlon xp and barton scores and they seem tobe right about where you would expect them to fall versus my score...A little low on the multitheaded apps but do well albeit slower then my 3.5ghz in single apps but at least ballpark...

If you go off of the skewed NON-HT real priority benchmark they get flat doubled up to tripled on the multithreaded apps and still a bit behind or equal in other single apps....

I have tested the hT and no way is it going to 200-300% increase over a Barton chip in multitasking of 2 apps...Not happening
 

RussianSensation

Elite Member
Sep 5, 2003
19,458
765
126
Originally posted by: Duvie
OK I know what they did....they set the priority to realtime which I have no issue with that....However when i ran the bench with HT on I basically got around same numbers as before since I don't have a lot of crapping runnig in the background like many others....

How do you set the priority to realtime?????????? Is this setting in the Pro version of the benchmark or where? Because with my 3000ghz i get 4400 points and these guys get 6800 - 7500 with only 3500 or 4000 ghz

theoretically they shouldnt get 16.67% +/-2% and 33.33% +/- 2-4%

hmmm.............or their score is so high due to them having faster hard drives? I thought PCMark04 total bench test with non-pro version is more cpu and memory intensive and some 2 videocard tests. Duvie do you know which tests in this stress hard drives if at all?

Thanx
 

Duvie

Elite Member
Feb 5, 2001
16,215
0
71
Originally posted by: RussianSensation
Originally posted by: Duvie
OK I know what they did....they set the priority to realtime which I have no issue with that....However when i ran the bench with HT on I basically got around same numbers as before since I don't have a lot of crapping runnig in the background like many others....

How do you set the priority to realtime?????????? Is this setting in the Pro version of the benchmark or where? Because with my 3000ghz i get 4400 points and these guys get 6800 - 7500 with only 3500 or 4000 ghz

theoretically they shouldnt get 16.67% +/-2% and 33.33% +/- 2-4%

hmmm.............or their score is so high due to them having faster hard drives? I thought PCMark04 total bench test with non-pro version is more cpu and memory intensive and some 2 videocard tests. Duvie do you know which tests in this stress hard drives if at all?

Thanx



You set priority by going into task manager (hit ctrl-alt-del once)...then into processs tab..there highlight the pcmark2004.exe file and right click...set priority to real-time...

It really doesn't make much difference unless someone has a antivirus software or seti running....I saw virtually no change other then a non-responsive system when it was running...It also doesn't make a difference cause the software has a major glitch that allows major erroroneous scores....



I am not sure which one is but iff you look into the HDD specific test noen of thes test or similar...However under the cpu test much of these are ran....



I here ya Jeff....I think the barton score may be a little low but I get a 4900 average with my lame vid card and mianlt that is the advantage of the HT I believe...
 

Technonut

Diamond Member
Mar 19, 2000
4,041
0
0
I here ya Jeff....I think the barton score may be a little low but I get a 4900 average with my lame vid card and mianlt that is the advantage of the HT I believe...
I just received yet another 2.4C M0 from ZZF, and am currently testing @ 3.36GHz. My score came in @ 5306. I also find it interesting that even though you are @ 3.5GHz, my results are fairly close...

Multithreaded Test 1 / File Compression 6.038 MB/s
Multithreaded Test 1 / File Encryption 55.316 MB/s
Multithreaded Test 2 / File Decompression 40.293 MB/s
Multithreaded Test 2 / Image Processing 15.532 MPixels/s
Multithreaded Test 3 / Virus Scanning 2817.949 MB/s
Multithreaded Test 3 / Grammar Check 2.530 KB/s
File Decryption 89.435 MB/s
Audio Conversion 2965.559 KB/s
Web Page Rendering 6.477 Pages/s
WMV Video Compression 61.218 FPS
DivX Video Compression 68.898 FPS
Physics Calculation and 3D 197.619 FPS
Graphics Memory - 64 Lines 2747.613 FPS

After I am through testing this processor, I will run the test again with my "keeper" 2.4C M0 @ 3.42GHz. I am running my RAM @ 5:4 / tight timings with this processor, and also with the other @ 3.42GHz.

EDIT: Duvie: If you have the time, drop down to 3.36GHz / 5:4 / 2-5-3-2 , run the benchmark, and post the results.. I am just curious...
 

Jeff7181

Lifer
Aug 21, 2002
18,368
11
81
I set mine back to 2.2 Ghz to give an idea of how an XP3200 compares...

PCMark Score: 3939

Multithreaded Test 1: 3.116
Multithreaded Test 1: 34.247
Multithreaded Test 2: 23.799
Multithreaded Test 2: 11.844
Multithreaded Test 3: 1687.441
Multithreaded Test 3: 2.863
File Decryption: 67.537
Audio Conversion: 2581.758
Web Page Rendering: 4.959
WMV Video Compression: 46.858
Divx Video Compression: 53.896
Physics Calculation and 3D: 164.17
Graphics Memory: 2344.045

Is it just me, or does the test seem a tad biased? I'm not just sore because Intel processors do much better... but doesn't it seem odd that 90% of the tests are in areas that we all know P4's do well in? Multi-threaded apps, media encoding, and bandwidth dependant apps.

*EDIT* Did Futuremark pick these tests because that's where they think the future of computers is? Multi-threading and media encoding?
 

RussianSensation

Elite Member
Sep 5, 2003
19,458
765
126
Thanx for the explanation Duvie. I've set the benchmark to real time priority and disabled Hyper-Threading. The result is my score went up from 4437 to 5370 with 2600@3055.

PCMark04 -- 5370 with 3055mhz p4

What conclusions can I make based on this?

1. It is fairly obvious that this benchmarks is not able to fairly represent real world performance accurately as HT generally brings 5-20% benefit and in this case disabling it gives an improvement of 21%!!!!?

2. When set to real-time priority my system was running at 100% load and I was unable to even move my mouse which makes this pointless as no one can really dedicate 100% cpu power (well you could) but most people tend to run things in a multitasking environment, so that would mean having internet, playing music, videogames, running seti, folding and so on.... therefore a score which reflects cpus dedation to one program is useless.

3. If this score can be manipulated in such way and it falsifies the reality then its validity and intentions are undermined, and as you guys say I am now starting to question the latest benchmarks by Futuremark. Having said that it is becoming hard to rely on the old test suites which favoured one platform over the other and 3dmark01 as well as pcmark02 are not up to date to provide a definite look at the current applications.

In the end I think we should all just go to benchmarking things that really matter. Our own gaming performance, encoding and so on.
 

Duvie

Elite Member
Feb 5, 2001
16,215
0
71
Technonut....the reason your overall score is better but your individual scores are not is cause you double me up on the vid card and that does count for about 2 test...Also the ram I have cannot run 5:4 right now but can run 2,6,3,2, cpc enabled and gat auto wth 3:2...So there is where you get closer then just the percentage difference of near 200mhz.....You are running higher speed ddr with still excellent timings...I would guess at 3.42ghz you tie or beat me in most all apps if you can still run 5:4....



Jeff...I do hear ya....
 

Duvie

Elite Member
Feb 5, 2001
16,215
0
71
Originally posted by: RussianSensation
Thanx for the explanation Duvie. I've set the benchmark to real time priority and disabled Hyper-Threading. The result is my score went up from 4437 to 5370 with 2600@3055.

PCMark04 -- 5370 with 3055mhz p4

What conclusions can I make based on this?

1. It is fairly obvious that this benchmarks is not able to fairly represent real world performance accurately as HT generally brings 5-20% benefit and in this case disabling it gives an improvement of 21%!!!!?

2. When set to real-time priority my system was running at 100% load and I was unable to even move my mouse which makes this pointless as no one can really dedicate 100% cpu power (well you could) but most people tend to run things in a multitasking environment, so that would mean having internet, playing music, videogames, running seti, folding and so on.... therefore a score which reflects cpus dedation to one program is useless.

3. If this score can be manipulated in such way and it falsifies the reality then its validity and intentions are undermined, and as you guys say I am now starting to question the latest benchmarks by Futuremark. Having said that it is becoming hard to rely on the old test suites which favoured one platform over the other and 3dmark01 as well as pcmark02 are not up to date to provide a definite look at the current applications.

In the end I think we should all just go to benchmarking things that really matter. Our own gaming performance, encoding and so on.



exactly....

I would love to set up a suite of real world test we can either burn on a dvd-rw or somethings of test we can run and keep standardized....

Here is a couple of thoughts....


Encoding to Divx say a trailer of a movie....

Encoding an avi file to mpeg using tmpgenc (good HT test)

POVRAY 3.5 (has a standardized benchmark)

UT2003 demo

SuperPI 32mb

Besweet or lame encoding of say the trailer above soundtrack


Any other ideas???
 

Jeff7181

Lifer
Aug 21, 2002
18,368
11
81
My conclusion is that all Futuremark programs have lost all respect as a non-biased benchmarking program. 3DMark with the nVidia driver fiasco and now PCMark with this crap. Just what is PCMark2004 supposed to be showing? How good the P4 is with multi-media applications and how useful it's memory bandwidth is? Even PCMark2002 has some major flaws... first... upgrading my video card put my memory score in the 6500's... why would the video card memory be included in that test, is that what 3DMark is for? Increasing my FSB does wonders for the hard drive score... which is a load of crap since it doesn't absolutely nothing for real world performance as far as the hard drive goes. I wish a large website/publication would slam Futuremark and tell it how it is... their applications are ONLY useful when making upgrades to your OWN computer to see if the change has made a positive difference. As far as comparing between computers... good luck... my "Duron @ 2250 Mhz" which is running on a 200 Mhz FSB with a multiplier of 11.5 (
rolleye.gif
) scores lower than an Athlon @ 1400 Mhz... geeze, guess I should have kept my 1.4 Ghz T-Bird afterall.

*EDIT* Oh yeah... </rant> :D
 

MDE

Lifer
Jul 17, 2003
13,199
1
81
Originally posted by: Jeff7181
My conclusion is that all Futuremark programs have lost all respect as a non-biased benchmarking program. 3DMark with the nVidia driver fiasco and now PCMark with this crap. Just what is PCMark2004 supposed to be showing? How good the P4 is with multi-media applications and how useful it's memory bandwidth is? Even PCMark2002 has some major flaws... first... upgrading my video card put my memory score in the 6500's... why would the video card memory be included in that test, is that what 3DMark is for? Increasing my FSB does wonders for the hard drive score... which is a load of crap since it doesn't absolutely nothing for real world performance as far as the hard drive goes. I wish a large website/publication would slam Futuremark and tell it how it is... their applications are ONLY useful when making upgrades to your OWN computer to see if the change has made a positive difference. As far as comparing between computers... good luck... my "Duron @ 2250 Mhz" which is running on a 200 Mhz FSB with a multiplier of 11.5 (
rolleye.gif
) scores lower than an Athlon @ 1400 Mhz... geeze, guess I should have kept my 1.4 Ghz T-Bird afterall.

*EDIT* Oh yeah... </rant> :D
lol... Couldn't put it any better myself. Someone wake Anand up! :)
 

Duvie

Elite Member
Feb 5, 2001
16,215
0
71
I agree with you Jeff...I actually think the basic test suite should not include the multithreaded apps or just 1....Let the multithreaded apps be tested in the more specific cpu test of the pro version...It is obvious it doesn't work anyways from the scores mentioned above...


HT can show it prowess in single apps as no doubt it is the reason I get a much higher Divx score since latest codecs for Divx show a 5-10% increase with HT. Leave the multitasking out of this bench IMO.....
 

Jeff7181

Lifer
Aug 21, 2002
18,368
11
81
Originally posted by: Duvie
I agree with you Jeff...I actually think the basic test suite should not include the multithreaded apps or just 1....Let the multithreaded apps be tested in the more specific cpu test of the pro version...It is obvious it doesn't work anyways from the scores mentioned above...


HT can show it prowess in single apps as no doubt it is the reason I get a much higher Divx score since latest codecs for Divx show a 5-10% increase with HT. Leave the multitasking out of this bench IMO.....

Multi-threading has it's place just like anything else... it's just not as big a part of everyday computing, or even power computing as Futuremark suggests with their 6 multi-threaded tests right off the bat.

Hey, I wonder if we could convince people that the P4 is doing application detection just like nVidia does... cause even when the P4's scores should drop (HT off) it increases. Intel is the devil! Intel cheats! Intel's hardware is flawed because it doesn't use the core to it's maximum efficiency so they had to force it to with Hyper-Threading! Lets have a "Floater Day" and expose Intel's weakness at floating point calculations compared to AMD's. Then lets have a "Developer Day" for Intel to showcase Hyper-Threading optimized applications and SSE2 optimized applications and vaguely elude to better floating point performance from the Prescott.

Man... I think I liked it better when I wasn't aware of these retarded conspiracies in the computer industry.
 

Jeff7181

Lifer
Aug 21, 2002
18,368
11
81
If Futuremark wants to do it right they should tailor their tests to fit the core designs better... use very small chunks of data for the CPU test so that only the L1 cache is used... then use larger chunks that fill the L2 cache but don't spill over into RAM to test the cache performance. Then use HUGE chunks of data that get progressively larger so people can see how much of a performance penalty they're taking for having too little RAM and forcing it to use the swap file. Then for the hard drive, use a data set that would be the same on every computer... for example, compress, decompress, encode, decode a cab file off the Windows XP CD that's the same on the Home as on the Pro, and the same on the origional version as SP1. They should include tests that are ONLY integer calculations, and some that are ONLY floating point, then a mix of the two.

Any programmers out there wanna get together and create a REAL benchmark? Futuremark obviously doesn't know what a benchmark should be anymore.
 

Duvie

Elite Member
Feb 5, 2001
16,215
0
71
Originally posted by: Jeff7181
Originally posted by: Duvie
I agree with you Jeff...I actually think the basic test suite should not include the multithreaded apps or just 1....Let the multithreaded apps be tested in the more specific cpu test of the pro version...It is obvious it doesn't work anyways from the scores mentioned above...


HT can show it prowess in single apps as no doubt it is the reason I get a much higher Divx score since latest codecs for Divx show a 5-10% increase with HT. Leave the multitasking out of this bench IMO.....

Multi-threading has it's place just like anything else... it's just not as big a part of everyday computing, or even power computing as Futuremark suggests with their 6 multi-threaded tests right off the bat.

Hey, I wonder if we could convince people that the P4 is doing application detection just like nVidia does... cause even when the P4's scores should drop (HT off) it increases. Intel is the devil! Intel cheats! Intel's hardware is flawed because it doesn't use the core to it's maximum efficiency so they had to force it to with Hyper-Threading! Lets have a "Floater Day" and expose Intel's weakness at floating point calculations compared to AMD's. Then lets have a "Developer Day" for Intel to showcase Hyper-Threading optimized applications and SSE2 optimized applications and vaguely elude to better floating point performance from the Prescott.

Man... I think I liked it better when I wasn't aware of these retarded conspiracies in the computer industry.

WoooooH settle down Beavis!!! ;)

I don't INtel would do that considering it makes the HT look bad...Why would you want to have better scores with it disabled..If that is application testing then they need to fire a whole buch of ppl....I think the test are ballpark in my test of normal priority and hT on....Percentage increases are right there with some of the stuff I ahve tested in my multitasking testing....So I think it real world and I don't need futuremark to tell me that....


I would love to see ppl right a better suite of testing that is less biased but also thorough so someone gets a good understanding how a platform whether it be AMD or INtell can benefit them in their intended use....
 

Duvie

Elite Member
Feb 5, 2001
16,215
0
71
Originally posted by: Jeff7181
Duvie... Floater Day was a joke =)

I know, I know!!! I was talking about application detection like Nvidia Futuremark cheats....
 

THUGSROOK

Elite Member
Feb 3, 2001
11,847
0
0
"system" benchmarks have been a serious problem all year, and theyre not getting any better.
gaming benchmarks have improved greatly, but that doesnt include 3dmock03. (not a typo)

its unfortunate, but we are left with creating our own "system" type of benchmarks with the programs we use most. (if possible)
 

DAPUNISHER

Super Moderator CPU Forum Mod and Elite Member
Super Moderator
Aug 22, 2001
31,737
31,674
146
Originally posted by: THUGSROOK
"system" benchmarks have been a serious problem all year, and theyre not getting any better.
gaming benchmarks have improved greatly, but that doesnt include 3dmock03. (not a typo)

its unfortunate, but we are left with creating our own "system" type of benchmarks with the programs we use most. (if possible)
That's what I'm talking about. Fortunately even many reviewers are eschewing 3D&PCmocks, SiSuck Sandra, and game benchies that can be manipulated to give the appearance of better performance. More needs to be done but the reputable ones are making progress. Duvie's hard work with HT is much appreciated as it shows detailed data on a feature barely mentioned in mainstream reviews. Anywho, we had this discussion almost a month ago@nForcersHQ and my conclusion was the same, the new PCmock is even more useless than the previous versions.

The fact people still use benchmarks to establish performance is lamentable too. I was trying to help a member here with what they thought were performance issues with their system, he went from 1700+ to P42.8c and was under the impression the 1700+ was faster for gaming because the 3DMock2k1 score was higher with the 1700+ system than P4
rolleye.gif
Then he concluded there was a bottleneck somewhere because of the lower score and began extensively questioning others as to what to do to find it, all based on the 3Dmock performance. Never did he consider that all the tweaks used to increase 3Dmock scores are worthless since many of the tweaks lower IQ, and having a 9800pro and P42.8c image quality is the last thing he should be messing with, especially@1024x768 in games out right now ;)