- Apr 27, 2000
- 22,937
- 13,024
- 136
Some of you might have read this thread in the CPU forum:
http://forums.anandtech.com/showthread.php?t=2433693
I also solicited for testers on OCN's Intel forum and got one guy with a 4770k to help me by testing the software on his machine:
http://www.overclock.net/t/1563664/please-help-me-make-this-software-more-intel-friendly/0_100
Long story short: the software I wrote works just fine and dandy on my A10-7700k, showing a notable improvement over the program it was intended to emulate - Dr. Cutress' 3DPM (Stage 1 only). It runs rather poorly on Enigmoid's i7-3630qm compared to 3DPM, which I still can not explain.
On top of that, it is slightly slower than 3DPM on an i7-4770K with HT on, but faster than 3DPM with HT off. Dr. Cutress' 3DPM Stage 1 gains ~67% on the 4770K from HT, while my 3DPMRedux only gains ~20% from HT. And that's on Haswell . . . I haven't seen how HT affects things on Ivy Bridge.
The source code for 3DPM (Stage 1 or otherwise) is not available in its entirety, but there is a link to the source for the latest build. Or heck I'll just link it here:
https://www.dropbox.com/s/enz9kz2u8up8v2x/3DPMReduxSource6222015.zip?dl=0
So can anyone here think of why HT is having such a muted effect on 3DPMRedux? If I had to guess why it's helping 3DPM Stage 1 so much, it's that there are probably some pipeline stalls leaving open a lot of execution resources for the extra logical processors to use on their assigned threads. It is also possible that the Java version is experiencing fewer pipeline stalls, but is (overall) running more slowly thanks to Java-inflicted overhead. It's close, darn close, but it isn't quite "there" yet.
There is also the possibility that the way I've set up my thread pool is slowing things down for HT, but I'm not really sure why that might be.
http://forums.anandtech.com/showthread.php?t=2433693
I also solicited for testers on OCN's Intel forum and got one guy with a 4770k to help me by testing the software on his machine:
http://www.overclock.net/t/1563664/please-help-me-make-this-software-more-intel-friendly/0_100
Long story short: the software I wrote works just fine and dandy on my A10-7700k, showing a notable improvement over the program it was intended to emulate - Dr. Cutress' 3DPM (Stage 1 only). It runs rather poorly on Enigmoid's i7-3630qm compared to 3DPM, which I still can not explain.
On top of that, it is slightly slower than 3DPM on an i7-4770K with HT on, but faster than 3DPM with HT off. Dr. Cutress' 3DPM Stage 1 gains ~67% on the 4770K from HT, while my 3DPMRedux only gains ~20% from HT. And that's on Haswell . . . I haven't seen how HT affects things on Ivy Bridge.
The source code for 3DPM (Stage 1 or otherwise) is not available in its entirety, but there is a link to the source for the latest build. Or heck I'll just link it here:
https://www.dropbox.com/s/enz9kz2u8up8v2x/3DPMReduxSource6222015.zip?dl=0
So can anyone here think of why HT is having such a muted effect on 3DPMRedux? If I had to guess why it's helping 3DPM Stage 1 so much, it's that there are probably some pipeline stalls leaving open a lot of execution resources for the extra logical processors to use on their assigned threads. It is also possible that the Java version is experiencing fewer pipeline stalls, but is (overall) running more slowly thanks to Java-inflicted overhead. It's close, darn close, but it isn't quite "there" yet.
There is also the possibility that the way I've set up my thread pool is slowing things down for HT, but I'm not really sure why that might be.