These days, a typical design will have the L2 shared between 2 or more CPU cores. Each core has their own L1 caches and a bus out to a shared L2. Physically, everything is smushed together on the die but conceptually since the L2 works with 2 or more cores it isn't counted as totally part of...
I'm pretty sure that bug has been fixed for years. And I certainly wasn't seeing the problem in vectorized C code, at least in the example in this thread.
I didn't need to worry about any of these to get the C code to within 2x of the ASM code.
And are you seriously using half-precision FP...
Unless your assembler can automatically do inlining, loop unrolling, function specialization, whole program optimizations and all of the other tricks modern compilers do this is not going to generally be the case. Sure, I guess you can do this by hand but are you going to manually re-do e.g...
The penalties vary depending on the chip you're running on. The compiler might not handle it well, but neither does the ASM - there's only a single code path rather than one ASM function tuned for each microarchitecture. Not sure how this is a point in favor of the ASM code.
-Ofast is pretty...
I just repeated that same test using a compiler released in this decade. The compiled code is now about 2x slower than pure ASM rather than 30x. Still a win for the ASM code I guess, if you exclude development time, the fact it doesn't actually do the same thing as the C code, automatically...
Sure, the same one I asked in the post you responded to - where can I find reviews of the performance of NV cards running at their guaranteed base clocks. All I can find anywhere are reviews of them running at max boost clocks, just like all the reviews of AMD cards.
In other words, guaranteed...
That's great. Where can I find reviews of Kepler GPUs running at their guaranteed advertised base clock rate? Everything I've seen has them running at their max variable boost clock speed, just like AMD cards.
+/-3% or so, based on the the review sites I've seen.
How does this compare against similar nV cards? We don't know, because nV only paid to have AMD cards tested by TechReport - and presumably the other review sites as well.
Nope, for Titan you'd only run it in faster DP mode if the gain in DP performance outweighs the slowdown from clock throttling :
"The penalty for enabling full speed FP64 mode is that NVIDIA has to reduce clockspeeds to keep everything within spec. For our sample card this manifests itself as...
Compare this approach to the one other reviewers are saying Nvidia used to try and sway reviewers :
http://forums.anandtech.com/showpost.php?p=35714780&postcount=54
You'd have to admit that's quite a coincidence.
Nah, lots of it is coming directly from NV.
" When discussing its new product with us, Nvidia took some time to explain how the 780 Ti differs from the competition. Some of what they offered in this context was FUD about the variable performance of the 290X cards in the market."
"There's...
Where can I buy a 290 which defaults to a 34% fan limit, as tested here by Anandtech - http://www.anandtech.com/show/7481/the-amd-radeon-r9-290-review/15?
If there's now an "only test at defaults" policy it hasn't been around for that long.
This is test where the 770 averaged 33FPS, which is 33.3msec per frame on average. You'd expect that with the cutoff right at the average frame time that a number of frames would be over the average by some amount no matter how good the frame pacing was. Without knowing how many frames were...
560ti, or 560ti 448 core? Those are two very different chips even thought the names look similar. I think I remember [some?] 560ti 448s being on 570 or 580 PCBs, which might explain why the cooler would be the same.
Or it could just be a vendor cheaping out, never know.
Yep, CPI goes up if you increase core frequency without increasing memory frequency. This is how real systems work - memory access latency is independent of core clock speed. It would be great if we could magically get DRAM latency to improve at pace with CPU speedups, but that's never been...
5 seconds is 5 seconds. But 5 seconds isn't always the same numbers of cycles. 5 secs at 4GHz vs. 1GHz is a 4x cycle count difference (plus or minus synchronization artifacts). Since you are dividing by C in IPC, that larger C means smaller IPC for a given workload. It's all in the original...
Probably not. Just like we don't have a published road-map of nVidia's driver team's skunkworks projects. If AMD's hiding stuff by not telling us everything their driver team is working on in detail, what's nVidia hiding?
Yeah, complain about people making the thread about you and then cherry pick a single line out of context to manufacture a post making it about my implied biases. That's consistent.
Unfortunately, not all of us get free hardware from these companies to test with. I'm just going by what various...
Yep. This happens in all fields. That's why the negative results from blind testing are interesting to me. Sure, something's going on with the measurement but as far as anyone can show, what's being measured is tough to perceive without someone telling you beforehand what you're supposed to...
Another option for under $200 : http://www.newegg.com/Product/Produc...82E16814127615. About GTX 570 levels of performance and isn't one of the new CUDA-crippled 6xx chips.
Look at http://techreport.com/review/24022/does-the-radeon-hd-7950-stumble-in-windows-8/9, you can see that nothing much has changed from 12.7 beta to 12.11 beta drivers. There were some minor slowdowns, but without knowing what sort of run to run variation they get with their testing it's hard...
Nope, I'm not saying this, you are. It's just more idle speculation on your part.
Still waiting for your data to show that latency got worse with 12.8. These attempts at distractions are cute but transparent.
Which still says nothing about latency getting worse with 12.8 versus older...
You're speculating that 12.8s introduced higher latency compared to earlier drivers, which implies you have data from a driver before that to compare to. If you're admitting you don't have that data, it's cool, just come clean and let us know that it's just speculation on your part.
And no...
The other thing that would be nice is a bit of blind or double-blind testing using the videos. Sure, it's really easy to tell that AMD's card is a stuttering mess when you know which video is AMD's. Science shows that it's a whole lot harder to pick up on these "obvious" observations when you...
Again, assuming it is even a problem with the driver and not in the way FRAPS happens to measure certain types of buffering schemes. For example, see how changing from single to double to triple buffering turns another card into a stutter-fest ...
This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
By continuing to use this site, you are consenting to our use of cookies.