If you read the article this is hardware acceleration that has no relationship to the rest of the graphics processing whatsoever. Its a little added hard wiring that only serves one purpose the same as a light switch. It also uses the system ram for virtual memory so the manufacturers can leave out the extra memory and not raise costs significantly.
I've only been talking about the rest of graphics processing, though. My whole argument against being able to utilize higher resolutions boils down to this equation:
(3840 * 2160) / (1920 * 1080) = 4
A desktop Radeon HD 6670 is almost entirely limited not by memory bandwidth, but by SP FP processing power, with games today. To keep the same performance and detail levels, it will need 4x the processing power. AMD GPUs already only read the part of the texture they need, so it won't help out any case where there is enough VRAM for all current textures. PRT, and equivalents, should be able to make 1GB VRAM act like 4+GB VRAM, and take care of most of the bottleneck of reading from main system RAM. It could make a scene not renderable on current video cards renderable due to VRAM limitations being broken, and take care of console fuzziness, but won't help the chip itself out much, once it is all in RAM (like in a PC, w/ 1GB+ VRAM, in most games today). But, a scene currently renderable in 1080P at a rate of 40FPS would only be able to run around 10FPS in 4K.
Again with the generic expressions of pessimism. You really should write poetry or do a comedy show.
<Zoidberg>Again with the crazy optimism.</Zoidberg>
It doesn't matter if gpu physics are only a few times faster. Again, its the divide and conquer strategy. Divide up the tasks and do more things simultaneously. In an ideal world we'd all have 60Thz single core processors to replace all this byzantine computer stuff, but this ain't an ideal world.
Ah, but it does matter than GPU physics is only a few times faster, as it implies that we are close enough to ideal. Being only that much faster means the GPU is not very efficient at doing it. We know that (a) Intel x86 vector extensions have generally sucked, compared to others out there (often even to AMD's x86 ones), compared to MIPS and PPC extensions, and good NEON implementations, that (c) Intel has already shown great work on improving vector performance with SB (there's bite to go with the bark), and (d) the AVX2 specs released thus far, have everything needed to exceed most RISC-based extensions/coprocessors, if combined with known improvements that will occur through Haswell. I don't doubt that a 4-8x improvement over SSE2-accelerated code today is a more than reasonable expectation, and I fully expect AVX2 to be a lasting lowest-common-denominator spec, like SSE2 has become.
It's not that we need some kind of strategy for the GPUs to do it. It's that GPUs being the things to do physics with turned out to be mostly marketing by NVidia and pre-AMD ATI. Now it will be able to go back where it should be (and has been on the consoles), the CPU; enabling developers to arbitrarily optimize tightly coupled code, instead of having to try to implement highly decoupled, latency-insensitive methods, which are more of a PITA than tuning a few loops here and there.
Prior Intel vector extensions have been heavily grounded in minimal increases to their CPU complexity, to get more out of existing FPUs. This time, they're Doing It Right(tm), to the extent that they reasonably can.
Now, again, some things, like pathfinding, should be made to scale out just fine, and should not be
too latency-sensitive, provided the pathfinding implementation is conceived with a coprocessor in mind (bolting a GPGPU implementation to an existing engine is likely not reasonable). For that sort of thing, coming up with a single good set of code for it, that works well enough on any hardware supporting that language (multiple optimizations are much more acceptable than multiple methods of computation), is much more an issue, and will likely get worked out one way or another, over time.
There's that pessimism rearing its ugly head again. Certainly nothing I can say will sooth the savage beast. You need to go straight to the source and argue with HP. Good luck with that.
No arguing with HP, just waiting. HP has made more than a few bombs, and introduced technology too soon. I'm glad they are finally back into being an R&D company, and I don't for a second consider pushing technology they can produce being a bad thing, nor that, if they are finally ready for prime time, memristors won't be good. If it works well and is cheap enough, it will get at least some use, and progress from there. But, more often than not, it will not be an instant hit.
Nope. You need to read up on transactional memory. The hardware in this case decides all by itself which cores do what.
I have, and it doesn't. If the OS doesn't want a thread to run, it won't happen, or at least may run the risk of being killed. The software's logic being executed makes the important decisions, though its logic code. There is not a hardware engine giving a go-ahead to a thread that the OS hasn't allowed to run right now. On small-scale work, 10-100x slowdowns are not uncommon for STM, which is one of several problems hampering adoption. The hardware gets to decide a commit or abort on its own, but the HW is not making any high-level scheduling decisions. It executes code that makes those decisions, but that will be no different with HTM than it is without it, save for not having to make performance sacrifices to use transactional memory. Dividing the work up is done by the compiler and/or programmer, and scheduling it is done by the compiler, and/or a runtime substrate (IE, JVM), and the OS. The program logic and OS still call for a thread to exist, execute, sleep, or die, just as without HTM. That hasn't changed. What HTM allows for is the CPU hardware to detect a potential conflict (a commit guarantees there was no conflict, but an abort does not guarantee there was one), and abort eagerly, without using potentially more CPU time than your intended program logic.
HTM effectively creates speculative multithreading, but the threading part is done by similar software means to how it has been for many years already, and that is not a bad thing. Software management overhead isn't that big, and it offers flexibility that set-in-stone hardware simply cannot. Software detecting or preventing conflicts
(with Intel's HTM, software will still have to adjudicate correctness of false conflicts in some cases), performing many function calls for actual work that measures in the tens of instructions, and making copies in heap-space all the time, can simply ruin performance; and HTM replaces most of that small-scale work.
It chooses. That's the whole point is that the hardware itself chooses if the programmer doesn't. Compilers and whatnot can help, but this way it goes straight to the source of the problem. They tried desperately to find some way to have the first multicore cpu processors thread themselves and theoretically it is possible, but nobody has a clue about how to do it. The math is just too difficult. With transactional memory that's not the case and they do know how it can be done because supercomputers have been developing the technology in software for years.
No,
humans have been developing the technology for years. Large clustered systems have been mostly used for testbeds and simulators, because on any single system, you can only test up to so many threads, and can't sufficiently test high-level NUMA. There's a point where a mathematical prove that it should work is not good enough, and we will need transactional memory to efficiently develop future many-threaded applications (and, with HLE, extend current ones). Compilers and programmers implement STM, which then uses the HTM for the low-level grunt work, removing massive software overhead, thus leaving the software layer to manage the low-level work, which is the way it aught to be. All the CPU does is execute what it's told, when it's told, and if an existing transaction could cause a conflict, fails in a defined way. There is no deciding to use the GPU or CPU, and there is no HW threading. The math for doing it just in pure hardware, for all workloads, is
still too complex, but it is not too complex to handle well in software, for a given workload.
Compile-time and run-time knowledge that are too difficult to narrow down to HW specs are still needed to get near optimal performance, and prevent pathologies (in addition, like the higher-level OS scheduling, updates to fix emergent problems are much easier than in HW). That compile-time and run-time information is enough to keep the HW from needing to do it on its own, just as any other software scheduling has survived innumerable attempts over the years to replace it. Software scheduling is very good, to the point that it's not worth even trying to doing it in HW for our CPUs. Let the CPU handle scheduling only a its knowledge level: pages, cache lines, branches, and program counters. HTM is implemented as a last-mile portion of a STM implementation, since pure software STM incurs too much overhead. Doing much more would likely hose up OS thread scheduling, as it would start getting in the way, and screwing with resource usage that used to be known quantities.
The first thing that needs to happen (IMO) in order to start increasing resolutions is the move to vector graphics. Otherwise it just won't happen. Vector graphics is the more feasible first step, so we're likely to see that first.
DX11 tesselation
should allow us to approach that without getting rid of our tried and true polygons (assuming by vector you mean being able to describe things that aren't straight lines). How well it can in actuality, we'll just have to see (that is, a game with models made using tesselation as a means of expressing curvature, where we currently still get visible straight sections). That, I am quite hopeful for, though I don't expect to see too much until a DX11 GPU is actually in an Xbox, since high-end PC video cards will be able to brute-force it for the foreseeable future, and the publishers with big budgets seem to like that kind of approach better, on average. Games that don't go for the faux-realistic look might be able to use such a feature to get substantial added detail, and good performance, out of otherwise-average GPUs. For an extreme example, imagine a Rayman game with what appear to be true curved surfaces, and practically no UV maps. I'd love seeing that way more than the next soldier in a jungle, desert, or post-apocalyptic midwest-looking place (unless it's the next Fallout, of course

).