TPU: many combinations of GPU/resolution/RTX/DLSS not allowed

darkswordsman17 · Feb 20, 2019

Despoiler said:
Back in the day Tensor and RT cores would have just been referred to as math co-processors. Now, now we have all this marketing.

I disagree, this stuff is implemented directly in the traditional GPU pipelines. I think this is just an expansion of the compute units. Now maybe you're considering that stuff to be the same as the old school math co-processors, but because its implemented directly in the processing pipeline, I don't think that's quite the same case (but I'm not well versed in the old math co-processors, but I thought once they integrated that stuff into the processors themselves they weren't really "co-processors" any more).

But I guess you're just meaning, expanding the math capabilities, which would be true is like what those were doing.

darkswordsman17 · Feb 20, 2019

NTMBK said:
Tensor cores are great for deep learning, but do those customers really need the baggage of a gaming GPU with expensive GDDR6, fully featured compute cores, texture units and rasterizer hardware? And does the one application of tensor cores to games (DLSS) really justify those tensor cores for gaming customers? The needs of DL and the needs of video games are going to continue to diverge, not converge. Google's in house TPU doesn't look much like a GPU.

The issue is that I'm not sure how much you can separate the two, as while they're being used for different things there's a fair amount of crossover (the reduced precision stuff can be used for graphics). Plus in essence its allowing them to share other resources (cache, memory, etc), plus it means a unified chip production. Although I feel like that last bit might be a hindrance right now due to the cost of chip design/engineering/production.

Now, I personally think it would be better to have separate chips, and then put them on the same card (or in consoles), and think we might see something like that when chiplet designs really start to take over, where we can have more sharing of resources like memory, but can then see more focused bits that aren't tied to each other (so for instance, clock speed differences like between GPU and CPU, I'm sure that matters in other processing bits - like we see with Intel's AVX-512 clocking issues).

But, perhaps the industry needed them integrated right now to make sure that it would actually be used (think how PhysX struggled to take off for instance, and how even dedicated sound processing chips kinda died off or now are integrated - post X-Fi even Creative Lab's cards are simple RISC processor blocks since they're powerful enough to run their processing software).

darkswordsman17 · Feb 20, 2019

William Gaatjes said:
I would not be surprised if we get in the future a dedicated ai chip for use as neural net in predicting user behavior for more advanced and real ai behavior in the 3d scenes like the artifical behavior from computer controlled enemies or for example when playing soccer, computer controlled adversaries.
Or use the net to predict the users behavior and use the fast recognition capabilities in reverse to simulate worldly behavior to get a true more random behavior.
What the net predicts will happen, do not use that prediction but instead go for another variant to create random effects.

A pc or game console with a GPU, a CPU , An AIPU (Ariticial Intelligence processing unit.
Make the computer predict user behavior for normal use, for more advanced and real like communication and conversations.
Great for games. Later great for robotics. Have a central pc in the house doing the crunching, have a power efficient robot in the house be the arms and legs of the pc.
Since the always powered pc does the heavy lifting, the accu powered robot can have a massive amount of sensors and have a wireless connection to the pc and still be extreme power efficient.

I think AI will actually really only make sense when we get something that really pushes us beyond current compute capabilities, something like quantum computers. Right now, it feels like we're trying to dumb things down, simplify them to be able to be run on our current processors, and I think we're seeing that's problematic for a variety of reasons (for instance, reduced precision mixed with intelligence seems like a recipe for disaster).

AI feels like crypto, its interesting and there's some uses for it, but I think its been far overhyped in how useful it is right now, and I don't think its potential will be realized by trying to make them work using our current systems, in fact, I think it might make it especially problematic. And that's before we get to how biases (believe there's been some analysis showing that AI has some racial bias for instance; and crypto, well the people serving as gatekeepers have caused a lot of problems, like the Mt Gox fiasco) and other stuff can wreak havoc.

DrMrLordX · Feb 20, 2019

coercitiv said:
One simple but powerful example here is the use of adversarial attacks to fool ML models into complete misinterpretation of images.

This doesn't mean the models cannot be adjusted to increase their resilience, but it does show the data actually used to interpret images is nowhere near what ordinary people would expect. This "blind spot" of the AI may be an eye opener for some.

Guacamole! At least it didn't say, "meal at a cheap <insertethnicityhere> restaurant".

I see where you're coming from, it's just that I don't think much behind the ML movement is hype-driven. Maybe some pointy-haired boss somewhere allocated spending on ML without really knowing what it is and what are its limitations. That guy will probably lose interest for the same reasons he acquired interest in the first place. People who are seriously involved in the field will press forward.

Stuka87 said:
You would be better off setting resolution scale down to 75 or 80%. Image will be sharper and FPS will be the same or better as with DLSS on.

I was kinda thinking the same thing. It's still good to have another tool in the shed, so to speak, but still.

darkswordsman17 · Feb 20, 2019

DrMrLordX said:
Guacamole! At least it didn't say, "meal at a cheap <insertethnicityhere> restaurant".

I see where you're coming from, it's just that I don't think much behind the ML movement is hype-driven. Maybe some pointy-haired boss somewhere allocated spending on ML without really knowing what it is and what are its limitations. That guy will probably lose interest for the same reasons he acquired interest in the first place. People who are seriously involved in the field will press forward.

I was kinda thinking the same thing. It's still good to have another tool in the shed, so to speak, but still.

I've seen quite a few people hype AI. And it seems like they're the same ones that were hyping cryptocurrencies (and blockchain in general). And claiming how it'll solve everything from economic inequality, to world hunger, and health care (I've seen people make those same claims about both things). And that's not going away (friggin JPMorgan is making a cryptocurrency...).

Seems like a lot of companies are just using AI as marketing (LG and Huwei for instance) while they apply some limited aspect of it, but mostly using it as a selling point "we've got AI/tensor stuff!").

Certainly not everyone is though. Like say Google, who is a heavyweight, but they focus on the point (for instance, Google touting using it to improve images taken on your phone, they're not making all these other crazy promises about it, they're just going "yeah, we can use it to improve things, and we'll see from there"). Its kinda like how Google is being pretty conservative on promising too much with self driving (when they're probably by far ahead of everyone else in that market), while others are acting like they could do it now and it'd be the best thing ever. So there's a mix, some like Google (and probably Microsoft and Apple are being pretty reserved on it, they'll show it off and talk about it having potential, but you rarely see them make especially grandiose claims). But then you'll see startups and others (like Uber) that act like they're doing all this amazing stuff and how its almost overnight going to revolutionize everything.

I guess its like anything. You've got some that hype it to try and make some money on it, then of course the marketers looking for buzzwords, while the people doing the real meaningful work toil away and roll their eyes at the rest.

IntelUser2000 · Feb 20, 2019

Unrealistic expectation of ML and AI comes from a combination of fantasy movies(Sci-Fi) and 40 or so years of rapid progress due to Moore's Law. The latter fed into the former so the hype got more extreme as time passed.

The decline in peak will probably start as more people realize Moore's Law like advancements are painfully slowing down. The thing is though, like fashion trends, its a trend, and trends die and come back.

William Gaatjes · Feb 20, 2019

darkswordsman17 said:
I think AI will actually really only make sense when we get something that really pushes us beyond current compute capabilities, something like quantum computers. Right now, it feels like we're trying to dumb things down, simplify them to be able to be run on our current processors, and I think we're seeing that's problematic for a variety of reasons (for instance, reduced precision mixed with intelligence seems like a recipe for disaster).

AI feels like crypto, its interesting and there's some uses for it, but I think its been far overhyped in how useful it is right now, and I don't think its potential will be realized by trying to make them work using our current systems, in fact, I think it might make it especially problematic. And that's before we get to how biases (believe there's been some analysis showing that AI has some racial bias for instance; and crypto, well the people serving as gatekeepers have caused a lot of problems, like the Mt Gox fiasco) and other stuff can wreak havoc.

At the moment, it is not usable on it's own.
Neural nets need to be trained. Training costs time.

Our brains are setup as parts that learn and parts that simulate and parts that can make decisions based on these simulations.
Seems very simple but it is extremely complex. It is as if the weights can be adjusted on the fly.

There is one thing though, if you can adjust the weights on the fly by a normal cpu that set its up or by some other means (see below)...

Ars technica has a fantastic article on neural nets and the current state of affairs.

https://arstechnica.com/science/2018/12/how-computers-got-shockingly-good-at-recognizing-images/

And about state of the art neural net.

https://arstechnica.com/science/2018/06/training-a-neural-network-in-phase-change-memory-beats-gpus/

Thala · Feb 20, 2019

maddie said:
The critical question is if RTX cores are indispensable? The answer is no, from an industry source. After all, in the end, it's all math.

We are going on Nvidia's choice to make certain assumptions, which appear not be true, namely, the only way is through specialized RT computing units.

Not sure at which point you came to the conclusion, that raytracing is only possible with RT compute units.
My statment is based on the fact, that if you want to build a cost-efficient GPU like in consoles, which support raytracing, that you better go with specialized RT cores.

maddie · Feb 20, 2019

Thala said:
Not sure at which point you came to the conclusion, that raytracing is only possible with RT compute units.
My statment is based on the fact, that if you want to build a cost-efficient GPU like in consoles, which support raytracing, that you better go with specialized RT cores.

???? Where I came to the conclusion RT cores are essential?

NTMBK · Feb 21, 2019

darkswordsman17 said:
The issue is that I'm not sure how much you can separate the two, as while they're being used for different things there's a fair amount of crossover (the reduced precision stuff can be used for graphics). Plus in essence its allowing them to share other resources (cache, memory, etc), plus it means a unified chip production. Although I feel like that last bit might be a hindrance right now due to the cost of chip design/engineering/production.

Now, I personally think it would be better to have separate chips, and then put them on the same card (or in consoles), and think we might see something like that when chiplet designs really start to take over, where we can have more sharing of resources like memory, but can then see more focused bits that aren't tied to each other (so for instance, clock speed differences like between GPU and CPU, I'm sure that matters in other processing bits - like we see with Intel's AVX-512 clocking issues).

But, perhaps the industry needed them integrated right now to make sure that it would actually be used (think how PhysX struggled to take off for instance, and how even dedicated sound processing chips kinda died off or now are integrated - post X-Fi even Creative Lab's cards are simple RISC processor blocks since they're powerful enough to run their processing software).

My point is that the two are already separating. If you look at dedicated deep learning hardware designs (like the TPU, or IBM's research: https://www.realworldtech.com/vlsi2018-ibm-machine-learning/ ) it looks almost nothing like a GPU. Heck, Nvidia is having to add more and more non-graphics hardware just to keep up. At first a regular GPU was the weapon of choice just because it was a cheap way to get lots of FLOPs. Then Nvidia added FP16x2 and INT8x4 SIMD, which can be useful in certain graphics cases but are pretty niche. Now they added tensor cores, which are practically useless for traditional graphics workloads. At what point does having the GPU hardware become a hindrance, not a help? I think these hybrids are a weird intermediate step, and eventually we will see more dedicated Deep Learning hardware from Nvidia, and see the tensor cores dropped from graphics cards.

DrMrLordX · Feb 21, 2019

[QUOTE="NTMBK, post: 39744968, member: 301096"If you look at dedicated deep learning hardware designs (like the TPU, or IBM's research: https://www.realworldtech.com/vlsi2018-ibm-machine-learning/ ) it looks almost nothing like a GPU. [/QUOTE]

Also Intel's Loihi. It remains to be seen if any of these dedicated solutions can outpace dGPU-based machine learning.

Thala · Feb 21, 2019

maddie said:
???? Where I came to the conclusion RT cores are essential?

Your statement "The critical question is if RTX cores are indispensable? The answer is no, from an industry source. After all, in the end, it's all math." made it sound, that you did need the statement of an industry source, to convince yourself that you can do raytracing with other compute resources. Sorry if I did get you wrong.

Its not even a critical question, as you are missing the issue at hand. Its like asking if GPUs are indispensible because you can run the whole 3D pipeline on a general purpose CPU - its all math after all.

My point was, that for efficient implementations you would want to use specialized units - and that such efficient implementations are particularly required for consoles. You just do not have sufficient transistor budget to have raytracing running on general purpose shader hardware and expect acceptable performance.

maddie · Feb 21, 2019

Thala said:
Your statement "The critical question is if RTX cores are indispensable? The answer is no, from an industry source. After all, in the end, it's all math." made it sound, that you did need the statement of an industry source, to convince yourself that you can do raytracing with other compute resources. Sorry if I did get you wrong.

Its not even a critical question, as you are missing the issue at hand. Its like asking if GPUs are indispensible because you can run the whole 3D pipeline on a general purpose CPU - its all math after all.

My point was, that for efficient implementations you would want to use specialized units - and that such efficient implementations are particularly required for consoles. You just do not have sufficient transistor budget to have raytracing running on general purpose shader hardware and expect acceptable performance.

I asked the question of those, including you, who claim that we need specialized HW for RT.

I see a scenario where with future nodes, the compute cores are numerous enough that with the addition of new instructions, RT will be useful without specialized units. The problem with specialized units is that they can't do any thing else. We're seeing the limitations of this with the huge die sizes on the RTX series.

Does anyone know if a similar sized die with RT improved general purpose compute cores would not usefully work in RT?

For sure, traditional games would have seen a larger performance increase than we now see. Even if the RT effects would have been less than they are now, I think sales would have great. Big jumps in performance with RT as icing on the top.

Anyhow, just my view.

coercitiv · Feb 21, 2019

maddie said:
Does anyone know if a similar sized die with RT improved general purpose compute cores would not usefully work in RT?

For sure, traditional games would have seen a larger performance increase than we now see. Even if the RT effects would have been less than they are now, I think sales would have great. Big jumps in performance with RT as icing on the top.

We have tests on Volta that seem to confirm just that. (see here and here)

Coupled with a hefty increase in number, improved general purpose units should have no problem performing the first wave of RT features such as reflections or global illumination. It would probably take far more intensive RT workloads to make the difference insurmountable. Meanwhile, increased number of compute cores would translate into direct performance jumps for traditional games, as you already point out.

A quote from the second linked article:

I reached out to Nvidia to confirm the reason that Volta is so much faster than Pascal. My suspicions were confirmed that the cache differences between the two arches (which Volta and Turing share) is what makes it so much faster. Volta has about 4x lower latency on the cache with more bandwidth and twice the low level cache.

BFG10K · Feb 21, 2019

coercitiv said:
A quote from the second linked article:

I reached out to Nvidia to confirm the reason that Volta is so much faster than Pascal. My suspicions were confirmed that the cache differences between the two arches (which Volta and Turing share) is what makes it so much faster. Volta has about 4x lower latency on the cache with more bandwidth and twice the low level cache.

So nVidia confirms Volta's ray tracing performance has nothing to do with Tensor cores after all, unlike some who claimed otherwise.

LTC8K6 · Feb 21, 2019

Lets keep in mind that the RT cores aren't the big hitter with die space consumption here, its the integer cores.

Told ya'!

http://www.pc-better.com/titan-v-dxr-analysis/#comment-4272526743

coercitiv · Feb 21, 2019

LTC8K6 said:
Told ya'!

You did? Where?

LTC8K6 · Feb 21, 2019

coercitiv said:
You did? Where?

Somewhere in this thread.

https://forums.anandtech.com/threads/h-bf5-raytracing-vram-easily-exceeds-6gb.2559712/

Rifter · Feb 21, 2019

Mopetar said:
All of this is wishful thinking. The next generation of consoles are supposedly based on Navi and won’t have ray tracing and wouldn’t be nearly powerful enough if they did include it which would be a total waste. They’ll be out late this year at the earliest. That means another ~5 years before the next cycle.

I agree, consols getting RT is at least 5 years out if not longer.

coercitiv · Feb 21, 2019

LTC8K6 said:
Somewhere in this thread.

https://forums.anandtech.com/threads/h-bf5-raytracing-vram-easily-exceeds-6gb.2559712/

Here's a list of posts you made in that thread. Which is it?

LTC8K6 said:
Well, there only 72 RT cores for the 2080ti, but there are 576 Tensor cores.

The main job of all those Tensor cores is DLSS.

LTC8K6 said:
I looked at lots of pics, but was unable to get any sense of scale between an RT and Tensor core.

Overall, it just looks like Tensor cores cover a lot of real estate.

https://www.gamersnexus.net/guides/3364-nvidia-turing-architecture-technical-deep-dive

LTC8K6 said:
So which takes up more area? 72 RT or 576 T?

LTC8K6 said:
That doesn't seem accurate, given other pictures that are out there.

https://www.servethehome.com/wp-con...Huang-with-NVIDIA-Turing-GPU-Architecture.jpg

So far it seems like RT and Tensor both take up a lot of die space, meaning RT is certainly not the only reason for the large die.

LTC8K6 said:
I'm going with the NV white paper chip diagrams for now.

https://www.gamersnexus.net/images/media/2018/gpus/2080-ti/arch/turing-block-diagram.jpg

https://www.gamersnexus.net/images/media/2018/gpus/2080-ti/arch/sm-architecture-block-diagram.jpg

LTC8K6 · Feb 21, 2019

coercitiv said:
Here's a list of posts you made in that thread. Which is it?

Again I have to ask why we have smileys...

Which smiley would have worked to inform you that I was not really serious?

Just curious.

Thala · Feb 21, 2019

maddie said:
I asked the question of those, including you, who claim that we need specialized HW for RT.

I can not remember that i have ever made such a claim. Can you please share a quote?

I see a scenario where with future nodes, the compute cores are numerous enough that with the addition of new instructions, RT will be useful without specialized units. The problem with specialized units is that they can't do any thing else. We're seeing the limitations of this with the huge die sizes on the RTX series.

Future nodes? You do not seem to realize that Moores Law is slowing down and that heterogenous computing becomes more important in the future..

In addition for gaming you always need to solve global illumination and other things where currently (without raytracing) only very crude approximation methods are available. So its not like the RT units staying idle in gaming workloads.

coercitiv · Feb 21, 2019

LTC8K6 said:
Again I have to ask why we have smileys...

You asked this before? Where?

LTC8K6 · Feb 21, 2019

coercitiv said:
You asked this before? Where?

Since you probably already did the search, I'll guess that it was actually at another board and not this one.

I was discussing the real estate on Turing in that other thread though, and I think I was correct about the RT vs Tensor core real estate.

coercitiv · Feb 21, 2019

LTC8K6 said:
I was discussing the real estate on Turing in that other thread though, and I think I was correct about the RT vs Tensor core real estate.

I know, I remember the discussion as it was rather recent: we had a good approximation on Tensor cores vs. CUDA core area ratio but we lacked any clear info on how big RT cores are relative to the rest.

That's why I began my seemingly tone deaf questions: the hypothesis of RT cores being rather small does fit the other info we have, but until we have something clear at hand... it's all just a bit more than hunch, which I'm inclined to believe if it makes any difference. Where our opinions diverge is with the role of the tensor cores since you're inclined to write them off as a separate entity, while I consider them complementary to RT cores, since noise reduction is paramount here and de-noising on CUDA cores requires additional performance loss.

TPU: many combinations of GPU/resolution/RTX/DLSS not allowed

Lifer

Lifer

Lifer

Lifer

Lifer

Elite Member

Lifer

Golden Member

Diamond Member

Lifer

Lifer

Golden Member

Diamond Member

Diamond Member

Lifer

Lifer

Diamond Member

Lifer

Lifer

Diamond Member

Lifer

Golden Member

Diamond Member

Lifer

Diamond Member