[BitsAndChips]390X ready for launch - AMD ironing out drivers - Computex launch

StereoPixel · Apr 12, 2015

ShintaiDK said:
The only sold HBM1 atm is 1GB/module at 1Ghz.

But AMD may use the custom version of HBM1 Dual-Link with 2GB or 4GB/module at 1250 MHz.

Cloudfire777 · Apr 12, 2015

msi2 said:
I think, that's per module!

No you dont know what you are talking about.
HBM1 supports 1GB per stack. Or 2Gb (bit, not byte) per DRAM. You can stack 4 of these DRAMs together max to make one stack.

We have already seen the leak on gfxbench. Its 4096bit. And how can it be that with 8GB HBM1? Two controllers with each accessing their own 4 * 1GB stacks. One controller per GPU die. Two dies working as one.
Not 8 stacks giving 8096bit bus or 4 stacks with 2GB because that isnt supported with HBM1 which we know 390X will have

3DVagabond · Apr 12, 2015

Cloudfire777 said:
No you dont know what you are talking about.
HBM1 supports 1GB per stack. Or 2Gb (bit, not byte) per DRAM. You can stack 4 of these DRAMs together max to make one stack.

We have already seen the leak on gfxbench. Its 4096bit. And how can it be that with 8GB HBM1? Two controllers with each accessing their own 4 * 1GB stacks. One controller per GPU die. Two dies working as one.
Not 8 stacks giving 8096bit bus or 4 stacks with 2GB because that isnt supported with HBM1 which we know 390X will have

So you are saying it has to be dual GPU for 8GB?

Cloudfire777 · Apr 12, 2015

Yes.
How else would they use the two controllers but still only have 4096bit?
If you ask me, its getting more and more convincing that we may see two Tongas under one die.

Unless the die picture on previous page is just a concept and two controllers isnt really needed for HBM.

Shehriazad · Apr 12, 2015

3DVagabond said:
So you are saying it has to be dual GPU for 8GB?

Yea....just how you will need a 2nd CPU to run a dual channel DDR3 setup...right? *cough* XDDD

All the assumptions are just going rampant.

There is nothing that stops AMD or anyone from using HBM1 and "duallinking(triple/quadlinking)" it for 8(12, 16) gb of HBM....performance might not scale 100% like that...but it's still gonna destroy GDDR5 by far.

I'm not even that tech savvy...but we are still talking about memory here...you can always get "moar". So what if there are 2x4GB memory blocks? Just connect those 2 and it should work fine... Of course it's more than "just connecting" them...but that's what engineers are there for....they're being paid for that kind of stuff.

I am 1000% sure that SK Hynix, Nvidia and AMD would not waste their time and money on a technology that is impossible to scale beyond 4GB when it's quite obvious that the future is 4K gaming...and thus 4GB is already cutting it close if you want to push if forward. Sure, HBM2 will have "moar"....but if HBM1 was so useless...it would not be made into a product in the first place and never actually leave their R&D department until it could effectively be used in phones or some bs like that.

3DVagabond · Apr 12, 2015

TechReport mentioned dual linking in an article last year to get 8GB with HBM1. Not sure about the technical details of it, as even they weren't. But it's not just something that has cropped up out of thin air recently.

msi2 · Apr 12, 2015

Cloudfire777 said:
No you dont know what you are talking about.
HBM1 supports 1GB per stack. Or 2Gb (bit, not byte) per DRAM. You can stack 4 of these DRAMs together max to make one stack.

We have already seen the leak on gfxbench. Its 4096bit. And how can it be that with 8GB HBM1? Two controllers with each accessing their own 4 * 1GB stacks. One controller per GPU die. Two dies working as one.
Not 8 stacks giving 8096bit bus or 4 stacks with 2GB because that isnt supported with HBM1 which we know 390X will have

Did you even watch the slide i posted?

Cloudfire777 · Apr 12, 2015

msi2 said:
Did you even watch the slide i posted?

That doesnt fit Hynix own specifications for HBM1 and HBM2. If they made some advancements in the technology it does in which case they could make 8GB on 4096bit. Im not sure how they could put 2Gb DRAMs next to each other just like that though

Conflicting information here.

msi2 · Apr 12, 2015

One thing is sure, i don't believe for one second in the dual-GPU (in that case in a dual Tonga configuration) hypothesis.

monstercameron · Apr 12, 2015

hmmm, would a gpu connected via an interposer need to use AFR/SFR etc. techniques like ones connected via pcie[meaning possible scaling issues] or is it more like all the cores are on die?

Tuna-Fish · Apr 12, 2015

monstercameron said:
hmmm, would a gpu connected via an interposer need to use AFR/SFR etc. techniques like ones connected via pcie[meaning possible scaling issues] or is it more like all the cores are on die?

In principle you could make a much wider and faster interface through the interposer than through any traditional means, which might let you drive two separate chips as one. The bottleneck in doing this is that you have to be able to route all rop and memory access from all sources on an any <-> any network. That's a lot of inter-chip bandwidth.

Long enough in the future, interposers might drive GPUs into only having a single chip optimized for yields manufactured, with the higher-end models being just more of them.

I really don't expect them to do this on the first model. The only "multi-chip magic" I expect is that if AMD makes the bottom-most logic chip of the stack that works as a memory controller, they might move some logic from the main die there, as I doubt you need to use the entire die just for the memory controller. Cache or even ROPs maybe?

The "2x Tonga" thing seems asisine to me.

shady28 · Apr 12, 2015

Almost mid April now and no release.

I still think their new cards will have a lot of Tonga in it, up to R9 380X.

And I suspect it will look something like this :

http://videocardz.com/52834/full-amd-tonga-gpu-might-feature-384-bit-memory-interface

2048 CU
128 TMU
32-48 ROPs
384bit bus

"The Tonga XT has 256 more shader count than the Tonga PRO GPU, which means that AMD is using the age old tactic of holding the good stuff back till it is actually needed. The full fat Tonga XT packs a total of 32 compute unit and six memory controllers. Since each is 64 bits wide, it adds up to a 384 bit interface."

Read more: http://wccftech.com/tonga-xt-hsa-support-384-bit-bus/#ixzz3X7WJx3Hy

looncraz · Apr 12, 2015

Cloudfire777 said:
2. Again, the two Tonga cores would not be connected through Crossfire. AMD should fire all their engineers if they couldnt connect two cores on the same die, with an internal connection that goes directly from one core to the other. If its through TSV or through L2 cache by routing crossbars I don`t know, but if they can do it with CPU cores and IGP, they can do it with dual GPU as well. This should have zero performance hit.

3. HBM1 have a limitation of 4GB. HBM2 wont be ready until 2016. Still AMD`s slide say "up to 8GB" for the 390 WCE. Dual controller like shown in the dual core picture anyone...?

2. It isn't as easy as you'd think to split a core between multiple dies. You either need to create a heterogeneous interposer interface which can connect a set of interconnects to either an external memory controller for the separate dies or you need to build a different die for each side, or the memory controller must be slowed down in order for it to be able to accept non-synchronized inputs (using latching).

It could be done, sure, but I don't see any advantage at all. The memory controller would still need to interface with the memory, or you will need to duplicate memory controllers, in which case you need a common external L3 for each memory controller to use to transfer data from the HBM chips it controls in the (common) event that an SP from the other die needs some chunk of memory. That means there will be considerable overhead and added latency.

The easiest solution to provide 8GB is to use one large die is... well... simple. Which brings me to your point 3.

3. HBM1 is limited to 4GB using a standard direct bus, true, but that doesn't stop the memory controller from being able to address more than 4GB. It just needs an added address line and to organize the HBM modules such that you can selectively address one of them selectively using the same bus. That's only a single added trace on the interposer per memory bus if the memory modules are aware.

[GPU] ...[BUS]... [HBM:0] + [HBM:1]

When the chip address line on the bus is low, the [HBM:0] responds, when the address line on the bus is high [HBM:1] responds.

However, if the memory is not able to support this, the situation will require some decoupling action on the bus for the action lines. (Get, Set, Reset, Write, whatever they're called on HBM). The memory should ignore any signals on the lines without one of the action lines being pulled high or low, depending on its design.

[GPU] [ACTION DECOUPLER]@[ACTION BUS:0]->[HBM:0]
[GPU] [ACTION DECOUPLER]@[ACTION BUS:1]->[HBM:1]
[GPU] [RAM BUS]:[HBM:0&1]

The RAM BUS would include the clock signal (likely needs to be driven a bit more, or amplified), power, ground, references, and the data bus. The memory controller could only address one module at a time, which means that memory locality suddenly becomes much more important. That requires some fairly careful logic to prevent any performance degradation. But all that logic will be in the memory controller, with some driver adjustments needed to take care of any oversights during the hardware design (happens) or because the designers weren't 100% sure of the performance effects in certain cases and left it for the software team to handle.

In any event, pretty much every solution makes more sense than going with a bifurcated die (and duplicate memory controllers).

In fact, going with duplicate memory controllers on the same die makes more sense. You'd just split the address space right down the middle.

Hauk · Apr 12, 2015

hmm what he said. sounds good to me, take my money

billbobaggins87 · Apr 12, 2015

this card just needs to come out so i can see how the market is going to change and buy my next GPU... damn

Head1985 · Apr 20, 2015

http://www.fudzilla.com/news/graphics/37566-two-amd-fiji-cards-coming-in-june
FIJI XT faster than GTX980 slower than TITANX
FIJI VR some dual card..

3DVagabond · Apr 20, 2015

Head1985 said:
http://www.fudzilla.com/news/graphics/37566-two-amd-fiji-cards-coming-in-june
FIJI XT faster than GTX980 slower than TITANX
FIJI VR some dual card..

According to Fudzilla Fiji is a dual GPU card anyway. Now there are two cards? Fuad covering all the bases.

raghu78 · Apr 20, 2015

3DVagabond said:
According to Fudzilla Fiji is a dual GPU card anyway. Now there are two cards? Fuad covering all the bases.

yeah that guy is so freaking hilarious. he is utterly clueless and the site is a trash clickbait site.

monstercameron · Apr 20, 2015

raghu78 said:
yeah that guy is so freaking hilarious. he is utterly clueless and the site is a trash clickbait site.

Yeah but the comment section is hilarious!

Alatar · Apr 20, 2015

To be fair he was the first one to confirm 8GB (few days before the slide stack leaked) even though his reasoning and explanations about the situation were complete garbage.

Cloudfire777 · Apr 20, 2015

Head1985 said:
http://www.fudzilla.com/news/graphics/37566-two-amd-fiji-cards-coming-in-june
FIJI XT faster than GTX980 slower than TITANX
FIJI VR some dual card..

As much as I`d like to believe this, but having two huge die`s with big TDP on one card? One thing is 2x 2816shaders on one card (295X2) but 2 x 4096shaders seems impossible. 2 x cut down Fiji with 3500shaders may be theoretically possible but to me it seems like a stretch. Not just TDP wise but also with power supply available and having two enormous die`s on one card.
Its like Nvidia using 2 x Titan X on one card. I think they used 2 x GK104 for a reason. I think 2 x GM204 is the line here as well.

destrekor · Apr 20, 2015

Cloudfire777 said:
As much as I`d like to believe this, but having two huge die`s with big TDP on one card? One thing is 2x 2816shaders on one card (295X2) but 2 x 4096shaders seems impossible. 2 x cut down Fiji with 3500shaders may be theoretically possible but to me it seems like a stretch. Not just TDP wise but also with power supply available and having two enormous die`s on one card.
Its like Nvidia using 2 x Titan X on one card. I think they used 2 x GK104 for a reason. I think 2 x GM204 is the line here as well.

Well to be fair, you probably would have said that very same thing before the 295X2 was announced. Hawaii XT in a dual-GPU configuration? A single Hawaii XT card is so hot and has such massive power draw, no way it's possible!
Oh wait.

Just because the number of shaders is increasing doesn't change the situation. Transistor count and shader/stream processor counts are constantly increasing, yet power draw and heat output remains relatively the same they are improve on their techniques.

Now, with the way Nvidia's GM200 has shaped up, I thoroughly doubt they will be releasing a proper dual-GM200 any time soon, but long-term, they may develop such a solution. I would expect them to potentially use up cut-down GM200's for a dual-GPU card, that's one way to use up the lesser performers.

Dual Fiji XT around launch? Nope. And it seems ludicrous to name the dual-GPU configuration Fiji XT (or a dual-die, which again, ridiculous) when convention has seen that to simply represent the larger die size.

Subyman · Apr 20, 2015

If AMD does make dual die work like Intel did with CPUs, then we could see the huge performance boosts akin to multi-core CPUs around the Q6600 era. Even if AMD gets a year or so ahead of Nvidia on this type of technology it could be what they need to get back in the market. It would open a new path for development exclusive to AMD for some time instead of chasing nvidia the traditional way.

I'd be surprised to see it happen, but would be excited to see how it shakes up the market.

destrekor · Apr 20, 2015

Subyman said:
If AMD does make dual die work like Intel did with CPUs, then we could see the huge performance boosts akin to multi-core CPUs around the Q6600 era. Even if AMD gets a year or so ahead of Nvidia on this type of technology it could be what they need to get back in the market. It would open a new path for development exclusive to AMD for some time instead of chasing nvidia the traditional way.

I'd be surprised to see it happen, but would be excited to see how it shakes up the market.

I thought about that kind of performance boost and advancement, but I don't think it translate here like it did with CPUs. GPUs are already essentially a multi-thread architecture, broken into many tiny cores, whereas multi-die CPUs were really the advent huge leaps in performance in multi-core and multi-threading x86 architecture. I don't think multi-die GPU translates to the same performance advancements. I could be wrong, however.

Elfear · Apr 20, 2015

destrekor said:
I thought about that kind of performance boost and advancement, but I don't think it translate here like it did with CPUs. GPUs are already essentially a multi-thread architecture, broken into many tiny cores, whereas multi-die CPUs were really the advent huge leaps in performance in multi-core and multi-threading x86 architecture. I don't think multi-die GPU translates to the same performance advancements. I could be wrong, however.

I think what Subyman was alluding to was that if AMD is able to make a dual die GPU interconnect function the same as a single die (i.e. without all the compromises of an XDMA/SLI interconnect), it would be huge. AMD would then be able to scale GPUs out much easier and we would see performance gains like we used to see each new generation.

I'm 99.9% sure AMD doesn't have that capability yet but I can dream.

[BitsAndChips]390X ready for launch - AMD ironing out drivers - Computex launch

Member

Golden Member

Lifer

Golden Member

Senior member

Lifer

Junior Member

Golden Member

Junior Member

Diamond Member

Golden Member

Platinum Member

Senior member

Platinum Member

Senior member

Golden Member

Lifer

Diamond Member

Diamond Member

Member

Golden Member

Lifer

Moderator <br> VC&G Forum

Lifer

Diamond Member