XDMA transfers of images on the 290X calculation

Page 2 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

BrightCandle

Diamond Member
Mar 15, 2007
4,762
0
76
I don't really get the purpose of the math here. Is it a correlation to frame pacing or smoothness?

Have you never done the maths just to see what the result is? Well now you have seen someone else do it!

Whenever a new technology is released I try and understand it and its implications. Changing how the images are transferred between cards could impact on the latency, could impact on the cards normal operations and could potentially be a problem on lower or older computers. Before my post did we know that 8x/8x on 2.0 might have a slight performance deficit compared to the previous cards? No we didn't. Equally did we know that the latency of the image is dramatically reduced by going over PCI-E? Nope.

I want to make sure and validate to myself (and you) that there is enough bandwidth that its not going to be a problem and in what scenarios it might be. We have reviewers all over the net predominantly testing X79 systems with PCI-E 3.0 16x/16x on crossfire right now and that isn't very representative of the systems people have. Most gamers seem to have gone for Sandy Bridge/Ivy Bridge/Haswell and not the enterprise platform even with crossfire/sli. The downside of that is they have less lanes for the second card, so one question I was hoping to answer is which types of system might have a problem there and should upgrade if they are considering these cards and there different image transfer mechanism. This is the extreme case and if doesn't break there it wont break anywhere else.
 

Granseth

Senior member
May 6, 2009
258
0
71
Indeed it is. All the bandwidth numbers are for lanes and these lanes are made up of 2 unidirectional links. For our purposes that halves the data we can send the GPU making my PCI-E bandwidth figures wrong. Darn it. OK I'll update that later.

No problem. It's always best to have an open discussion or we won't learn anything new.
 

chimaxi83

Diamond Member
May 18, 2003
5,457
63
101
I'm pretty sure XDMA CF is just fine. From Hardocp, pioneers of the "smoothness" metric:

ege8yjav.jpg
 

beginner99

Diamond Member
Jun 2, 2009
5,315
1,760
136
I'm pretty sure XDMA CF is just fine. From Hardocp, pioneers of the "smoothness" metric:

ege8yjav.jpg

I think the question is if it is also fine on PCIE 2.0 or to be more specific on Sandy bridge systems or older or any AMD systems. The later actually being funny if 290x CF would not work with an AMD CPU.
 

BrightCandle

Diamond Member
Mar 15, 2007
4,762
0
76
It also changes the contention possibilities as well. Its more complex than I initially thought!

Graphics cards in terms of bandwidth usage on the PCI-E bus are mostly receivers of data. They get textures, meshes etc from the CPU and RAM. However GPUs don't really send anything big over PCI-E that I know of (could be wrong of course!). Its all propriety so we can't know for sure but it seems unlikely. So given that the only contention possibility is on the cards receiving part of its lane.

The main card in the 16x slot now has half the input bandwidth available to what I calculated and its still got contention with the images coming in, but in practice the 6 PCI-E lane estimate for data is also twice what it should be, so it should be 1500MB/s. So now PCI-E 3.0 has about 8000 MB/s with 1500MB/s used for normal operation and a 985 MB/s used for the images. That is actually a significant percentage jump from before but it goes to show that this change shouldn't impact 16x lanes even on 2.0.

But for the supporting cards there is no contention with the cards usual operation as they are sending the image data. Because its the sending lane there is by and large nothing else bandwidth heavy on that port, so even a 4x PCI-E 2.0 port has sufficient bandwidth to send the images despite 250MB/s per lane in the sending direction. It can only just make it which results in a latency of 16ms but it works without further degrading fps.

So its interesting because the latency is impacted heavily by the data change (2x) but the fps impact of XDMA shouldn't be a problem on 16x on 3.0 or 2.0. However at 8x 2.0 is 250MB/s receiving bandwidth per lane = 2000 MB/s total and we need about 1500 MB/s for normal operation (world assets) and we need 985 MB/s for the images, and that is a problem for the main card.

The master card needs at least 16x 2.0 or 8x 3.0 but the supporting cards can get away with 4x 2.0 but it impacts on latency and likely frame pacing quite dramatically. The FCAT results will change dependent on the PCI-E bandwidth given to the secondary card quite a bit.
 
Last edited:

Granseth

Senior member
May 6, 2009
258
0
71
At PCI-E 2.0 it has 500MB/S per lane in each direction, not 250/250, and 985 MB/s for 3.0 per lane per direction
 

Stuka87

Diamond Member
Dec 10, 2010
6,240
2,559
136
This explains why CF scaled so much better at 4K resolutions in comparison to the 780/Titan SLI configs.
 

Stuka87

Diamond Member
Dec 10, 2010
6,240
2,559
136
I think the question is if it is also fine on PCIE 2.0 or to be more specific on Sandy bridge systems or older or any AMD systems. The later actually being funny if 290x CF would not work with an AMD CPU.

Why are you saying it would not work with any AMD CPU? Yes they use PCI-E 2.0, but the OP's post it is still not an issue.
 

chimaxi83

Diamond Member
May 18, 2003
5,457
63
101
I think the question is if it is also fine on PCIE 2.0 or to be more specific on Sandy bridge systems or older or any AMD systems. The later actually being funny if 290x CF would not work with an AMD CPU.

It would be fine on an AMD CPU, why wouldn't it be? Even at PCIE 2.0 16x 4x, 4K images would still only use less than half the bandwidth.
 

Stuka87

Diamond Member
Dec 10, 2010
6,240
2,559
136
so sb pcie 2.0 8x/8x is enough for 4k eyefinity?

There are no cards out there that are going to drive three 4K displays. Even if you had quad CF with 290X I doubt you could drive three 4K displays at a playable (ie: 30+) frame rates.
 

Granseth

Senior member
May 6, 2009
258
0
71
Well, I guess if you are playing very non-demanding games then sure. And in this case the displays are running at 30Hz.

But if you are wanting to play more modern games, such as BF4 or the like, you are going to need a lot more horsepower.

Where do you get 30Hz from?
 

ViRGE

Elite Member, Moderator Emeritus
Oct 9, 1999
31,516
167
106
Where do you get 30Hz from?
How else would you get 3 4K monitors hooked up to 2 290X cards? 290X did away with the ability to drive 2 DisplayPorts per GPU, so they only have 2 such ports total. They'd need 3 DisplayPorts to have 3x4K@60Hz.

Ergo they must be running all 3 monitors over HDMI/DVI, which means 3x4K@30Hz.
 

BrightCandle

Diamond Member
Mar 15, 2007
4,762
0
76
At 30 fps its only a 50% increase in bandwidth, from 985 to 1477.5 MB/s. That is significant in that it causes issues with 4x PCI-E 2.0 but 8x is fine for the supporting card. The primary card with eyefinity 3x 4k however needs 1477.5 + 3000, and in this cases 8x PCI-E 2.0 is only giving 4000MB/s. Strangely enough if you have the choice you probably want 16x and 4x for eyefinity, the supporting card then has sufficient bandwidth as does the primary.
 

Granseth

Senior member
May 6, 2009
258
0
71
How else would you get 3 4K monitors hooked up to 2 290X cards? 290X did away with the ability to drive 2 DisplayPorts per GPU, so they only have 2 such ports total. They'd need 3 DisplayPorts to have 3x4K@60Hz.

Ergo they must be running all 3 monitors over HDMI/DVI, which means 3x4K@30Hz.

According to AMDs slides the DP supports multi stream (ie 3 screens)
 

Phynaz

Lifer
Mar 13, 2006
10,140
819
126
At 30 fps its only a 50% increase in bandwidth, from 985 to 1477.5 MB/s. That is significant in that it causes issues with 4x PCI-E 2.0 but 8x is fine for the supporting card. The primary card with eyefinity 3x 4k however needs 1477.5 + 3000, and in this cases 8x PCI-E 2.0 is only giving 4000MB/s. Strangely enough if you have the choice you probably want 16x and 4x for eyefinity, the supporting card then has sufficient bandwidth as does the primary.

You're assumption is the secondary and tertiary cards are both sending their display to the primary card at the same time. Do we know that to be true?
 

Stuka87

Diamond Member
Dec 10, 2010
6,240
2,559
136
You're assumption is the secondary and tertiary cards are both sending their display to the primary card at the same time. Do we know that to be true?

Very good point. Its quite possible that they would send them out of phase with each other. Each card rendering the next frame from the previous.
 

Granseth

Senior member
May 6, 2009
258
0
71
Yes, but not at 4K. Not enough bandwidth on DP 1.2. It supports a single 4K display @60Hz.

I tried getting more information on the subject and got confused, but it seems like it can run 4k@60hz through DP and HDMI
http://www.hardocp.com/article/2013/10/23/amd_radeon_r9_290x_video_card_review/4#.UnAiLRCmZqI

But there was no good guide to what multistream DP could handle.

But I found AMD had been doing this before with custom drivers http://www.tested.com/tech/gaming/456899-triple-monitor-4k-gaming-15-billion-pixels-second/ so don't discount the screens running @60HZ
 

Granseth

Senior member
May 6, 2009
258
0
71
Very good point. Its quite possible that they would send them out of phase with each other. Each card rendering the next frame from the previous.
As it don't have a connection bar for signaling and synchronization it could introduce unwanted latency in 3 and 4 way CF