Threadripper BUILDERS thread

tamz_msc · Aug 2, 2017

TR including Asetek brackets for the ability to mount existing AIO's is fine, but some of these coolers have a circular contact plate, and with the massive size of the heatspreader, how well would it actually cool?

BradC · Aug 3, 2017

ub4ty said:
My question here is : What in the world is this use case that everyone's talking about that's causing seg faults? What is being compiled and how many times? Is the use case a kernel compilation done over and over again w/o stop? Why the creation of a ramdisk? Is memory usage ballooning over time?

This is getting OT for this thread. If you're cool using an unreliable processor (you ran the test, it failed, your machine is affected) then have at it. You'll probably see random processes die at random times so infrequently that you'll put it down to something else. How many people got random and annoying crashes when overclocking before someone thought of running software that made things crash faster to highlight instability? The looped compile is a stability test. It doesn't mean the system is stable unless you are compiling large software. Compiling software just makes it happen quicker to make a nice easy to reproduce test. The ram disk is just to stop it hammering your SSD while running the tests.

I caught it originally when compiling a new kernel for the machine and it bombed. I've had a windows VM die after 6 weeks and I've had CUPS die twice in 2 weeks. Stuff shouldn't just die, so I did some searching and found a whole lot of other people complaining about a similar thing. Apparently AMD says it affects a tiny proportion of Linux users. It probably does, but at the moment that's you and I. You may not care, but that doesn't mean it doesn't affect your system.

This'll be my last post on it in this thread. I'm happy to elaborate further if you want to create a thread on it (or better still, just read the AMD forum thread I linked originally).

Markfw · Aug 3, 2017

BradC said:
This is getting OT for this thread. If you're cool using an unreliable processor (you ran the test, it failed, your machine is affected) then have at it. You'll probably see random processes die at random times so infrequently that you'll put it down to something else. How many people got random and annoying crashes when overclocking before someone thought of running software that made things crash faster to highlight instability? The looped compile is a stability test. It doesn't mean the system is stable unless you are compiling large software. Compiling software just makes it happen quicker to make a nice easy to reproduce test. The ram disk is just to stop it hammering your SSD while running the tests.

I caught it originally when compiling a new kernel for the machine and it bombed. I've had a windows VM die after 6 weeks and I've had CUPS die twice in 2 weeks. Stuff shouldn't just die, so I did some searching and found a whole lot of other people complaining about a similar thing. Apparently AMD says it affects a tiny proportion of Linux users. It probably does, but at the moment that's you and I. You may not care, but that doesn't mean it doesn't affect your system.

This'll be my last post on it in this thread. I'm happy to elaborate further if you want to create a thread on it (or better still, just read the AMD forum thread I linked originally).

This thread is for people that are BUILDING TR systems, not whiners about it. And I have 2 Ryzen systems that have never failed me, and have run 24/7@100% load doing DC.

So take your crap elsewhere, and stay out of my thread.

ajc9988 · Aug 3, 2017

tamz_msc said:
TR including Asetek brackets for the ability to mount existing AIO's is fine, but some of these coolers have a circular contact plate, and with the massive size of the heatspreader, how well would it actually cool?

I forgot which video or article I read talking about it, but the active dies are caddy corner from each other. This helps to spread the heat to prevent hotspots. This means the cooler will cover most (if not all) of the dies while cooling a large portion of the rest of the ihs. Remember, heat seeks cold. So I'd expect it to be fairly effective. Keep an eye on core temp variance when setting clocks and voltages and you should be more than fine.

ub4ty · Aug 3, 2017

BradC said:
This is getting OT for this thread. If you're cool using an unreliable processor (you ran the test, it failed, your machine is affected) then have at it. You'll probably see random processes die at random times so infrequently that you'll put it down to something else. How many people got random and annoying crashes when overclocking before someone thought of running software that made things crash faster to highlight instability? The looped compile is a stability test. It doesn't mean the system is stable unless you are compiling large software. Compiling software just makes it happen quicker to make a nice easy to reproduce test. The ram disk is just to stop it hammering your SSD while running the tests.

I caught it originally when compiling a new kernel for the machine and it bombed. I've had a windows VM die after 6 weeks and I've had CUPS die twice in 2 weeks. Stuff shouldn't just die, so I did some searching and found a whole lot of other people complaining about a similar thing. Apparently AMD says it affects a tiny proportion of Linux users. It probably does, but at the moment that's you and I. You may not care, but that doesn't mean it doesn't affect your system.

This'll be my last post on it in this thread. I'm happy to elaborate further if you want to create a thread on it (or better still, just read the AMD forum thread I linked originally).

My last response.
The processor is stable as proofed by its ability to handle sustained executional benchmarks.
It is proofed by the sheer number of people running the processor w/o issue. Beyond that is software, corner cases, and stateful data issues .. aka caching subsystems.

The issue seems to be segmented to a specific compilation use case which makes it a corner case with assigned internal priority. I went through the thread found here : https://community.amd.com/thread/215773?start=525&tstart=0 (90% of the people seem to be quite uninformed on how processors function). They hit a corner case issue and are screaming bloody murder. The main execution pipeline seems fine. Where the problem seems to be centered is on OP Caching and other supportive structures...
http://www.anandtech.com/show/10578...lers-micro-op-cache-memory-hierarchy-revealed

Bugs in such optimization subsystems get worked out as they for all processors.
https://github.com/xoreaxeaxeax/sandsifter/blob/master/references/domas_breaking_the_x86_isa_wp.pdf
This is likely why the problem goes away when you disable OP Caching...

I really should get paid to debug issues like this. Who knows, maybe there is something in store for this bug 😉

and no AMD isn't going to disclose anything of note about what the issue is if they've found it and are working on it. This section of the processor doesn't get detailed much. As for other random issues you have (6 WEEKS UPTIME), it's software ... Congrats on 6 weeks uptime.

ub4ty · Aug 3, 2017

ajc9988 said:
I forgot which video or article I read talking about it, but the active dies are caddy corner from each other. This helps to spread the heat to prevent hotspots. This means the cooler will cover most (if not all) of the dies while cooling a large portion of the rest of the ihs. Remember, heat seeks cold. So I'd expect it to be fairly effective. Keep an eye on core temp variance when setting clocks and voltages and you should be more than fine.

True enough. Look at even a stock Ryzen wraith spire... The Contact plate is circular.

ub4ty · Aug 3, 2017

Question :
Do AIO water coolers come w/ a damage warranty w.r.t warrantee/guarantee against component damage in case these things leak?

*Taking question over to :
https://forums.anandtech.com/forums/cases-cooling.13/

Will drop inquiry details here once I get them. Trying to resolve going AIO or air cooling for threadripper. The nightmare scenario of a leak on such an expensive system + expensive GPUs and subsequent potential data loss doesn't seem worth it 🙁

Cringe : https://forums.anandtech.com/thread...-trashed-could-i-have-prevented-this.2512319/

So, i might end up w/ a noctua cpu cooler while throwing their ugly arse poo colored fan in the trash. Thing is if they are even rated to handle 180W. I went to the page and all of them indicate 100W ratings

coercitiv · Aug 3, 2017

ub4ty said:
So, i might end up w/ a noctua cpu cooler while throwing their ugly arse poo colored fan in the trash. Thing is if they are even rated to handle 180W. I went to the page and all of them indicate 100W ratings

Take a look at their TDP guide: the new socket TDP values are not listed, but you can check AM3 ratings since that socket featured 220W CPUs.

The U14S is good for 220W.
The U12S goes beyond 140W but does not reach 220W.
The U9S is only 125W, but the new TR4 model comes with one more heatpipe and an additional fan for push-pull config.

I expect all of them to handle 180W, while U14 will go beyond stock easily.

tamz_msc · Aug 3, 2017

ajc9988 said:
I forgot which video or article I read talking about it, but the active dies are caddy corner from each other. This helps to spread the heat to prevent hotspots. This means the cooler will cover most (if not all) of the dies while cooling a large portion of the rest of the ihs. Remember, heat seeks cold. So I'd expect it to be fairly effective. Keep an eye on core temp variance when setting clocks and voltages and you should be more than fine.

ub4ty said:
True enough. Look at even a stock Ryzen wraith spire... The Contact plate is circular.

I still have my doubts, but we'll see. Circular contact plates isn't a problem for AM4 because the heatspreader is square and of a much smaller size, plus the die is bang in the middle, so everything works out in terms of symmetry. For reference, the TR4 edition of the Noctua air coolers have a contact plate of dimension 70x56mm, which is what you'd really want if these schematics are to be believed.

tamz_msc · Aug 3, 2017

ub4ty said:
My last response.
The processor is stable as proofed by its ability to handle sustained executional benchmarks.
It is proofed by the sheer number of people running the processor w/o issue. Beyond that is software, corner cases, and stateful data issues .. aka caching subsystems.

The issue seems to be segmented to a specific compilation use case which makes it a corner case with assigned internal priority. I went through the thread found here : https://community.amd.com/thread/215773?start=525&tstart=0 (90% of the people seem to be quite uninformed on how processors function). They hit a corner case issue and are screaming bloody murder. The main execution pipeline seems fine. Where the problem seems to be centered is on OP Caching and other supportive structures...
http://www.anandtech.com/show/10578...lers-micro-op-cache-memory-hierarchy-revealed

Bugs in such optimization subsystems get worked out as they for all processors.
https://github.com/xoreaxeaxeax/sandsifter/blob/master/references/domas_breaking_the_x86_isa_wp.pdf
This is likely why the problem goes away when you disable OP Caching...

I really should get paid to debug issues like this. Who knows, maybe there is something in store for this bug 😉

and no AMD isn't going to disclose anything of note about what the issue is if they've found it and are working on it. This section of the processor doesn't get detailed much. As for other random issues you have (6 WEEKS UPTIME), it's software ... Congrats on 6 weeks uptime.

That PDF looks interesting - will have a look at it. Thanks.

Sequences · Aug 3, 2017

Has there been any news of workstation motherboards for TR (not just "gamer" models)?

tamz_msc · Aug 3, 2017

Sequences said:
Has there been any news of workstation motherboards for TR (not just "gamer" models)?

If by that you mean which company's board aesthetic is the least offensive, then AsRock has you covered - they're the only ones providing dual-Intel GbE as well, at a reasonable price. Otherwise, if you meant something akin to the WS lineup from Asus, I'm afraid it'll have to wait.

StinkyPinky · Aug 3, 2017

Surely I'm not the only that is sick of bling bling motherboards that look like they're from the Borg Collective?

ajc9988 · Aug 3, 2017

ub4ty said:
Question :
Do AIO water coolers come w/ a damage warranty w.r.t warrantee/guarantee against component damage in case these things leak?

*Taking question over to :
https://forums.anandtech.com/forums/cases-cooling.13/

Will drop inquiry details here once I get them. Trying to resolve going AIO or air cooling for threadripper. The nightmare scenario of a leak on such an expensive system + expensive GPUs and subsequent potential data loss doesn't seem worth it 🙁

Cringe : https://forums.anandtech.com/thread...-trashed-could-i-have-prevented-this.2512319/

So, i might end up w/ a noctua cpu cooler while throwing their ugly arse poo colored fan in the trash. Thing is if they are even rated to handle 180W. I went to the page and all of them indicate 100W ratings

https://www.amd.com/en/thermal-solutions-threadripper
Updated list of compatible coolers from the horse...

Pandora's Box · Aug 3, 2017

Will be cooled by my EVGA CLC 280mm.

Haven't had an AMD Processor since the AMD 64 3200+ back in 2005.

I added a 2nd 1TB 960 Pro NVMe drive to this purchase. Going completely NVMe with this build. No SATA cables and cutting back on power cables too. Will be transferring over my 512 GB 950 Pro NVMe drive from my current build.

ajc9988 · Aug 3, 2017

Pandora's Box said:
Will be cooled by my EVGA CLC 280mm.

Haven't had an AMD Processor since the AMD 64 3200+ back in 2005.

I added a 2nd 1TB 960 Pro NVMe drive to this purchase. Going completely NVMe with this build. No SATA cables and cutting back on power cables too. Will be transferring over my 512 GB 950 Pro NVMe drive from my current build.

So, you will pay $400 for 3200MT/s ram, when 4133 costs $440-450 for 4x8GB. Considering the Zenith and MSI boards now supposedly run 3600, but the QVL lists none, it means it runs it, but not at the rated speed (XMP profile). So, if you are willing to manually set the ram OC, you get ram that will last you longer and run faster at a small premium. Or maybe kill thinking about this wrong...

Pandora's Box · Aug 3, 2017

ajc9988 said:
So, you will pay $400 for 3200MT/s ram, when 4133 costs $440-450 for 4x8GB. Considering the Zenith and MSI boards now supposedly run 3600, but the QVL lists none, it means it runs it, but not at the rated speed (XMP profile). So, if you are willing to manually set the ram OC, you get ram that will last you longer and run faster at a small premium. Or maybe kill thinking about this wrong...

It's confirmed 3600 works? Also I doubt theres 3600 ram with CAS 14 timings.

ajc9988 · Aug 3, 2017

Pandora's Box said:
It's confirmed 3600 works? Also I doubt theres 3600 ram with CAS 14 timings.

Yes and no. The website still says 3200 on the Zenith and MSI boards, but the bulk of news coverage since QVLs at the different companies were published went from as marked to Zenith and MSI supporting 3600.

I want to be clear, according to QVL, only two companies have ram listed at 3600: Gigabyte and Asrock. If you look at Asus, MSI, and Giga, they all only contain ram that works at its rated speeds (XMP). If you look at Asrock's list, they marked down every ram tested and the speed achieved. Now, that means not everything is simple plug and play on ram, just that you have a larger list of tested ram.

Asrock does have one type of ram running at 3600, its rated speeds. Giga as well. So my thought is, after seeing how Asrock did, they don't want to publish a QVL that isn't plug and play (unlike Asrock), but they still wanted to get the word out that if you tune like with Asrock, you can reach the same speeds as Asrock. This would explain why so many journalists reported higher ram speeds in the past day than what has been found on the companies' websites.

Edit: my two kits of 4133 in my M8E do 3733@CL14. So challenge accepted!

Pandora's Box · Aug 3, 2017

I'll stick with this set of 3200mhz cas 14 ram that's on the qvl list for the msi board.

ub4ty · Aug 3, 2017

coercitiv said:
Take a look at their TDP guide: the new socket TDP values are not listed, but you can check AM3 ratings since that socket featured 220W CPUs.

The U14S is good for 220W.
The U12S goes beyond 140W but does not reach 220W.
The U9S is only 125W, but the new TR4 model comes with one more heatpipe and an additional fan for push-pull config.

I expect all of them to handle 180W, while U14 will go beyond stock easily.

Thanks, I just caught that. They seemingly need to update this with the threadripper product's specs. Indeed U14S is good to 220W. U12S isn't spec'd beyond 140W besides mention and U9S is hard stopped at 125W which is way below 180W. Thus, they need to update the spec w/ what these things can actually achive

ub4ty · Aug 3, 2017

tamz_msc said:
I still have my doubts, but we'll see. Circular contact plates isn't a problem for AM4 because the heatspreader is square and of a much smaller size, plus the die is bang in the middle, so everything works out in terms of symmetry. For reference, the TR4 edition of the Noctua air coolers have a contact plate of dimension 70x56mm, which is what you'd really want if these schematics are to be believed.

Just did some back hand analysis found here :
http://imgur.com/a/kLqt4
There's no way any of those AIO Water coolers with unrevised contact plates actually cover the dies completely in a diagonal configuration. The issue isn't the width.. It's the height of the die and the fact that its at a diagonal which is why the heat spreader is rectangular.

There's no way the current AIO Water coolers have a diameter in-line with or that is the length of that diagonal from Active die to die. Something smells fishy here and I don't like it.

Noctua air coolers have a contact plate of dimension 70x56mm... Due to the rectangular configuration of the active dies and the subsequent rectangular heatspreadder.

Comparing a visual picture of the dies underneath the heatspreader to the actual schematic...
In order for any of these AIO water cooler's to cover the portions of the heat spreader in contact with the dies, they must be (at the absolute minimum) ~52 mm in diameter inside the bolt screws :

This is highly unlikely .. I will be tracking down measurements of these things now

lixlax · Aug 3, 2017

Gamers Nexus on this topic:

But has anyone actually given a thought as to why these CPU's are so heavy? Yes, they are big, but not that big (~130 grams). My bet is that the IHS is much thicker than the ones on most the other CPU's and therefore spreads the heat across itself much more effectively. So its not even essential to have 100% of the die area to be covered by thermal paste/cooler base. Quess we'll find out soon. What do you guys think?

MarkPost · Aug 3, 2017

I prefer the heatsink to cover the entire heatspreader surface, I think results will be better. Yeah, those existing "compatible" coolers will be enough, but still

ub4ty · Aug 3, 2017

lixlax said:
Gamers Nexus on this topic:

But has anyone actually given a thought as to why these CPU's are so heavy? Yes, they are big, but not that big (~130 grams). My bet is that the IHS is much thicker than the ones on most the other CPU's and therefore spreads the heat across itself much more effectively. So its not even essential to have 100% of the die area to be covered by thermal paste/cooler base. Quess we'll find out soon. What do you guys think?

Yep. Confirms exactly what I said. However, I went a step further and specified the bare minimum diameter needed for actual coverage of the dies by the contact surface. The diameter has to be 51.9mm. This is backed up by :

Nothing on the market is this diameter and is thus not suited for this product. Compatible is a fancy marketing term for : fits on the dam thing. There is no way there is any form of efficiency given how little the diameter these things are. What are you going to do? have some cores with inferno temps while others are cool? wtf.. This is b.s

They were designed for a single die not 2 dies diagonally separated with space in the middle. None of these AIO water coolers are efficient. I am canceling my consideration of them and getting a Noctua at the end of the month and only considering a water cooler solution in which the manufacturers have enough decency to make a custom/compatible/properly engineered contact plate.

Noctua :

The same resizing needs to occur for these water cooler solutions.
EK is making custom mating blocks for their water cooling solutions.
The rest of this stuff is junk IMO.

Hey AMD,
This is why you include coolers which such products... You know, the ones you used for internal testing and Engineering samples and on EPYC .. So you don't have to rely on this clown show of after-market cost cutting at product launch and even worse let your marketing team tell people they're 'compatible' and in some cases lie and tell people they cover the active dies completely. I care because I love your products so far and want to see them in their best light. You're letting after-market make a mess of your flagship processor -_-)

MarkPost · Aug 3, 2017

If I were to building a TR system (I wont, its too much for me 😛 ) I would go all day to a Noctua U14S. It seems to be the equivalent to a NH-U14S which performs nicely

Threadripper BUILDERS thread

Diamond Member

Junior Member

Moderator Emeritus, Elite Member

Senior member

Senior member

Senior member

Senior member

Diamond Member

Diamond Member

Diamond Member

Member

Diamond Member

Diamond Member

Senior member

Senior member

Senior member

Senior member

Senior member

Senior member

Senior member

Senior member

Senior member

Senior member

Senior member

Senior member