I don't need another system...I don't need another system...I don't need another system...

Discussion in 'Distributed Computing' started by TennesseeTony, Dec 18, 2016.

  1. crashtech

    crashtech Diamond Member

    Joined:
    Jan 4, 2013
    Messages:
    6,696
    Likes Received:
    254
    #101 crashtech, Feb 13, 2017
    Last edited: Feb 13, 2017
  2. MongGrel

    MongGrel Lifer

    Joined:
    Dec 3, 2013
    Messages:
    38,755
    Likes Received:
    3,044
    I still think you Ninja'd that SR-2 out from under me elsewhere in the past. You've been using it for a better purpose anyway, I did not really need it.

    Cheers ;)

    :beercheers:
     
    #102 MongGrel, Feb 13, 2017
    Last edited: Feb 13, 2017
  3. Markfw

    Markfw CPU Moderator VC&G Moderator Elite Member
    Super Moderator

    Joined:
    May 16, 2002
    Messages:
    14,427
    Likes Received:
    631
    Nope, I paid $430 from newegg on 9/2013 for that SR-2
     
  4. crashtech

    crashtech Diamond Member

    Joined:
    Jan 4, 2013
    Messages:
    6,696
    Likes Received:
    254
    The legend of the SR-2 lives on.
     
  5. Markfw

    Markfw CPU Moderator VC&G Moderator Elite Member
    Super Moderator

    Joined:
    May 16, 2002
    Messages:
    14,427
    Likes Received:
    631
    Yes, the most "cool" motherboard I have had in many years, and now I pass it on to you crashtech...(tears stream as it went in the mail today to him)
     
  6. Kiska

    Kiska Senior member

    Joined:
    Apr 4, 2012
    Messages:
    391
    Likes Received:
    48
    On the second one, I can't put a E5-2670 v1, in that one, it has a C612 chipset for v3's and above.
    The first one I have no idea if it will support a GPU, and I would like to install @Markfw's GPU's that I got from him into the system, or @TennesseeTony's R9 280x into it.

    And that Asus motherboard is a special order, none of the retailers here have stock of it
     
  7. crashtech

    crashtech Diamond Member

    Joined:
    Jan 4, 2013
    Messages:
    6,696
    Likes Received:
    254
    I'm sorry for accidentally including a 2011-3 board, my mistake. But four of the slots in the X9DRi-LN4F+ are x16 PCI-e 3.0, and will support modern GPUs. Not that you should get that board in particular, but to illustrate that there are possible alternatives to the high-end workstation board you have initially selected.
     
  8. Kiska

    Kiska Senior member

    Joined:
    Apr 4, 2012
    Messages:
    391
    Likes Received:
    48
    I know there are alternatives, however I live in Australia, and shipping on that motherboard is about ~$220 including customs clearance fees

    I can try to see if I can source a lower end of that motherboard the Asus Z9PA-D8, but I am not sure, none of the retailers here stock it, and Newegg doesn't ship to Australia for that motherboard

    EDIT:
    I have found a cheaper board, but I have no idea if NewEgg will finance it, since I can't buy it upfront: https://www.newegg.com/global/au/Product/Product.aspx?Item=N82E16813131886&ignorebbr=1

    Is this a good motherboard?
     
    #108 Kiska, Feb 14, 2017
    Last edited: Feb 14, 2017
  9. Orange Kid

    Orange Kid Elite Member

    Joined:
    Oct 9, 1999
    Messages:
    3,120
    Likes Received:
    154
  10. StefanR5R

    StefanR5R Senior member

    Joined:
    Dec 10, 2016
    Messages:
    498
    Likes Received:
    179
    @Kiska, I guess you already thought about the pros and cons of 1 dual-socket system vs. 2 single-socket systems.

    Note that many applications are somewhat slowed down in dual-socket systems because they are not NUMA-aware. Looking at DC applications, e.g. WCG OpenZika and MappingCancerMarkers work nicely in my 2P boxes, whereas PrimeGrid SoB-LLR works quite slower than on a comparable 1P box. (Not sure though whether this is because of NUMA, or Linux on 2P vs. Windows on 1P, or something else. Numbers to follow in a dedicated PrimeGrid post.) Generally, many classic desktop applications are not NUMA-aware. And Apple, as another example, seem to have given up on NUMA in their Mac Pro line entirely; I recall to have read they never had proper OS support for it (but I may misremember).

    @Orange Kid, Austria ./. Australia...
     
  11. Orange Kid

    Orange Kid Elite Member

    Joined:
    Oct 9, 1999
    Messages:
    3,120
    Likes Received:
    154
    My bad. Only off by one click:eek:
     
  12. StefanR5R

    StefanR5R Senior member

    Joined:
    Dec 10, 2016
    Messages:
    498
    Likes Received:
    179
    Hmm, looking closer at the yet-to-be-posted numbers, maybe there is actually no discernible negative effect from 2P/NUMA in PrimeGrid SoB-LLR after all (nor from Linux). There is quite a bit of memory latency difference between the 1P and 2P boxes I am looking at; the 1P having the considerable faster RAM. --- When I configured my 2P Xeon boxes, I looked for low-latency ECC RAM but found none. But for my purposes, ECC was more important than memory latency.
     
  13. Markfw

    Markfw CPU Moderator VC&G Moderator Elite Member
    Super Moderator

    Joined:
    May 16, 2002
    Messages:
    14,427
    Likes Received:
    631
    UPDATE: Ok, when this started, I talked myself into one E5-2683 system....(Thanks to Tony).

    BUT NOW I have FOUR E5-2683 systems. All 4 are crunching WCG, and my averages next week should reflect that. I have figured out that the CPU takes about 100 watts@100% on all 28 threads. So for a 400 watt investment I am probably going to be doing 40,000 PPD for WCG for 56 cpu's and 112 threads. Not bad.

    The folding farm is holding@6.1-6.2 million ppd, but my electrical investment, and the heat output has gone down. Now if only I can sell my 980's and motherboards.... (I have a thread in FS/FT, but no takers)

    EDIT: All of this cost me about $2,800 and so my updates will be next to nothing until Decembers race.
     
    crashtech and TennesseeTony like this.
  14. Ionstream

    Ionstream Member

    Joined:
    Nov 19, 2016
    Messages:
    48
    Likes Received:
    16
    I don't know about the 980s man. Apart from the fact that shipping ain't free, $225 does seem a little steep for GPUs that have been crunching non-stop. There's also the fact that a new 6GB 1060 is roughly around the same price, offering similar performance while costing less power.
     
  15. Markfw

    Markfw CPU Moderator VC&G Moderator Elite Member
    Super Moderator

    Joined:
    May 16, 2002
    Messages:
    14,427
    Likes Received:
    631
    So, now that I have 4 of the E5-2683's, I have a question. I have seen multiple screen shots where this chip is running at 3 ghz. How did they do that ? Mine all run turbo @2.5
     
  16. TennesseeTony

    TennesseeTony Elite Member

    Joined:
    Aug 2, 2003
    Messages:
    1,912
    Likes Received:
    361
    The Turbo speeds for that model are 5/5/5/5/5/5/5/5/5/6/7/8/10/10. (Yeah, but what the heck does that mean?) Glad you asked, here's a link that will give the details, but the short version is that you work backwards, from the 10 on the far right: One physical core running will multiply the Front Side Bus by 10, and add that (100MHz x10=1GHz) to the base clock of the chip (which is 2.0GHz), giving you 3.0GHz.

    Two physical cores, again, from right to left, the next boost bin is a 10 also. But run 3 physical cores, and the next boost is a 8, so only 800MHz gets added for 2.8GHz top clock.

    I find that hard to follow personally, so I'll reverse it: 10/10/8/7/6/5/5/5/5/5/5/5/5/5.
    1. Core = 10x100MHz FSB + 2000MHz base clock
    2. Cores=10x100
    3. Cores= 8x100
    4. Cores= 7x100
    5. Cores= 6x100
    6. through 14 cores used, you "only" get a boost of 5x100
    But AVX tasks may prevent full boost, I've heard. I think this is more or less tied to the TDP of 120W for that model. And as yours run full blast at much less wattage than that, I doubt you'd ever see a slowdown while running AVX tasks.

    If you have seen screenshots of ALL cores running at 3.0, that involves voodoo and other risky forms of persuasion. Something about updating the microcode on the CPU itself, modified BIOS's, all sorts of fun stuff. All well beyond my comprehension.
     
    #116 TennesseeTony, Feb 20, 2017
    Last edited: Feb 20, 2017
  17. StefanR5R

    StefanR5R Senior member

    Joined:
    Dec 10, 2016
    Messages:
    498
    Likes Received:
    179
    For all practical purposes, you can ignore Intel's Max Turbo = Single-core Turbo specification, even though this clock is featured heavily in spec sheets and online shops. Most OSs have great difficulties to ever achieve this clock, simply because task migration between cores and background tasks prevent other cores idling long enough for the CPU to decide to go full single-core turbo on the most loaded core.

    In my experience, all-core turbo or something slightly higher is the most common clock to be seen on a light-to-medium loaded Xeon, and on non-K desktop CPUs as well. Single core turbo might blip up now and then, but hardly for long.

    (Looks like this is going to change with AMD Ryzen, and maybe even AMD Naples...?)

    Yep, in case of Haswell-EP, this microcode manipulation exploits a specific hardware erratum, as I have learned from the "CPUs and Overclocking/ What controls Turbo Core in Xeons?" thread.

    Haswell-EP and Broadwell-EP have an additional set of base frequency and single/ multi/ all core turbo frequency for AVX workloads. This set of extra base/ turbo frequencies is fixed by Intel for each processor model. According to Johan De Gelas' Haswell-EP review, the processor continuously (or periodically?) checks whether any core is running AVX code. If yes, the AVX set of frequencies is applied for at least the next 1 ms, instead of the normal set.

    So from what I understand, this downclocking under AVX loads is predetermined, not due to dynamic measurement of actual power consumption or/ and temperature. (Thermal throttling will be applied independently of this if cooling is insufficient.)

    Furthermore, I have read that Haswell-EP clocks down all cores simultaneously if at least one core is subjected to AVX load. But I have no Haswell-EP of my own to verify if this info is correct. Broadwell-EP applies AVX frequencies only to cores which actually do run AVX code, while the remaining cores continue to run at non-AVX frequencies.

    As far as I have seen, current World Community Grid applications do not utilize AVX --- or at least not those AVX instructions and data which would trigger this downclocking. (Could be that 128 bit wide vectors do not cause downclocking, in contrast to 256 bit wide vectors. But I am not aware of definitive documentation on this matter.)

    Several if not all PrimeGrid applications utilize AVX on CPUs which have it, thus force the CPU to AVX turbo or AVX base.
     
    TennesseeTony likes this.
  18. Kiska

    Kiska Senior member

    Joined:
    Apr 4, 2012
    Messages:
    391
    Likes Received:
    48
    This is for laptops, for my i5-6200U, Intel applies a AVX downclock on all cores if it detects AVX code, no matter if its 128, 256 or 512, any and it will downclock to AVX boost which on this processor is 25x, 28x is when 1 core is loaded and 27x is when both are.

    On my i5-4210U unit, when AVX code, whether that be 128 or 256, it will downclock to the BASE frequency of the processor on all cores, there is no AVX base or turbo, it will downclock to 1.7Ghz.

    So Skylake downclocks when AVX is detected on one core only, the rest stay in their other mode vs Haswell which forces downclocking on all cores and down to the processors base clock speed
     
  19. PCTC2

    PCTC2 Diamond Member

    Joined:
    Feb 18, 2007
    Messages:
    3,879
    Likes Received:
    8
    Damnit guys. Since the race, I bought a E5-1620v4, realized it wasn't enough for my work, and upgraded to a E5-2620v4. I should've just gotten a used E5-2883v3 for the same price!

    Right after the race I ended up buying a new rig. It now has an E5-2620v4, 32GB DDR4, and dual GTX 1070s. First new rig in years, and now you guys are making me want to upgrade it again for the second time in a month!
     
  20. Markfw

    Markfw CPU Moderator VC&G Moderator Elite Member
    Super Moderator

    Joined:
    May 16, 2002
    Messages:
    14,427
    Likes Received:
    631
    You have plenty of CPU power for F@H, its uses GPU power mostly (ppd wise) and you have 2 1070's, so you are good !

    What are you going to use the CPU and GPU for ?
     
  21. PCTC2

    PCTC2 Diamond Member

    Joined:
    Feb 18, 2007
    Messages:
    3,879
    Likes Received:
    8
    Mainly just for work for now. Some machine learning/GPU compute kind of things. Personal research. Also, lots of video encoding/decoding. And the GTX 1070s are for playing games at 3K@100Hz, because I haven't seriously gamed in years and I'm getting back into it. I have 32GB of DDR4 and the E5-1620v4 laying around, so I'm contemplating buying another X99 board and another GTX 1070 or similar for a 24/7 linux workstation. Swap the E5-1620v4 back into the desktop for pure single-threaded performance and put the E5-2620v4 into the new box for video encoding, virtualization, and research.

    No F@H yet, but maybe if there's another race sometime this year before I leave the country, I'll join in. ;)
     
  22. PCTC2

    PCTC2 Diamond Member

    Joined:
    Feb 18, 2007
    Messages:
    3,879
    Likes Received:
    8
  23. Markfw

    Markfw CPU Moderator VC&G Moderator Elite Member
    Super Moderator

    Joined:
    May 16, 2002
    Messages:
    14,427
    Likes Received:
    631
    Well, thats pretty good, but if you only have 2 video cards and want 2011-3 mobo, then newegg has an X99 ASROCK open box for $146 (I have one of these)

    https://www.newegg.com/Product/Product.aspx?Item=N82E16813157543R

    And the regular was on sale for $120 (or was it $130) A while back. Could wait for that again. (like one week ago). I am pretty well, spent on the 4 E5-2683 systems I just brought up.
     
  24. StefanR5R

    StefanR5R Senior member

    Joined:
    Dec 10, 2016
    Messages:
    498
    Likes Received:
    179
    I received a question about the motherboards of my two dual-socket boxes, and I thought I put the answer into this thread:

    These were my very first and so far only 2P builds, meaning that I have no experience beyond these. (I had occasional limited access to 2P systems before, but never had to deal with their hardware or system-level software aspects.)

    I chose Supermicro because I already had good experience with an X10SAE for my Haswell Xeon E3 as my primary home desktop. Second, I chose X10DAX because I needed good CPU clocks from the Xeon E5s. X10DAX is among the "Hyper-Speed" line of Supermicro boards. This line features (1.) some sort of turbo enhancement, (2.) BCLK overclocking, VRMs which are capable to support the power draw of 1. and 2., and (3.) latency-optimized firmware.

    Of these features, I was keen on (1.) and on the VRMs to go with it. I have not found any proper documentation about what this "Hyper-Turbo" feature is really doing, but I hoped that it would drive the Xeons at all-core turbo clock all the time when under full load. This turned out to be true. I have never seen it go below all-core turbo, except on idle cores. (Fewer-core turbo clocks are occasionally applied provided there is only a partial load, but this is not important to me on these boxes.)

    However, it could be that other, cheaper boards can permanently hold all-core turbo too. But I have no idea how widespread this feature is. Maybe it is even standard? I am not just a 2P virgin, but a Xeon E5 virgin as well.

    As for features (2.) and (3.): I have not experimented with BCLK overclocking yet, and don't really plan to. Low-latency firmware is good to have, but I don't use these systems as a desktop or for high-frequency stock trading or something like that.

    There is no BMC/VGA on the board, i.e. it needs a graphics card. But any DCer worth its salt would put some of those PCIe slots to good use for GPU crunching goodness anyway. (I only have a low-wattage, passively cooled, hand-me-down GPU in there which is only supposed to display the BIOS, and rarely a desktop.) Also, there is no storage controller besides the one in the PCH, and just run-of-the-mill dual 1G networking. The upside of the lack of extra controllers onboard is that POST times are fairly low, comparable with the lowest POST times among X99 boards.

    Thermal sensors etc. are rather sparse, at least with Linux. I get to see only one temperature probe besides CPU temperatures, and I have no idea where this probe is located.

    The board has several PWM-enabled fan headers, which I like. Fan control options are mostly absent; actually, I don't remember whether there are any options at all, and can't check right now. As far as I can tell, all PWM headers are driven at the same level, dependent on the hotter of the two CPUs. But I am very satisfied with the BIOS's fan curve, in combination with the heat sinks, CPU fans, and case fans that I chose.

    Ah, I forgot: Another major factor why I chose X10DAX was because two Noctua NH-D15S fit on it, blowing upwards, straight out of the fully-meshed top side of the computer case. This avoids exhaust from one cooler getting into the intake of the other. In combination with the plentiful radiator surface of NH-D15s, and passive GPU, it makes for a super-quiet system. These two boxes could IMO be used even in a bedroom, both running at full CPU load all day and night. (I have no place to put loud hardware.) The case is Corsair Carbide 400Q; tray modded for SSI EEB layout; front fascia permanently removed for easier cleaning of the front fan filter. (Function over form; case chosen for its compact size. Making it SSI EEB compatible can be a PITA depending on one's craftsmanship, which I for one mostly lack.)

    RAM support:

    I went for registered ECC RAM. These boxes are supposed to perform week-long computations if needed, and I have little use for bit flip errors during such a computation... The X10DAX manual and QVL list only mention support for RDIMM (registered), and LRDIMM (load reduced) modules. Supported RAM speeds depend on the CPU generation (Haswell E5-26xx v3, versus Broadwell E5-26xx v4), and in case of RDIMM on the number of populated slots per channel.

    I do not know whether unregistered DIMMs could work, have not tried, and currently do not have spare modules to try.

    Final note: When I built these two 2P boxes, I had no idea that I would get into distributed computing one day. As they were built for multi-core CPU computing, they happen to work quite well for DC applications too, I believe with good performance per Watt, but obviously not on a performance per dollar level anywhere near a more typical DC enthusiast system.
     
    #124 StefanR5R, Apr 17, 2017
    Last edited: Apr 17, 2017
    crashtech, Smoke and TennesseeTony like this.
  25. StefanR5R

    StefanR5R Senior member

    Joined:
    Dec 10, 2016
    Messages:
    498
    Likes Received:
    179
    Over in the Cases & Cooling forum, I mentioned that I had a 2P PC and fitted it with phat dualtower air coolers. I was asked to provide pictures, which I did today. Crossposting them here too:
    [​IMG]
    [​IMG]
    [​IMG]
    [​IMG]
    [​IMG]
    [​IMG]
    [​IMG]
    [​IMG]
    [​IMG]
    [​IMG]
    [​IMG]
    [​IMG]
    [​IMG]
    [​IMG]
    [​IMG]
    [​IMG]
    [​IMG]
    BTW, I keep telling myself that I do in fact not need yet another system of this sort.