Neither 10GB with 6 threads, it was a quick(not really, 1750sec) check but it seems to be stable, it didn't immediately crash with 1.18V, almost passed 1 run so that suggests that it should be fine, I'm going to leave it working overnight at max memory usage but I'd be surprised to see it crash. I always report the temperature of the hottest core. I can check how much the power usage drops from 1.25V to 1.2V, I have a meter. Earlier reported temps are from 4GB usage, 10GB usage is much more strenuous, 6C more and 20Gflops!!! I think AIR is enough if you aim for 4-4.4GHz, anything more needs water IMHO. LINX at max memory usage is a good 10-15C higher than any load a useful app could generate so even 95C in LINX are fine IMHO.
I say that's a plausible view of it. But this question has been recently visited in other threads about Haswell K processors in general. There is a complaint that the AVX2 aspect of the stress-test software is pushing these temperatures higher than what had been shown before an AVX2 instruction set. [I assume from how this feature is variously described that it is a new instruction-set component for the Haswell-gen processors.]
Understandably, the discussion moved toward the issue of "how much stress-testing was necessary," or whether certain tests were any less useful than more strenuous tests. The author of OCCT asserts that his own "OCCT CPU" test component runs cooler but will reveal more errors and sooner than the LinPack test included as an OCCT option.
IDontCare opined that there has to be some "standard." This implies that using limp stress-tests has limited value.
If one has a successful test run that pushes the temperatures to 95C and within the throttling limit for the processor, you could argue exactly the same thing: It passed the stress test but the processor will never achieve those temperatures under any normal usage.
I might be inclined to agree with that, but I speculate that frequent and extensive stress-testing under those conditions might actually shorten processor life-span.
That's why I agree with IDontCare on the value of robust testing that has relatively short duration. For instance, OCCT's author suggests 3.5 hours for his test; others advise 25 to 30 iterations of IBT or LinX. These test durations are stark contrast to established Prime95 usage for which a user may prefer a duration of 18 hours or more.
Anyway -- we agree about the cooling. You can actually "get somewhere" with overclocking under NH-D14/D15 or Corsair H110i cooling, but you might prefer some custom-water solution with more radiator and cooling capacity.
Lower voltage means less thermal wattage, but conversely -- lower temperatures are likely to trim something off a voltage requirement for an overclock target.