Memory overclocking can take a long, long time to dial something in well. I do it on my main system as a hobby along with all sorts of other overclocking and performance tweaking, but when I threw together the other box with that CL15 3600 stuff, I was like "eh, XMP is good enough for me".
Well, a person has the option to increase the VDIMM voltage as needed to increase the speed. Or to drop the latency timings. Or both. It may require bumping up the VCCIO (aka IMC) voltage.
I'm not sure I want to fiddle with the timings, or it might entail making changes to the secondary timings -- I can't say. I've done it before. But the trial-and-error aspect is a pain-in-the-ass.
It's the sort of thing you'd want to nail down early in building the system, after overclocking the CPU.
The easiest thing I'd found to do was to run the command-rate at 1. But then, that would be a lot more troublesome for a 4x8GB system than for a 2x16GB. It might not even be possible for anything but a kit of two RAM modules. Otherwise, you can depend on G.SKILLs to make it possible.