Did a small test to see what kind of improvements does the delta color compression implemented in Carrizo bring in practice. The results are quite depresessing, but pretty much exactly what was expected. It seems that while the DCC slightly improves the situation, for some reason it is not as effective on Carrizo as it is on discrete cards (e.g Tonga).
I fixed the GPU frequency to 400MHz in order to allow it to reach it´s full, non-restricted (bandwidth) performance at the higher memory frequencies. Then I ran the test with different memory speeds. The TDP was set to 65W in order to keep the frequencies stable.
Despite the DCC technology the iGPU still needs around 62MB/s of bandwidth per GFlop in order to reach most (96.8%) it´s full performance. The number is somewhat similar compared to Kaveri (no DCC available), however the delta (performance loss due bandwidth limitation) is smaller on Carrizo thanks to DCC.
Based on these numbers the FX-8800P (800MHz, 512SP GPU) would require 51.2GB/s (819.2 * 62.5) of memory bandwidth in order to be able to completely utilize it´s GPU. This doesn´t look too good for AM4 Bristol Ridge parts which have significantly higher GPU frequency and still have only 38.4GB/s (2400MHz DDR4) of bandwidth available. Ideally the fastest Bristol Ridge AM4 APU would need to be paired with RAM operating at ~ 4400MHz.
Unless Raven Ridge solves the bandwidth issue, AMD APUs are better of dead :'(