Another thing to keep in mind is that L2 associativity dropped from 8 way in previous Intel designs to 4 way on Skylake client. That likely significantly hurt cache performance for the sake of trying to conserve power in the ULV chips. SKX will have a better, not just bigger, L2$ which should help the true IPC potential of the Skylake core shine through.
Anandtech heard from Intel that it was to save power and some benefits for server that Intel wouldn't tell yet. Also something about saving space.
The server part is interesting. Are they saying it'll be a 1MB L2 4-way now? Also saving space won't be big because L2 caches are now pretty close to core. I guess its possible they did that to make room for core enhancements?
The power argument can only make sense for lower leakage and average power. What are we talking about though in terms of average power? 100mW? 200mW? I highly doubt its even that. Back as far as Banias Intel chips could turn off portions of caches to save power.
Another possibility. Part of the reason Broadwell didn't clock well is because of 14nm issues. There were reports Intel could have possibly re-architected Skylake to get the clock to acceptable levels. Cutting the associativity of the L2 cache could have been one way. Server chips clock nowhere near PC chips so they can get away with some things.
If I am being cynical I'd even say they use the power/die excuse because they are still too embarassed to admit they completely messed up on the 14nm process, especially after all that fanfare about having 3+ year lead and they would use that to dominate everything. The lead is absolutely nonexistent considering despite the smaller pitch Ryzen L3 caches are denser than Intel's.