If Fermi 40nm was already such a PITA to manufacture, you really think an even smaller 28nm node is going to be easier? And this is considering TSMC is skipping the 32nm altogether, so it's an even larger jump compared to before.
1) It's a big die, that makes it difficult to manufacture.
2) The process itself had problems, which was one of the major issues with the whole thing.
3) The design from NV didn't try and alleviate those problems. There was an article on AT which talks about what ATI had to do with their architecture (for the HD5870/etc) in order to make sure yields were good.
On a GOOD 28nm process, many of those issues will go away. On a bad one they might try planning to alleviate the negative impact of them more than they did for 40nm.
And die size might change depending on what approach they take, for instance if they decide to only increase functional units by 50% and then ramp up clock speed, meaning the die would be smaller.
Also TSMC skipped '45nm' and went to 40nm, so it was 55nm -> 40nm, and now it's 40nm -> 28nm, skipping 32nm. Exactly the same step as the last one.