Why would Atom even need that? only software I know of that at least makes a little bit sense to run on an Atom is Handbrake. But else? Waste of die pace and energy.
Even having an AVX2 implementation that implements 256-bit operations over 2 uops would be of some value. This could still alleviate some decode bottlenecks and a 1 load/cycle gather can be better than the alternative. There wouldn't be too much of an overhead in implementing this.
The other benefit is that if you're doing something like writing intrinsics or using ASM you can write an AVX2 target as your best optimized case and maybe not bother with a separate SSE2 target anymore. As it stands most stuff out now doesn't use auto-dispatch like in ICC so won't target something like AVX2 unless it's done explicitly, and given software incompatibility a lot of people won't bother. So then the CPUs that do support it miss out. Lack of support on Core-based Celeron and Pentiums is the biggest problem, but Atoms aren't kept out of the PC space by any means either.