Having seen the original posts in Chinese, "architecture performance" most likely refers to IPC. Considering how good the X1's IPC is even with anemic cache (SPEC loves cache), it certainly wouldn't be surprising for the X2 to match if not beat GC in this regard.
Considering that X2 has nearly 10 less pipeline stages, with uop cache miss and about 5 with uop cache hit yes it's possible. At the depth of X2, Golden Cove would be an additional 15-30% faster!
Per clock, they are losing an insane amount of performance trying to reach those absurd frequencies.
In the Anandtech article about the X2, it says how ARM thought reducing 1 stage is worth it for the performance. Also the uop cache on the ARM parts are made to reduce effective pipeline stages. Golden Cove goes and adds an
extra stage.
When uop cache was introduced with Sandy Bridge, it went from 16 stages in Nehalem(14 in Core 2), to 14-18. So part of the motivation of the uop cache was to increase clock frequencies. It's not bad as the Pentium 4, where it had 20 stages after the Trace Cache hit(close to 30 on a miss).
Gracemont should be same as Tremont which is 13 stages.
On a desktop where your aim is performance at any cost I guess it's worth it. But elsewhere it really is not.