Question Incredible Apple M4 benchmarks...

Page 13 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

naukkis

Senior member
Jun 5, 2002
783
639
136
Nobody in industry describes "wider" as anything but "more op throughput and execution resources" - there are clearly many ways to scale performance without increasing those, as enumerated above

More op throughput - yes. But there's are different hardware possibilities to go there which aren't simplified to few parameters. CPU's are build around their register files - how much data they can shuffle around per clock is main factor for cpu core "wideness" It's just about optimization how much other stuff is put there to extract everything out from that data movement ability - and less is better.
 

SarahKerrigan

Senior member
Oct 12, 2014
646
1,648
136
More op throughput - yes. But there's are different hardware possibilities to go there which aren't simplified to few parameters. CPU's are build around their register files - how much data they can shuffle around per clock is main factor for cpu core "wideness" It's just about optimization how much other stuff is put there to extract everything out from that data movement ability - and less is better.

Once again, not a usage of these terms that I have ever heard in the semi industry.
 

SpudLobby

Senior member
May 18, 2022
963
660
106
Sure thing.

IMG_2693.jpegIMG_1941.jpeg


This measurement for the phones is full platform motherboard power too, not just the package, and the fundamental curves for those A715’s at 2GHz or 2.8GHz is blowing Intel out, be it E Core or LP E Core at the same performance, they’re using 2-3.5x the power, and that’s understating things. A715’s in a laptop would draw more power, but it’d still be better than Intel’s E Cores in Meteor Lake.

Skymont might narrow some gaps in Lunar Lake with a new architecture, but I don’t expect anything impressive for a part on N3B.
 

SpudLobby

Senior member
May 18, 2022
963
660
106
Nobody in industry describes "wider" as anything but "more op throughput and execution resources" - there are clearly many ways to scale performance without increasing those, as enumerated above

Not going to respond to the Apple hagiography because I don't think it's very relevant
lmao, “Apple hagiography” 😂 agreed.
 

FlameTail

Diamond Member
Dec 15, 2021
3,297
1,898
106
Wider as cpu ability to extract IPC from code. Technical details to achieve that are irrelevant, stating that Cortex-X4 is "wider" than M3 is massively oversimplification. Minor performance increase can be found by widening any state of cpu pipeline but as increasing area of execution will also increase power logarithmically good cpu designer should instead minimize every stage at their design to find optimal performance/power and extract max clocks. Which clearly Apple does better than any of their rival today. Which actually is very curious case, they don't need to be fastest, they really don't chase maximum performance from their silicon and still they got the fastest cpu design. They sure make their rivals look incompetent.
M4's clock increase came with an exponential increase in power consumption. It was the same case with M3, where the power increased.
 

SiliconFly

Golden Member
Mar 10, 2023
1,256
654
96
Did you not hear what I said?

Apple has the most efficient E-core in the world

The question is whether Intel LPE can match Apple-E, not the other way around.
Of course they won't match now cos Intel is currently in an older (i4) node which isn't that efficient compared to competition. Intel E core's PPW will experience massive gains once they move to a far better node (20A) shortly.
 

SiliconFly

Golden Member
Mar 10, 2023
1,256
654
96
Intel LPE cores are garbage. Intel can’t even match Arm’s A715/A720 (which clocked down is 2x less efficient than Apple’s but that’s not saying much), and it’s not close.
Not at all. Once they move to better/more power efficient node, they'll not only easily catch up, but may even surpass.
 

The Hardcard

Member
Oct 19, 2021
166
236
86
In short: It's the very definition of resting on one's laurels.

It's exploiting existing capabilities but doesn't prepare for future bigger improvements. Making the most of what's already there makes it seem like the product keeps up well in the competitive environment, but bigger jumps are essentially delayed up to the point where work on redesigns resume (and those actually turn out to be successful, which is never a given). Intel did it will all the Skylake derivatives and lost its IPC lead over that. Apple seems to be on the way to do the same right now unless there is a major redesign coming soon for one of the following gens.
I don’t think they’re worried about a redesign right now. If between Andrei and Geekerwan they are successfully representing what the core structures are, there has been constant and sometimes back-and-forth changes with each release from M1 to M4. It would seem to me that they are still experimenting with the best design balance for this first architecture.

There have been constant changes in decoder width and dispatch and scheduling queues. The reorder architecture has gone up, then down, then up again. To me, it’s screams that they’re not finished with the first architecture yet. At least not as of M3. Maybe fourth iteration is the charm.
 
  • Like
Reactions: Tlh97 and dr1337

SpudLobby

Senior member
May 18, 2022
963
660
106
Not at all. Once they move to better/more power efficient node, they'll not only easily catch up, but may even surpass.
LOL

I will bet you right now Lunar Lake will use more power than an M3 for ST at the same performance (which is similarly sized, on N3B).

We are going to get the comparison of a lifetime in that vein for one thing.

As for A715/A720’s, process nodes can’t explain this gap. Even when Intel gets a better fabric and upgrades Skymont, I doubt they’ll be as efficient. We’ve seen this story before.

we’ll see though. It will be an upgrade.
 
Last edited: