Win2012R2
Golden Member
What's better? It's the only agreed game in town.No, that one specifically sucks.
It's aa64 but Bad.
What's better? It's the only agreed game in town.No, that one specifically sucks.
It's aa64 but Bad.
Actually doing a cleanup ISA break, just like aa64 did.What's better? It's the only agreed game in town.
Wouldn't something like iBOT be very interesting to be put into HW accelerated IP in the CPU? IBOT seems to be some kind of high-level workflow arbiter.Intel says that iBOT currently requires extensive validation, a little over a quarter. They claim to be working on optimization for content creation software.
You're 3 inches away from reinventing Transmeta.Wouldn't something like iBOT be very interesting to be put into HW accelerated IP in the CPU? IBOT seems to be some kind of high-level workflow arbiter.
You missed my point. You collect the profile on specific machine. The inlining decisions made for that specific machine might not carry over to another CPU. Due to cache sizes etc. So the benefit can be limited if not outright nullified. That's not a problem on console, where for each generation the CPU stays exactly the same. Of course it would be nice if the OS could perform something like BOLT for you on first launch, but well, people are too impatient for that😉PGO is fantastic for any software that runs heavy stuff: it helps compiler decide how to inline better and that can make huge difference vs function call.
Not happening in x86 - ever.Actually doing a cleanup ISA break, just like aa64 did.
PGO is driven by profile data (PG bit), not CPU features like cache size or ISA (that's different optimisation type), there is also Dynamic PGO that is used in .NET and I believe Java to do such optimisations depending on work load, so no static one off necessary.You missed my point. You collect the profile on specific machine. The inlining decisions made for that specific machine might not carry over to another CPU.
Yes, but the CPU features are implicit dependency. You gather profile on CPU X, based on that toolchain will alter code layout branch hints whatever it can to extract most performance. Then you run on CPU Y, some of those optimization might not hold (for example smaller uOP cache might end up penalizing too big functions etc).PGO is driven by profile data (PG bit), not CPU features like cache size or ISA
Well, no, APX was a golden opportunity.Not happening in x86 - ever.
That has got nothing to do with PGO which is Profile Guided Optimisation, it is based solely on data captured during profiling session.Yes, but the CPU features are implicit dependency
Thank you for this. Re-coding an application and showing it runs better on Apple actually supports my "less baggage/newer ISA" advantage supposition.The facts are they have the best scores and the best PPW(A) in industry standard cross-platform tests.
What exactly they could have done and what benefit it would have brought? Changing stuff in small way won't give any meaningful benefits and serious breaking changes would mean you compete directly against ARM.Well, no, APX was a golden opportunity.
brotha, spec2017 subtests are *ancient*.Re-coding an application and showing it runs better on Apple actually supports my "less baggage/newer ISA" advantage supposition.
Remember amd64? Gotta do that again. They just extended opcode space instead. yuck.What exactly they could have done and what benefit it would have brought?
amd64 was already a "serious breaking change".and serious breaking changes would mean you compete directly against ARM.
hmm....2017-1979=38brotha, spec2017 subtests are *ancient*.
Remember amd64? Gotta do that again. They just extended opcode space instead. yuck.
amd64 was already a "serious breaking change".
What exactly did it break in terms of backwards compatibility?amd64 was already a "serious breaking change".
Wut.hmm....2017-1979=38
A lot?What exactly did it break in terms of backwards compatibility?
You can still run old 16/32 bit stuff on it, no?A lot?
amd64 addressing modes alone are uhhh.
Nothing stopped ARMv8 cores from running 32b stuff either.You can still run old 16/32 bit stuff on it, no?
Well it's a bit too late for that, been 20+ years since amd64 got out, and ARM got 64-bits nearly 10 years later, obviously it was easier to follow up with better arch, nothing can be done about it.They just had a clean 64b ISA instead of what we have with APX.
You could've given APX a separate exec mode with a cleaned up opcode space.Well it's a bit too late for that, been 20+ years since amd64 got out, and ARM got 64-bits nearly 10 years later, obviously it was easier to follow up with better arch, nothing can be done about it.
What exactly you want to change there - change to fixed rather than variable length opcodes?You could've given APX a separate exec mode with a cleaned up opcode space.
Maybe.What exactly you want to change there - change to fixed rather than variable length opcodes?
You're seriously underestimating the cost of desgin and more the one of validation, and the impacts on how you can let an ISA progress with such dead 50 years old weight.That support takes very few transistors, keeping it for the sake of 100% backwards compatibility is what made x86 successful, once you start cutting "old stuff" it's a slipper slope and will result in fragmentation, basically neither Intel nor AMD will do such madness.
But I do see Apple CPU's not being so fast in Windows, right? I mean really this kind of perfectly illustrates my point.
I'm done with this topic until Apple release a x86 cpu. Meaning... I'm done.The only way an Apple Silicon Mac can run Windows is under a VM. There are performance penalties for that, depending on what you're doing.
I'm not sure what numbers you're referencing with "I do see Apple CPU's not being so fast in Windows", but the proper comparison would be a Mac running Windows in a VM measured against a PC running Windows in a VM.