• We’re currently investigating an issue related to the forum theme and styling that is impacting page layout and visual formatting. The problem has been identified, and we are actively working on a resolution. There is no impact to user data or functionality, this is strictly a front-end display issue. We’ll post an update once the fix has been deployed. Thanks for your patience while we get this sorted.

Discussion ARM Cortex/Neoverse IP + SoCs (no custom cores) Discussion

Page 48 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.
Hmmmm, I'd like to dive deeper into this particular rabbit hole, but I don't want to derail this thread further so I'm starting new threads for CPU and GPU µArch research.
 
It's RV specific, but I'd be surprised if the principles couldn't apply to ARM too.

Sure, after ARM adds vector ISA to their instruction set. Vector isa is pretty much designed to do long daisy-chained instruction loops without need to rearrange execution so in-order execution of vector side is pretty obvious thing to do.
 
I am now convinced: ARM must follow Huawei's path and deliver a small out of order core.

ARM A7XX are excellent, but as for small cores won't be a good idea. Even Apple pulled that.
Maybe a nerfed A7XX called as a A6XX core could be ideal for it.
 
Sure, after ARM adds vector ISA to their instruction set. Vector isa is pretty much designed to do long daisy-chained instruction loops without need to rearrange execution so in-order execution of vector side is pretty obvious thing to do.
You mean SVE which Arm introduced years ago? And which the article linked to explicitly mentions?
 
You mean SVE which Arm introduced years ago? And which the article linked to explicitly mentions?

SVE ain't vector isa but scalable SIMD. It is designed to be OOO-friendly implementation. RV instead is full vector ISA which hardware OOO-implementation was no long ago considered pretty much impossible to implement. Seems that OOO-implementations are indeed doable but executing vector path OOO is pretty questionable as whole design is build to be extract enough parallelism from code to make wide in-order cores to work efficiently.
 
Hmm that's not too bad? At least glancing at the GB6 numbers. Still not M1 territory but let's be honest, who expected otherwise? It's beating the RK3588 pretty convincingly.
The RK3588 even wanted to be defeated, now is Raspberry which is in the situation of "can't catch up" since there are more options available and even cheaper.
And that's since Rockchip already has the 3688 in the works
 
The RK3588 even wanted to be defeated, now is Raspberry which is in the situation of "can't catch up" since there are more options available and even cheaper.
And that's since Rockchip already has the 3688 in the works
One hopes availability on the Orion O6 (and RK3688) is better than the RK3588 which was delayed for so long.
 
One hopes availability on the Orion O6 (and RK3688) is better than the RK3588 which was delayed for so long.
RK3588 was announced long before it actually went to fabs I think.

The specs they originally announced were different to what they later made.
 
Maybe if by "not long ago" you mean the 1980s. NEC has been doing out-of-order on vector computers for a while - since at least SX-9 and I believe SX-8 as well.

I mean out-of-order hardware. SX9 has vector overtake instruction on vector operations so compiler can mark non-dependent instructions to overtake long latency shuffle instructions. That's still pure ir-order vector hardware. Non-vector side of cpu could of course be OOO just like that invention in discussion suggests.

OOO hardware is there to execute instructions at rate memory reads are served. Vector ISA does that at ISA level by putting data in long daisy chained vectors that are executed sequentially, thus hardware should be able to fully utilize it's available memory bandwidth without OOO.
 
I mean out-of-order hardware. SX9 has vector overtake instruction on vector operations so compiler can mark non-dependent instructions to overtake long latency shuffle instructions. That's still pure ir-order vector hardware. Non-vector side of cpu could of course be OOO just like that invention in discussion suggests.

Nope. SX-9 can do straight-up out-of-order issue of vector ops. SX-ACE extends it further.
 
Here is a link to hotchips slides for SXAurora Vector Engine Proccessor https://old.hotchips.org/hc30/2conf/2.14_NEC_vector_NEC_SXAurora_TSUBASA_HotChips30_finalb.pdf, Anandtech did a live blog on it here https://www.anandtech.com/show/13259/hot-chips-2018-nec-vector-processor-live-blog. The blog post mentions OoO, while the slides are using OoO scheduling. You can also find SX-ACE slides here https://old.hotchips.org/wp-content...e-epub/HC26.11.110-SX-ACE-MOMOSE-NEC-v004.pdf

From high level point of view they seem similar, SX-ACE hotchips slides don't mention OoO explicitly as far as I can tell. But still Aurora seems like an evolution of ACE so they thought that adding OoO scheduling is important.
 
That is interesting, what were the OG Specs of the RK 3588?
IIRC the GPU spec changed from one more contemporary to the A76 to the G610.

I might be misremembering things though.

RockChip have an annoying tendency to be ambiguous with specs of future SoCs sometimes, like the RK3688 mentions a v9.3-A CPU core, but according to latest rumours of X930 and Ax30 they are v9.4-A ISA instead 😒
 
Back
Top