Discussion RISC V Latest Developments Discussion [No Politics]

Page 5 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.


Golden Member
Mar 3, 2017
Some background on my experience with RISC V...
Five years ago, we were developing a CI/CD pipeline for arm64 SoC in some cloud and we add tests to execute the binaries in there as well.
We actually used some real HW instances using an ARM server chip of that era, unfortunately the vendor quickly dumped us, exited the market and leaving us with some amount of frustration.
We shifted work to Qemu which turns out to be as good as the actual chips themselves, but the emulation is buggy and slow and in the end we end up with qemu-user-static docker images which work quite well for us. We were running arm64 ubuntu cloud images of the time before moving on to docker multi arch qemu images.

Lately, we were approached by many vendors now with upcoming RISC-V chips and out of curiosity I revisited the topic above.
To my pleasant surprise, running RISC-V Qemu is smooth as butter. Emulation is fast, and images from Debian, Ubuntu, Fedora are available out of the box.
I was running ubuntu cloud images problem free. Granted it was headless but I guess with the likes of Imagination Tech offering up their IP for integration, it is only a matter of time.

What is even more interesting is that Yocto/Open Embedded already have a meta layer for RISC-V and apparently T Head already got the kernel packages and manifest for Android 10 working with RISC-V.
Very very impressive for a CPU in such a short span of time. What's more, I see active LLVM, GCC and Kernel development happening.

From latest conferences I saw this slide, I can't help but think that it looks like they are eating somebody's lunch starting from MCUs and moving to Application Processors.

And based on many developments around the world, this trend seems to be accelerating greatly.
Many high profile national and multi national (e.g. EU's EPI ) projects with RISC V are popping up left and right.
Intel is now a premium member of the consortium, with the likes of Google, Alibaba, Huawei etc..
NVDA and soon AMD seems to be doing RISC-V in their GPUs. Xilinx, Infineon, Siemens, Microchip, ST, AD, Renesas etc., already having products in the pipe or already launched.
It will be a matter of time before all these companies start replacing their proprietary Arch with something from RISC V. Tools support, compiler, debugger, OS etc., are taken care by the community.
Interesting as well is that there are lots of performant implementation of RISC V in github as well, XuanTie C910 from T Head/Alibaba, SWerV from WD, and many more.
Embedded Industry already replaced a ton of traditional MCUs with RISC V ones. AI tailored CPUs from Tenstorrent's Jim Keller also seems to be in the spotlight.

Most importantly a bunch of specs got ratified end of last year, mainly accelerated by developments around the world. Interesting times.


Golden Member
Jun 13, 2013
Risc-V doesn't have fixed length instructions. Instruction length can vary from 16 to 192 bits and more. Risc-v differs from x86 how different length instructions are encoded, Risc-V encoding is aligned to 16 bits and instructions have sane encoding where instruction boundaries can be found easily.

But the base RV32I set is still 32bit and that's what general purpose CPU would be executing most the time. Having 36 bytes ( less than 6x48 bit but more than 6x32bits ) of instruction fetch to align with some SIMD instruction width would only make sense if they were building something like Fujitsu A64FX. But they are not, it's general purpose CPU focused on integer perf and not FPU/SIMD throughput.


Golden Member
Mar 4, 2011
My best guess is that it's for short branches where if you have some simple switch cases, etc. pulling in up to 17 other instructions means you might have already fetched the next one you need before you realize you need it. If the decoder is well designed for something like that it could theoretically discard fetched instructions that it knows it won't execute and keep the backend better fed.

There could be other reasons as well. Even assuming mostly sequential code, fetching more at once means fewer cache accesses. That could theoretically improve performance or even reduce power use.

No, that's fetch for decode in one cycle, the second cycle gets it's own 36bytes.

It's for instruction fusion, because RISC-V addressing modes are so simple that a lot of what fits into a single load instruction in x86 or arm needs to first do an alu op and then a load. The CPU can do very aggressive instruction fusion in such cases, which reduces the amount of instructions that need to be tracked past that point, but requires a lot of fetch bandwidth.
  • Like
Reactions: Tlh97 and Mopetar


Diamond Member
Jan 31, 2011
That makes a lot of sense. The RISC-V instructions for loads/stores would only let you offset ~1,000 bytes from an address, so a lot of memory accesses would take 2 or 3 instructions instead of just one.