Zhaoxin's ZX-F/KX-7000/KH-40000

Page 17 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

prosty_mirek

Junior Member
Nov 1, 2020
7
1
41
First signs of KH-40000. Same model (0x5b) as YongFeng.

Uses ZEN's software optimisations for fixed AVX2 implementation. Also first sign of family 8, if I read this correct.

AVX2 was previously only partially supported.

No kh-40000 mass production yet.
 

amd6502

Senior member
Apr 21, 2017
971
360
136
So this might be the big next step (5M+ views!). Integrating Analog computing cells within digital computing (either CPU or GPU) Is this already being done by any of the major CPU / GPU designers (AMD/Raja's Intel division/Apple/Nvidia/ZX-Via/IBM) ?
 

NTMBK

Lifer
Nov 14, 2011
10,023
4,433
136

Good in depth analysis of the CNS core. An impressive improvement over their previous designs, especially for such a small and poorly resourced team, but they weren't going to be competitive.
 

NostaSeronx

Diamond Member
Sep 18, 2011
3,571
1,118
136
However, it is unlikely that Zhaoxin is using the CNS-core.

Rumor mill in another language gives the KX-7000 cores/YongFeng over KX-6000:
+ AVX2 Support
+ FMA support
+ Fast P-state control (HWP?/CPPCx?/Alternative?)
+ Overclocking-state control (If overclocked this MSA register/feature would be equal to 1)
+ Zhaoxin's New Padlock variant instructions.
From CNS to KX-7000 cores/Yongfeng:
- No EVEX instructions

With extended rumors that Zhaoxin's owner has partnered up with SMIC's owner to do 8nm (N+1)/mass production or 7nm (N+2)/volume ramp right now.
Mid-2020 = 16nm TSMC for YongFeng
Early-2021 = 8nm/7nm SMIC for YongFeng
Zhaoxin should be the leading MPU at SMIC, that isn't labeled under "Smart Phones".

"N" - 14SFE/14SF+/14SF++ :: 12nm variants are available as well, via 12SF(14SF+ process) and 12SF+(14SF++ process)
"N+1" - 10SFE (HPC-track) and 8SFE (Mobile-track) -> 8/10SF+
"N+2" - 7 (no suffix) = switched from EUV-track to DUV-track with design rule set similar to GlobalFoundries 7LP, by sources close to SMIC. With 7+ being EUV with the re-instated/re-started EUV ASML agreement for 10 EUV machines.
 
Last edited:

Doug S

Golden Member
Feb 8, 2020
1,499
2,185
106
"N+2" - 7 (no suffix) = switched from EUV-track to DUV-track with design rule set similar to GlobalFoundries 7LP, by sources close to SMIC. With 7+ being EUV with the re-instated/re-started EUV ASML agreement for 10 EUV machines.
Since when has there been any change in SMIC's ability to acquire EUV machines, let alone 10 of them? Do you have a link for this?
 

NostaSeronx

Diamond Member
Sep 18, 2011
3,571
1,118
136
Since when has there been any change in SMIC's ability to acquire EUV machines, let alone 10 of them? Do you have a link for this?
No link, yet. Keep it as a rumor till proven otherwise.

Given the profit SMIC has gotten and the investment from big chip acts, SMIC has enough to splurge at ASML. With SMIC's dependence on ASML and increased sells to United States companies, the ban and the license is expected to go away/acquired "soon"-ish. Hence, the agreement for EUV machines that was lost will be re-instated/re-started. Going in super depth SMIC might have a secondary headquarters+US Fab in Illinois of all places.

7 is DUV-only at this moment. Of which Zhaoxin is the leading MPU of that node.
 
Last edited:

NostaSeronx

Diamond Member
Sep 18, 2011
3,571
1,118
136
@NostaSeronx , you mentioned also "Smart Phones", that means that Huawei wants to make mobile processors again?
Huawei via HiSilicon is already doing Kirin 710A. However, only the 710A is released and we are approaching the second year anniversary of it.

Expanding to HiSilicon rumors two chips are succeeding the 710A: One at the same market level and one at the market level above.
1. New SoC that is 710A's successor = Mid-end
2. New SoC above 710A's successor = High-end

Mid-end will probably be a refresh to 14SF++/12SF+ and the high-end SoC will probably be SMIC 7.

The rumors on architecture are extremely all over the place:
ARMv8 has been dropped or not. If it has been dropped the next SoCs are ARMv9 or RISC-V, if not it is still ARMv8.

On the RISC-V side, there is apparently something on them about licensing StarFive cores. Instead of developing their own 64-bit OoO RISC-V processor.

StarFive's RISC-V A75 competitor which is faster than A73 in 710A. }| 4.8 DMIPS/MHz-A73 to 6.6(12nm)/5.6(7nm) DMIPS/MHz-Dubhe
12nm TSMC "Dubhe" = 2 GHz <--> 7nm TSMC "Dubhe" = 3.5 GHz

Going through linkedin profiles it might be because they want to focus on finally implementing that Mobile GPU:
"We are looking for GPU Architects. The candidate will help define and model Huawei's first generation mobile GPU." <- From 2016, by the way. However, there are still people working as GPU architects: GPU/CPU/AI Director, first bit is all about the in-house GPU. Selecting Dubhe to focus on GPU might seem to be a viable explanation where the RISC-V processor went.

On IP:
14SF+ has paper specification support of DDR5-4800(2021-IP) and PCIe5.0(2021-IP). Which are likely to ported across through N+1(10SF/8SF)/N+2(7). 2022 IP hasn't been announced by home IP. Non-SMIC aka external IP is PCIe 4.0 and LPDDR5(DDR5/LPDDR4/DDR4) silicon proven though down to SMIC 7.

Allows Zhaoxin to hit all the points:
- 7nm process
- New core
- DDR5
- PCIe 4.0(at min) and PCIe 5.0(at max)

Which follows Zhaoxin's general theme:
ZX-A = TSMC 40nm
ZX-B = HLMC 40nm

ZX-C = TSMC 28nm
ZX-D = HLMC 28nm

ZX-E = TSMC 16nm
(2020 - 14nm HLMC/SMIC SoC -- Skipped)
(2021 - 7nm TSMC SoC -- Skipped)
ZX-F = SMIC 7nm

Looking through Zhaoxin's slides that weren't posted here but at 3d-center.

1.25x IPC increase + 1.38x Frequency increase = ZX-F Single-threaded and 2x cores = ZX-F Multithreaded.

16-core @ ~4.1 GHz = ZX-F

After reading though various EYE-BLEEDING articles and hunting down everything:
Q4 2019 = 14SF
Q4 2020* = 10SF
Q4 2021* = 7SF(internal name)/7(outside name)
*New Q2 2022 facility will volume ramp these nodes.

14nm HPC -> 10nm HPC == Big fixed mm2
12nm Value-orientated -> 8nm Value-orientated == Small fixed mm2
14nm Mobile/SoC -> 7nm Mobile/SoC == Shrinked mm2 <-- Zhaoxin
 
Last edited:

Kosusko

Member
Nov 10, 2019
159
117
86
The new Chinese CPU matches AMD with half the Cores and less consumption
source: https://cuba.detailzero.com/technology/44634/The-new-Chinese-CPU-matches-AMD-with-half-the-Cores-and-less-consumption.html

This is first open benchmark 16-cores ZHAOXIN Kaisheng KH-40000 Series x86 processor at 2.7GHz. The Kaisheng KH-40000 Series x86 processors (code name YONGFENG) can go up to 32 cores and 64 cores (on a dual-socket platform)!

"Houston, we have a problem"

P.S. Don't forget, YONGFENG is very "similar" Centaur CNS microarchitecture.

 
Last edited:

Thunder 57

Platinum Member
Aug 19, 2007
2,079
2,626
136
The new Chinese CPU matches AMD with half the Cores and less consumption
source: https://cuba.detailzero.com/technology/44634/The-new-Chinese-CPU-matches-AMD-with-half-the-Cores-and-less-consumption.html

This is first open benchmark 16-cores ZHAOXIN Kaisheng KH-40000 Series x86 processor at 2.7GHz. The Kaisheng KH-40000 Series x86 processors (code name YONGFENG) can go up to 32 cores and 64 cores (on a dual-socket platform)!

"Houston, we have a problem"

P.S. Don't forget, YONGFENG is very "similar" Centaur CNS microarchitecture.

That reads like a propaganda article with plenty of suppositions with no real facts or data. Matched EPYC 7601? Even if true, congratulations, you are just five years behind.
 

Markfw

CPU Moderator, VC&G Moderator, Elite Member
Super Moderator
May 16, 2002
24,102
13,155
136
That reads like a propaganda article with plenty of suppositions with no real facts or data. Matched EPYC 7601? Even if true, congratulations, you are just five years behind.
Yes, and the 7601 runs @2.6 best case, maybe lower under load. And it uses 2666 memory at the highest speed, maybe not even officially that. Yes, its a joke. Welcome to 7-8 years ago. (since I am sure this was not a production model, 7601 ES was available well before 2017.)
 
  • Like
Reactions: Tlh97 and NTMBK

NostaSeronx

Diamond Member
Sep 18, 2011
3,571
1,118
136
KH-40000 is based off LuJiaZui.
KX-7000/KH-50000 is based off YongFeng.

LuJiaZui-style architectures has a target for ~30 SpecInt.
YongFeng brand new architecture is targeting ~50 SpecInt.

specintzhao.jpg

For it to be YongFeng it would need to have a 4 GHz clock + ~50 SpecInt for this specific segment. There shouldn't be a downclocked version as YongFeng has boost clocks. YongFeng shares nothing with CNS, learning period for Zhaoxin's Core Design team is over. This is the first grounds-up architecture they will be making.

KH40000/KX6000G = Same die
KX6000G = 1x Quad-core LuJiaZui
KH40000 = 4x Quad-core LuJiaZui
zhaoxin16nm.png

KX6000G is the insert replacing the majority of products which are using the quad-core dies.
kxrefresh.png

KX-7000 will be inserted above 8-core segment and be 4 GHz and 16-cores and KH-50000 in the same config as 16-core (Dual-die/Dual-package; Max KH40000 is 16-cores) will have 64-cores at 4 GHz.

LuJiaZui Refresh this year => YongFeng next year.

LuJiaZui refresh was originally slated for 2021 at TSMC, hence the 16nm TSMC. The delay is apparently because they switched from the Island fab(TSMC) to the Mainland fab(HLMC/SMIC). SMG is the largest shareholder of SMIC's FinFET Fab and Zhaoxin, which might be the biggest reason.

2019 Annual Report: "SMIC has built a robust foundation for FinFET and R&D execution, and as a result, the more advanced generations FinFET technology development are progressing much faster than previous FinFET nodes. SMIC established multiple specialty 14/12nm platforms, N+1 has made steady R&D progress and now in customer engagement and product qualification stage."
N+1 = 8/10 nodes here => https://www.innosilicon.com/html/selector/?foundry=SMIC

2020 Annual Report:
YongFeng - KX7000/KH50000
"The second generation of FinFET technology adopts SAQP to form a fin structure for the first time to meet the needs of a smaller size structure. Compared with the previous generation technology, the density of transistors per unit area is greatly improved. At present, SMIC’s second-generation FinFET technology has completed low-voltage process development, which can provide 0.33V/0.35V low-voltage usage requirements, and has entered risk production."
Second-generation FinFETs = 7 nodes here => https://www.innosilicon.com/html/selector/?foundry=SMIC

Where as the KH40000/KX6000G refresh are based on;
"FinFET has gradually improved to enter mature mass production, and the product yield has reached industry standards. The development of multiple derivative platforms has been completed as planned, and the goal of diversifying mass production products has been achieved. The advanced version of the first-generation FinFET technology further optimizes device performance, improves integration, and achieves the goal of chip performance improvement"
smicsf14++.png
Likely to be SMIC 14++ (SFE -> SF+ -> SF++)

2021 Annual Report: SMIC goes dark... doesn't talk about 14nm/10nm/7nm again.
However, even though SMIC went dark... customers didn't and 7nm SMIC is Mass Production status in Q3 2022.
 
Last edited:
  • Like
Reactions: wugang530

DrMrLordX

Lifer
Apr 27, 2000
20,498
9,582
136
It's not really competitive with anything on the market in the West. And by all indications it would be prohibitively expensive to get one outside of China. Inside China it may be a halfway-decent product for those who can't get what they want thanks to tariffs and embargoes, if it's cheap enough.
 
  • Like
Reactions: Tlh97

prosty_mirek

Junior Member
Nov 1, 2020
7
1
41
Process, clock and IPC of KH-40000 are sameish as CNS.
From patches we know that KH-40000 and Yongfeng, are the same model and have ID of Centaur.
Looks like Zhaoxin cutted Ncore (AI), and pack more L1 cache per core and 2x cores per socket.
 

Kosusko

Member
Nov 10, 2019
159
117
86
Yeap.

Zhaoxin has refined and improved the Centaur CNS microarchitecture up to 16 cores in two die up to 32 cores as well and of course for dual socket implementation up to 64 cores (2x 32C).

NCORE was a Glenn Henry project that was "glued" to the CHA SoC with Centaur CNS cores.
Allegedly Zhaoxin does not own NCORE intellectual property.
 
Last edited:

prosty_mirek

Junior Member
Nov 1, 2020
7
1
41
Yeap.

Zhaoxin has refined and improved the Centaur CNS microarchitecture up to 16 cores in two die up to 32 cores as well and of course for dual socket implementation up to 64 cores (2x 32C).
My bad. I was mean exactly that.

NCORE was a Glenn Henry project that was "glued" to the CHA SoC with Centaur CNS cores.
Allegedly Zhaoxin does not own NCORE intellectual property.
Zhaoxin owns patents that Henry invented.
https://patents.google.com/?q=Neural&assignee=Zhaoxin&language=ENGLISH&num=100
Rumor has it, that Ncore was very difficult to program.
 

dark zero

Platinum Member
Jun 2, 2015
2,623
118
106
@NostaSeronx , one question, is SMIC goes 7nm, is likely that Huawei might release at least a succesor of the Kirin 710A and Kirin 820 with that node?
 

NostaSeronx

Diamond Member
Sep 18, 2011
3,571
1,118
136
@NostaSeronx , one question, is SMIC goes 7nm, is likely that Huawei might release at least a succesor of the Kirin 710A and Kirin 820 with that node?
The first products will be for the Personal Computing market.
kunpengpc.png
Kirin laptops will be replaced by Kunpeng laptops.

PC-market -> Server-market -> Mobile-market -> Embedded-market

These products will be fabbed at the 7nm 25k wpm SMIC-fab;
PC 1st (2022+) -> PC 2nd (2024)
These products will be fabbed at joint-owned SMIC/Huawei Fab (Which is also the Big Collab IC Fab as well):
Kunpeng Server 1st (2023+) -> Kunpeng Server 2nd (2025)
Kirin 1st (2024+) -> Kirin 2nd (2026)
etc.

SMIC's 7nm Arch-spread:
7nm x86 => Government/Enterprise (Legacy)
7nm ARM => General Public
7nm RISC-V => HPC/Cloud
7nm LoongArch => Government/Education/Secure Applications

However, Huawei has been re-organized with Kunpeng PC being the major focus.

Kunpeng PC (1st launch)
8? Custom ARM cores + Integrated Custom GPU -> [>10?]

Kunpeng WS (2nd launch)
32? Custom ARM Cores + 1/2? Discrete Custom GPU -> [>40?]

Kunpeng SV (3rd launch)
64? Custom ARM Cores + 3?/4? Discrete Custom GPUs -> [>80]

Huawei's Custom ARM vs Zhaoxin x86

I don't know if they will improve the TaiShan cores once they shift over to SMIC 7nm, btw.
They might skip over Kunpeng 920's core and go straight to 930's core. Thus, SVE and SMT2 becomes available.
 
Last edited:

dark zero

Platinum Member
Jun 2, 2015
2,623
118
106
So...
Kirin line is dead and the Kunpeng line will be the succesor?
If that so...
What they are gonna pull in order to stay at least relevant in the mid range?
Using ARM A78 cores in the low and mid range?
 

Kosusko

Member
Nov 10, 2019
159
117
86
The LGA KH-40000 processor is between the BGAs CPUs in the banner.

In the middle is the KH-40000
On the left is the KX-6000G
To the right is the KX-6000 / KH-30000
etc.



source: https://news.mydrivers.com/1/841/841373.htm

It is already a large family of x86 processors. The first is that many are already obsolete.
 

gruffi

Member
Nov 28, 2014
28
91
91
Do I see correctly? They compare single core spec2006 and claim they match AMD with half the cores and power consumption / TDP? That's pure clickbait. According to the spec2006 scores such a Zhaoxin core offers less than half the performance of a current AMD or Intel core (w/ SMT). So no, this is nowhere near any competition to Intel in professional markets. Leave alone AMD.
 
  • Like
Reactions: Tlh97 and dark zero

Kosusko

Member
Nov 10, 2019
159
117
86
New KH40000 results:

2 Processors, 32 Cores
ZHAOXIN KaiSheng KH-40000/16
source: https://browser.geekbench.com/v5/cpu/15706425

CentaurHauls Family 6 Model 71 Stepping 2 vs CentaurHauls Family 7 Model 11 Stepping 3
CNS vs KH-40000
L1 Instruction Cache is 64KB per core and this is differend from Centaur CNS
L3 Cache is 8MB per eight core cluster and this is differend from Centaur CNS with 16MB L3 cache per eight core
source: https://browser.geekbench.com/v5/cpu/compare/12878360?baseline=15706425
 
Last edited:

ASK THE COMMUNITY