Are these two the ones in the bottom of that list?![]()
Elbrus-8S - Wikipedia
en.wikipedia.org
Description | Cores | Frequency (MHz) | Prod year | TDP | Process (nm) | GFlops (double) | ISA generation | Model name | Alternative name |
---|---|---|---|---|---|---|---|---|---|
Cpu Elbrus | 1 | 300 | 2005 | 6 | 130 | 2.4 | (E2Kv1) | - | e3m |
SoC Elbrus | 1 | 300 | 2007 | 6 | 130 | 2.4 | (E2Kv2) | - | e3s |
SoC Elbrus-S | 1 | 500 | 2010 | 13 | 90 | 4 | elbrus-v2 | elbrus-s | Экскурсовод, Эльбрус-3S, Elbrus-2C1 |
SoC Elbrus-2C+DSP | 2 (+4 DSP cores) | 500 | 2011 | 25 | 90 | 8 | elbrus-v2 | elbrus-2c+ | Кубик, Elbrus-2C2, Elbrus-S2 |
SoC Elbrus-4C | 4 | 800 | 2014 | 60 | 65 | 25.6 | elbrus-v3 | elbrus-4c | e2s, Эльбрус-2S |
SoC Elbrus-8C | 8 | 1300 | 2016 | 80 | 28 | 124.8 | elbrus-v4 | elbrus-8c | e8c, Процессор-1, Эльбрус-4C+ |
SoC Elbrus-1C+ | 1 | 1000 | 2015 | 10 | 40 | 12 | elbrus-v4 | elbrus-1c+ | e1c+, e1cp, e1c, Процессор-2 |
SoC Elbrus-8CB | 8 | 1500 | 2018 | 90 | 28 | 288 | elbrus-v5 | elbrus-8c2 | e8c2, p9, Процессор-9 |
SoC Elbrus-16C | 16 | 2000 | 2021 | 110-130 | 16 | 768 | elbrus-v6 | elbrus-16c | |
SoC Elbrus-2C3 | 2 | 2000 | 2021 | 10-15 | 16 | 96 | elbrus-v6 | elbrus-2c3 | |
SoC Elbrus-12C | 12 | 2000 | 2022 | 85-90 | 16 | 576 | elbrus-v6 | elbrus-12c | |
SoC Elbrus-32C | 32 | 2500? | 2025/2026? | ? | 7 | 1920 | elbrus-v7 | elbrus-32c? | Процессор-21(СРВ) |
####################################################
getDetails and MHz
Assembler CPUID and RDTSC
CPU , Features Code 00000000, Model Code 00000000
Measured - Minimum 0 MHz, Maximum 0 MHz
Linux Functions
get_nprocs() - CPUs 8, Configured CPUs 8
get_phys_pages() and size - RAM Size 30.94 GB, Page Size 4096 Bytes
uname() - Linux, sumireko, 4.19.0-0.3-e8c2
#1 SMP Sat Jan 18 09:49:15 GMT 2020, e2k
##############################################
64 Bit MP SSE MFLOPS Benchmark 1, 8 Threads, Sat Mar 21 14:13:55 2020
Test 4 Byte Ops/ Repeat Seconds MFLOPS First All
Words Word Passes Results Same
Data in & out 102400 2 20000 0.050556 81019 0.620974 Yes
Data in & out 1024000 2 2000 0.045250 90519 0.942935 Yes
Data in & out 10240000 2 200 0.446652 9170 0.994032 Yes
Data in & out 102400 8 20000 0.066665 245766 0.749971 Yes
Data in & out 1024000 8 2000 0.065821 248916 0.965360 Yes
Data in & out 10240000 8 200 0.465831 35172 0.996409 Yes
Data in & out 102400 32 20000 0.177363 369501 0.498060 Yes
Data in & out 1024000 32 2000 0.174700 375134 0.910573 Yes
Data in & out 10240000 32 200 0.464857 140981 0.990447 Yes
End of test Sat Mar 21 14:13:57 2020
Press Enter
The wiki says 8SV has 576 vs the 8S ( 250? ) and the only thing was 1500/1300mhz and DDR4-2400 vs DDR3-1600?How calculate FLOPS (v1 .. v3):
How calculate FLOPS (v4):
- Single Precision: 4 FP ALUs * 4 Single operation * Cores * Frequency
- Double Precision: 4 FP ALUs * 2 Double operation * Cores * Frequency
How calculate FLOPS (v5+ [128 bit SIMD]):
- Single Precision: 6 FP ALUs * 4 Single operation * Cores * Frequency
- Double Precision: 6 FP ALUs * 2 Double operation * Cores * Frequency
Example for Elbrus-16C: 6 ALUs * 2 DP * 2 * 16 Cores * 2e10 = 7.68e11 --> 768 GFlops
- Single Precision: 6 FP ALUs * 4 Single operation * 2 SIMD * Cores * Frequency
- Double Precision: 6 FP ALUs * 2 Double operation * 2 SIMD * Cores * Frequency
The wiki says 8SV has 576 vs the 8S ( 250? ) and the only thing was 1500/1300mhz and DDR4-2400 vs DDR3-1600?
#include <stdio.h>
#include <math.h>
int * calculate_mx4(int * x, int * y, int * z, int a, int b, int c, const int size) {
int * res = (int *)calloc(size, sizeof(int));
#pragma loop count(4)
for(int i = 0; i < size; i++) {
res[i] = a * x[i] + b * y[i] / 2 + z[i] * c;
}
return res;
}
double * calculate_fx4(double * x, double * y, double * z, double a, double b, double c, const int size) {
double * res = (double *)calloc(size, sizeof(double));
for(int i = 0; i < size; i++) {
res[i] = a * x[i] + b * y[i] / 1.99 + z[i] * c;
}
return res;
}
int main() {
int x[] = { 1, -1, 1, -1 };
int y[] = { 1, 2, 3, 4 };
int z[] = { 8, 4, 2, 1 };
int * res = calculate_mx4(x, y, z, 7, 8, 6, 4);
printf("%d %d %d %d\n", res[0], res[1], res[2], res[3]);
double fx[] = { 1.0, -1.0, 1.0, -1.0 };
double fy[] = { 1.0, 2.0, 3.0, 4.0 };
double fz[] = { 8.0, 4.0, 2.0, 1.0 };
double * fres = calculate_fx4(fx, fy, fz, 7.0, 8.0, 6.0, 4.0);
printf("%f %f %f %f\n", fres[0], fres[1], fres[2], fres[3]);
return 0;
}
calculate_mx4(int*, int*, int*, int, int, int, int):
{
setwd wsz = 0xd, nfx = 0x1, dbl = 0x0
setbn rsz = 0x3, rbs = 0x9, rcur = 0x0
disp %ctpr2, calloc
addd,4,sm 0x0, %dr0, %dr17
}
{
cmplsb,0 0x0, %r6, %pred2
sxt,2 0x2, %r6, %db[0]
addd,5 0x4, 0x0, %db[1]
}
{
getsp,0 _f32s,_lts0 0xfffffff0, %dr8
}
{
nop 1
cmplsb,0,sm 0x2, %r6, %pred1 ? %pred2
cmplsb,1 0x1, %r6, %pred0 ? %pred2
addd,2,sm 0x8, 0x0, %dr14 ? %pred2
adds,3,sm 0x2, 0x0, %r15 ? %pred2
addd,4,sm 0x4, 0x0, %dr9 ? %pred2
addd,5 0x0, 0x0, %dr13 ? %pred2
}
{
call %ctpr2, wbs = 0x9
}
{
return %ctpr3
ldw,0,sm %dr1, 0x0, %g16
ldw,2,sm %dr2, 0x0, %g17
ldw,3 %dr0, 0x0, %g18 ? %pred2
addd,4 0x0, %db[0], %dr16
}
{
disp %ctpr1, .L345
addd,4 0x0, %dr16, %dr0 ? ~%pred2
}
{
ct %ctpr3 ? ~%pred2
}
{
nop 1
return %ctpr3
}
{
nop 5
muls,0,sm %r4, %g16, %g16
muls,1,sm %g17, %r5, %g17
muls,3,sm %r3, %g18, %g18
}
{
getfs,0,sm %g16, _f16s,_lts0lo 0x5f, %g19
adds,1 0x0, %g17, %r11 ? %pred2
adds,2 0x0, %g18, %r12 ? %pred2
}
{
adds,0,sm %g16, %g19, %g16
}
{
sars,0,sm %g16, 0x1, %r10 ? %pred2
}
.L345:
{
adds,0 %r12, %r10, %g16
addd,3 0x0, %dr16, %dr0 ? ~%pred0
pass %pred0, @p0
andp @p0, @p0, @p4
pass @p4, %pred2
}
{
adds,0 %g16, %r11, %g16
}
{
stw,2 %dr16, %dr13, %g16
ldw,5,sm %dr2, %dr9, %r11
}
{
ct %ctpr3 ? ~%pred2
ldw,0,sm %dr1, %dr9, %r8
ldw,2,sm %dr17, %dr9, %r10
}
{
adds,0,sm %r15, 0x1, %g16
addd,1,sm 0x0, %dr9, %dr13
addd,2,sm 0x0, %dr14, %dr9
addd,3,sm 0x4, %dr14, %dr14
pass %pred1, @p0
andp @p0, @p0, @p4
pass @p4, %pred0
}
{
cmplsb,0,sm %g16, %r6, %pred2
adds,1,sm 0x0, %g16, %r15
}
{
pass %pred2, @p0
andp @p0, @p0, @p4
pass @p4, %pred1
}
{
muls,3 %r11, %r5, %g16
}
{
nop 4
muls,0 %r4, %r8, %g17
muls,1 %r3, %r10, %g18
}
{
adds,0,sm 0x0, %g16, %r11
}
{
getfs,0 %g17, _f16s,_lts0lo 0x5f, %g16
adds,1,sm 0x0, %g18, %r12
}
{
adds,0 %g17, %g16, %g16
}
{
sars,0 %g16, 0x1, %g16
}
{
ct %ctpr1
adds,0,sm 0x0, %g16, %r10
}
calculate_fx4(double*, double*, double*, double, double, double, int):
{
setwd wsz = 0x10, nfx = 0x1, dbl = 0x1
setbn rsz = 0x3, rbs = 0xc, rcur = 0x0
disp %ctpr2, calloc
getsp,0 _f32s,_lts1 0xfffffff0, %dr8
addd,1,sm 0x0, %dr5, %dr5
addd,2,sm 0x0, %dr3, %dr3
addd,3,sm 0x0, %dr4, %dr4
addd,4,sm 0x0, %dr2, %dr2
addd,5,sm 0x0, %dr1, %dr1
}
{
nop 1
cmplsb,0,sm 0x0, %r6, %pred0
sxt,1 0x2, %r6, %db[0]
addd,2 0x8, 0x0, %db[1]
addd,3,sm 0x0, %dr0, %dr10
}
{
merges,0,sm 0x1, %r6, %r11, %pred0
}
{
subs,0,sm %r11, 0x1, %r8
}
{
call %ctpr2, wbs = 0xc
cmplsb,0,sm %r8, _f16s,_lts0lo 0x60, %pred1
}
{
return %ctpr3
ldd,0,sm %dr1, 0x0, %dg16
ldd,2,sm %dr1, 0x8, %dg17
ldd,3,sm %dr1, _f16s,_lts0lo 0x10, %dg18
addd,4 0x0, %db[0], %dr6
ldd,5,sm %dr1, _f16s,_lts0hi 0x18, %dg19
}
{
disp %ctpr1, .L933
ldd,0,sm %dr1, _f16s,_lts0lo 0x28, %dr15
merges,1,sm %r8, _f16s,_lts0hi 0x60, %g20, ~%pred1
ldd,2,sm %dr1, _f16s,_lts1lo 0x20, %dg21
cmpledb,3,sm %dr6, %dr1, %pred1
addd,4 0x0, %dr6, %dr0 ? ~%pred0
ldd,5,sm %dr1, _f16s,_lts1hi 0x30, %dr14
}
{
ct %ctpr3 ? ~%pred0
cmpledb,0,sm %dr6, %dr0, %pred2
sxt,1,sm 0x2, %g20, %dg20
ldd,2,sm %dr1, _f16s,_lts0lo 0x38, %dr13
subd,3,sm %dr6, 0x8, %dg23
subd,4,sm %dr6, 0x8, %dg22
ldd,5,sm %dr1, _f16s,_lts0hi 0x40, %dr12
}
{
shld,0,sm %dg20, 0x3, %dg20
cmpledb,1,sm %dr6, %dr2, %pred3
}
{
addd,0,sm 0x8, %dg20, %dg20
}
{
addd,0,sm %dg20, %dr1, %dg24
addd,1,sm %dg20, %dr0, %dg25
addd,2,sm %dg20, %dr2, %dg20
fmuld,3,sm %dr4, %dg16, %dr20
fmuld,4,sm %dr4, %dg17, %dr19
fmuld,5,sm %dr4, %dg18, %dr18
}
{
cmpledb,0,sm %dg24, %dg22, %pred4
cmpledb,1,sm %dg25, %dg23, %pred5
fmuld,2,sm %dr4, %dg19, %dr17
addd,3,sm 0x0, _f64,_lts0 0x3fffd70a3d70a3d7, %dr0
fmuld,4,sm %dr4, %dg21, %dr16
}
{
cmpledb,0,sm %dg20, %dg22, %pred4
pass %pred4, @p0
pass %pred1, @p1
landp ~@p0, ~@p1, @p4
pass @p4, %pred1
pass %pred5, @p2
pass %pred2, @p3
landp ~@p2, ~@p3, @p5
pass @p5, %pred2
}
{
pass %pred0, @p0
pass %pred2, @p1
landp @p0, ~@p1, @p4
pass @p4, %pred0
pass %pred1, @p2
landp @p4, ~@p2, @p5
pass @p5, %pred1
}
{
nop 2
pass %pred1, @p0
pass %pred4, @p1
landp @p0, @p1, @p4
pass @p4, %pred0
landp @p0, ~@p1, @p5
pass @p5, %pred1
pass %pred3, @p2
landp @p5, @p2, @p6
pass @p6, %pred2
}
{
ct %ctpr1 ? %pred0
}
{
ct %ctpr1 ? %pred2
}
{
setwd wsz = 0x35, nfx = 0x1, dbl = 0x1
setbn rsz = 0x28, rbs = 0xc, rcur = 0x0
disp %ctpr1, .L571
addd,0 0x0, 0x0, %dg16
addd,3 0x0, _f64,_lts1 0x3fffd70a3d70a3d7, %dr0
}
{
addd,0 0x0, _f64,_lts0 0x20ff2000000000, %dg17
aaurwd,2 %dr6, %aad0
addd,3,sm 0x0, 0x0, %db[32]
}
{
insfd,0 %dg17, _f32s,_lts1 0x8800, %dr11, %dg17
aaurwd,2 %dg16, %aasti1
addd,3,sm %db[32], _f16s,_lts0lo 0x10, %dg19
addd,4,sm 0x8, %db[32], %dg18
addd,5,sm %db[32], _f16s,_lts0hi 0x18, %dg20
}
{
ldd,0,sm %dr10, %db[32], %db[60], mas=0x4
addd,1,sm %db[32], _f16s,_lts0lo 0x30, %dg22
addd,2,sm %db[32], _f16s,_lts1hi 0x38, %dg23
ldd,3,sm %dr1, %db[32], %db[79], mas=0x4
addd,4,sm %db[32], _f16s,_lts0hi 0x20, %dg16
addd,5,sm %db[32], _f16s,_lts1lo 0x28, %dg21
}
{
ldd,0,sm %dr1, %dg19, %db[75], mas=0x4
addd,1,sm %db[32], _f16s,_lts0lo 0x50, %dg26
addd,2,sm %db[32], _f16s,_lts1hi 0x58, %dg27
ldd,3,sm %dr1, %dg18, %db[77], mas=0x4
addd,4,sm %db[32], _f16s,_lts0hi 0x40, %dg24
addd,5,sm %db[32], _f16s,_lts1lo 0x48, %dg25
}
{
ldd,0,sm %dr1, %dg16, %db[71], mas=0x4
addd,1,sm %db[32], _f16s,_lts0lo 0x60, %dg28
addd,2,sm %db[32], _f16s,_lts0hi 0x70, %dg30
ldd,3,sm %dr1, %dg20, %db[73], mas=0x4
addd,4,sm %db[32], _f16s,_lts1lo 0x68, %dg29
addd,5,sm %db[32], _f16s,_lts1hi 0x78, %db[2]
}
{
ldd,0,sm %dr10, %dg18, %db[58], mas=0x4
addd,1,sm 0x0, %dg18, %db[30]
addd,2,sm 0x0, %dg19, %db[28]
ldd,3,sm %dr1, %dg21, %db[69], mas=0x4
addd,4,sm 0x0, %dg20, %db[26]
addd,5,sm 0x0, %dg16, %db[24]
}
{
ldd,0,sm %dr10, %dg19, %db[56], mas=0x4
addd,1,sm 0x0, %dg21, %db[22]
addd,2,sm 0x0, %dg22, %db[20]
ldd,3,sm %dr1, %dg22, %db[67], mas=0x4
addd,4,sm 0x0, %dg23, %db[18]
addd,5,sm 0x0, %dg24, %db[16]
}
{
ldd,0,sm %dr1, %dg24, %db[63], mas=0x4
addd,1,sm 0x0, %dg25, %db[14]
addd,2,sm 0x0, %dg26, %db[12]
ldd,3,sm %dr1, %dg23, %db[65], mas=0x4
fmuld,4,sm %dr4, %db[79], %dg31
addd,5,sm 0x0, %dg27, %db[10]
}
{
ldd,0,sm %dr1, %dg25, %db[61], mas=0x4
addd,1,sm 0x0, %dg28, %db[8]
addd,2,sm 0x0, %dg29, %db[6]
fmuld,3,sm %dr4, %db[75], %dg24
fmuld,4,sm %dr4, %db[77], %dg23
addd,5,sm 0x0, %dg30, %db[4]
}
{
ldd,0,sm %dr2, %db[32], %db[72], mas=0x4
fmuld,3,sm %dr4, %db[71], %dr8
fmuld,4,sm %dr4, %db[73], %dg25
}
{
ldd,0,sm %dr2, %dg18, %db[70], mas=0x4
fmuld,4,sm %dr4, %db[69], %dr11
}
{
ldd,0,sm %dr2, %dg19, %db[68], mas=0x4
ldd,3,sm %dr10, %dg20, %db[54], mas=0x4
fmuld,4,sm %dr4, %db[67], %dg19
fdivd,5,sm %dg31, %dr0, %dg18
}
{
ldd,0,sm %dr10, %dg16, %db[52], mas=0x4
ldd,3,sm %dr10, %dg21, %db[50], mas=0x4
fmuld,4,sm %dr4, %db[65], %dg21
fmuld,5,sm %dr4, %db[63], %dg31
}
{
ldd,0,sm %dr1, %dg27, %db[57], mas=0x4
ldd,3,sm %dr1, %dg26, %db[59], mas=0x4
fmuld,4,sm %dr4, %db[61], %dg26
fdivd,5,sm %dg23, %dr0, %dg23
}
{
rwd,0 %dg17, %lsr
ldd,3,sm %dr2, %dg20, %db[66], mas=0x4
}
{
ldd,0,sm %dr2, %dg16, %db[64], mas=0x4
ldd,3,sm %dr10, %dg22, %db[48], mas=0x4
fdivd,5,sm %dg24, %dr0, %dg17
}
{
ldd,0,sm %dr1, %dg28, %db[55], mas=0x4
ldd,3,sm %dr1, %dg29, %db[53], mas=0x4
}
{
ldd,0,sm %dr1, %dg30, %db[51], mas=0x4
fdivd,5,sm %dg25, %dr0, %dg16
}
{
fmuld,0,sm %dr4, %db[59], %db[5]
fmuld,1,sm %dr4, %db[57], %db[3]
}
{
nop 1
fdivd,5,sm %dr8, %dr0, %dg20
}
{
nop 1
fdivd,5,sm %dr11, %dr0, %dg22
}
{
nop 1
fdivd,5,sm %dg19, %dr0, %db[35]
}
{
nop 1
fmul_addd,3,sm %dr3, %db[60], %dg18, %db[44]
fdivd,5,sm %dg21, %dr0, %db[33]
}
{
nop 1
fmul_addd,3,sm %dr3, %db[58], %dg23, %db[42]
fdivd,5,sm %dg31, %dr0, %db[31]
}
{
nop 1
fmul_addd,3,sm %dr3, %db[56], %dg17, %db[40]
fdivd,5,sm %dg26, %dr0, %db[29]
}
{
nop 1
fmul_addd,3,sm %dr3, %db[54], %dg16, %db[38]
}
{
nop 1
fmul_addd,3,sm %dr3, %db[52], %dg20, %db[36]
fmul_addd,4,sm %db[72], %dr5, %db[44], %db[80]
}
{
nop 1
fmul_addd,3,sm %dr3, %db[50], %dg22, %db[34]
fmul_addd,4,sm %db[70], %dr5, %db[42], %db[78]
}
{
fmul_addd,3,sm %db[68], %dr5, %db[40], %db[76]
}
.L571:
{
loop_mode
rbranch .L1495
ldd,0,sm %dr10, %db[18], %db[46], mas=0x4 ? %pcnt7
fmuld,1,sm %dr4, %db[55], %db[1]
ldd,2 %dr2, %db[32], %db[72], mas=0x3 ? %pcnt0
fdivd,5,sm %db[5], %dr0, %db[27]
}
.L1511:
{
loop_mode
rbranch .L1498
ldd,0,sm %dr2, %db[22], %db[62], mas=0x4 ? %pcnt5
addd,1,sm 0x8, %db[2], %db[0]
ldd,2 %dr1, %db[32], %db[79], mas=0x3 ? %pcnt0
ldd,3,sm %dr1, %db[2], %db[49], mas=0x4
fmul_addd,4,sm %db[66], %dr5, %db[38], %db[74]
ldd,5 %dr10, %db[32], %db[60], mas=0x3 ? %pcnt0
}
.L1508:
{
loop_mode
alc alcf=1, alct=1
abn abnf=1, abnt=1
ct %ctpr1 ? %NOT_LOOP_END
fmul_addd,4,sm %dr3, %db[48], %db[35], %db[32]
staad,5 %db[80], %aad0[ %aasti1 ]
incr,5 %aaincr0
}
{
setwd wsz = 0x10, nfx = 0x1, dbl = 0x1
setbn rsz = 0x3, rbs = 0xc, rcur = 0x0
disp %ctpr1, .L421
adds,0 0x0, 0x0, %g16
addd,1 0x0, 0x0, %dg17
}
{
return %ctpr3
mmurw,2 %dg17, %dam_inv
}
{
nop 3
aaurw,2 %g16, %aabf0
}
{
ct %ctpr1
}
.L933:
{
setwd wsz = 0x28, nfx = 0x1, dbl = 0x1
setbn rsz = 0x1b, rbs = 0xc, rcur = 0x0
ldisp %ctpr2, .L1133
addd,0 0x0, _f64,_lts1 0x20492000000000, %dg17
fmuld,1,sm %dr4, %dr14, %dg20
fmuld,2,sm %dr4, %dr15, %dg21
fmuld,3,sm %dr4, %dr12, %dg18
fmuld,4,sm %dr4, %dr13, %dg19
fdivd,5,sm %dr16, %dr0, %dg16
}
{
disp %ctpr1, .L608
insfd,0 %dg17, _f32s,_lts0 0x8800, %dr11, %dg17
addd,1,sm 0x0, %dr10, %dg22
addd,2,sm 0x0, %dr1, %dg24
addd,3,sm 0x0, %dr2, %dg26
addd,4,sm 0x0, 0x0, %dg23
addd,5,sm 0x0, 0x0, %dg25
}
{
return %ctpr3
rwd,0 %dg17, %lsr
aaurwd,2 %dr6, %aad3
fdivd,5,sm %dr17, %dr0, %dg28
}
{
ldd,0,sm %dr10, _f16s,_lts0hi 0x20, %dg29
addd,1,sm 0x0, 0x0, %dg27
addd,2,sm %dg26, _f16s,_lts0lo 0xa8, %dg26
addd,3,sm %dg22, _lit16_ref,_lts0lo 0xa8, %dg22
addd,4 0x0, 0x0, %dg17
addd,5,sm %dg24, _lit16_ref,_lts0lo 0xa8, %dg24
}
{
ldd,0,sm %dr10, _f16s,_lts0hi 0x10, %dr8
ldd,2,sm %dr10, 0x8, %dr11
ldd,3,sm %dr10, _f16s,_lts0lo 0x18, %dg31
fdivd,5,sm %dr18, %dr0, %dg30
}
{
aaurwq,2 %qg26, %aad0
}
{
ldd,0,sm %dr10, 0x0, %dg17
aaurwd,2 %dg17, %aasti1
fdivd,5,sm %dr19, %dr0, %dg26
}
{
aaurwq,2 %qg22, %aad2
}
{
ldd,0,sm %dr1, _f16s,_lts0lo 0x58, %dg27
ldd,2,sm %dr1, _f16s,_lts1lo 0x50, %dr12
ldd,3,sm %dr1, _f16s,_lts0hi 0x60, %dg23
fdivd,5,sm %dr20, %dr0, %dg22
}
{
aaurwq,2 %qg24, %aad1
}
{
bap
ldd,0,sm %dr1, _f16s,_lts0lo 0x78, %dg25
ldd,2,sm %dr1, _f16s,_lts1lo 0x70, %dr13
ldd,3,sm %dr1, _f16s,_lts0hi 0x48, %dg24
fdivd,5,sm %dg18, %dr0, %dg18
}
{
ldd,0,sm %dr2, _f16s,_lts0lo 0x20, %dr15
ldd,2,sm %dr2, _f16s,_lts1lo 0x18, %dr16
ldd,3,sm %dr1, _f16s,_lts0hi 0x68, %dr14
}
{
ldd,0,sm %dr2, 0x8, %dr18
ldd,2,sm %dr2, 0x0, %dr19
ldd,3,sm %dr2, _f16s,_lts0lo 0x10, %dr17
fdivd,5,sm %dg19, %dr0, %dg19
}
{
ldd,0,sm %dr10, _f16s,_lts0lo 0x38, %dr21
fmuld,1,sm %dr4, %dr12, %dr12
ldd,2,sm %dr10, _f16s,_lts1lo 0x30, %dr22
ldd,3,sm %dr10, _f16s,_lts0hi 0x40, %dr20
fmuld,4,sm %dr4, %dg23, %dg23
fmuld,5,sm %dr4, %dg27, %dg27
}
{
ldd,0,sm %dr1, _f16s,_lts0lo 0x80, %dr23
ldd,2,sm %dr1, _f16s,_lts1lo 0xa0, %db[50]
ldd,3,sm %dr10, _f16s,_lts0hi 0x28, %dg29
fmul_addd,4,sm %dr3, %dg29, %dg16, %dg16
fdivd,5,sm %dg20, %dr0, %dg20
}
{
ldd,0,sm %dr10, _f16s,_lts0lo 0xa0, %db[3]
fmuld,1,sm %dr4, %dr13, %dr13
ldd,2,sm %dr2, _lit16_ref,_lts0lo 0xa0, %db[2]
ldd,3,sm %dr1, _f16s,_lts0hi 0x98, %db[52]
fmuld,4,sm %dr4, %dg24, %dg24
fmuld,5,sm %dr4, %dg25, %dg25
}
{
ldd,0,sm %dr10, _f16s,_lts0lo 0x98, %db[5]
fmuld,1,sm %dr4, %dr14, %dg31
ldd,2,sm %dr2, _lit16_ref,_lts0lo 0x98, %db[4]
ldd,3,sm %dr1, _f16s,_lts0hi 0x90, %db[54]
fmul_addd,4,sm %dr3, %dg31, %dg28, %dg28
fdivd,5,sm %dg21, %dr0, %dg21
}
{
ldd,0,sm %dr10, _f16s,_lts0lo 0x90, %db[7]
ldd,2,sm %dr2, _lit16_ref,_lts0lo 0x90, %db[6]
ldd,3,sm %dr10, _f16s,_lts0hi 0x88, %db[9]
}
{
ldd,0,sm %dr2, _f16s,_lts0lo 0x88, %db[8]
ldd,2,sm %dr10, _f16s,_lts0hi 0x80, %db[11]
ldd,3,sm %dr2, _lit16_ref,_lts0hi 0x80, %db[10]
fmul_addd,4,sm %dr3, %dr8, %dg30, %dg30
ldd,5,sm %dr10, _f16s,_lts1lo 0x78, %db[13]
}
{
ldd,0,sm %dr2, _f16s,_lts0lo 0x78, %db[12]
fmuld,1,sm %dr4, %dr23, %db[51]
ldd,2,sm %dr10, _f16s,_lts0hi 0x70, %db[15]
ldd,3,sm %dr2, _lit16_ref,_lts0hi 0x70, %db[14]
fdivd,5,sm %dg23, %dr0, %db[42]
}
{
ldd,0,sm %dr10, _f16s,_lts0lo 0x68, %db[17]
ldd,2,sm %dr2, _lit16_ref,_lts0lo 0x68, %db[16]
ldd,3,sm %dr10, _f16s,_lts0hi 0x60, %db[19]
fmul_addd,4,sm %dr3, %dr11, %dg26, %dg23
ldd,5,sm %dr2, _lit16_ref,_lts0hi 0x60, %db[18]
}
{
ldd,0,sm %dr10, _f16s,_lts0lo 0x58, %db[21]
ldd,2,sm %dr2, _lit16_ref,_lts0lo 0x58, %db[20]
ldd,3,sm %dr10, _f16s,_lts0hi 0x50, %db[23]
fdivd,5,sm %dg27, %dr0, %db[44]
}
{
ldd,0,sm %dr2, _f16s,_lts0lo 0x50, %db[22]
ldd,2,sm %dr10, _f16s,_lts0hi 0x48, %db[25]
ldd,3,sm %dr2, _lit16_ref,_lts0hi 0x48, %db[24]
fmul_addd,4,sm %dr3, %dg17, %dg22, %dg17
ldd,5,sm %dr2, _f16s,_lts1lo 0x40, %db[26]
}
{
ldd,0,sm %dr2, _f16s,_lts0lo 0x38, %db[28]
ldd,2,sm %dr2, _f16s,_lts0hi 0x30, %db[30]
ldd,3,sm %dr2, _f16s,_lts1lo 0x28, %db[32]
fmul_addd,4,sm %dr15, %dr5, %dg16, %db[27]
fdivd,5,sm %dr12, %dr0, %db[46]
}
{
ldd,0,sm %dr1, _f16s,_lts0lo 0x88, %dg16
fmul_addd,3,sm %dr3, %dr20, %dg18, %db[39]
fmul_addd,4,sm %dr16, %dr5, %dg28, %db[29]
}
{
fdivd,5,sm %dg24, %dr0, %db[48]
}
{
fmul_addd,3,sm %dr17, %dr5, %dg30, %db[31]
fmul_addd,4,sm %dr3, %dr21, %dg19, %db[41]
}
{
fdivd,5,sm %dg25, %dr0, %db[36]
}
{
fmul_addd,3,sm %dr18, %dr5, %dg23, %db[33]
fmul_addd,4,sm %dr3, %dr22, %dg20, %db[43]
}
{
fmuld,0,sm %dr4, %dg16, %db[49]
fdivd,5,sm %dr13, %dr0, %db[38]
}
{
fmul_addd,3,sm %dr19, %dr5, %dg17, %db[35]
fmul_addd,4,sm %dr3, %dg29, %dg21, %db[45]
}
{
nop 7
fdivd,5,sm %dg31, %dr0, %db[40]
}
.L608:
{
loop_mode
fmul_addd,3,sm %dr3, %db[25], %db[48], %db[37]
fmuld,4,sm %dr4, %db[54], %db[47]
fdivd,5,sm %db[51], %dr0, %db[34]
movad,0 area=0, ind=0, am=1, be=0, %db[0]
movad,1 area=1, ind=0, am=1, be=0, %db[1]
}
{
loop_mode
alc alcf=1, alct=1
abn abnf=1, abnt=1
ct %ctpr1 ? %NOT_LOOP_END
staad,2 %db[35], %aad3[ %aasti1 ]
incr,2 %aaincr0
fmul_addd,3,sm %db[32], %dr5, %db[45], %db[25]
movad,3 area=0, ind=0, am=1, be=0, %db[48]
}
{
setwd wsz = 0x10, nfx = 0x1, dbl = 0x1
setbn rsz = 0x3, rbs = 0xc, rcur = 0x0
adds,0 0x0, 0x0, %g16
}
{
disp %ctpr2, disp=0x0
aaurw,2 %g16, %aabf0
}
.L421:
{
ct %ctpr3
addd,3 0x0, %dr6, %dr0
}
.L1133:
{
fapb ct=0, dcd=0, fmt=4, mrng=8, d=0, incr=0, ind=0, asz=4, abs=0, disp=0
fapb dpl=0, dcd=0, fmt=4, mrng=8, d=1, incr=0, ind=0, asz=5, abs=0, disp=0
}
{
fapb ct=1, dcd=0, fmt=4, mrng=8, d=2, incr=0, ind=0, asz=4, abs=16, disp=0
}
.L1495:
{
nop 3
}
{
nop 7
fmul_addd,0,sm %db[72], %dr5, %db[44], %db[80]
}
{
ibranch .L1511
}
.L1498:
{
nop 3
}
{
nop 3
fmuld,3,sm %dr4, %db[79], %db[25]
}
{
nop 7
fdivd,5,sm %db[25], %dr0, %db[47]
}
{
nop 5
}
{
nop 7
fmul_addd,3,sm %dr3, %db[60], %db[47], %db[44]
}
{
nop 7
fmul_addd,3,sm %db[72], %dr5, %db[44], %db[80]
}
{
nop 1
}
{
ibranch .L1508
}
main:
{
setwd wsz = 0x16, nfx = 0x0, dbl = 0x0
setbn rsz = 0x3, rbs = 0x12, rcur = 0x0
disp %ctpr1, calloc
getsp,0 _f32s,_lts1 0xffffff40, %dr2
addd,1 0x0, _f64,_lts2 0x400000003, %dr3
scrd,3 0x1, 0x2, %dr4
}
{
qppackdl,0 %dr3, _f64,_lts2 0x200000001, %xr3
addd,1,sm 0x4, 0x0, %db[1]
addd,2 0x0, _f64,_lts0 0xffffffff00000001, %dr5
addd,3,sm 0x4, 0x0, %db[0]
addd,4 0x0, %dr4, %dr6
}
{
addd,0 %dr2, _f64,_lts0 0xc0, %dr1
addd,1 0x0, _f64,_lts2 0x100000002, %dr7
addd,2 0x0, %dr5, %dr9
}
{
qppackdl,0 %dr9, %dr5, %xr5
qppackdl,1 %dr7, _f64,_lts1 0x400000008, %xr3
stqp,2 %dr1, _f16s,_lts0lo 0xffe0, %xr3
adds,3 0x0, _f16s,_lts0hi 0x705c, %r7
}
{
ldw,0 %dr1, _f16s,_lts0lo 0xffec, %r9
addd,1 0x0, [ _f64,_lts2 .LC.1 ], %dr13
ldw,2 %dr1, _f16s,_lts0hi 0xffe4, %r11
ldw,3 %dr1, _f16s,_lts1lo 0xffe8, %r10
ldw,5 %dr1, _f16s,_lts1hi 0xffe0, %r12
}
{
call %ctpr1, wbs = 0x12
stqp,2 %dr1, _f16s,_lts0lo 0xffd0, %xr3
addd,4 0x0, _f64,_lts1 0x4010000000000000, %dr14
ldw,5 %dr1, _f16s,_lts0hi 0xffdc, %r3
}
{
disp %ctpr1, printf
ldw,0 %dr1, _f16s,_lts0lo 0xffd8, %r5
stqp,2 %dr1, _f16s,_lts0hi 0xfff0, %xr5
addd,3 0x0, _f64,_lts2 0x3ff0000000000000, %dr16
ldw,5 %dr1, _f16s,_lts1lo 0xffd4, %r15
}
{
ldw,0 %dr1, _f16s,_lts0lo 0xfff8, %r19
ldw,2 %dr1, _f16s,_lts0hi 0xffd0, %r17
ldw,3 %dr1, _f16s,_lts1lo 0xfff4, %r20
qppackdl,4 %dr14, _f64,_lts2 0x4008000000000000, %xr21
ldw,5 %dr1, _f16s,_lts1hi 0xfffc, %r18
}
{
addd,1 0x0, _f64,_lts1 0x4020000000000000, %dr23
ldw,2 %dr1, _f16s,_lts0lo 0xfff0, %r22
qppackdl,4 %dr6, %dr16, %xr6
}
{
getfs,0 %r9, %r7, %r24
getfs,1 %r10, %r7, %r25
getfs,3 %r11, %r7, %r26
getfs,4 %r12, %r7, %r7
shls,5 %r9, 0x3, %r9
}
{
shls,0 %r10, 0x3, %r10
ands,1 %r25, 0x1, %r25
shls,2 %r11, 0x3, %r11
ands,3 %r26, 0x1, %r26
shls,4 %r12, 0x3, %r12
ands,5 %r7, 0x1, %r7
}
{
shls,0 %r3, 0x1, %r27
shls,1 %r3, 0x2, %r3
ands,2 %r24, 0x1, %r24
shls,3 %r5, 0x1, %r28
shls,4 %r5, 0x2, %r5
shls,5 %r15, 0x1, %r29
}
{
shls,0 %r15, 0x2, %r15
shls,1 %r20, 0x3, %r30
shls,2 %r18, 0x3, %r31
shls,3 %r19, 0x3, %r32
shls,4 %r17, 0x1, %r33
shls,5 %r17, 0x2, %r17
}
{
shls,0 %r22, 0x3, %r34
adds,1 %r10, %r25, %r10
adds,2 %r12, %r7, %r7
adds,3 %r11, %r26, %r11
subs,4 %r27, %r18, %r12
adds,5 %r9, %r24, %r9
}
{
adds,0 %r3, %r31, %r3
subs,1 %r28, %r19, %r18
adds,2 %r5, %r32, %r5
subs,3 %r29, %r20, %r19
adds,4 %r15, %r30, %r15
subs,5 %r33, %r22, %r20
}
{
adds,0 %r17, %r34, %r17
sars,1 %r10, 0x1, %r10
adds,2 %r18, %r5, %r5
sars,3 %r7, 0x1, %r7
sars,4 %r11, 0x1, %r11
adds,5 %r19, %r15, %r15
}
{
adds,0 %r12, %r3, %r3
adds,1 %r20, %r17, %r12
sars,2 %r9, 0x1, %r9
qpswitchd,3 %xr21, %xr15
qpswitchd,4 %xr6, %xr17
adds,5 %r15, %r11, %r11
}
{
adds,0 %r5, %r10, %r5
adds,1 %r3, %r9, %r3
adds,2 %r12, %r7, %r7
sxt,3 0x2, %r11, %db[2]
addd,4 0x0, _f64,_lts0 0x3fffd70a3d70a3d7, %dr9
addd,5 0x0, _f64,_lts2 0xbff0000000000000, %dr10
}
{
sxt,0 0x2, %r5, %db[3]
sxt,1 0x2, %r3, %db[4]
sxt,2 0x2, %r7, %dr12
qppackdl,3 %dr10, %dr16, %xr7
addd,4 0x0, _f64,_lts0 0x401c000000000000, %dr10
stw,5 %db[0], 0x0, %r7
}
{
addd,0 0x0, _f64,_lts2 0x4018000000000000, %dr18
addd,2,sm 0x0, %dr12, %db[1]
qpswitchd,3 %xr7, %xr11
qppackdl,4 %dr14, _f64,_lts0 0x4020000000000000, %xr14
stw,5 %db[0], 0x4, %r11
}
{
addd,0 0x0, [ _f64,_lts0 .LC.2 ], %dr16
qppackdl,3 %dr16, %dr4, %xr4
qpswitchd,4 %xr14, %xr5
stw,5 %db[0], 0x8, %r5
}
{
addd,2,sm 0x0, [ _f64,_lts0 .LC.1 ], %db[0]
qpswitchd,3 %xr4, %xr3
stw,5 %db[0], 0xc, %r3
}
{
std,2 0x18, %dr2, %db[3]
std,5 %dr2, _f16s,_lts0lo 0x20, %db[4]
}
{
std,2 %dr2, 0x8, %dr12
std,5 0x10, %dr2, %db[2]
}
{
std,2 %dr2, 0x0, %dr13
}
{
call %ctpr1, wbs = 0x12
}
{
nop 4
disp %ctpr1, calloc
addd,0,sm 0x4, 0x0, %db[0]
addd,1 0x8, 0x0, %db[1]
}
{
call %ctpr1, wbs = 0x12
}
{
disp %ctpr1, printf
fmuld,0,sm %dr6, %dr23, %dr6
fmuld,1 %dr11, %dr10, %dr11
fmuld,2 %dr7, %dr10, %dr7
fmuld,3,sm %dr15, %dr23, %dr12
fmuld,4,sm %dr21, %dr23, %dr13
fmuld,5,sm %dr17, %dr23, %dr15
}
{
nop 2
fmuld,0 %dr3, %dr18, %dr3
fmuld,1 %dr4, %dr18, %dr4
fmuld,2 %dr5, %dr18, %dr5
fmuld,3 %dr14, %dr18, %dr10
}
{
nop 1
fdivd,5,sm %dr12, %dr9, %dr12
}
{
nop 1
fdivd,5,sm %dr13, %dr9, %dr13
}
{
nop 1
fdivd,5,sm %dr15, %dr9, %dr14
}
{
nop 7
fdivd,5,sm %dr6, %dr9, %dr6
}
{
nop 1
faddd,3,sm %dr11, %dr12, %dr9
}
{
nop 1
faddd,3,sm %dr7, %dr13, %dr12
}
{
nop 1
faddd,3,sm %dr11, %dr14, %dr11
faddd,4,sm %dr9, %dr3, %dr3
}
{
nop 1
faddd,3,sm %dr7, %dr6, %dr6
faddd,4,sm %dr12, %dr4, %dr4
}
{
nop 1
faddd,3,sm %dr11, %dr5, %dr5
}
{
nop 1
addd,0,sm 0x0, %dr3, %db[4]
faddd,3,sm %dr6, %dr10, %dr6
}
{
nop 1
addd,0,sm 0x0, %dr4, %db[3]
}
{
addd,0,sm 0x0, %dr5, %db[2]
std,5 %db[0], 0x0, %dr6
}
{
std,5 %db[0], 0x8, %dr5
}
{
addd,0,sm 0x0, %dr6, %db[1]
std,5 %db[0], _f16s,_lts0lo 0x10, %dr4
}
{
addd,0,sm 0x0, [ _f64,_lts1 .LC.2 ], %db[0]
std,5 %db[0], _f16s,_lts0lo 0x18, %dr3
}
{
std,2 %dr2, _f16s,_lts0lo 0x20, %dr3
std,5 0x18, %dr2, %dr4
}
{
std,2 0x10, %dr2, %dr5
std,5 %dr2, 0x8, %dr6
}
{
std,2 %dr2, 0x0, %dr16
}
{
call %ctpr1, wbs = 0x12
}
{
nop 5
return %ctpr3
addd,3 0x0, 0x0, %dr0
}
{
ct %ctpr3
}
.LC.1:
.ascii "%d %d %d %d\n\000"
.LC.2:
.ascii "%f %f %f %f\n\000"
elbrus_optimizing_compiler_v1.24.10_Mar__8_2020 = 0x0
Can I ask what the plug is about? Getting the word on Elbrus out there, support from the IC?.
Is any of this hardware readily available outside of Russia?
Look for IcepeakITX ELBRUS-8CB.
Interesting. So they crowdfunded the board but the CPU is sourced from Russia, basically?
Oof too rich for my blood.
-------------------------------------------------------------
STREAM version $Revision: 5.10 $
-------------------------------------------------------------
This system uses 8 bytes per array element.
-------------------------------------------------------------
Array size = 100000000 (elements), Offset = 0 (elements)
Memory per array = 762.9 MiB (= 0.7 GiB).
Total memory required = 2288.8 MiB (= 2.2 GiB).
Each kernel will be executed 10 times.
The *best* time for each kernel (excluding the first iteration)
will be used to compute the reported bandwidth.
-------------------------------------------------------------
Your clock granularity/precision appears to be 1 microseconds.
Each test below will take on the order of 81921 microseconds.
(= 81921 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
-------------------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
-------------------------------------------------------------
Function Best Rate MB/s Avg time Min time Max time
Copy: 16674.2 0.102085 0.095957 0.142147
Scale: 16511.2 0.101129 0.096904 0.129083
Add: 19486.0 0.126514 0.123165 0.140751
Triad: 19358.4 0.124993 0.123977 0.125983
-------------------------------------------------------------
Solution Validates: avg error less than 1.000000e-13 on all three arrays
-------------------------------------------------------------
$ cc stream.c -O4 -DSTREAM_ARRAY_SIZE=100000000 -DSTREAM_TYPE=double -fopenmp
entityfx@yukari:~/STREAM$ ./a.out
-------------------------------------------------------------
STREAM version $Revision: 5.10 $
-------------------------------------------------------------
This system uses 8 bytes per array element.
-------------------------------------------------------------
Array size = 100000000 (elements), Offset = 0 (elements)
Memory per array = 762.9 MiB (= 0.7 GiB).
Total memory required = 2288.8 MiB (= 2.2 GiB).
Each kernel will be executed 10 times.
The *best* time for each kernel (excluding the first iteration)
will be used to compute the reported bandwidth.
-------------------------------------------------------------
Number of Threads requested = 32
Number of Threads counted = 32
-------------------------------------------------------------
Your clock granularity/precision appears to be 1 microseconds.
Each test below will take on the order of 31778 microseconds.
(= 31778 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
-------------------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
-------------------------------------------------------------
Function Best Rate MB/s Avg time Min time Max time
Copy: 33722.5 0.117056 0.047446 0.350954
Scale: 32959.0 0.133574 0.048545 0.429537
Add: 37047.9 0.193759 0.064781 0.668979
Triad: 36455.3 0.165948 0.065834 0.411430
-------------------------------------------------------------
Solution Validates: avg error less than 1.000000e-13 on all three arrays
-------------------------------------------------------------
STREAM version $Revision: 5.10 $
-------------------------------------------------------------
This system uses 8 bytes per array element.
-------------------------------------------------------------
Array size = 100000000 (elements), Offset = 0 (elements)
Memory per array = 762.9 MiB (= 0.7 GiB).
Total memory required = 2288.8 MiB (= 2.2 GiB).
Each kernel will be executed 10 times.
The *best* time for each kernel (excluding the first iteration)
will be used to compute the reported bandwidth.
-------------------------------------------------------------
Number of Threads requested = 8
Number of Threads counted = 8
-------------------------------------------------------------
Your clock granularity/precision appears to be 1 microseconds.
Each test below will take on the order of 45269 microseconds.
(= 45269 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
-------------------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
-------------------------------------------------------------
Function Best Rate MB/s Avg time Min time Max time
Copy: 23097.3 0.069884 0.069272 0.070459
Scale: 23137.4 0.069689 0.069152 0.070604
Add: 25578.7 0.094895 0.093828 0.096911
Triad: 25643.2 0.094898 0.093592 0.096150
-------------------------------------------------------------
Solution Validates: avg error less than 1.000000e-13 on all three arrays
-------------------------------------------------------------