• Guest, The rules for the P & N subforum have been updated to prohibit "ad hominem" or personal attacks against other posters. See the full details in the post "Politics and News Rules & Guidelines."

Info MCST Elbrus cpu's benchmarks & e2k assembly code

EntityFX

Junior Member
Mar 2, 2021
16
37
46
Are these two the ones in the bottom of that list?
DescriptionCoresFrequency (MHz)Prod yearTDPProcess (nm)GFlops (double)ISA generationModel nameAlternative name
Cpu Elbrus1300200561302.4(E2Kv1)-e3m
SoC Elbrus1300200761302.4(E2Kv2)-e3s
SoC Elbrus-S1500201013904elbrus-v2elbrus-sЭкскурсовод, Эльбрус-3S, Elbrus-2C1
SoC Elbrus-2C+DSP2 (+4 DSP cores)500201125908elbrus-v2elbrus-2c+Кубик, Elbrus-2C2, Elbrus-S2
SoC Elbrus-4C48002014606525.6elbrus-v3elbrus-4ce2s, Эльбрус-2S
SoC Elbrus-8C8130020168028124.8elbrus-v4elbrus-8ce8c, Процессор-1, Эльбрус-4C+
SoC Elbrus-1C+110002015104012elbrus-v4elbrus-1c+e1c+, e1cp, e1c, Процессор-2
SoC Elbrus-8CB8150020189028
288
elbrus-v5elbrus-8c2e8c2, p9, Процессор-9
SoC Elbrus-16C1620002021110-13016768elbrus-v6elbrus-16c
SoC Elbrus-2C322000202110-151696elbrus-v6elbrus-2c3
SoC Elbrus-12C122000202285-9016576elbrus-v6elbrus-12c
SoC Elbrus-32C322500?2025/2026??71920elbrus-v7elbrus-32c?Процессор-21(СРВ)

How calculate FLOPS (v1 .. v3):
  • Single Precision: 4 FP ALUs * 4 Single operation * Cores * Frequency
  • Double Precision: 4 FP ALUs * 2 Double operation * Cores * Frequency
How calculate FLOPS (v4):
  • Single Precision: 6 FP ALUs * 4 Single operation * Cores * Frequency
  • Double Precision: 6 FP ALUs * 2 Double operation * Cores * Frequency
How calculate FLOPS (v5+ [128 bit SIMD]):
  • Single Precision: 6 FP ALUs * 4 Single operation * 2 SIMD * Cores * Frequency
  • Double Precision: 6 FP ALUs * 2 Double operation * 2 SIMD * Cores * Frequency
Example for Elbrus-16C: 6 ALUs * 2 DP * 2 * 16 Cores * 2e9 = 7.68e11 --> 768 GFlops
 
Last edited:

EntityFX

Junior Member
Mar 2, 2021
16
37
46
Here is MP MFLOPS benchmark results by Roy Longbottom, I've disabled cpuid for compilation (because of other architecture)
Benchmark can be found here: http://www.roylongbottom.org.uk/GigaFLOPS Benchmarks.htm
My results here: https://github.com/EntityFX/anybench/tree/master/results


####################################################
getDetails and MHz




Assembler CPUID and RDTSC
CPU , Features Code 00000000, Model Code 00000000

Measured - Minimum 0 MHz, Maximum 0 MHz
Linux Functions
get_nprocs() - CPUs 8, Configured CPUs 8
get_phys_pages() and size - RAM Size 30.94 GB, Page Size 4096 Bytes
uname() - Linux, sumireko, 4.19.0-0.3-e8c2
#1 SMP Sat Jan 18 09:49:15 GMT 2020, e2k


##############################################

64 Bit MP SSE MFLOPS Benchmark 1, 8 Threads, Sat Mar 21 14:13:55 2020

Test 4 Byte Ops/ Repeat Seconds MFLOPS First All
Words Word Passes Results Same

Data in & out 102400 2 20000 0.050556 81019 0.620974 Yes
Data in & out 1024000 2 2000 0.045250 90519 0.942935 Yes
Data in & out 10240000 2 200 0.446652 9170 0.994032 Yes

Data in & out 102400 8 20000 0.066665 245766 0.749971 Yes
Data in & out 1024000 8 2000 0.065821 248916 0.965360 Yes
Data in & out 10240000 8 200 0.465831 35172 0.996409 Yes

Data in & out 102400 32 20000 0.177363 369501 0.498060 Yes
Data in & out 1024000 32 2000 0.174700 375134 0.910573 Yes
Data in & out 10240000 32 200 0.464857 140981 0.990447 Yes

End of test Sat Mar 21 14:13:57 2020

Press Enter
 

RTX

Member
Nov 5, 2020
30
11
41
How calculate FLOPS (v1 .. v3):
  • Single Precision: 4 FP ALUs * 4 Single operation * Cores * Frequency
  • Double Precision: 4 FP ALUs * 2 Double operation * Cores * Frequency
How calculate FLOPS (v4):
  • Single Precision: 6 FP ALUs * 4 Single operation * Cores * Frequency
  • Double Precision: 6 FP ALUs * 2 Double operation * Cores * Frequency
How calculate FLOPS (v5+ [128 bit SIMD]):
  • Single Precision: 6 FP ALUs * 4 Single operation * 2 SIMD * Cores * Frequency
  • Double Precision: 6 FP ALUs * 2 Double operation * 2 SIMD * Cores * Frequency
Example for Elbrus-16C: 6 ALUs * 2 DP * 2 * 16 Cores * 2e10 = 7.68e11 --> 768 GFlops
The wiki says 8SV has 576 vs the 8S ( 250? ) and the only thing was 1500/1300mhz and DDR4-2400 vs DDR3-1600?
 

EntityFX

Junior Member
Mar 2, 2021
16
37
46
The wiki says 8SV has 576 vs the 8S ( 250? ) and the only thing was 1500/1300mhz and DDR4-2400 vs DDR3-1600?
8C: 1200-1300 MHz, DDR3-1600 ECC;
8CB (also known as SV): 1500 MHz, DDR4-2400 ECC;
16C: 2000 MHz, DDR4-3200 ECC.

"C" means cores, "B" means "vector extensions".

Memory is only ECC because it requires 4 bit to store tags, last 4 of 8 bits are used for ECC.
 
Last edited:

EntityFX

Junior Member
Mar 2, 2021
16
37
46
So, internal architecture look like this:

6 ALUs, Predicate Unit, 256 Registers File (80 bit [v1 .. v4] or 128 bit [v5+] + tags)



Diagram is not completed yet (here is missing SRU block in Control Unit).

Yellow: address data
Brown: data
Green: predicate data
 

EntityFX

Junior Member
Mar 2, 2021
16
37
46
C source and assembly output examples

C:
#include <stdio.h>
#include <math.h>

int * calculate_mx4(int * x, int * y, int * z, int a, int b, int c, const int size) {
    int * res = (int *)calloc(size, sizeof(int));
#pragma loop count(4)
    for(int i = 0; i < size; i++) {
        res[i] = a * x[i] + b * y[i] / 2 + z[i] * c;
    }
    return res;
}

double * calculate_fx4(double * x, double * y, double * z, double a, double b, double c, const int size) {
    double * res = (double *)calloc(size, sizeof(double));

    for(int i = 0; i < size; i++) {
        res[i] = a * x[i] + b * y[i] / 1.99 + z[i] * c;
    }
    return res;
}


int main() {
    int x[] = { 1, -1, 1, -1 };
    int y[] = { 1, 2, 3, 4 };
    int z[] = { 8, 4, 2, 1 };
    int * res = calculate_mx4(x, y, z, 7, 8, 6, 4);
    printf("%d %d %d %d\n", res[0], res[1], res[2], res[3]);


    double fx[] = { 1.0, -1.0, 1.0, -1.0 };
    double fy[] = { 1.0, 2.0, 3.0, 4.0 };
    double fz[] = { 8.0, 4.0, 2.0, 1.0 };
    double * fres = calculate_fx4(fx, fy, fz, 7.0, 8.0, 6.0, 4.0);
    printf("%f %f %f %f\n", fres[0], fres[1], fres[2], fres[3]);

    return 0;
}
Code:
calculate_mx4(int*, int*, int*, int, int, int, int):
        {
          setwd wsz = 0xd, nfx = 0x1, dbl = 0x0
          setbn rsz = 0x3, rbs = 0x9, rcur = 0x0
          disp  %ctpr2, calloc
          addd,4,sm     0x0, %dr0, %dr17
        }
        {
          cmplsb,0      0x0, %r6, %pred2
          sxt,2 0x2, %r6, %db[0]
          addd,5        0x4, 0x0, %db[1]
        }
        {
          getsp,0       _f32s,_lts0 0xfffffff0, %dr8
        }
        {
          nop 1
          cmplsb,0,sm   0x2, %r6, %pred1 ? %pred2
          cmplsb,1      0x1, %r6, %pred0 ? %pred2
          addd,2,sm     0x8, 0x0, %dr14 ? %pred2
          adds,3,sm     0x2, 0x0, %r15 ? %pred2
          addd,4,sm     0x4, 0x0, %dr9 ? %pred2
          addd,5        0x0, 0x0, %dr13 ? %pred2
        }
        {
          call  %ctpr2, wbs = 0x9
        }
        {
          return        %ctpr3
          ldw,0,sm      %dr1, 0x0, %g16
          ldw,2,sm      %dr2, 0x0, %g17
          ldw,3 %dr0, 0x0, %g18 ? %pred2
          addd,4        0x0, %db[0], %dr16
        }
        {
          disp  %ctpr1, .L345
          addd,4        0x0, %dr16, %dr0 ? ~%pred2
        }
        {
          ct    %ctpr3 ? ~%pred2
        }
        {
          nop 1
          return        %ctpr3
        }
        {
          nop 5
          muls,0,sm     %r4, %g16, %g16
          muls,1,sm     %g17, %r5, %g17
          muls,3,sm     %r3, %g18, %g18
        }
        {
          getfs,0,sm    %g16, _f16s,_lts0lo 0x5f, %g19
          adds,1        0x0, %g17, %r11 ? %pred2
          adds,2        0x0, %g18, %r12 ? %pred2
        }
        {
          adds,0,sm     %g16, %g19, %g16
        }
        {
          sars,0,sm     %g16, 0x1, %r10 ? %pred2
        }
.L345:
        {
          adds,0        %r12, %r10, %g16
          addd,3        0x0, %dr16, %dr0 ? ~%pred0
          pass  %pred0, @p0
          andp  @p0, @p0, @p4
          pass  @p4, %pred2
        }
        {
          adds,0        %g16, %r11, %g16
        }
        {
          stw,2 %dr16, %dr13, %g16
          ldw,5,sm      %dr2, %dr9, %r11
        }
        {
          ct    %ctpr3 ? ~%pred2
          ldw,0,sm      %dr1, %dr9, %r8
          ldw,2,sm      %dr17, %dr9, %r10
        }
        {
          adds,0,sm     %r15, 0x1, %g16
          addd,1,sm     0x0, %dr9, %dr13
          addd,2,sm     0x0, %dr14, %dr9
          addd,3,sm     0x4, %dr14, %dr14
          pass  %pred1, @p0
          andp  @p0, @p0, @p4
          pass  @p4, %pred0
        }
        {
          cmplsb,0,sm   %g16, %r6, %pred2
          adds,1,sm     0x0, %g16, %r15
        }
        {
          pass  %pred2, @p0
          andp  @p0, @p0, @p4
          pass  @p4, %pred1
        }
        {
          muls,3        %r11, %r5, %g16
        }
        {
          nop 4
          muls,0        %r4, %r8, %g17
          muls,1        %r3, %r10, %g18
        }
        {
          adds,0,sm     0x0, %g16, %r11
        }
        {
          getfs,0       %g17, _f16s,_lts0lo 0x5f, %g16
          adds,1,sm     0x0, %g18, %r12
        }
        {
          adds,0        %g17, %g16, %g16
        }
        {
          sars,0        %g16, 0x1, %g16
        }
        {
          ct    %ctpr1
          adds,0,sm     0x0, %g16, %r10
        }
calculate_fx4(double*, double*, double*, double, double, double, int):
        {
          setwd wsz = 0x10, nfx = 0x1, dbl = 0x1
          setbn rsz = 0x3, rbs = 0xc, rcur = 0x0
          disp  %ctpr2, calloc
          getsp,0       _f32s,_lts1 0xfffffff0, %dr8
          addd,1,sm     0x0, %dr5, %dr5
          addd,2,sm     0x0, %dr3, %dr3
          addd,3,sm     0x0, %dr4, %dr4
          addd,4,sm     0x0, %dr2, %dr2
          addd,5,sm     0x0, %dr1, %dr1
        }
        {
          nop 1
          cmplsb,0,sm   0x0, %r6, %pred0
          sxt,1 0x2, %r6, %db[0]
          addd,2        0x8, 0x0, %db[1]
          addd,3,sm     0x0, %dr0, %dr10
        }
        {
          merges,0,sm   0x1, %r6, %r11, %pred0
        }
        {
          subs,0,sm     %r11, 0x1, %r8
        }
        {
          call  %ctpr2, wbs = 0xc
          cmplsb,0,sm   %r8, _f16s,_lts0lo 0x60, %pred1
        }
        {
          return        %ctpr3
          ldd,0,sm      %dr1, 0x0, %dg16
          ldd,2,sm      %dr1, 0x8, %dg17
          ldd,3,sm      %dr1, _f16s,_lts0lo 0x10, %dg18
          addd,4        0x0, %db[0], %dr6
          ldd,5,sm      %dr1, _f16s,_lts0hi 0x18, %dg19
        }
        {
          disp  %ctpr1, .L933
          ldd,0,sm      %dr1, _f16s,_lts0lo 0x28, %dr15
          merges,1,sm   %r8, _f16s,_lts0hi 0x60, %g20, ~%pred1
          ldd,2,sm      %dr1, _f16s,_lts1lo 0x20, %dg21
          cmpledb,3,sm  %dr6, %dr1, %pred1
          addd,4        0x0, %dr6, %dr0 ? ~%pred0
          ldd,5,sm      %dr1, _f16s,_lts1hi 0x30, %dr14
        }
        {
          ct    %ctpr3 ? ~%pred0
          cmpledb,0,sm  %dr6, %dr0, %pred2
          sxt,1,sm      0x2, %g20, %dg20
          ldd,2,sm      %dr1, _f16s,_lts0lo 0x38, %dr13
          subd,3,sm     %dr6, 0x8, %dg23
          subd,4,sm     %dr6, 0x8, %dg22
          ldd,5,sm      %dr1, _f16s,_lts0hi 0x40, %dr12
        }
        {
          shld,0,sm     %dg20, 0x3, %dg20
          cmpledb,1,sm  %dr6, %dr2, %pred3
        }
        {
          addd,0,sm     0x8, %dg20, %dg20
        }
        {
          addd,0,sm     %dg20, %dr1, %dg24
          addd,1,sm     %dg20, %dr0, %dg25
          addd,2,sm     %dg20, %dr2, %dg20
          fmuld,3,sm    %dr4, %dg16, %dr20
          fmuld,4,sm    %dr4, %dg17, %dr19
          fmuld,5,sm    %dr4, %dg18, %dr18
        }
        {
          cmpledb,0,sm  %dg24, %dg22, %pred4
          cmpledb,1,sm  %dg25, %dg23, %pred5
          fmuld,2,sm    %dr4, %dg19, %dr17
          addd,3,sm     0x0, _f64,_lts0 0x3fffd70a3d70a3d7, %dr0
          fmuld,4,sm    %dr4, %dg21, %dr16
        }
        {
          cmpledb,0,sm  %dg20, %dg22, %pred4
          pass  %pred4, @p0
          pass  %pred1, @p1
          landp ~@p0, ~@p1, @p4
          pass  @p4, %pred1
          pass  %pred5, @p2
          pass  %pred2, @p3
          landp ~@p2, ~@p3, @p5
          pass  @p5, %pred2
        }
        {
          pass  %pred0, @p0
          pass  %pred2, @p1
          landp @p0, ~@p1, @p4
          pass  @p4, %pred0
          pass  %pred1, @p2
          landp @p4, ~@p2, @p5
          pass  @p5, %pred1
        }
        {
          nop 2
          pass  %pred1, @p0
          pass  %pred4, @p1
          landp @p0, @p1, @p4
          pass  @p4, %pred0
          landp @p0, ~@p1, @p5
          pass  @p5, %pred1
          pass  %pred3, @p2
          landp @p5, @p2, @p6
          pass  @p6, %pred2
        }
        {
          ct    %ctpr1 ? %pred0
        }
        {
          ct    %ctpr1 ? %pred2
        }
        {
          setwd wsz = 0x35, nfx = 0x1, dbl = 0x1
          setbn rsz = 0x28, rbs = 0xc, rcur = 0x0
          disp  %ctpr1, .L571
          addd,0        0x0, 0x0, %dg16
          addd,3        0x0, _f64,_lts1 0x3fffd70a3d70a3d7, %dr0
        }
        {
          addd,0        0x0, _f64,_lts0 0x20ff2000000000, %dg17
          aaurwd,2      %dr6, %aad0
          addd,3,sm     0x0, 0x0, %db[32]
        }
        {
          insfd,0       %dg17, _f32s,_lts1 0x8800, %dr11, %dg17
          aaurwd,2      %dg16, %aasti1
          addd,3,sm     %db[32], _f16s,_lts0lo 0x10, %dg19
          addd,4,sm     0x8, %db[32], %dg18
          addd,5,sm     %db[32], _f16s,_lts0hi 0x18, %dg20
        }
        {
          ldd,0,sm      %dr10, %db[32], %db[60], mas=0x4
          addd,1,sm     %db[32], _f16s,_lts0lo 0x30, %dg22
          addd,2,sm     %db[32], _f16s,_lts1hi 0x38, %dg23
          ldd,3,sm      %dr1, %db[32], %db[79], mas=0x4
          addd,4,sm     %db[32], _f16s,_lts0hi 0x20, %dg16
          addd,5,sm     %db[32], _f16s,_lts1lo 0x28, %dg21
        }
        {
          ldd,0,sm      %dr1, %dg19, %db[75], mas=0x4
          addd,1,sm     %db[32], _f16s,_lts0lo 0x50, %dg26
          addd,2,sm     %db[32], _f16s,_lts1hi 0x58, %dg27
          ldd,3,sm      %dr1, %dg18, %db[77], mas=0x4
          addd,4,sm     %db[32], _f16s,_lts0hi 0x40, %dg24
          addd,5,sm     %db[32], _f16s,_lts1lo 0x48, %dg25
        }
        {
          ldd,0,sm      %dr1, %dg16, %db[71], mas=0x4
          addd,1,sm     %db[32], _f16s,_lts0lo 0x60, %dg28
          addd,2,sm     %db[32], _f16s,_lts0hi 0x70, %dg30
          ldd,3,sm      %dr1, %dg20, %db[73], mas=0x4
          addd,4,sm     %db[32], _f16s,_lts1lo 0x68, %dg29
          addd,5,sm     %db[32], _f16s,_lts1hi 0x78, %db[2]
        }
        {
          ldd,0,sm      %dr10, %dg18, %db[58], mas=0x4
          addd,1,sm     0x0, %dg18, %db[30]
          addd,2,sm     0x0, %dg19, %db[28]
          ldd,3,sm      %dr1, %dg21, %db[69], mas=0x4
          addd,4,sm     0x0, %dg20, %db[26]
          addd,5,sm     0x0, %dg16, %db[24]
        }
        {
          ldd,0,sm      %dr10, %dg19, %db[56], mas=0x4
          addd,1,sm     0x0, %dg21, %db[22]
          addd,2,sm     0x0, %dg22, %db[20]
          ldd,3,sm      %dr1, %dg22, %db[67], mas=0x4
          addd,4,sm     0x0, %dg23, %db[18]
          addd,5,sm     0x0, %dg24, %db[16]
        }
        {
          ldd,0,sm      %dr1, %dg24, %db[63], mas=0x4
          addd,1,sm     0x0, %dg25, %db[14]
          addd,2,sm     0x0, %dg26, %db[12]
          ldd,3,sm      %dr1, %dg23, %db[65], mas=0x4
          fmuld,4,sm    %dr4, %db[79], %dg31
          addd,5,sm     0x0, %dg27, %db[10]
        }
        {
          ldd,0,sm      %dr1, %dg25, %db[61], mas=0x4
          addd,1,sm     0x0, %dg28, %db[8]
          addd,2,sm     0x0, %dg29, %db[6]
          fmuld,3,sm    %dr4, %db[75], %dg24
          fmuld,4,sm    %dr4, %db[77], %dg23
          addd,5,sm     0x0, %dg30, %db[4]
        }
        {
          ldd,0,sm      %dr2, %db[32], %db[72], mas=0x4
          fmuld,3,sm    %dr4, %db[71], %dr8
          fmuld,4,sm    %dr4, %db[73], %dg25
        }
        {
          ldd,0,sm      %dr2, %dg18, %db[70], mas=0x4
          fmuld,4,sm    %dr4, %db[69], %dr11
        }
        {
          ldd,0,sm      %dr2, %dg19, %db[68], mas=0x4
          ldd,3,sm      %dr10, %dg20, %db[54], mas=0x4
          fmuld,4,sm    %dr4, %db[67], %dg19
          fdivd,5,sm    %dg31, %dr0, %dg18
        }
        {
          ldd,0,sm      %dr10, %dg16, %db[52], mas=0x4
          ldd,3,sm      %dr10, %dg21, %db[50], mas=0x4
          fmuld,4,sm    %dr4, %db[65], %dg21
          fmuld,5,sm    %dr4, %db[63], %dg31
        }
        {
          ldd,0,sm      %dr1, %dg27, %db[57], mas=0x4
          ldd,3,sm      %dr1, %dg26, %db[59], mas=0x4
          fmuld,4,sm    %dr4, %db[61], %dg26
          fdivd,5,sm    %dg23, %dr0, %dg23
        }
        {
          rwd,0 %dg17, %lsr
          ldd,3,sm      %dr2, %dg20, %db[66], mas=0x4
        }
        {
          ldd,0,sm      %dr2, %dg16, %db[64], mas=0x4
          ldd,3,sm      %dr10, %dg22, %db[48], mas=0x4
          fdivd,5,sm    %dg24, %dr0, %dg17
        }
        {
          ldd,0,sm      %dr1, %dg28, %db[55], mas=0x4
          ldd,3,sm      %dr1, %dg29, %db[53], mas=0x4
        }
        {
          ldd,0,sm      %dr1, %dg30, %db[51], mas=0x4
          fdivd,5,sm    %dg25, %dr0, %dg16
        }
        {
          fmuld,0,sm    %dr4, %db[59], %db[5]
          fmuld,1,sm    %dr4, %db[57], %db[3]
        }
        {
          nop 1
          fdivd,5,sm    %dr8, %dr0, %dg20
        }
        {
          nop 1
          fdivd,5,sm    %dr11, %dr0, %dg22
        }
        {
          nop 1
          fdivd,5,sm    %dg19, %dr0, %db[35]
        }
        {
          nop 1
          fmul_addd,3,sm        %dr3, %db[60], %dg18, %db[44]
          fdivd,5,sm    %dg21, %dr0, %db[33]
        }
        {
          nop 1
          fmul_addd,3,sm        %dr3, %db[58], %dg23, %db[42]
          fdivd,5,sm    %dg31, %dr0, %db[31]
        }
        {
          nop 1
          fmul_addd,3,sm        %dr3, %db[56], %dg17, %db[40]
          fdivd,5,sm    %dg26, %dr0, %db[29]
        }
        {
          nop 1
          fmul_addd,3,sm        %dr3, %db[54], %dg16, %db[38]
        }
        {
          nop 1
          fmul_addd,3,sm        %dr3, %db[52], %dg20, %db[36]
          fmul_addd,4,sm        %db[72], %dr5, %db[44], %db[80]
        }
        {
          nop 1
          fmul_addd,3,sm        %dr3, %db[50], %dg22, %db[34]
          fmul_addd,4,sm        %db[70], %dr5, %db[42], %db[78]
        }
        {
          fmul_addd,3,sm        %db[68], %dr5, %db[40], %db[76]
        }
.L571:
        {
          loop_mode
          rbranch       .L1495
          ldd,0,sm      %dr10, %db[18], %db[46], mas=0x4 ? %pcnt7
          fmuld,1,sm    %dr4, %db[55], %db[1]
          ldd,2 %dr2, %db[32], %db[72], mas=0x3 ? %pcnt0
          fdivd,5,sm    %db[5], %dr0, %db[27]
        }
.L1511:
        {
          loop_mode
          rbranch       .L1498
          ldd,0,sm      %dr2, %db[22], %db[62], mas=0x4 ? %pcnt5
          addd,1,sm     0x8, %db[2], %db[0]
          ldd,2 %dr1, %db[32], %db[79], mas=0x3 ? %pcnt0
          ldd,3,sm      %dr1, %db[2], %db[49], mas=0x4
          fmul_addd,4,sm        %db[66], %dr5, %db[38], %db[74]
          ldd,5 %dr10, %db[32], %db[60], mas=0x3 ? %pcnt0
        }
.L1508:
        {
          loop_mode
          alc   alcf=1, alct=1
          abn   abnf=1, abnt=1
          ct    %ctpr1 ? %NOT_LOOP_END
          fmul_addd,4,sm        %dr3, %db[48], %db[35], %db[32]
          staad,5       %db[80], %aad0[ %aasti1 ]
          incr,5        %aaincr0
        }
        {
          setwd wsz = 0x10, nfx = 0x1, dbl = 0x1
          setbn rsz = 0x3, rbs = 0xc, rcur = 0x0
          disp  %ctpr1, .L421
          adds,0        0x0, 0x0, %g16
          addd,1        0x0, 0x0, %dg17
        }
        {
          return        %ctpr3
          mmurw,2       %dg17, %dam_inv
        }
        {
          nop 3
          aaurw,2       %g16, %aabf0
        }
        {
          ct    %ctpr1
        }
.L933:
        {
          setwd wsz = 0x28, nfx = 0x1, dbl = 0x1
          setbn rsz = 0x1b, rbs = 0xc, rcur = 0x0
          ldisp %ctpr2, .L1133
          addd,0        0x0, _f64,_lts1 0x20492000000000, %dg17
          fmuld,1,sm    %dr4, %dr14, %dg20
          fmuld,2,sm    %dr4, %dr15, %dg21
          fmuld,3,sm    %dr4, %dr12, %dg18
          fmuld,4,sm    %dr4, %dr13, %dg19
          fdivd,5,sm    %dr16, %dr0, %dg16
        }
        {
          disp  %ctpr1, .L608
          insfd,0       %dg17, _f32s,_lts0 0x8800, %dr11, %dg17
          addd,1,sm     0x0, %dr10, %dg22
          addd,2,sm     0x0, %dr1, %dg24
          addd,3,sm     0x0, %dr2, %dg26
          addd,4,sm     0x0, 0x0, %dg23
          addd,5,sm     0x0, 0x0, %dg25
        }
        {
          return        %ctpr3
          rwd,0 %dg17, %lsr
          aaurwd,2      %dr6, %aad3
          fdivd,5,sm    %dr17, %dr0, %dg28
        }
        {
          ldd,0,sm      %dr10, _f16s,_lts0hi 0x20, %dg29
          addd,1,sm     0x0, 0x0, %dg27
          addd,2,sm     %dg26, _f16s,_lts0lo 0xa8, %dg26
          addd,3,sm     %dg22, _lit16_ref,_lts0lo 0xa8, %dg22
          addd,4        0x0, 0x0, %dg17
          addd,5,sm     %dg24, _lit16_ref,_lts0lo 0xa8, %dg24
        }
        {
          ldd,0,sm      %dr10, _f16s,_lts0hi 0x10, %dr8
          ldd,2,sm      %dr10, 0x8, %dr11
          ldd,3,sm      %dr10, _f16s,_lts0lo 0x18, %dg31
          fdivd,5,sm    %dr18, %dr0, %dg30
        }
        {
          aaurwq,2      %qg26, %aad0
        }
        {
          ldd,0,sm      %dr10, 0x0, %dg17
          aaurwd,2      %dg17, %aasti1
          fdivd,5,sm    %dr19, %dr0, %dg26
        }
        {
          aaurwq,2      %qg22, %aad2
        }
        {
          ldd,0,sm      %dr1, _f16s,_lts0lo 0x58, %dg27
          ldd,2,sm      %dr1, _f16s,_lts1lo 0x50, %dr12
          ldd,3,sm      %dr1, _f16s,_lts0hi 0x60, %dg23
          fdivd,5,sm    %dr20, %dr0, %dg22
        }
        {
          aaurwq,2      %qg24, %aad1
        }
        {
          bap
          ldd,0,sm      %dr1, _f16s,_lts0lo 0x78, %dg25
          ldd,2,sm      %dr1, _f16s,_lts1lo 0x70, %dr13
          ldd,3,sm      %dr1, _f16s,_lts0hi 0x48, %dg24
          fdivd,5,sm    %dg18, %dr0, %dg18
        }
        {
          ldd,0,sm      %dr2, _f16s,_lts0lo 0x20, %dr15
          ldd,2,sm      %dr2, _f16s,_lts1lo 0x18, %dr16
          ldd,3,sm      %dr1, _f16s,_lts0hi 0x68, %dr14
        }
        {
          ldd,0,sm      %dr2, 0x8, %dr18
          ldd,2,sm      %dr2, 0x0, %dr19
          ldd,3,sm      %dr2, _f16s,_lts0lo 0x10, %dr17
          fdivd,5,sm    %dg19, %dr0, %dg19
        }
        {
          ldd,0,sm      %dr10, _f16s,_lts0lo 0x38, %dr21
          fmuld,1,sm    %dr4, %dr12, %dr12
          ldd,2,sm      %dr10, _f16s,_lts1lo 0x30, %dr22
          ldd,3,sm      %dr10, _f16s,_lts0hi 0x40, %dr20
          fmuld,4,sm    %dr4, %dg23, %dg23
          fmuld,5,sm    %dr4, %dg27, %dg27
        }
        {
          ldd,0,sm      %dr1, _f16s,_lts0lo 0x80, %dr23
          ldd,2,sm      %dr1, _f16s,_lts1lo 0xa0, %db[50]
          ldd,3,sm      %dr10, _f16s,_lts0hi 0x28, %dg29
          fmul_addd,4,sm        %dr3, %dg29, %dg16, %dg16
          fdivd,5,sm    %dg20, %dr0, %dg20
        }
        {
          ldd,0,sm      %dr10, _f16s,_lts0lo 0xa0, %db[3]
          fmuld,1,sm    %dr4, %dr13, %dr13
          ldd,2,sm      %dr2, _lit16_ref,_lts0lo 0xa0, %db[2]
          ldd,3,sm      %dr1, _f16s,_lts0hi 0x98, %db[52]
          fmuld,4,sm    %dr4, %dg24, %dg24
          fmuld,5,sm    %dr4, %dg25, %dg25
        }
        {
          ldd,0,sm      %dr10, _f16s,_lts0lo 0x98, %db[5]
          fmuld,1,sm    %dr4, %dr14, %dg31
          ldd,2,sm      %dr2, _lit16_ref,_lts0lo 0x98, %db[4]
          ldd,3,sm      %dr1, _f16s,_lts0hi 0x90, %db[54]
          fmul_addd,4,sm        %dr3, %dg31, %dg28, %dg28
          fdivd,5,sm    %dg21, %dr0, %dg21
        }
        {
          ldd,0,sm      %dr10, _f16s,_lts0lo 0x90, %db[7]
          ldd,2,sm      %dr2, _lit16_ref,_lts0lo 0x90, %db[6]
          ldd,3,sm      %dr10, _f16s,_lts0hi 0x88, %db[9]
        }
        {
          ldd,0,sm      %dr2, _f16s,_lts0lo 0x88, %db[8]
          ldd,2,sm      %dr10, _f16s,_lts0hi 0x80, %db[11]
          ldd,3,sm      %dr2, _lit16_ref,_lts0hi 0x80, %db[10]
          fmul_addd,4,sm        %dr3, %dr8, %dg30, %dg30
          ldd,5,sm      %dr10, _f16s,_lts1lo 0x78, %db[13]
        }
        {
          ldd,0,sm      %dr2, _f16s,_lts0lo 0x78, %db[12]
          fmuld,1,sm    %dr4, %dr23, %db[51]
          ldd,2,sm      %dr10, _f16s,_lts0hi 0x70, %db[15]
          ldd,3,sm      %dr2, _lit16_ref,_lts0hi 0x70, %db[14]
          fdivd,5,sm    %dg23, %dr0, %db[42]
        }
        {
          ldd,0,sm      %dr10, _f16s,_lts0lo 0x68, %db[17]
          ldd,2,sm      %dr2, _lit16_ref,_lts0lo 0x68, %db[16]
          ldd,3,sm      %dr10, _f16s,_lts0hi 0x60, %db[19]
          fmul_addd,4,sm        %dr3, %dr11, %dg26, %dg23
          ldd,5,sm      %dr2, _lit16_ref,_lts0hi 0x60, %db[18]
        }
        {
          ldd,0,sm      %dr10, _f16s,_lts0lo 0x58, %db[21]
          ldd,2,sm      %dr2, _lit16_ref,_lts0lo 0x58, %db[20]
          ldd,3,sm      %dr10, _f16s,_lts0hi 0x50, %db[23]
          fdivd,5,sm    %dg27, %dr0, %db[44]
        }
        {
          ldd,0,sm      %dr2, _f16s,_lts0lo 0x50, %db[22]
          ldd,2,sm      %dr10, _f16s,_lts0hi 0x48, %db[25]
          ldd,3,sm      %dr2, _lit16_ref,_lts0hi 0x48, %db[24]
          fmul_addd,4,sm        %dr3, %dg17, %dg22, %dg17
          ldd,5,sm      %dr2, _f16s,_lts1lo 0x40, %db[26]
        }
        {
          ldd,0,sm      %dr2, _f16s,_lts0lo 0x38, %db[28]
          ldd,2,sm      %dr2, _f16s,_lts0hi 0x30, %db[30]
          ldd,3,sm      %dr2, _f16s,_lts1lo 0x28, %db[32]
          fmul_addd,4,sm        %dr15, %dr5, %dg16, %db[27]
          fdivd,5,sm    %dr12, %dr0, %db[46]
        }
        {
          ldd,0,sm      %dr1, _f16s,_lts0lo 0x88, %dg16
          fmul_addd,3,sm        %dr3, %dr20, %dg18, %db[39]
          fmul_addd,4,sm        %dr16, %dr5, %dg28, %db[29]
        }
        {
          fdivd,5,sm    %dg24, %dr0, %db[48]
        }
        {
          fmul_addd,3,sm        %dr17, %dr5, %dg30, %db[31]
          fmul_addd,4,sm        %dr3, %dr21, %dg19, %db[41]
        }
        {
          fdivd,5,sm    %dg25, %dr0, %db[36]
        }
        {
          fmul_addd,3,sm        %dr18, %dr5, %dg23, %db[33]
          fmul_addd,4,sm        %dr3, %dr22, %dg20, %db[43]
        }
        {
          fmuld,0,sm    %dr4, %dg16, %db[49]
          fdivd,5,sm    %dr13, %dr0, %db[38]
        }
        {
          fmul_addd,3,sm        %dr19, %dr5, %dg17, %db[35]
          fmul_addd,4,sm        %dr3, %dg29, %dg21, %db[45]
        }
        {
          nop 7
          fdivd,5,sm    %dg31, %dr0, %db[40]
        }
.L608:
        {
          loop_mode
          fmul_addd,3,sm        %dr3, %db[25], %db[48], %db[37]
          fmuld,4,sm    %dr4, %db[54], %db[47]
          fdivd,5,sm    %db[51], %dr0, %db[34]
          movad,0       area=0, ind=0, am=1, be=0, %db[0]
          movad,1       area=1, ind=0, am=1, be=0, %db[1]
        }
        {
          loop_mode
          alc   alcf=1, alct=1
          abn   abnf=1, abnt=1
          ct    %ctpr1 ? %NOT_LOOP_END
          staad,2       %db[35], %aad3[ %aasti1 ]
          incr,2        %aaincr0
          fmul_addd,3,sm        %db[32], %dr5, %db[45], %db[25]
          movad,3       area=0, ind=0, am=1, be=0, %db[48]
        }
        {
          setwd wsz = 0x10, nfx = 0x1, dbl = 0x1
          setbn rsz = 0x3, rbs = 0xc, rcur = 0x0
          adds,0        0x0, 0x0, %g16
        }
        {
          disp  %ctpr2, disp=0x0
          aaurw,2       %g16, %aabf0
        }
.L421:
        {
          ct    %ctpr3
          addd,3        0x0, %dr6, %dr0
        }

.L1133:
        {
          fapb  ct=0, dcd=0, fmt=4, mrng=8, d=0, incr=0, ind=0, asz=4, abs=0, disp=0
          fapb  dpl=0, dcd=0, fmt=4, mrng=8, d=1, incr=0, ind=0, asz=5, abs=0, disp=0
        }
        {
          fapb  ct=1, dcd=0, fmt=4, mrng=8, d=2, incr=0, ind=0, asz=4, abs=16, disp=0
        }

.L1495:
        {
          nop 3
        }
        {
          nop 7
          fmul_addd,0,sm        %db[72], %dr5, %db[44], %db[80]
        }
        {
          ibranch       .L1511
        }
.L1498:
        {
          nop 3
        }
        {
          nop 3
          fmuld,3,sm    %dr4, %db[79], %db[25]
        }
        {
          nop 7
          fdivd,5,sm    %db[25], %dr0, %db[47]
        }
        {
          nop 5
        }
        {
          nop 7
          fmul_addd,3,sm        %dr3, %db[60], %db[47], %db[44]
        }
        {
          nop 7
          fmul_addd,3,sm        %db[72], %dr5, %db[44], %db[80]
        }
        {
          nop 1
        }
        {
          ibranch       .L1508
        }
main:
        {
          setwd wsz = 0x16, nfx = 0x0, dbl = 0x0
          setbn rsz = 0x3, rbs = 0x12, rcur = 0x0
          disp  %ctpr1, calloc
          getsp,0       _f32s,_lts1 0xffffff40, %dr2
          addd,1        0x0, _f64,_lts2 0x400000003, %dr3
          scrd,3        0x1, 0x2, %dr4
        }
        {
          qppackdl,0    %dr3, _f64,_lts2 0x200000001, %xr3
          addd,1,sm     0x4, 0x0, %db[1]
          addd,2        0x0, _f64,_lts0 0xffffffff00000001, %dr5
          addd,3,sm     0x4, 0x0, %db[0]
          addd,4        0x0, %dr4, %dr6
        }
        {
          addd,0        %dr2, _f64,_lts0 0xc0, %dr1
          addd,1        0x0, _f64,_lts2 0x100000002, %dr7
          addd,2        0x0, %dr5, %dr9
        }
        {
          qppackdl,0    %dr9, %dr5, %xr5
          qppackdl,1    %dr7, _f64,_lts1 0x400000008, %xr3
          stqp,2        %dr1, _f16s,_lts0lo 0xffe0, %xr3
          adds,3        0x0, _f16s,_lts0hi 0x705c, %r7
        }
        {
          ldw,0 %dr1, _f16s,_lts0lo 0xffec, %r9
          addd,1        0x0, [ _f64,_lts2 .LC.1 ], %dr13
          ldw,2 %dr1, _f16s,_lts0hi 0xffe4, %r11
          ldw,3 %dr1, _f16s,_lts1lo 0xffe8, %r10
          ldw,5 %dr1, _f16s,_lts1hi 0xffe0, %r12
        }
        {
          call  %ctpr1, wbs = 0x12
          stqp,2        %dr1, _f16s,_lts0lo 0xffd0, %xr3
          addd,4        0x0, _f64,_lts1 0x4010000000000000, %dr14
          ldw,5 %dr1, _f16s,_lts0hi 0xffdc, %r3
        }
        {
          disp  %ctpr1, printf
          ldw,0 %dr1, _f16s,_lts0lo 0xffd8, %r5
          stqp,2        %dr1, _f16s,_lts0hi 0xfff0, %xr5
          addd,3        0x0, _f64,_lts2 0x3ff0000000000000, %dr16
          ldw,5 %dr1, _f16s,_lts1lo 0xffd4, %r15
        }
        {
          ldw,0 %dr1, _f16s,_lts0lo 0xfff8, %r19
          ldw,2 %dr1, _f16s,_lts0hi 0xffd0, %r17
          ldw,3 %dr1, _f16s,_lts1lo 0xfff4, %r20
          qppackdl,4    %dr14, _f64,_lts2 0x4008000000000000, %xr21
          ldw,5 %dr1, _f16s,_lts1hi 0xfffc, %r18
        }
        {
          addd,1        0x0, _f64,_lts1 0x4020000000000000, %dr23
          ldw,2 %dr1, _f16s,_lts0lo 0xfff0, %r22
          qppackdl,4    %dr6, %dr16, %xr6
        }
        {
          getfs,0       %r9, %r7, %r24
          getfs,1       %r10, %r7, %r25
          getfs,3       %r11, %r7, %r26
          getfs,4       %r12, %r7, %r7
          shls,5        %r9, 0x3, %r9
        }
        {
          shls,0        %r10, 0x3, %r10
          ands,1        %r25, 0x1, %r25
          shls,2        %r11, 0x3, %r11
          ands,3        %r26, 0x1, %r26
          shls,4        %r12, 0x3, %r12
          ands,5        %r7, 0x1, %r7
        }
        {
          shls,0        %r3, 0x1, %r27
          shls,1        %r3, 0x2, %r3
          ands,2        %r24, 0x1, %r24
          shls,3        %r5, 0x1, %r28
          shls,4        %r5, 0x2, %r5
          shls,5        %r15, 0x1, %r29
        }
        {
          shls,0        %r15, 0x2, %r15
          shls,1        %r20, 0x3, %r30
          shls,2        %r18, 0x3, %r31
          shls,3        %r19, 0x3, %r32
          shls,4        %r17, 0x1, %r33
          shls,5        %r17, 0x2, %r17
        }
        {
          shls,0        %r22, 0x3, %r34
          adds,1        %r10, %r25, %r10
          adds,2        %r12, %r7, %r7
          adds,3        %r11, %r26, %r11
          subs,4        %r27, %r18, %r12
          adds,5        %r9, %r24, %r9
        }
        {
          adds,0        %r3, %r31, %r3
          subs,1        %r28, %r19, %r18
          adds,2        %r5, %r32, %r5
          subs,3        %r29, %r20, %r19
          adds,4        %r15, %r30, %r15
          subs,5        %r33, %r22, %r20
        }
        {
          adds,0        %r17, %r34, %r17
          sars,1        %r10, 0x1, %r10
          adds,2        %r18, %r5, %r5
          sars,3        %r7, 0x1, %r7
          sars,4        %r11, 0x1, %r11
          adds,5        %r19, %r15, %r15
        }
        {
          adds,0        %r12, %r3, %r3
          adds,1        %r20, %r17, %r12
          sars,2        %r9, 0x1, %r9
          qpswitchd,3   %xr21, %xr15
          qpswitchd,4   %xr6, %xr17
          adds,5        %r15, %r11, %r11
        }
        {
          adds,0        %r5, %r10, %r5
          adds,1        %r3, %r9, %r3
          adds,2        %r12, %r7, %r7
          sxt,3 0x2, %r11, %db[2]
          addd,4        0x0, _f64,_lts0 0x3fffd70a3d70a3d7, %dr9
          addd,5        0x0, _f64,_lts2 0xbff0000000000000, %dr10
        }
        {
          sxt,0 0x2, %r5, %db[3]
          sxt,1 0x2, %r3, %db[4]
          sxt,2 0x2, %r7, %dr12
          qppackdl,3    %dr10, %dr16, %xr7
          addd,4        0x0, _f64,_lts0 0x401c000000000000, %dr10
          stw,5 %db[0], 0x0, %r7
        }
        {
          addd,0        0x0, _f64,_lts2 0x4018000000000000, %dr18
          addd,2,sm     0x0, %dr12, %db[1]
          qpswitchd,3   %xr7, %xr11
          qppackdl,4    %dr14, _f64,_lts0 0x4020000000000000, %xr14
          stw,5 %db[0], 0x4, %r11
        }
        {
          addd,0        0x0, [ _f64,_lts0 .LC.2 ], %dr16
          qppackdl,3    %dr16, %dr4, %xr4
          qpswitchd,4   %xr14, %xr5
          stw,5 %db[0], 0x8, %r5
        }
        {
          addd,2,sm     0x0, [ _f64,_lts0 .LC.1 ], %db[0]
          qpswitchd,3   %xr4, %xr3
          stw,5 %db[0], 0xc, %r3
        }
        {
          std,2 0x18, %dr2, %db[3]
          std,5 %dr2, _f16s,_lts0lo 0x20, %db[4]
        }
        {
          std,2 %dr2, 0x8, %dr12
          std,5 0x10, %dr2, %db[2]
        }
        {
          std,2 %dr2, 0x0, %dr13
        }
        {
          call  %ctpr1, wbs = 0x12
        }
        {
          nop 4
          disp  %ctpr1, calloc
          addd,0,sm     0x4, 0x0, %db[0]
          addd,1        0x8, 0x0, %db[1]
        }
        {
          call  %ctpr1, wbs = 0x12
        }
        {
          disp  %ctpr1, printf
          fmuld,0,sm    %dr6, %dr23, %dr6
          fmuld,1       %dr11, %dr10, %dr11
          fmuld,2       %dr7, %dr10, %dr7
          fmuld,3,sm    %dr15, %dr23, %dr12
          fmuld,4,sm    %dr21, %dr23, %dr13
          fmuld,5,sm    %dr17, %dr23, %dr15
        }
        {
          nop 2
          fmuld,0       %dr3, %dr18, %dr3
          fmuld,1       %dr4, %dr18, %dr4
          fmuld,2       %dr5, %dr18, %dr5
          fmuld,3       %dr14, %dr18, %dr10
        }
        {
          nop 1
          fdivd,5,sm    %dr12, %dr9, %dr12
        }
        {
          nop 1
          fdivd,5,sm    %dr13, %dr9, %dr13
        }
        {
          nop 1
          fdivd,5,sm    %dr15, %dr9, %dr14
        }
        {
          nop 7
          fdivd,5,sm    %dr6, %dr9, %dr6
        }
        {
          nop 1
          faddd,3,sm    %dr11, %dr12, %dr9
        }
        {
          nop 1
          faddd,3,sm    %dr7, %dr13, %dr12
        }
        {
          nop 1
          faddd,3,sm    %dr11, %dr14, %dr11
          faddd,4,sm    %dr9, %dr3, %dr3
        }
        {
          nop 1
          faddd,3,sm    %dr7, %dr6, %dr6
          faddd,4,sm    %dr12, %dr4, %dr4
        }
        {
          nop 1
          faddd,3,sm    %dr11, %dr5, %dr5
        }
        {
          nop 1
          addd,0,sm     0x0, %dr3, %db[4]
          faddd,3,sm    %dr6, %dr10, %dr6
        }
        {
          nop 1
          addd,0,sm     0x0, %dr4, %db[3]
        }
        {
          addd,0,sm     0x0, %dr5, %db[2]
          std,5 %db[0], 0x0, %dr6
        }
        {
          std,5 %db[0], 0x8, %dr5
        }
        {
          addd,0,sm     0x0, %dr6, %db[1]
          std,5 %db[0], _f16s,_lts0lo 0x10, %dr4
        }
        {
          addd,0,sm     0x0, [ _f64,_lts1 .LC.2 ], %db[0]
          std,5 %db[0], _f16s,_lts0lo 0x18, %dr3
        }
        {
          std,2 %dr2, _f16s,_lts0lo 0x20, %dr3
          std,5 0x18, %dr2, %dr4
        }
        {
          std,2 0x10, %dr2, %dr5
          std,5 %dr2, 0x8, %dr6
        }
        {
          std,2 %dr2, 0x0, %dr16
        }
        {
          call  %ctpr1, wbs = 0x12
        }
        {
          nop 5
          return        %ctpr3
          addd,3        0x0, 0x0, %dr0
        }
        {
          ct    %ctpr3
        }
.LC.1:
        .ascii  "%d %d %d %d\n\000"
.LC.2:
        .ascii  "%f %f %f %f\n\000"

        elbrus_optimizing_compiler_v1.24.10_Mar__8_2020 = 0x0
 

EntityFX

Junior Member
Mar 2, 2021
16
37
46
Can I ask what the plug is about? Getting the word on Elbrus out there, support from the IC? :).
Hi. Yes, we're enthusiast team, we research e2k architecture, its assembly language, trying to port an existing open source software (Mostly C/C++ projects [But any Python, JS, PHP, Java, C# work Ok too])to e2k architecture. Maybe someone will be interested.
Everyone can start from here: http://ftp.altlinux.org/pub/people/mike/elbrus/docs/elbrus_prog/html/index.html (This book in Russian, but you can use any online translator from Russian to English)/
 

DrMrLordX

Lifer
Apr 27, 2000
17,034
6,005
136
I do not think I would be ordering 10k of any CPU. Personally I am still looking for a cheap A76 (or better yet A77 or A78!) SBC to play with, and being able to snap up a cheap Elbrus SBC would be kinda fun as well, even though I know the hardware itself is aimed squarely at the Russian market (as is supported software). $2k would be out of the question for a little hobby SBC though. And a group buy wouldn't be happening. Not at that scale.
 

EntityFX

Junior Member
Mar 2, 2021
16
37
46
STREAM memory benchmark.

Elbrus 8C 1200 Mhz (4 CPU server board, total 32 Cores, DDR3-1600), 1 thread:
Code:
-------------------------------------------------------------
STREAM version $Revision: 5.10 $
-------------------------------------------------------------
This system uses 8 bytes per array element.
-------------------------------------------------------------
Array size = 100000000 (elements), Offset = 0 (elements)
Memory per array = 762.9 MiB (= 0.7 GiB).
Total memory required = 2288.8 MiB (= 2.2 GiB).
Each kernel will be executed 10 times.
 The *best* time for each kernel (excluding the first iteration)
 will be used to compute the reported bandwidth.
-------------------------------------------------------------
Your clock granularity/precision appears to be 1 microseconds.
Each test below will take on the order of 81921 microseconds.
   (= 81921 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
-------------------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
-------------------------------------------------------------
Function    Best Rate MB/s  Avg time     Min time     Max time
Copy:           16674.2     0.102085     0.095957     0.142147
Scale:          16511.2     0.101129     0.096904     0.129083
Add:            19486.0     0.126514     0.123165     0.140751
Triad:          19358.4     0.124993     0.123977     0.125983
-------------------------------------------------------------
Solution Validates: avg error less than 1.000000e-13 on all three arrays
-------------------------------------------------------------
Elbrus 8C 1200 Mhz (4 CPU server board, total 32 Cores, DDR3-1600):

Code:
$ cc stream.c -O4 -DSTREAM_ARRAY_SIZE=100000000 -DSTREAM_TYPE=double -fopenmp
entityfx@yukari:~/STREAM$ ./a.out
-------------------------------------------------------------
STREAM version $Revision: 5.10 $
-------------------------------------------------------------
This system uses 8 bytes per array element.
-------------------------------------------------------------
Array size = 100000000 (elements), Offset = 0 (elements)
Memory per array = 762.9 MiB (= 0.7 GiB).
Total memory required = 2288.8 MiB (= 2.2 GiB).
Each kernel will be executed 10 times.
The *best* time for each kernel (excluding the first iteration)
will be used to compute the reported bandwidth.
-------------------------------------------------------------
Number of Threads requested = 32
Number of Threads counted = 32
-------------------------------------------------------------
Your clock granularity/precision appears to be 1 microseconds.
Each test below will take on the order of 31778 microseconds.
   (= 31778 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
-------------------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
-------------------------------------------------------------
Function    Best Rate MB/s  Avg time     Min time     Max time
Copy:           33722.5     0.117056     0.047446     0.350954
Scale:          32959.0     0.133574     0.048545     0.429537
Add:            37047.9     0.193759     0.064781     0.668979
Triad:          36455.3     0.165948     0.065834     0.411430
-------------------------------------------------------------
Solution Validates: avg error less than 1.000000e-13 on all three arrays

Elbrus 8CB 1550 Mhz 8 core (1 CPU, DDR4-2400):
Code:
-------------------------------------------------------------
STREAM version $Revision: 5.10 $
-------------------------------------------------------------
This system uses 8 bytes per array element.
-------------------------------------------------------------
Array size = 100000000 (elements), Offset = 0 (elements)
Memory per array = 762.9 MiB (= 0.7 GiB).
Total memory required = 2288.8 MiB (= 2.2 GiB).
Each kernel will be executed 10 times.
The *best* time for each kernel (excluding the first iteration)
will be used to compute the reported bandwidth.
-------------------------------------------------------------
Number of Threads requested = 8
Number of Threads counted = 8
-------------------------------------------------------------
Your clock granularity/precision appears to be 1 microseconds.
Each test below will take on the order of 45269 microseconds.
   (= 45269 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
-------------------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
-------------------------------------------------------------
Function    Best Rate MB/s  Avg time     Min time     Max time
Copy:           23097.3     0.069884     0.069272     0.070459
Scale:          23137.4     0.069689     0.069152     0.070604
Add:            25578.7     0.094895     0.093828     0.096911
Triad:          25643.2     0.094898     0.093592     0.096150
-------------------------------------------------------------
Solution Validates: avg error less than 1.000000e-13 on all three arrays
-------------------------------------------------------------
 
Last edited:

ASK THE COMMUNITY