Povray on ARM

Discussion in 'CPUs and Overclocking' started by jhu, Jun 12, 2012.

  1. jhu

    jhu Lifer

    Joined:
    Oct 10, 1999
    Messages:
    11,878
    Likes Received:
    0
    Finally figured out how to compile 3.7 on ARM!

    Povray 3.7
    Code:
    FX 8350 (4m/8t, 4.1 GHz turbo):        1985.83 pps ; 121.09 pps/module/GHz
    Core i5 3317U (Ivy Bridge)
      (2c/4t, 1.7 GHz, 2.4 GHz turbo):      573.68 pps ; 119.52 pps/core/GHz
    Core i7 2600 (Sanyd Bridge)
      (4c/8t, 3.4 GHz, 3.5 GHz turbo):     1618.48 pps ; 115.61 pps/core/GHz
    Core i5 4570 (Haswell)
      (4c/4t, 3.2 GHz, 3.4 GHz turbo):     1540.24 pps ; 113.25 pps/core/GHz
    Core i5 660 (Nehalem)
      (2c/4t, 3.33 GHz, 3.46 GHz turbo):    694.34 pps ; 100.34 pps/core/GHz
    Core i5 3317U (Ivy Bridge)
      (2c/2t, 1.7 GHz, 2.4 GHz turbo):      474.19 pps ;  98.79 pps/core/GHz
    Core i5 2400S (Sandy Bridge)
      (4c/4t, 2.5 GHz, 2.6 GHz turbo):      991.64 pps ;  95.35 pps/core/GHz
    Core i7 2600 (sandy Bridge)
      (4c/4t, 3.4 GHz, 3.5 GHz turbo):     1318.28 pps ;  94.16 pps/core/GHz
    Core i5 660 (Nehalem)
      (2c/2t, 3.33 GHz, 3.46 GHz turbo):    561.51 pps ;  81.14 pps/core/GHz
    Core 2 Duo E8400 (2c/2t 3.0 GHz):       462.46 pps ;  77.08 pps/core/GHz
    Phenom II x6, 1090T 
      (6c/6t, 3.2 GHz):                    1388.11 pps ;  72.30 pps/core/GHz
    FX 8350 (4m/1t, 4.2 GHz turbo):         292.03 pps ;  69.53 pps/module/GHz 
    AMD E-450 (2c/2t, 1.6 GHz):             159.86 pps ;  49.95 pps/core/GHz
    Exynos 5250 (1 thread, 1.7 GHz):         83.95 pps ;  49.38 pps/GHz
    Exynos 5250 (2c/2t, 1.7 GHz):           160.62 pps ;  47.24 pps/core/GHz
    Snapdragon 801, MSM8974AB
      (1t, 0.96 GHz):                        35.34 pps ;  36.81 pps/GHz
    PowerPC 970MP 
      (G5, 4c/4t, 2.5 GHz):                 357.72 pps ;  35.77 pps/core/GHz
    PowerPC 7400 (G4, 0.47 GHz):             16.20 pps ;  34.71 pps/core/GHz
    Pentium 4 HT (1c/2t, 3.2 GHz):          105.92 pps ;  33.10 pps/core/GHz
    Pentium 4m (1c/1t, 1.5 GHz):             48.99 pps ;  32.66 pps/core/GHz
    S4 Pro APQ8064 (1t, 1.026 GHz):          31.58 pps ;  30.78 pps/GHz
    Pentium 4 HT (1c/1t, 3.2 GHz):           86.04 pps ;  26.89 pps/core/GHz
    Atom N270 (1c/2t, 1.6 GHz):              42.05 pps ;  26.28 pps/core/GHz
    OMAP4430 (2c/2t, 1.0 GHz):               50.73 pps ;  25.37 pps/core/GHz
    OMAP4470 (2c/2t, 1.5 GHz):               72.88 pps ;  24.29 pps/core/GHz
    Exynos 4210 (2c/2t, 1.2 GHz):            48.15 pps ;  20.06 pps/core/GHz
    S4 Pro APQ8064 (4c/4t, 1.5 GHz):        110.76 pps ;  ????? pps/core/GHz
    Snapdragon 801, MSM8974AB 
      (4c/4t, 2.5 GHz):                     177.66 pps ;  ????? pps/core/GHz
    
    The systems:

    FX8350 - FreeBSD 10, gcc 4.8 / gcc 4.9
    Core i5 4570 - Ubuntu 14.04, icc 14.0.3
    Core i5 3317U - Ubuntu 14.04, icc 14.0.3
    Core i5 2400S - Ubuntu 14.04, icc 14.0.3
    Phenom IIx6 1090T - Ubuntu 14.04, gcc 4.8
    AMD E-450 - Ubuntu 12.04, gcc 4.6
    Pentium 4 HT - Debian 7, gcc 4.7
    Pentium 4m - Debian 7, gcc 4.7
    Atom N270 - Ubuntu 12.04, icc 12

    PowerPC 4700 - Debian 7, gcc 4.6

    Exynos 5250 (Chromebook) - Ubuntu 14.04, gcc 4.8
    Snapdragon 801 MSM8974AB (HTC One M8) - Android 4.4, gcc 4.8
    S4 Pro APQ8064 (Nexus 4) - Android 4.4, gcc 4.6
    OMAP 4430/4470 (Nook Tablet, Nook HD+) - Android 4.3, gcc 4.6
     
    #1 jhu, Jun 12, 2012
    Last edited: Sep 8, 2014
  2. Soulkeeper

    Soulkeeper Diamond Member

    Joined:
    Nov 23, 2001
    Messages:
    6,261
    Likes Received:
    0
    interesting results, thanks
     
  3. soccerballtux

    Joined:
    Dec 30, 2004
    Messages:
    12,560
    Likes Received:
    0
    if you wanted to calculate these in a normalized clock/clock score for the lazy brain dead among us that would be appreciated. This is a great thread.
     
    #3 soccerballtux, Jun 13, 2012
    Last edited: Jun 14, 2012
  4. TuxDave

    TuxDave Lifer

    Joined:
    Oct 8, 2002
    Messages:
    10,576
    Likes Received:
    0
  5. jhu

    jhu Lifer

    Joined:
    Oct 10, 1999
    Messages:
    11,878
    Likes Received:
    0
    Next I'll benchmark my Nook Color
     
  6. jhu

    jhu Lifer

    Joined:
    Oct 10, 1999
    Messages:
    11,878
    Likes Received:
    0
    Updated with pps and pps/GHz. Looks like Exynos is faster than first generation Atom per GHz. I wonder how the new Medfield performs?
     
  7. soccerballtux

    Joined:
    Dec 30, 2004
    Messages:
    12,560
    Likes Received:
    0
    This is a cool thread.
    It'll be interesting to see how the hard abi, and neon, affect things.
    Also, A15 out now which is ~25% faster clock/clock.
    Pretty soon we won't need Intel...
     
  8. Arkadrel

    Arkadrel Diamond Member

    Joined:
    Oct 19, 2010
    Messages:
    3,683
    Likes Received:
    0

    ^ neat :)
    Also holy cow that Athon II kicks arse compaired to the others.
    Now do one with a Ivy Bridge :D
     
  9. jhu

    jhu Lifer

    Joined:
    Oct 10, 1999
    Messages:
    11,878
    Likes Received:
    0
    They're actually all single thread. What's more interesting is pps/GHz, and you'll see even the lowly celeron beat K10h. Ouch!
     
  10. Ferzerp

    Ferzerp Diamond Member

    Joined:
    Oct 12, 1999
    Messages:
    6,107
    Likes Received:
    1
    Not really. That's an incredibly meaningless metric on its own.
     
  11. jhu

    jhu Lifer

    Joined:
    Oct 10, 1999
    Messages:
    11,878
    Likes Received:
    0
    I wouldn't count on that. These guys did a comparison of armel vs. armhf on the Raspberry Pi. The numbers are better for armhf, but the ranges vary quite a bit.

    Hmm, looks like Debian armhfs beta installer is downloadable. I don't know how I missed that. I'll report back later!
     
  12. Arkadrel

    Arkadrel Diamond Member

    Joined:
    Oct 19, 2010
    Messages:
    3,683
    Likes Received:
    0
    A "ARM Cortex A9" @4ghz would give a Celeron 220 a run for its money :)

    I wonder what its power usage would be like?
    How easy would it be to overclock one to around 4ghz?

    Does anyone feel like pulling apart a mobile phone,
    and slapping a heatsink onto the chip and trying?


    edit:

    http://www.engadget.com/2012/05/03/tsmc-ramps-28nm-arm-cortex-a9-chip-to-3-1ghz/

    Hmmm looks like they "could" make Cortex-A9's around 3.1ghz if they wanted to.
    It wouldnt be a "speedy" chip (still slower than celeron 220, going by the pps/GHz above),
    but it might be enough to run windows 8 and still have a semi enjoyable experiance.

    (weird its max rated at 2,000mhz, but they can get them running stable at 3.1ghz in labs)

    I wonder when we ll see the first ARM chips ment for PC users.
     
    #12 Arkadrel, Aug 12, 2012
    Last edited: Aug 12, 2012
  13. jhu

    jhu Lifer

    Joined:
    Oct 10, 1999
    Messages:
    11,878
    Likes Received:
    0
    It actually isn't. The Celeron 220 is Core 2 based. And that's about the time when Intel started to overtake AMD in performance. The pps/GHz is a decent proxy for IPC (Core 2 is nearly 2.5x faster per clock vs. Netburst, ouch!) and the number is, not surprisingly, fairly consistent across the range of a processor family so it's easy to see what pps numbers a processor would get at a certain frequency.
     
    #13 jhu, Aug 12, 2012
    Last edited: Aug 12, 2012
  14. Ferzerp

    Ferzerp Diamond Member

    Joined:
    Oct 12, 1999
    Messages:
    6,107
    Likes Received:
    1
    It has absolutely no practical meaning by itself. It is only when combined with clockspeed and power usage that it has any meaning at all.
     
  15. jhu

    jhu Lifer

    Joined:
    Oct 10, 1999
    Messages:
    11,878
    Likes Received:
    0
    Uhm, data, in general, has no practical meaning without context, but you also forgot price, which is more important (eg. Oracle touting it's TPS with TCO numbers on different hardware, etc.). With pps/GHz, it's easy to figure out how one particular chip will perform compared with another particular chip, then compare prices to see if the performance gain (if any) is worth the money to upgrade.
     
  16. Blitzvogel

    Blitzvogel Golden Member

    Joined:
    Oct 17, 2010
    Messages:
    1,919
    Likes Received:
    0
    I assume POVRay 3.7 tests the FPU more than anything in these chips?

    The Exynos, N270, P4m, and PPC750 all have 64 bit FPUs, with the K10h and Celeron 220 having 128 bit yes? From even such a small amount of data, you can interpolate other processors into it, simply based on their FPU widths. I assume any AVX equipped i series or AMD BD module to be about twice the pps/GHz.
     
    #16 Blitzvogel, Aug 12, 2012
    Last edited: Aug 12, 2012
  17. jhu

    jhu Lifer

    Joined:
    Oct 10, 1999
    Messages:
    11,878
    Likes Received:
    0
    Povray 3.7 is multithreaded, 3.6 is only single threaded. The program uses the SIMD registers as scalar entities. Looking through the code, there's not much that can be vectorized in critical paths and AVX2 probably will not be of much benefit for this program. Watching the program compile, the only things getting autovectorized are in the file manipulation routines.
     
    #17 jhu, Aug 12, 2012
    Last edited: Aug 12, 2012
  18. Blitzvogel

    Blitzvogel Golden Member

    Joined:
    Oct 17, 2010
    Messages:
    1,919
    Likes Received:
    0
    Sorry I get a bit confused as to how SIMD, Vector Units, FPUs, etc al all work in their own manner.:oops:

    But is the SIMD width directly affecting how much data is getting shoved through?
     
  19. Haserath

    Haserath Senior member

    Joined:
    Sep 12, 2010
    Messages:
    789
    Likes Received:
    0
    Improvements made by ARM's army of designers mean that the Cortex-A15's performance is significantly improved: official figures put the chip's integer performance at around 1.5 times that of the Cortex-A9, while floating-point performance is doubled.
    http://www.bit-tech.net/news/hardware/2012/08/10/samsung-exynos-5/1

    You can say that again. I was thinking they were going to start out with 2ghz dual cores though, but 1.7ghz would still put this a league ahead of an A9.

    It's amazing that these chips will be using 12.8GB/s memory bandwidth when the desktop APUs are merely getting 30GB/s or less.
     
  20. Arachnotronic

    Arachnotronic Diamond Member

    Joined:
    Mar 10, 2006
    Messages:
    8,510
    Likes Received:
    45
    Whatever you say, man.
     
    #20 Arachnotronic, Aug 12, 2012
    Last edited: Aug 12, 2012
  21. Arkadrel

    Arkadrel Diamond Member

    Joined:
    Oct 19, 2010
    Messages:
    3,683
    Likes Received:
    0
    @1.5x int and 2x fp, that would give it how many pps in this type of bench? 50ish? still lower than a celeron 220.

    I agree about the memory bandwidth however, Im guessing there gonna put it to good use:

    1) "promises 1080p video playback at 60 frames per second with full support for wireless displays and 3D stereoscopic videos."

    2) "the image processor portion can capture pictures and video from an eight megapixel sensor at 30 frames per second and features hardware post-processing units including 3D noise reduction, image stabilisation and optical distortion compensation"

    Also impressive:
    Comes with USB3 and SATA3.
     
  22. nenforcer

    nenforcer Golden Member

    Joined:
    Aug 26, 2008
    Messages:
    1,751
    Likes Received:
    0
    This is a great thread I will be very interested in following these numbers starting with the release of Windows 8 RT this fall and obviously Debian 7.

    Also interested in Qualcomm Snapdragon Krait and nVidia Tegra 4? since those are suppose to be considered the flagship ARM Cortex A15 processors.
     
  23. jhu

    jhu Lifer

    Joined:
    Oct 10, 1999
    Messages:
    11,878
    Likes Received:
    0
    Same here, but I don't have access to all that hadware.
     
  24. ShintaiDK

    ShintaiDK Lifer

    Joined:
    Apr 22, 2012
    Messages:
    20,081
    Likes Received:
    13
    Windows RT is a stillborn child tho. Almost all OEMs rejected it in favour of x86 tablets.
     
  25. jhu

    jhu Lifer

    Joined:
    Oct 10, 1999
    Messages:
    11,878
    Likes Received:
    0
    Exynos hard float numbers are up. Had to install Debian wheezy armhf on my phone. Loads of fun. Looks like performance is about on par wth a Pentium 4 and better than first generation Atom.
     
    #25 jhu, Aug 20, 2012
    Last edited: Aug 20, 2012