K1 compared to a typical Desktop Kepler GPU

greatnoob · Jan 6, 2014

Hey guys, I've been a long time reader here and I finally decided to create a forum account to get this one question out of my mind.

After Nvidia's CES conference I was kinda left shocked when they said "192 CUDA cores" and then they followed with "Kepler". I know that one of the differences between Fermi and Kepler is the manufacturing process (40nm vs 28nm?) and that translates into less power consumption and less heat draw allowing for better performance at the same TDP as the older architecture variant of a GPU.

Now my question is: at 192 CUDA cores which Nvidia claims to be THE actual Kepler part of a desktop (albeit with some modifications for the mobile space obviously) is it fair to say that at 192 CUDA cores the K1 can pull off ~50% of what the 740m can do with significantly less power draw and virtually no heat (for mobile devices at least, meaning no fans).

And if so how well does the Kepler K1 w/ 192 CUDA core compare to my current laptop's GPU, the 540m Fermi arch with 96 CUDA cores?

Thanks! :thumbsup:

BallaTheFeared · Jan 6, 2014

Kepler is something like .5 the perf of 1 femi core @ 25% the power consumption.

That's dgpu, k1 will cut out the interconnect power, and hopefully lower leakage/clocks/voltage get them the rest of the way.

BrightCandle · Jan 7, 2014

The 680 is 1536 cores, a 780 is 2304 and the 780 to is 2880. Even if we look pretty low end like Before 650 it has 384 cores and a lot more memory bandwidth. The 740m part equally has 384 sp.

So the claim its desktop performance like I don't really agree with. By and large no one recommends that low for actual gaming on a PC, its simply too slow.

SPBHM · Jan 7, 2014

probably a little faster than a GT 620 (48 Fermi Cores, 64bit memory bus) but slower than a 630 (96 Fermi Cores, 128bit memory bus).

but it's hard to be sure, I would expect it to have low memory bandwidth and gpu clock,

EliteRetard · Jan 7, 2014

K1 is going to be 64bit LPDDR3, so even compared to a 540m it's gimped.
I'd say best case K1 you'd be looking at ~65% of a DDR3 540m.

Shivansps · Jan 7, 2014

Nvidia claim of 60fps at GFXBench offscreen @ 60fps whould place it at the same level of a HD4400 on a I5-4200U, it does not look that much impressive to me. I dont think it will be able to get to 630 levels. GT620 is the safe bet for me.

*GT630 has been also been updated to Kepler
http://www.geforce.com/hardware/desktop-gpus/geforce-gt-630/specifications

SPBHM · Jan 7, 2014

Shivansps said:
Nvidia claim of 60fps at GFXBench offscreen @ 60fps whould place it at the same level of a HD4400 on a I5-4200U, it does not look that much impressive to me. I dont think it will be able to get to 630 levels. GT620 is the safe bet for me.

*GT630 has been also been updated to Kepler
http://www.geforce.com/hardware/desktop-gpus/geforce-gt-630/specifications

Memory Bandwidth (GB/sec) 14.4

lol

Fermi GT 630 is probably a better card,

nvidia presented some graph showing it as almost 3x the Apple A7 video;
on 3DM Ice Storm I think desktop HD 4000 can achieve almost 3x the score from the A7,

A7 is pretty fast when it comes to ARM SoCs, so I think it's impressive, even if it's not impressive compared to PC IGPs

EliteRetard · Jan 7, 2014

K1 GPU:

192 Cores (2 FLOPS per core)
950MHz (Claimed)
(192 x 2 x .95 = 365 GFLOPS)
8 TMUs
4 ROPs
17 GB/s (Claimed) would need LPDDR3 2133 at 64bit

GT540M
96 Cores (4 FLOPS per core)
672MHz
(96 x 4 x .672 = 258 GFLOPS)
16 TMUs
4 ROPs
28.8GB/s (DDR3 1800 at 128bit)

NTMBK · Jan 7, 2014

But can it run Crysis?

R0H1T · Jan 7, 2014

NTMBK said:
But can it run Crysis?

Well that depends on whether Crysis can run on Android or not

Shivansps · Jan 7, 2014

SPBHM said:
Memory Bandwidth (GB/sec) 14.4

lol

Fermi GT 630 is probably a better card,

nvidia presented some graph showing it as almost 3x the Apple A7 video;
on 3DM Ice Storm I think desktop HD 4000 can achieve almost 3x the score from the A7,

A7 is pretty fast when it comes to ARM SoCs, so I think it's impressive, even if it's not impressive compared to PC IGPs

I whould consider it impressive if i did no know about about Cherry Trail and Broadwell-Y, while Cherry Trail its unlikely to outperform it, itslikely to end up close at a lower tdp and power, but Broadwell-Y? its gona outperform it for sure if the best nvidia can do is match a HD4400 in a 4200U... also having lower tdp and x86, meaning i can actually use that igp power on Windows, on Android/RT the K1 IGP is wasted until that sort of performance become mainstream on mobile, by that time K1 will be long EOL.

R0H1T · Jan 7, 2014

Shivansps said:
I whould consider it impressive if i did no know about about Cherry Trail and Broadwell-Y, while Cherry Trail its unlikely to outperform it, itslikely to end up close at a lower tdp and power, but Broadwell-Y? its gona outperform it for sure if the best nvidia can do is match a HD4400 in a 4200U... at a lower tdp too... and its x86, meaning i can actually use that igp power on Windows.

Why is it so ? Cherry Trail & Broadwell Y are gonna be released sometime near Q4 2014-Q1 2015 timeline which gives Nvidia enough time to work on Maxwell based Tegra 6 & remember that starting with this next gen(Maxwell) Nvidia is also going to design low power GPU's & scale them to fit the desktop/mobile markets the same way Intel is doing, so for what it's worth K1 is impressive indeed considering it was their first attempt at bringing Console level graphics to low power devices.

BrightCandle · Jan 7, 2014

Its better than the current mobile GPUs, its also got real support for modern APIs which will make it easier to make good looking games. Combining it with high speed dual cores with Denver and some reasonable other parts on a SOC and you have yourself a decent tablet chip that will be very competitive. I have been unhappy with my tablet performance since I got it, and my phone. The reality is mobile needs a lot more performance before it genuinely becomes smooth.

But its not desktop equivalent however Nvidia tries to market it. Basically its previous generation console like tech, which makes it performance wise around 2005. We saw a massive increase in graphics performance (10x) in that period on the desktop, its an order of magnitude off from todays state of the art. Good for mobile but its not what Nvidia claims it is.

Arkadrel · Jan 7, 2014

Since Gflops compaired across differnt architectures is meaningless, I guess the best way to meassure its performance would be to compair it with the closest kepler GPU on the market.

The closests compairision would be a Nvidia Geforce GT 630.

It has; 192 cores @875 mhz, and a 192:16:16 config (TMUs,ROPs).

Which is close to the K1's 192 Cores & 8 TMUs,4 ROPs.

I guess you could look up 3D mark results for a 630, and imagine abit less than that, as the K1's results if it could run the same thing.

Now my question is: at 192 CUDA cores which Nvidia claims to be THE actual Kepler part of a desktop (albeit with some modifications for the mobile space obviously) is it fair to say that at 192 CUDA cores the K1 can pull off ~50% of what the 740m can do with significantly less power draw and virtually no heat (for mobile devices at least, meaning no fans).

740M = 384 cores (more TMUs, more ROPs)

So I doubt a K1 is in the same league as a 740 Mobile.

Its understandable a 5w chip isnt able to compete with something thats around ~30w though.

sontin · Jan 7, 2014

BrightCandle said:
But its not desktop equivalent however Nvidia tries to market it. Basically its previous generation console like tech, which makes it performance wise around 2005. We saw a massive increase in graphics performance (10x) in that period on the desktop, its an order of magnitude off from todays state of the art. Good for mobile but its not what Nvidia claims it is.

Kepler.M uses around 2W in games. What do you really expect from the GPU configuration?!

NTMBK · Jan 7, 2014

Arkadrel said:
Since Gflops compaired across differnt architectures is meaningless, I guess the best way to meassure its performance would be to compair it with the closest kepler GPU on the market.

The closests compairision would be a Nvidia Geforce GT 630.

It has; 192 cores @875 mhz, and a 192:16:16 config (TMUs,ROPs).

Which is close to the K1's 192 Cores & 8 TMUs,4 ROPs.

I guess you could look up 3D mark results for a 630, and imagine abit less than that, as the K1's results if it could run the same thing.

740M = 384 cores (more TMUs, more ROPs)

So I doubt a K1 is in the same league as a 740 Mobile.

Its understandable a 5w chip isnt able to compete with something thats around ~30w though.

According to Tom's Hardware, those ROPs are also less capable than their desktop equivalents:

And whereas each ROP outputs eight pixels per clock in, say, GK104, Tegra K1 drops to four.

So those 4 ROPs are actually the equivalent of 2 ROPs in standard Kepler.

greatnoob · Jan 7, 2014

Arkadrel said:
The closests compairision would be a Nvidia Geforce GT 630.

It has; 192 cores @875 mhz, and a 192:16:16 config (TMUs,ROPs).

The 630m is the Kepler based 540m according to Notebook check yet at a higher clock (672 vs 875) which should scale to the same amount of performance between the 2 architectures. Is it safe to say the ROP units are less effective when moving up from Fermi to Kepler just like 96 Fermi CUDA cores scales to 192 Kepler cores (refer to first user post)?

I find it really difficult to believe a 28nm Kepler based mobile SoC can bring console grade performance at such a low TDP (I heard 5w?!). Even if it's possible, how will that power even be leveraged? The A15 is as powerful as the low end 2ghz Core 2 Duos but it would be a bottleneck for mobile gaming if they really are promising this much GPU compute power.

Shivansps · Jan 7, 2014

R0H1T said:
Why is it so ? Cherry Trail & Broadwell Y are gonna be released sometime near Q4 2014-Q1 2015 timeline which gives Nvidia enough time to work on Maxwell based Tegra 6 & remember that starting with this next gen(Maxwell) Nvidia is also going to design low power GPU's & scale them to fit the desktop/mobile markets the same way Intel is doing, so for what it's worth K1 is impressive indeed considering it was their first attempt at bringing Console level graphics to low power devices.

simple, there are no going to be any ARM OS based game that use that sort of gpu power, because i dont see any people wanting to do a game that it will only work on the K1, by the time you could use that power on android the K1 will be already outperformed by other ARM cpus, so why even bother in getting something that is gona be wasted?

It does not really matters if Broadwell-Y or Cherry Trail comes out on Q4 14 or Q1 15, x86 is what make it worth it.

DarkKnightDude · Jan 7, 2014

My problem with it is the battery life is going to get killed by that really quickly. Any discussion on battery life?

BrightCandle · Jan 7, 2014

greatnoob said:
I find it really difficult to believe a 28nm Kepler based mobile SoC can bring console grade performance at such a low TDP (I heard 5w?!). Even if it's possible, how will that power even be leveraged? The A15 is as powerful as the low end 2ghz Core 2 Duos but it would be a bottleneck for mobile gaming if they really are promising this much GPU compute power.

The xbox 360 started on 90nm process. Implementing that today on 28nm it would be a 10th of the size and hence roughly a tenth of the power. Combined with significant improvements in power consumption technology over that period it stands to reason that the processing power from 2005 today can fit in either 5W for mobile or be about 10x that for roughly the same power consumption. It didn't all work out linearly in practice, the next gen of consoles aren't 10x but then they are also a bit lower power as well and in that period we have seen a stagnation of CPU performance and that has led to more transistors getting put into the GPU.

But its completely normal in this industry for this to be the basic relationship, this is what happens when a world class company takes its product and produces a modern chip on a modern process at the right moment. Almost everything Intel/AMD/Nvidia has been doing in mobile up to this point has been completely half arsed, old process, reused old tech. Now its on the same curve on architecture and process then it ends up in about the right place.

I just see this level of performance as an inevitable consequence of Moore's law.

ams23 · Jan 7, 2014

Shivansps said:
simple, there are no going to be any ARM OS based game that use that sort of gpu power, because i dont see any people wanting to do a game that it will only work on the K1, by the time you could use that power on android the K1 will be already outperformed by other ARM cpus, so why even bother in getting something that is gona be wasted?

All the major ultra mobile GPU vendors (including Qualcomm, ARM, ImgTech, etc.) will have ultra mobile GPU's that support DX11 and OpenGL 4.x graphics API's in 2014. Android is also the fastest growing and most pervasent mobile OS. It is just a matter of time before high quality games come to Android.

if the best nvidia can do is match a HD4400

Tegra K1 will be available within the next 3-5 months (not one year from now). The GPU only uses ~ 2w (!) at most on average (TDP for the whole SoC is 5w). Tegra M1 with Maxwell GPU due next year will have at least double the GPU performance in comparison.

At the end of the day, when it comes to ultra mobile graphics performance and feature set, Tegra K1 will be as good as it gets for some months to come.

NTMBK · Jan 7, 2014

ams23 said:
All the major ultra mobile GPU vendors (including Qualcomm, ARM, ImgTech, etc.) will have ultra mobile GPU's that support DX11 and OpenGL 4.x graphics API's in 2014. Android is also the fastest growing and most pervasent mobile OS. It is just a matter of time before high quality games come to Android.

I'm not convinced.

People were saying the exact same things 18 months ago, when the Ouya started raising money- that if you just combined Android, high performance hardware and a controller, that the games would come rolling in. People were saying the exact same thing 12 months ago, when the Shield was announced. The "AAA" games market just isn't there on Android, no matter how much hardware NVidia wants to throw at the problem. It's just the wrong ecosystem.

ams23 · Jan 7, 2014

The graphics performance and API support from ultra mobile GPU's will make a major leap in 2014. Game developers will take advantage of that, even if it doesn't happen overnight. Tim Sweeney and the UE4 team are on board, which is a start. With the proper API support and sufficient perf. in the ultra mobile space, porting games from other platforms to Android should not be difficult to do, and will be monetizeable too.

blackened23 · Jan 7, 2014

I'm still confused as to why a mobile SOC is being compared to desktop CPUs and GPUs. They're an entirely different market with different performance levels. Yes, it is based on Kepler but it's also designed to operate in a 2-5W power window. Which big Kepler obviously isn't.

NTMBK · Jan 7, 2014

ams23 said:
The graphics performance and API support from ultra mobile GPU's will make a major leap in 2014. Game developers will take advantage of that, even if it doesn't happen overnight. Tim Sweeney and the UE4 team are on board, which is a start. With the proper API support and sufficient perf., porting games from other platforms to Android should not be difficult.

The graphics performance makes a big leap every year. And Tim Sweeney was on board with UE3 last time (at least on iOS and Windows RT).

EDIT: Here's a fun blast from the past- Tim Sweeney demoing Unreal Engine 3 on the Tegra 2. http://www.geek.com/mobile/tegra-2-...d-console-3d-graphics-in-your-pocket-1046701/

K1 compared to a typical Desktop Kepler GPU

Senior member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Lifer

Platinum Member

Diamond Member

Platinum Member

Diamond Member

Diamond Member

Diamond Member

Lifer

Senior member

Diamond Member

Senior member

Diamond Member

Senior member

Lifer

Senior member

Diamond Member

Lifer