News "Aurora’s Troubles Move Frontier into Pole Exascale Position" - HPCwire

Page 2 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

moinmoin

Diamond Member
Jun 1, 2017
5,248
8,463
136
With Sapphire Rapids delayed Intel's Aurora exascale supercomputer misses yet another date. This means Frontier is now on route of becoming the first exascale supercomputer.

 

eek2121

Diamond Member
Aug 2, 2005
3,491
5,183
136
Intel pre-announced a 300m charge against Q4 earnings, which most people interpret as some penalty payment for missing yet another Aurora deadline.

So, DoE has some pocket change to go on a shopping spree...

I want to play devil's advocate here and say that the 300M charge may actually be an up front payment to TSMC, or it could be any number of things. I'm not claiming one thing or another (I haven't even heard about the charge, because I don't own Intel stock and I never will), but assumptions are a dangerous thing.
 

Joe NYC

Diamond Member
Jun 26, 2021
4,239
5,859
136
I want to play devil's advocate here and say that the 300M charge may actually be an up front payment to TSMC, or it could be any number of things. I'm not claiming one thing or another (I haven't even heard about the charge, because I don't own Intel stock and I never will), but assumptions are a dangerous thing.

There is actually difference in accounting terms. A charge or expense are treated differently from a prepayment. First one reduces earnings, the second does not.

And Intel referred to this as an upcoming a charge, which does reduce earnings and that's why Intel pre-announced it.
 
  • Like
Reactions: scineram

jpiniero

Lifer
Oct 1, 2010
17,205
7,580
136
I remember reading something where Intel was told to deliver the system by the end of 2021 "or else". I took that to mean that the project was getting cancelled entirely but a very large charge works too.
 
  • Like
Reactions: Joe NYC

moinmoin

Diamond Member
Jun 1, 2017
5,248
8,463
136
I want to play devil's advocate here and say that the 300M charge may actually be an up front payment to TSMC, or it could be any number of things. I'm not claiming one thing or another (I haven't even heard about the charge, because I don't own Intel stock and I never will), but assumptions are a dangerous thing.
We talked about this charge in the Intel Q2 results thread.

While nobody official is going to call that a penalty it pretty obviously is, with what remains of Aurora being at cost at best for Intel. Considering Aurora initially started as being based around Intel's Xeon Phi product family that has since been abandoned this project sure had a multifaceted "life".

But I don't think Argonne National Laboratory is abandoning Aurora. As @Vattila rightly pointed out the mostly Nvidia based system Polaris being installed this year in place of Aurora is being used in a way that should be fully reusable on Aurora once it comes to existence sometime, fully ignoring all of Nvidia's specific tools and ecosystem.
 

jpiniero

Lifer
Oct 1, 2010
17,205
7,580
136
But I don't think Argonne National Laboratory is abandoning Aurora. As @Vattila rightly pointed out the mostly Nvidia based system Polaris being installed this year in place of Aurora is being used in a way that should be fully reusable on Aurora once it comes to existence sometime, fully ignoring all of Nvidia's specific tools and ecosystem.

I bet there's a second deadline where they do give up (and they just go ahead and expand this new system instead).
 
  • Like
Reactions: Joe NYC

DrMrLordX

Lifer
Apr 27, 2000
23,227
13,309
136
Compute chiplet is on TSMC N5 and the link tile is on TSMC N7, nothing on Intel 4, at least according to their architecture day presentation.

Hmm, interesting. Thus far the only other sources I've read about Intel getting wafers from TSMC are N6 and (eventually) N3. Also if Intel can't include 7nm/Intel 4 in any part of Ponte Vecchio . . . yeouch! That was supposed to be the pipecleaner for that node.
 

moinmoin

Diamond Member
Jun 1, 2017
5,248
8,463
136
I bet there's a second deadline where they do give up (and they just go ahead and expand this new system instead).
There should be additional deadlines for Intel with further penalties. But as long as Intel eventually delivers Argonne is getting an exascale supercomputer at cost or less. Hard to beat that with any other offer at this point, unless Intel just stops bothering while exascale supercomputers become commonplace. Which is where additional deadlines and penalties should come in (maybe the final one should be Intel having to shoulder the cost of buying an exascale supercomputer from the competition for maximum embarrassment).
 
  • Like
Reactions: Joe NYC

dmens

Platinum Member
Mar 18, 2005
2,275
965
136
There should be additional deadlines for Intel with further penalties. But as long as Intel eventually delivers Argonne is getting an exascale supercomputer at cost or less. Hard to beat that with any other offer at this point, unless Intel just stops bothering while exascale supercomputers become commonplace. Which is where additional deadlines and penalties should come in (maybe the final one should be Intel having to shoulder the cost of buying an exascale supercomputer from the competition for maximum embarrassment).

The Argonne scientists all use CUDA anyways, Aurora and Intel was forced on them by the DoE management as some “national science initiative”. The maximum embarrassment scenario is the preferred option for the actual researchers, only the bureaucrats and politicians oppose it.
 

JasonLD

Senior member
Aug 22, 2017
488
447
136
Hmm, interesting. Thus far the only other sources I've read about Intel getting wafers from TSMC are N6 and (eventually) N3. Also if Intel can't include 7nm/Intel 4 in any part of Ponte Vecchio . . . yeouch! That was supposed to be the pipecleaner for that node.

Well, ever since the year delay of their 7nm/Intel 4 was announced, that probably wasn't happening anymore due to timing.


This is new pipecleaner for Intel 4.
 
  • Like
Reactions: Joe NYC

moinmoin

Diamond Member
Jun 1, 2017
5,248
8,463
136
The Argonne scientists all use CUDA anyways, Aurora and Intel was forced on them by the DoE management as some “national science initiative”. The maximum embarrassment scenario is the preferred option for the actual researchers, only the bureaucrats and politicians oppose it.
Actual scientists preferring proprietary closed source tool chains, really? Eww...
 
  • Like
Reactions: lobz

DrMrLordX

Lifer
Apr 27, 2000
23,227
13,309
136
Actual scientists preferring proprietary closed source tool chains, really? Eww...

In the end, it's the best tool for the job. NV has done a really good job there. Maybe all these ultra-complicated SYCL tools will eventually eclipse CUDA (while allowing CUDA diehards to convert their projects to SYCL).

edit; oops sorry for the double comment.
 
Last edited:

moinmoin

Diamond Member
Jun 1, 2017
5,248
8,463
136
In the end, it's the best tool for the job. NV has done a really good job there.
On a personal level I don't disagree (I've stated before that its thorough software ecosystem is Nvidia's main competitive advantage). I just expect different from prudent science than to jump into a hard dependency at the first step. But maybe I'm still too used to independent basic research whereas nowadays corporate sponsored research (with all its unscientific warts like this one) is dominating.
 
  • Like
Reactions: Tlh97 and Joe NYC

DeathReborn

Platinum Member
Oct 11, 2005
2,786
789
136
On a personal level I don't disagree (I've stated before that its thorough software ecosystem is Nvidia's main competitive advantage). I just expect different from prudent science than to jump into a hard dependency at the first step. But maybe I'm still too used to independent basic research whereas nowadays corporate sponsored research (with all its unscientific warts like this one) is dominating.

The local universities/colleges I do work for all use CUDA purely because they demand something that works with minimal trouble. With CUDA they get a known quantity, the extra costs of going nvidia are outweighed by the staff support they would otherwise require. The platform varies but the Accelerators are almost exclusively Nvidia.
 
  • Like
Reactions: Tlh97 and Vattila

moinmoin

Diamond Member
Jun 1, 2017
5,248
8,463
136
The local universities/colleges I do work for all use CUDA purely because they demand something that works with minimal trouble. With CUDA they get a known quantity, the extra costs of going nvidia are outweighed by the staff support they would otherwise require. The platform varies but the Accelerators are almost exclusively Nvidia.
Are those departments that do basic research in the area of computing? If not they are semi-excused considering the quasi-monopoly Nvidia built there.

But that's exactly why DoE, DARPA etc. stimulate solutions building on Intel and AMD, not Nvidia, to build a feasible choice outside any pre-existing monopoly. This is the context in which I find statements like @dmens quoting above, that "only the bureaucrats and politicians oppose" the monopoly completely baffling. In my experience it was especially the politicians that boneheadedly run into every opportunity to allow corporations to create or foster monopolies, and scientists pushing back to allow for less dependencies for more basic scientific research. But then again my experience is not from in the US but in Germany, and not today but two decades ago.
 

dmens

Platinum Member
Mar 18, 2005
2,275
965
136
Are those departments that do basic research in the area of computing? If not they are semi-excused considering the quasi-monopoly Nvidia built there.

But that's exactly why DoE, DARPA etc. stimulate solutions building on Intel and AMD, not Nvidia, to build a feasible choice outside any pre-existing monopoly. This is the context in which I find statements like @dmens quoting above, that "only the bureaucrats and politicians oppose" the monopoly completely baffling. In my experience it was especially the politicians that boneheadedly run into every opportunity to allow corporations to create or foster monopolies, and scientists pushing back to allow for less dependencies for more basic scientific research. But then again my experience is not from in the US but in Germany, and not today but two decades ago.

1. The vast majority of people do not care at all about using a proprietary software stack, they use what works best.
2. The Aurora acquisition is entirely political, akin to the military-industrial complex purchases where the money must be spread out to various parasitic corporations in order to keep the pig troughs evenly filled so the politicians can claim they did something for their constituents.
 

jpiniero

Lifer
Oct 1, 2010
17,205
7,580
136
2. The Aurora acquisition is entirely political, akin to the military-industrial complex purchases where the money must be spread out to various parasitic corporations in order to keep the pig troughs evenly filled so the politicians can claim they did something for their constituents.

There's also the National Security interests given that Intel is only one fabbing the chips in the US. That's why Intel keeps getting second, third chances.
 
  • Like
Reactions: Zucker2k

moinmoin

Diamond Member
Jun 1, 2017
5,248
8,463
136
1. The vast majority of people do not care at all about using a proprietary software stack, they use what works best.
You mean: What they know best. Until they don't have the choice anymore. And they seriously call themselves scientists, not office workers?

2. The Aurora acquisition is entirely political, akin to the military-industrial complex purchases where the money must be spread out to various parasitic corporations in order to keep the pig troughs evenly filled so the politicians can claim they did something for their constituents.
Of course it's entirely political, as this wouldn't happen otherwise. But investment into creating choice where there currently effectively isn't one is for the eventual good for everybody. An approach that would be part of a possible Hippocratic Oath for scientists or whatever.
 

Joe NYC

Diamond Member
Jun 26, 2021
4,239
5,859
136
2. The Aurora acquisition is entirely political, akin to the military-industrial complex purchases where the money must be spread out to various parasitic corporations in order to keep the pig troughs evenly filled so the politicians can claim they did something for their constituents.

There is an overall goal of fostering supercomputing innovation, and these supercomputer purchases fund that.

I think I disagree with comparisons to, say defense industry, with all the bad incentives for generals, DoD staffs to seek payoff of their corrupt work while on government payroll, in hope of getting on the corporate boards or consulting fees from the defense companies.

I don't know of parallels of, say HPe, Intel, AMD, NVidia trying to corrupt the government officials for supercomputing contracts.

Also, unlike defense industries, the government procurement, especially in supercomputing, is relatively small portion of revenue, unlike defense industries, where it is close to all their revenue. Even their exports are in some ways contingent on government approvals.
 

dmens

Platinum Member
Mar 18, 2005
2,275
965
136
There is an overall goal of fostering supercomputing innovation, and these supercomputer purchases fund that.

I think I disagree with comparisons to, say defense industry, with all the bad incentives for generals, DoD staffs to seek payoff of their corrupt work while on government payroll, in hope of getting on the corporate boards or consulting fees from the defense companies.

I don't know of parallels of, say HPe, Intel, AMD, NVidia trying to corrupt the government officials for supercomputing contracts.

Also, unlike defense industries, the government procurement, especially in supercomputing, is relatively small portion of revenue, unlike defense industries, where it is close to all their revenue. Even their exports are in some ways contingent on government approvals.

You can draw a direct parallel with mil-ind even without the rotating door between the DoD staffers and corporate boards. It is the spreading of money from one corporation to another as policy. Just like how fighter programs would rotate between Lockheed, Mcdonnell Douglas, etc, regardless of the merits of the plane, DoE will spread money between Intel, NVIDIA, AMD, no matter how obviously superior one solution is compared to the other.

NVIDIA created its GPU compute ecosystem entirely through vision and hard work, any government contract they won was deserved. I fail to see why taxpayer money needs to be handed over to second/third place maybe-finishers in this highly competitive business.
 
  • Like
Reactions: scineram

Joe NYC

Diamond Member
Jun 26, 2021
4,239
5,859
136
You can draw a direct parallel with mil-ind even without the rotating door between the DoD staffers and corporate boards. It is the spreading of money from one corporation to another as policy. Just like how fighter programs would rotate between Lockheed, Mcdonnell Douglas, etc, regardless of the merits of the plane, DoE will spread money between Intel, NVIDIA, AMD, no matter how obviously superior one solution is compared to the other.

NVIDIA created its GPU compute ecosystem entirely through vision and hard work, any government contract they won was deserved. I fail to see why taxpayer money needs to be handed over to second/third place maybe-finishers in this highly competitive business.

Intel did not even have a GPU product, so perhaps, Intel's entry into the GPU market was partially funded by the government Aurora contract.

As far as the current AMD deliverable of Frontier supercomputer, I don't think the combination of EPYC Zen 3 + CDNA 2 + Unified memory is 2nd to anyone.

The US government actually bet correctly in this case.

BTW, Intel was also supposed to have Unified Memory for Aurora. It is a little bit unclear if that is still in the cards. There was some recent news saying that after all, Sapphire Rapids and Ponte Vecchio will not be able to have Unified Memory Access between them.
 

DeathReborn

Platinum Member
Oct 11, 2005
2,786
789
136
Are those departments that do basic research in the area of computing? If not they are semi-excused considering the quasi-monopoly Nvidia built there.

But that's exactly why DoE, DARPA etc. stimulate solutions building on Intel and AMD, not Nvidia, to build a feasible choice outside any pre-existing monopoly. This is the context in which I find statements like @dmens quoting above, that "only the bureaucrats and politicians oppose" the monopoly completely baffling. In my experience it was especially the politicians that boneheadedly run into every opportunity to allow corporations to create or foster monopolies, and scientists pushing back to allow for less dependencies for more basic scientific research. But then again my experience is not from in the US but in Germany, and not today but two decades ago.

They do everything from Physics, Chemistry, IC Design/Simulation & Petro Chemical (my brother just finished his Degree there using the mainframe we installed). That University does a lot of research for BAE Systems, QinetiQ, IBM etc.
 
  • Like
Reactions: Tlh97 and Vattila

Vattila

Senior member
Oct 22, 2004
822
1,467
136
Apart from the politics of the manner supercomputer contracts are awarded, there is a practical point to make on this: If you participate in this segment, and you want to run your code across different heterogeneous architectures, CUDA is not an option, and alas, Nvidia does not seem to support any good cross-platform alternative.

This is why I think CUDA is doomed in the long run, with Khronos SYCL already emerging as the replacement, eventually to be subsumed by the ISO C++ standard.


It is striking that the Polaris supercomputer, based on AMD CPU + Nvidia GPU, will use SYCL as the programming model, in preparation for the Aurora system, based on Intel CPU+GPU. The Polaris system should be well suited for performance analysis and conversion of existing CUDA code targeted for Aurora (as well as Frontier and El Capitan, the other upcoming exascale supercomputers, both based on AMD CPU+GPU). Contracted by the national labs, Codeplay is hard at work providing support for all platforms in Intel's DPC++ compiler, which implements the SYCL 2020 standard, and is part of Intel's oneAPI open framework. I applaud Intel for doing the right thing here.
 

DrMrLordX

Lifer
Apr 27, 2000
23,227
13,309
136
Intel did not even have a GPU product, so perhaps, Intel's entry into the GPU market was partially funded by the government Aurora contract.

They did have iGPUs, and they're extending that tech to larger accelerators. Though they took whatever gub'ment money they could get for the project. No doubt there.

As far as the current AMD deliverable of Frontier supercomputer, I don't think the combination of EPYC Zen 3 + CDNA 2 + Unified memory is 2nd to anyone.

Debateable. But if you're a developer, are you really going to want to work with the AMD software stack?

BTW, Intel was also supposed to have Unified Memory for Aurora. It is a little bit unclear if that is still in the cards.

OneAPI is still there. Waiting.

There was some recent news saying that after all, Sapphire Rapids and Ponte Vecchio will not be able to have Unified Memory Access between them.

Got any links? I don't remember seeing that.

Apart from the politics of the manner supercomputer contracts are awarded, there is a practical point to make on this: If you participate in this segment, and you want to run your code across different heterogeneous architectures, CUDA is not an option, and alas, Nvidia does not seem to support any good cross-platform alternative.

This is why I think CUDA is doomed in the long run, with Khronos SYCL already emerging as the replacement, eventually to be subsumed by the ISO C++ standard.

CUDA is still a force, and NV intends to keep it that way. They don't want their hardware to compete on a level playing field with a common software stack. Especially not when their competitors control the CPUs and motherboards. However . . .

It is striking that the Polaris supercomputer, based on AMD CPU + Nvidia GPU, will use SYCL as the programming model, in preparation for the Aurora system, based on Intel CPU+GPU. The Polaris system should be well suited for performance analysis and conversion of existing CUDA code targeted for Aurora (as well as Frontier and El Capitan, the other upcoming exascale supercomputers, both based on AMD CPU+GPU).

. . . if the conversion tools that allow CUDA projects to be translated effectively and efficiently to the SYCL programming model actually work well, then NV's goose may be cooked anyway.
 
Last edited:
  • Like
Reactions: Tlh97 and Vattila