Cloudflare replaces Intel with AMD

moinmoin

Diamond Member
Jun 1, 2017
4,933
7,619
136

Gen X: Intel Not Inside
Compared with our prior server (Gen 9), it processes as much as 36% more requests while costing substantially less. Additionally, it enables a ~50% decrease in L3 cache miss rate and up to 50% decrease in NGINX p99 latency, powered by a CPU rated at 25% lower TDP (thermal design power) per core.
We selected the AMD EPYC 7642 processor in a single-socket configuration for Gen X. This CPU has 48-cores (96 threads), a base clock speed of 2.4 GHz, and an L3 cache of 256 MB. While the rated power (225W) may seem high, it is lower than the combined TDP in our Gen 9 servers and we preferred the performance of this CPU over lower power variants. Despite AMD offering a higher core count option with 64-cores, the performance gains for our software stack and usage weren’t compelling enough.
The performance we’ve seen from the AMD EPYC 7642 processor has encouraged us to accelerate replacement of multiple generations of Intel-based servers.

Gen 9 is 2x 24 core Intel custom off-roadmap 1.9Ghz 150W, which replaced Gen 8 being 2x 12 core Intel Xeon Silver 4116 2.1Ghz 85W before.

So these are pretty straight replacements, dual-socket Intel 48-core (2x 24, 300W) updated to single socket 48-core EPYC (225W).
 
Last edited:

maddie

Diamond Member
Jul 18, 2010
4,722
4,625
136






Gen 9 is 2x 24 core Intel custom off-roadmap 1.9Ghz 150W, which replaced Gen 8 being 2x 12 core Intel Xeon Silver 4116 2.1Ghz 85W before.

So these are pretty straight replacements, dual-socket Intel 48-core (2x 24, 300W) updated to single socket 48-core EPYC (225W).
Just free thinking here.

With the Corona virus almost certain to cause a world recession, will AMD be disproportionally positively affected, assuming product can be delivered, as one aspect of recessions is companies drastically cutting costs. Business as usual and complacency will vanish.
 

moinmoin

Diamond Member
Jun 1, 2017
4,933
7,619
136
I personally don't think Covid-19 will cause an actual recession, just a "pause" in economic activity. The demand won't recede, it's just delayed. The actual challenge will be that companies and employees make it unscathed through the forced lock down period and economy can pick up as before after the break.

I consider AMD itself to be nimble enough that it should be able to weather the storm, but it is to be seen if that's true for all of their business partners as well.
 

moinmoin

Diamond Member
Jun 1, 2017
4,933
7,619
136
Cloudflare, the only company that changes "optimal" CPU architecture more often than @VirtualLarry changes socks.

For fun and games read this.
They do write:
Readers of our blog might remember our excitement around ARM processors. We even ported the entirety of our software stack to run on ARM, just as it does with x86, and have been maintaining that ever since even though it calls for slightly more work for our software engineering teams. We did this leading up to the launch of Qualcomm’s Centriq server CPU, which eventually got shuttered. While none of the off-the-shelf ARM CPUs available this moment are interesting to us, we remain optimistic about high core count offerings launching in 2020 and beyond, and look forward to a day when our servers are a mix of x86 (Intel and AMD) and ARM.
It seems they do a good job leaving all possible options open.
 

ondma

Platinum Member
Mar 18, 2018
2,718
1,278
136
Just free thinking here.

With the Corona virus almost certain to cause a world recession, will AMD be disproportionally positively affected, assuming product can be delivered, as one aspect of recessions is companies drastically cutting costs. Business as usual and complacency will vanish.
Actually, Intel has much more market clout and financial reserves to withstand a recession than AMD does, so my take is just the opposite.
 

ultimatebob

Lifer
Jul 1, 2001
25,135
2,445
126
I'm curious how much AMD stock the posters in this topic are holding right now. Combining that Cloudflare news story with a completely unrelated Coronavirus post REALLY makes it feel like a stock shill.

And, NO, I'm not holding any AMD or Intel stock personally.
 

Makaveli

Diamond Member
Feb 8, 2002
4,715
1,049
136
I'm curious how much AMD stock the posters in this topic are holding right now. Combining that Cloudflare news story with a completely unrelated Coronavirus post REALLY makes it feel like a stock shill.

And, NO, I'm not holding any AMD or Intel stock personally.

I don't see the correlation between two.
 

maddie

Diamond Member
Jul 18, 2010
4,722
4,625
136
I'm curious how much AMD stock the posters in this topic are holding right now. Combining that Cloudflare news story with a completely unrelated Coronavirus post REALLY makes it feel like a stock shill.

And, NO, I'm not holding any AMD or Intel stock personally.
I made the "completely unrelated Coronavirus post". Good luck to anyone who thinks the economic landscape next year will look anything like it is today. Companies across the board are reducing expectations for this year. Remember this post in 6 months. If I'm wrong then have a good laugh.
 

Markfw

Moderator Emeritus, Elite Member
May 16, 2002
25,478
14,434
136
Another blog post from them on the same subject (more upcoming):

Its obvious from their performance graphs. More processing power, less power consumption, and less sockets. There is virtually no metric that they are not doing better with. Impressive win for AMD and them.
 

prtskg

Senior member
Oct 26, 2015
261
94
101
I made the "completely unrelated Coronavirus post". Good luck to anyone who thinks the economic landscape next year will look anything like it is today. Companies across the board are reducing expectations for this year. Remember this post in 6 months. If I'm wrong then have a good laugh.
Economic activities are being affected, there's no doubt about it. Hope this virus gets under control soon
 

beginner99

Diamond Member
Jun 2, 2009
5,208
1,580
136
The actual challenge will be that companies and employees make it unscathed through the forced lock down period
Economic activities are being affected, there's no doubt about it. Hope this virus gets under control soon

It's already out of control. A virus that spreads from people without symptomps is almost impossible to control. Only way would have been immediate global quarantine, stop of all flights or any other long distance travel. Plus china should still be in lock down until there are no sick people anymore but that was economically not viable.

The economy will win, there is no chance the needed complete travel shutdown / quarantine will be imposed for at least 2-3 months globally synced, just too expensive and too complex. This means the virus will spread globally.

However since we now know more about the virus than just 2-3 weeks ago, the impact on the workforce will not be any worse than the normal flu. It only really kills old and sick people at a higher rate than the flu. Probbaly why the ban was lifted in China. All in all this is pure media panic and we should stop to control and just accept this a "fact of live", people get sick and people die.

Any control or blockage effort at this point is wasted. the virus has almost certainly already spread far further than is known. Why you may ask? All new infected (like Italy) are old people. It's clear why, because only they have symptopms strong enough to actually see a doctor and get tested. For every old one diagnosed there are probbaly >10 younger people with milder or no symptoms. Which means the Virus will spread in Italy and ultimatey Europe.
 
  • Like
Reactions: Tarkin77

BigDaveX

Senior member
Jun 12, 2014
440
216
116
It's already out of control. A virus that spreads from people without symptomps is almost impossible to control. Only way would have been immediate global quarantine, stop of all flights or any other long distance travel. Plus china should still be in lock down until there are no sick people anymore but that was economically not viable.

The economy will win, there is no chance the needed complete travel shutdown / quarantine will be imposed for at least 2-3 months globally synced, just too expensive and too complex. This means the virus will spread globally.

However since we now know more about the virus than just 2-3 weeks ago, the impact on the workforce will not be any worse than the normal flu. It only really kills old and sick people at a higher rate than the flu. Probbaly why the ban was lifted in China. All in all this is pure media panic and we should stop to control and just accept this a "fact of live", people get sick and people die.

Any control or blockage effort at this point is wasted. the virus has almost certainly already spread far further than is known. Why you may ask? All new infected (like Italy) are old people. It's clear why, because only they have symptopms strong enough to actually see a doctor and get tested. For every old one diagnosed there are probbaly >10 younger people with milder or no symptoms. Which means the Virus will spread in Italy and ultimatey Europe.
Yeah, SARS actually caused more severe symptoms than this virus - and to further put things into perspective, the SARS scare was mocked at the time on South Park by having someone infected with the virus tearfully telling his son that he had only a 98% chance of survival.

Getting back to the topic at hand, definitely a big win for AMD. Getting back to anything like the market share figures that Opteron used to have in the mid-2000s is likely going to be a long, slow process, but this is the kind of design win that helped Opteron achieve those figures back in the day.
 

coercitiv

Diamond Member
Jan 24, 2014
6,151
11,674
136
However since we now know more about the virus than just 2-3 weeks ago, the impact on the workforce will not be any worse than the normal flu. It only really kills old and sick people at a higher rate than the flu. Probbaly why the ban was lifted in China. All in all this is pure media panic and we should stop to control and just accept this a "fact of live", people get sick and people die.

Any control or blockage effort at this point is wasted. the virus has almost certainly already spread far further than is known. Why you may ask? All new infected (like Italy) are old people. It's clear why, because only they have symptopms strong enough to actually see a doctor and get tested. For every old one diagnosed there are probbaly >10 younger people with milder or no symptoms. Which means the Virus will spread in Italy and ultimatey Europe.
There's so much wrong in this post I don't know where to begin with, true and false facts combined to reach a conclusion that has little do do with reality in contaminated areas.

It's exactly the fact that this virus can spread exponentially faster than the flu that we should not abandon any efforts to contain it. We do not enforce quarantine to stop the virus (which was already considered impossible a while ago), we do it to keep the number of new cases as close to manageable levels as possible. The real crisis begins once hospitals are overwhelmed. For context, in China the critical cases were around double the death count (5% vs 2.3%). If left unchecked this virus has the potential to wreak havoc through highly populated urban areas. Even if we go by what we know about China (and there's probably lots we don't know), they are still enforcing extraordinary measures to contain the virus. Travel is still carefully monitored, people are being continuously tracked.

Personally I would advise everybody to stop giving verdicts on the severity of this disease until we see how we're able to deal with it here in in Europe. For the very first time you will get real data for large urban areas, not censored information from China or skewed numbers from a cruise ship. It will also be the first time we observe it as it spreads through high density population with prior knowledge of it's existence. Wuhan was oblivious in the most critical stage of the outbreak.
 

moinmoin

Diamond Member
Jun 1, 2017
4,933
7,619
136
Yet another blog post from them on the same subject, this time with focus on the huge impact Epyc's huge L3 cache has:

As well as one on performance tuning:
 
Last edited:

maddie

Diamond Member
Jul 18, 2010
4,722
4,625
136
Yet another blog post from them on the same subject, this time with focus on the huge impact Epyc's huge L3 cache has:

As well as one on performance tuning:
Cloudflare must have been one of those evaluating Rome for quite a while before this roll out. The reduction in main memory reads for their workload is very significant [39% to 14%]. Is this why we might be seeing even more L3 being made available for some Milan models? I know it's not a linear relationship but they might get their levels to below 10% in next gen.
 

moinmoin

Diamond Member
Jun 1, 2017
4,933
7,619
136
Cloudflare must have been one of those evaluating Rome for quite a while before this roll out. The reduction in main memory reads for their workload is very significant [39% to 14%]. Is this why we might be seeing even more L3 being made available for some Milan models? I know it's not a linear relationship but they might get their levels to below 10% in next gen.
I honestly can't imagine L3 cache increasing much in Zen 3 on N7+, if at all. I'd expect that to happen again with N5 due to the more significant increase of density affordable on that node (same with further increase of core count). But since Zen 3 will improve the cache handling ("unified L3" etc.) it may still show an improvement.

Since Cloudflare relies on 48 cores Epycs mainly for better core to L3 cache ratio, it would be interesting to know whether the newly announced 32 cores Epyc 7532 with still the full 256MB L3$ were to show further improvements (it has 50% more L3$ per core after all).
 

moinmoin

Diamond Member
Jun 1, 2017
4,933
7,619
136
Yet another blog post from them on the same subject, this time Cloudflare finds out Epyc supports encrypting RAM, benchmarks it with surprising little performance impact (mean 3.7%, expected >5%):
 
  • Like
Reactions: lightmanek