Info 64MB V-Cache on 5XXX Zen3 Average +15% in Games

Page 16 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Toggle sidebar Toggle sidebar

K

Kedas

Senior member

Jun 1, 2021

#1

Well we know now how they will bridge the long wait to Zen4 on AM5 Q4 2022.
Production start for V-cache is end this year so too early for Zen4 so this is certainly coming to AM4.
+15% Lisa said is "like an entire architectural generation"

Last edited: Jun 1, 2021

Reactions: Tlh97 and Gideon

H

Hitman928

Diamond Member

Aug 11, 2021

#376

Joe NYC said:
Current generation of interconnect has been a challenge to route through the substrate with 8 chiplets. I don't know if it can be extended in the current form any further, since Genoa will not only require extra 4 to 8 chiplets, the bandwidth should ideally double again.

AMD will have to switch to completely new technology to go forward.

I wasn't talking about going forward, I was saying they easily could have grown the CCDs to "throw more silicon at it" but they didn't. Obviously there is always a balance to these things and not just make the biggest best you can design no matter the costs.

Cutress has a video titled cost of 7nm wafer, and in the middle there is a table showing cost of different technologies. 12/14nm was within 20% of cost of 7nm.

I'll have to go find that table, but based upon my own experience having worked on said processes, this is wrong and 7 nm wafers are more expensive that 12nm + 20%.

Going from logic die to N6/N7 SRAM die, the cost should be drastically lower than N7 logic die. due to fewer layers of metal and likely use of EUV resulting in fewer processing steps.

7nm doesn't use EUV. They may require fewer layers in the cache die but the top layers are typically the least complex parts of the process so I have my doubts that this would be a real cost savings.

The question is: can one bad bond kill the entire chip or just 1 layer?
Most likely, it is only a single layer, so the yield is not going to be a big consideration.

Depends on how AMD has designed the base layer and the stacks, but if one layer is bonded incorrectly, it's not just that layer but everything above it that is now broken. So if the bad bond happens at the first stack, that's it, no stacked cache for that chip. Without knowing the stack yields and costs, how can you with any confidence say what AMD should do?

BTW, it is funny that people keep mentioning 1 stack. 1 stack is a proof of concept, but AMD and TSMC are clearly not doing this for one stack.

Of course not, but for the first product in the consumer space? Seems pretty reasonable to me.

That was my point. Intel had a poorly yielding process and only monolithic architecture making thing worse for larger dies. So throwing more silicon results in movement in wrong direction exponentially

But if the point is the have the top SKU no matter the cost, what does it matter? AMD adding stacks of cache will increase cost through more 7 nm silicon costs, stacking cost, and higher die processing costs as well as lowered yields and yet you think they should do it just to retain the top gaming SKU. You're not be consistent here.

AMD starts with very high yielding process and expands with even higher yielding dies.

I honestly don't know what you are saying here.

At this point, a competent manager would ask you: Is it a material difference?

Most people in the know consider it to be a breakthrough technology, people NOT in the know are talking about yield difference between 1 and 2 stacks.

The reason SRAM is known for extremely high yields is the ease of implementation of redundancies.

We're not talking about yields of the SRAM die themselves but yields of the stacking process on top of all other yields as well as the costs of the extra die processing, stacking, and assembly. All your arguments seem to be based around the idea that yields are extremely high and these costs are extremely low, so why not? But you have no idea what the costs or yields are.

I think there is another limiting factor - manpower. There were layoffs, and AMD has been staffing back up,

It would be opportunistic to have, say a shrink of Zen 3 to N5, but AMD, with limited staff, has to be more methodical where to put their eggs, into which baskets.

BTW, AMD is doing risk production of SoIC with TSMC. So this one looked like a good bet to AMD leadership.

SoIC won't be in "risk production" when Zen 3d launches, that's the whole point of them waiting to produce until the end of the year. The man power comment is just a non-sequitur.

A/B testing. Through which NVidia found out it can charge $1,500 for a market leading consumer card.

So Nvidia launched a failure at $3000 just so they could find out they could launch one at $1500? I don't even know what to say to this absurd suggestion. . .

Intel used to charge $1000. And 5950x. is not that far from $1000.

I don't know where the limit is, but Zen 3D seems to be the ideal product to test the waters. Since it can go as far as 8 stacks high.

So your suggestion is AMD should launch a Ryzen product with an 8 hi stack at what, $2000, $3000? And if it fails, well, then they'll know they can't do that?

There may be caveats, but after all the caveats, Alder Lake will be the best performing desktop CPU when you take a cross section of benchmark. And Intel will take the claim of best performing CPU from AMD.
(in absence of Zen 3D)

You said destroys, there's a huge difference between destroys and best performing across a cross section of benchmarks. Pick one or the other.

About the AMD being able to add performance in $6 increments until it beats ADL?

The $6 is an extremely conservative estimate for the cost of 36mm2 SRAM die. I think Cutress floated it this one as well. It is more likely half that, and $6 including assembly.

~$6 is probably a pretty good estimate for the cache die itself, but it won't be that cost when taking processing cost, stacking cost, stacking yield, and assembly into account. Not a chance.

Edit: since you keep using Ian as a reference, here's him saying that stacking technology is typically "expensive" (4:50 mark if timestamp doesn't work). That doesn't vibe really well with your $6 per layer theory.

It would be quite unimportant if you had a disagreement with a random poster on Internet. But you are not seeing eye to eye with AMD CEO about how to run their business:

What she said:

Lisa Su said:
I have extremely high standards. I really love to win and winning is having great people, empowering them, making sure that we're very clear and know what our priorities are and very direct in how we speak to each other. We're best as a company when we have that heritage of being a maverick and bucking the status quo.

What she didn't say:
We will have the highest performing parts in every segment at all costs, profits be darned.

Last edited: Aug 11, 2021

Reactions: DisEnchantment, Tlh97 and Thibsie

H

Hitman928

Diamond Member

Aug 11, 2021

#377

Joe NYC said:
Funny that you are speaking about cost of scaling performance at the very point of breakthrough, where the exponential curve was broken into linear.

What are you referring to?

moinmoin

Diamond Member

Aug 11, 2021

#378

Doug S said:
If the US is trying to attract others to build fabs in the US, like TSMC, it can't very well exclude Intel from the same subsidies. After all, Intel has fabs in other countries and if they get subsidies from them they would simply build fabs overseas instead of in the US. Multinationals like Intel don't care about the strategic needs of the US, they care only about what is most profitable.

I agree that subsidies targeted at Intel or that would only be available to Intel may not be wise. But if you want TSMC and Samsung to invest in US based fabs, it is pretty hard to write laws that would exclude Intel from those same subsidies.

Of course you shouldn't discriminate against Intel. But I sure hope the subsidies are combined with rules that both inhibit stock buybacks/dividends during the subsidising as well as set clear schedules with contractual penalties when those are missed.

Abwx said:
Productized mean produced.?.

Productized means turning something into a product that can be produced and sold. In R&D there is a lot of stuff that's unfortunately never productized.

Abwx said:
Curious since there is only one year or so left before Zen 4 is to be released, that would make a 6 months production window.

There have been mentions of Zen 4 going to datacenters first, with the consumer version coming later. If one thinks of the DDR5 market size having to increase over time first before it's really suitable for the consumer mass market, letting the competition as well as datacenters have a go at it first may be a pretty good strategic move.

Reactions: Tlh97, scannall and Joe NYC

Vattila

Senior member

Aug 11, 2021

#379

Joe NYC said:
The reason SRAM is known for extremely high yields is the ease of implementation of redundancies.

On redundancy and yield — through-silicon-via (TSV) interconnects may also apply redundancy to improve yield:

TSV Redundancy: Architecture and Design Issues in 3D IC

"3D technology provides many benefits including high density, high band-with, low-power, and small form-factor. Through Silicon Via (TSV), which provides communication links for dies in vertical direction, is a critical design issue in 3D integration. Just like other components, the fabrication and bonding of TSVs can fail. A failed TSV may cause a number of known-good-dies that are stacked together to be discarded. This can severely increase the cost and decrease the yield as the number of dies to be stacked increases. A redundant TSV architecture with reasonable cost for ASICs is proposed in this paper. Design issues including recovery rate and timing problem are addressed. Based on probabilistic models, some interesting findings are reported. First, the probability that three or more TSVs are failed in a tier is less than 0.002%. Assumption of that there are at most two failed TSVs in a tier is sufficient to cover 99.998% of all possible faulty free and faulty cases. Next, with one redundant TSV allocated to one TSV block, limiting the number of TSVs in each TSV block to be no greater than 50 and 25 leads to 90% and 95% recovery rates when 2 failed TSVs are assumed. Finally, analysis on overall yield shows that the proposed design can successfully recover most of the failed chips and increase the yield of TSV bonding to 99.99%. This can effectively reduce the cost of manufacturing 3D ICs."

https://past.date-conference.com/proceedings-archive/2010/DATE10/PDFFILES/03.3_4.PDF

Architecture of Ring-based Redundant TSV for Clustered Faults

"Three-dimensional Integrated Circuits (3D-ICs) that employ the Through-Silicon Vias (TSVs) vertically stacking multiple dies provide many benefits, such as high density, high bandwidth, low-power. However, the fabrication and bonding of TSVs may fail because of many factors, such as the winding level of the thinned wafers, the surface roughness and cleanness of silicon dies, and bonding technology. To improve the yield of 3D-ICs, many redundant TSV architectures were proposed to repair 3D-ICs with faulty TSVs. These methods reroute signals of faulty TSVs to other regular or redundant TSVs. In practice, the faulty TSVs may cluster because of imperfect bonding technology. To resolve the problem of clustered TSV faults, router-based redundant TSV architecture was the first paper proposed to pay attention to this clustering problem. Their method enables faulty TSVs to be repaired by redundant TSVs that are farther apart. However, for some rarely occurring defective patterns, their method consumes too much area. In this paper, we propose a ring-based redundant TSV architecture to utilize the area more efficiently as well as to maintain high yield. Simulation results show that for a given number of TSVs (8 × 8) and TSV failure rate (1%), our design achieves 54% area reduction of MUXes per signal, while the yield of our ring-based redundant TSV architectures can still maintain 98.47% to 99.00% as compared with router-based design. Furthermore, the minimum shifting length of our ring-based redundant TSV architecture is at most 1 which guarantees the minimum timing overhead of each signal."

https://past.date-conference.com/proceedings-archive/2015/pdf/0268.pdf

Reactions: Schmide, DisEnchantment, Tlh97 and 3 others

D

Doug S

Diamond Member

Aug 11, 2021

#380

moinmoin said:
Of course you shouldn't discriminate against Intel. But I sure hope the subsidies are combined with rules that both inhibit stock buybacks/dividends during the subsidising as well as set clear schedules with contractual penalties when those are missed.

So Intel just piles up the cash for now and does buybacks once they are able. They and many other companies held onto cash for years overseas for financial engineering reasons, now they'd be holding it in the US for the same reason but the end result would be the same. Other than making you feel better I guess, what good would be hoping to accomplish with that?

M

Mopetar

Diamond Member

Aug 11, 2021

#381

HurleyBird said:
But if AMD wants to become the market leader for x86 processors they need to cement the perception that they are the market leader. Letting the competition up for air when you don't need to weakens the narrative.

If AMD wants to become the market leader for x86 processors they would need to massively expand the number of wafers they buy from TSMC. Otherwise it doesn't matter how good their processors are because Intel has far more volume, which is readily apparent if you look at the revenue for both companies.

If you think that them simply having the best processor is going to topple Intel you really need to reevaluate the market. AMD could have the best processor for the next 3 generations and it wouldn't even come close to toppling Intel. Even when AMD had Bulldozer which was hopelessly behind they didn't die (even as close as they may have come) and even if AMD has a better product I don't see Intel becoming anywhere near as hopelessly behind.

Believing that you need to spend resources pointlessly one-upping the competition at every avenue is how you exhaust yourself and do far more damage than the competition ever could.

Reactions: Tlh97, Thibsie and Hitman928

J

Joe NYC

Diamond Member

Aug 11, 2021

#382

Hitman928 said:
I wasn't talking about going forward, I was saying they easily could have grown the CCDs to "throw more silicon at it" but they didn't. Obviously there is always a balance to these things and not just make the biggest best you can design no matter the costs.

But that's a different scenario. Throwing more silicon at the problem when you are already winning by a lot. You don't have to throw more silicon.

The scenario under discussion is adding very inexpensive silicon to clinch a win that is within grasp. This is when you add silicon.

Hitman928 said:
I'll have to go find that table, but based upon my own experience having worked on said processes, this is wrong and 7 nm wafers are more expensive that 12nm + 20%.

I went back to capture the frame with the table and it looks like 69% of N7.

Hitman928 said:
7nm doesn't use EUV. They may require fewer layers in the cache die but the top layers are typically the least complex parts of the process so I have my doubts that this would be a real cost savings.

1. SRAM will most likely be on N6, which uses EUV
2. Noted that you have "doubts" about SRAM being cheaper to produce than high performance, probably most complex logic design of Zen 3,

Hitman928 said:
Depends on how AMD has designed the base layer and the stacks, but if one layer is bonded incorrectly, it's not just that layer but everything above it that is now broken. So if the bad bond happens at the first stack, that's it, no stacked cache for that chip. Without knowing the stack yields and costs, how can you with any confidence say what AMD should do?

No, if you have unlimited number of TSVs, then layers are not sharing TSVs. If one column is broken, it affects only one layer. That's how I would design it.

And if the L3 has enough redundancy, that single section of die affected by the broken column may be replaced by a spare one.

Hitman928 said:
The man power comment is just a non-sequitur.

I don't think so. Any new SOC, or a node shrink need manpower and a whole bunch of overhead cost on top of that.

On one hand, you are arguing against AMD releasing an extra Zen 3D SKU with different number of stacked die, and next sentence you say AMD should be doing tape outs on new process nodes. Which is like 10s of orders of magnitude more complex and costly.

Hitman928 said:
So Nvidia launched a failure at $3000 just so they could find out they could launch one at $1500? I don't even know what to say to this absurd suggestion. . .

Nvidia found the top at $1,500 and is making a killing. Nothing absurd about it.

Hitman928 said:
So your suggestion is AMD should launch a Ryzen product with an 8 hi stack at what, $2000, $3000? And if it fails, well, then they'll know they can't do that?

Just taking a stab here. Say cost of 8 stacks is $50 and AMD marks it up 4x50 = $200 to the base cost of 1 CCD 8 core die. So 6800x with 0 stacks is, say $399, $6880x with 8 stacks would be $599. 4 stacks would be $499.

The 2 CCD, 16 core would be in $1,000 price range.

Hitman928 said:
You said destroys, there's a huge difference between destroys and best performing across a cross section of benchmarks. Pick one or the other.

Being able to reasonably claim the performance crown.

That is a good definition.

Hitman928 said:
~$6 is probably a pretty good estimate for the cache die itself, but it won't be that cost when taking processing cost, stacking cost, stacking yield, and assembly into account. Not a chance.

Cutress also (I think in the same video I referenced above) estimated the cost of substrate and assembling the entire Ryzen package to be $20.

With that in min, I would estimate the assembly cost to be $2 range, die itself probably in $4-6 range.

Hitman928 said:
What she didn't say:
We will have the highest performing parts in every segment at all costs, profits be darned.

The proposition that highest performing Zen 3D would be unprofitable exists only in your mind. In the real world, the highest performing Zen 3D will be very profitable.

H

HurleyBird

Platinum Member

Aug 11, 2021

#383

Mopetar said:
If you think that them simply having the best processor is going to topple Intel you really need to reevaluate the market.

Of course not. Mind share increases demand, and lets you charge more for your products. You leverage that to increase supply and build better products. If you can stay on top long and consistently enough, this becomes a significant positive feedback. If not, then you lose a lot of momentum whenever there's a break. Intel has the benefit of being a far larger company. They could lose 2 out of every 3 generations moving forward and maintain market leadership pretty much indefinitely. But eventually, if they consistently lose, they will fall. In an alternate universe where AMD one-upped Conroe and maintained technological leadership all the way from A64 to today, they are now the larger company.

Reactions: Tlh97 and Joe NYC

moinmoin

Diamond Member

Aug 11, 2021

#384

Doug S said:
So Intel just piles up the cash for now and does buybacks once they are able. They and many other companies held onto cash for years overseas for financial engineering reasons, now they'd be holding it in the US for the same reason but the end result would be the same. Other than making you feel better I guess, what good would be hoping to accomplish with that?

Sounds like a charming business to do business with you describe there.

I have no feelings either way as I'm not a tax payer affected by this. However I do think this is a serious issue that should ideally be tackled. Maybe you are able to write down a suggestion how to solve that instead assembling what I dislike about the industry?

H

Hitman928

Diamond Member

Aug 11, 2021

#385

Joe NYC said:
But that's a different scenario. Throwing more silicon at the problem when you are already winning by a lot. You don't have to throw more silicon.

The scenario under discussion is adding very inexpensive silicon to clinch a win that is within grasp. This is when you add silicon.

View attachment 48578

I went back to capture the frame with the table and it looks like 69% of N7.

So we went from 20% more expensive to now 43.5% more expensive while this is also TSMC 14/16 nm and not GF 12/14 nm which I think everyone knows is in far less demand than TSMC's 14/16 nm so the gap will widen and I can tell you the gap is wider from personal experience though I can't get into details. What's clear is that your initial 20% assumption was way off.

1. SRAM will most likely be on N6, which uses EUV

AMD already said they are using 7nm for the cache.

2. Noted that you have "doubts" about SRAM being cheaper to produce than high performance, probably most complex logic design of Zen 3,

I said I have doubts that using fewer metal layers accounts for a significant cost reduction. Also, how complex or simple the logic design is doesn't add to wafer cost, a wafer is a wafer, it doesn't matter if it's a simple design or complex on the wafer, that matters for yield, but that's not what I was talking about.

No, if you have unlimited number of TSVs, then layers are not sharing TSVs. If one column is broken, it affects only one layer. That's how I would design it.

And if the L3 has enough redundancy, that single section of die affected by the broken column may be replaced by a spare one.

TSVs are rather large structures compared to the digital logic. You don't just get to use however many you want and spread them out. I'll wait to comment further until I see what AMD presents at Hot Chips regarding their stacking method.

I don't think so. Any new SOC, or a node shrink need manpower and a whole bunch of overhead cost on top of that.

On one hand, you are arguing against AMD releasing an extra Zen 3D SKU with different number of stacked die, and next sentence you say AMD should be doing tape outs on new process nodes. Which is like 10s of orders of magnitude more complex and costly.

No, I was using your logic that you continue to push that costs are unimportant as long as you maintain leadership. If you agree that this isn't true then you need to analyze at what point does it become too costly to try to do so which I've asked for repeatedly but no one arguing that point seems to want to answer.

Nvidia found the top at $1,500 and is making a killing. Nothing absurd about it.

1) Nvidia had $1200 Titans before that which sold well, I don't think they had to have a failure at $3000 to figure $1500 might be worth it.
2) Again, you're ignoring the unique market demand to determine the 3090's success. Going by Ebay prices, Nvidia could have offered the 3090 at $3000 and it still would have sold well even though their $3000 card failed miserably before because the market is different now where miners are coming in and paying ridiculous amounts to get almost any card they can. This is only a temporary market though so it's not a good case study and what a company should do in a normal market.

Just taking a stab here. Say cost of 8 stacks is $50 and AMD marks it up 4x50 = $200 to the base cost of 1 CCD 8 core die. So 6800x with 0 stacks is, say $399, $6880x with 8 stacks would be $599. 4 stacks would be $499.

The 2 CCD, 16 core would be in $1,000 price range.

8 stacks will not be $50, that would barely cover the cost of the SRAM cache alone and not include processing, stacking, the extra thermal substrate, or p&a.

Being able to reasonably claim the performance crown.

That is a good definition.

That's fine. I think you are on an island in your definition of destroys, but as long as you make your definition clear.

Cutress also (I think in the same video I referenced above) estimated the cost of substrate and assembling the entire Ryzen package to be $20.

With that in min, I would estimate the assembly cost to be $2 range, die itself probably in $4-6 range.

Package and assembly of a Ryzen CPU does not have any costs for wafer thinning or die stacking or any assembly adjustments that may be required for these processors. All the added costs for an 8 hi cache stack Ryzen will be higher than $50 and not by an insignificant amount.

The proposition that highest performing Zen 3D would be unprofitable exists only in your mind. In the real world, the highest performing Zen 3D will be very profitable.

I don't want to keep going back and forth so I will just sum it up the way I see it.

I don't know the full cost of adding 1 stack of cache but I have an idea due to personal experience. Without full knowledge of all the costs and the yield AMD expects from stacking as well as taking hints from what AMD has revealed, I can only say that I don't have much confidence in AMD going 4+ hi stacks on Ryen 3d due to both costs of doing so and diminishing returns in performance. I do know that AMD knows these costs and if they think it is worth it from a cost and opportunity cost perspective, then we'll see those SKUs and I'll be happy to be wrong.

What you seem to be saying is that despite not really knowing any of the costs associated with it, you will assume it's fairly cheap and that without knowing how much performance improvement they can get by doing so, AMD should go 4+ stacks hi in hopes of topping ADL in gaming performance so that they can make sure they hold on to the mind share of gamers because it has an unspecified amount of value to AMD's bottom line. Maybe I'm being a little unfair in representing your arguments, but I haven't seen much in terms of real analysis in all your posts and what has been there uses assumptions that are either just wrong or unknowable at this point as to make any confidence in the analysis to be unreasonable.

Reactions: DisEnchantment, Tlh97, Gideon and 1 other person

J

Joe NYC

Diamond Member

Aug 11, 2021

#386

Vattila said:
On redundancy and yield — through-silicon-via (TSV) interconnects may also apply redundancy to improve yield:

TSV Redundancy: Architecture and Design Issues in 3D IC

"3D technology provides many benefits including high density, high band-with, low-power, and small form-factor. Through Silicon Via (TSV), which provides communication links for dies in vertical direction, is a critical design issue in 3D integration. Just like other components, the fabrication and bonding of TSVs can fail. A failed TSV may cause a number of known-good-dies that are stacked together to be discarded. This can severely increase the cost and decrease the yield as the number of dies to be stacked increases. A redundant TSV architecture with reasonable cost for ASICs is proposed in this paper. Design issues including recovery rate and timing problem are addressed. Based on probabilistic models, some interesting findings are reported. First, the probability that three or more TSVs are failed in a tier is less than 0.002%. Assumption of that there are at most two failed TSVs in a tier is sufficient to cover 99.998% of all possible faulty free and faulty cases. Next, with one redundant TSV allocated to one TSV block, limiting the number of TSVs in each TSV block to be no greater than 50 and 25 leads to 90% and 95% recovery rates when 2 failed TSVs are assumed. Finally, analysis on overall yield shows that the proposed design can successfully recover most of the failed chips and increase the yield of TSV bonding to 99.99%. This can effectively reduce the cost of manufacturing 3D ICs."

https://past.date-conference.com/proceedings-archive/2010/DATE10/PDFFILES/03.3_4.PDF

Architecture of Ring-based Redundant TSV for Clustered Faults

"Three-dimensional Integrated Circuits (3D-ICs) that employ the Through-Silicon Vias (TSVs) vertically stacking multiple dies provide many benefits, such as high density, high bandwidth, low-power. However, the fabrication and bonding of TSVs may fail because of many factors, such as the winding level of the thinned wafers, the surface roughness and cleanness of silicon dies, and bonding technology. To improve the yield of 3D-ICs, many redundant TSV architectures were proposed to repair 3D-ICs with faulty TSVs. These methods reroute signals of faulty TSVs to other regular or redundant TSVs. In practice, the faulty TSVs may cluster because of imperfect bonding technology. To resolve the problem of clustered TSV faults, router-based redundant TSV architecture was the first paper proposed to pay attention to this clustering problem. Their method enables faulty TSVs to be repaired by redundant TSVs that are farther apart. However, for some rarely occurring defective patterns, their method consumes too much area. In this paper, we propose a ring-based redundant TSV architecture to utilize the area more efficiently as well as to maintain high yield. Simulation results show that for a given number of TSVs (8 × 8) and TSV failure rate (1%), our design achieves 54% area reduction of MUXes per signal, while the yield of our ring-based redundant TSV architectures can still maintain 98.47% to 99.00% as compared with router-based design. Furthermore, the minimum shifting length of our ring-based redundant TSV architecture is at most 1 which guarantees the minimum timing overhead of each signal."

https://past.date-conference.com/proceedings-archive/2015/pdf/0268.pdf

Nice links, but I don't even think this would be necessary.

Suppose the SRAM die has 70 x 1 MB blocks.

Suppose one block of TSVs point to 1 of the 1 MB blocks. and the block of TSVs are used for only one layer. (TSVs may go through all layers but only one layer uses them).

So if 1% of the 70 TSV blocks fail, you lose 1.
Say you lose another 1% = 1 block to defects. So that's 2 down, down to 68.

But all you needed was 64, so the first layer gives you 4 spares before looking into the next layer.

It does not mean that each layer has to produce 64 MB. If the 1st layer produced 68 blocks, the next layer needs to produce just 60.

So it may be easier just to use redundant SRAM blocks without even trying to go to next level to achieve redundancy through redundant TSVs.

At risk of several posters having heart attacks reading the next sentence, I am going to say it.
V-Cache will have near 100% yield.

Reactions: Schmide, RnR_au, Tlh97 and 1 other person

J

Joe NYC

Diamond Member

Aug 11, 2021

#387

Hitman928 said:
So we went from 20% more expensive to now 43.5%

I was going by memory, which did not go to more than 1 most significant digit which was 4 and 5.

Hitman928 said:
AMD already said they are using 7nm for the cache.

Yes, that's what was said during the demo, We will see if it changes or stays that way.

Hitman928 said:
I said I have doubts that using fewer metal layers accounts for a significant cost reduction.

So we are at "significant"

Hitman928 said:
TSVs are rather large structures compared to the digital logic. You don't just get to use however many you want and spread them out. I'll wait to comment further until I see what AMD presents at Hot Chips regarding their stacking method.

Someone counted 24,000 TSVs on Zen 3 die.

Hitman928 said:
No, I was using your logic that you continue to push that costs are unimportant as long as you maintain leadership. If you agree that this isn't true then you need to analyze at what point does it become too costly to try to do so which I've asked for repeatedly but no one arguing that point seems to want to answer.

No, I quite clearly specified linear cost increases at $6 increments.
FIrst of which, according to heading of this thread adds 15% in gaming performance.
This cost is trivial up to ~4 layers and nothing compared to the value of maintaining performance lead.

You proposition that adding stacks of V-Cache will lead to loss is quite absurd.

Hitman928 said:
8 stacks will not be $50, that would barely cover the cost of the SRAM cache

I gave you my estimate. What is your estimate of 1 stack and 8 stacks?

Hitman928 said:
What you seem to be saying is that despite not really knowing any of the costs associated with it, you will assume it's fairly cheap and that without knowing how much performance improvement they can get by doing so, AMD should go 4+ stacks hi in hopes of topping ADL in gaming performance so that they can make sure they hold on to the mind share of gamers because it has an unspecified amount of value to AMD's bottom line.

I gave you my cost estimate. For performance estimate, I would go with every L3 doubling will add the similar performance increment in similar applications. Rounding down to 10%
1 stack: +10%= 110%
2 stacks +10% = 121%
4 stacks +10% = 133%
8 stacks +10% = 146%

One of these should be enough to topple the highest end ADL gaming SKUs that Intel is releasing this year. I am going to guess that 4 stacks should be about right.

Intel is releasing a top gaming SKU ahead of all the boring ones to gain mindshare of gamers.

I think Intel, like me, also disagrees with your assessment that gaining mindshare is not significant, not worthwhile.

Hitman928 said:
Maybe I'm being a little unfair in representing your arguments, but I haven't seen much in terms of real analysis in all your posts and what has been there uses assumptions that are either just wrong or unknowable at this point as to make any confidence in the analysis to be unreasonable.

I posted my assumptions, and I posted more support for them than anyone else on this thread, and I posted where my assumptions lead me.

It's as good a baseline of a conversation as you can get.

H

Hitman928

Diamond Member

Aug 11, 2021

#388

Joe NYC said:
So we are at "significant"

Yes, as I said in the original statement, I did not think it would lead to any real cost savings. Real in this context meaning significant or appreciable.

No, I quite clearly specified linear cost increases at $6 increments.

And as I said, it's not $6 per layer, around that price will just cover the cache die production itself and does not include any of the other costs associated with the stacking which, as Ian said, is typically expensive.

You can check the die cost yourself, https://caly-technologies.com/die-yield-calculator/. Note that I know for a fact, and Ian alluded to this in the video, that the 7 nm wafer cost is more expensive than in the table you posted.

You proposition that adding stacks of V-Cache will lead to loss is quite absurd.

I never said that, I said that I don't think going 4+ stacks on Ryzen will be worth the cost/opportunity cost.

I gave you my estimate. What is your estimate of 1 stack and 8 stacks?

I never gave a hard estimate as there are too many unknowns, I just expect 4+ stacks to be too expensive to make it worth while for AMD to put it into the consumer market. 8 hi stack will definitely be higher than $50 though because, again, that will just cover the cache die production itself.

I gave you my cost estimate. For performance estimate, I would go with every L3 doubling will add the similar performance increment in similar applications. Rounding down to 10%
1 stack: +10%= 110%
2 stacks +10% = 121%
4 stacks +10% = 133%
8 stacks +10% = 146%

One of these should be enough to topple the highest end ADL gaming SKUs that Intel is releasing this year. I am going to guess that 4 stacks should be about right.

Intel is releasing a top gaming SKU ahead of all the boring ones to gain mindshare of gamers.

I think Intel, like me, also disagrees with your assessment that gaining mindshare is not significant, not worthwhile.

I posted my assumptions, and I posted more support for them than anyone else on this thread, and I posted where my assumptions lead me.

It's as good a baseline of a conversation as you can get.

Your assumptions were based on faulty evidence which I've pointed out multiple times, you just keep ignoring it. Beyond that, I'm done debating this and this will be my last response. You are free to feel as confident as you want in your analysis and it would be really cool if you were right as AMD would be providing massive performance uplifts for very little money. However, I'm certainly not going to be surprised or disappointed if it doesn't come to pass.

Reactions: Thibsie

J

Joe NYC

Diamond Member

Aug 12, 2021

#389

Hitman928 said:
I never gave a hard estimate as there are too many unknowns

Way to analyze and estimate something:

"there are too many unknowns"

D

Doug S

Diamond Member

Aug 12, 2021

#390

moinmoin said:
Sounds like a charming business to do business with you describe there.

I have no feelings either way as I'm not a tax payer affected by this. However I do think this is a serious issue that should ideally be tackled. Maybe you are able to write down a suggestion how to solve that instead assembling what I dislike about the industry?

What "serious issue" are you referring to? Buybacks? The fact subsidies exist at all?

What I'm trying to tell you through my objections is that there is NO WAY to stop beneficiaries of subsidies from buying back stock or issuing dividends, and even if you could that's not a worthwhile goal. Stockholders are the owners of a business, why should profits not be delivered to them?

If Intel could not afford to build their next generation fab without subsidies, and then they turned around and used the money for buybacks instead of building a fab yes that would be a "serious issue". That's not the case, they are easily able to afford to build new fabs. If it was just Intel we cared about we wouldn't need subsidies, and they'd still build some fabs here because they already have so many trained workers and existing infrastructure in Oregon and Arizona.

If we want to attract TSMC and Samsung to have a bigger presence here, the only way to tempt them to do so is to with fat wads of cash. Otherwise the best we'll get are smaller fabs like what Samsung currently operates in Texas, or fabs that are two generations behind when they begin operations like TSMC's currently planned fab in Arizona (though they've recently indicated they will be stepping up their investment in the US, presumably depending on if the subsidies materialize)

Reactions: Joe NYC

coercitiv

Diamond Member

Aug 12, 2021

#391

Doug S said:
What "serious issue" are you referring to? Buybacks? The fact subsidies exist at all?

The fact that Intel, after avoiding paying taxes for years, is seeking taxpayer funds to vitalize their business. They had a massive advantage over the competition and squandered it all on maximizing profits. Yes, it's the stockholder's right to to demand swift and maximal delivery of profits, but doing so while endangering the company means heavy "subsidies" from taxpayers should come at an equally steep price: equity in the company.

Boeing used their cash reserves for years to buy back stock. When brown matter hit the fan, they sought taxpayer aid, but refused to exchange equity and borrowed piles of cash instead. They won't be seeing them profits for a while though.

Last edited: Aug 12, 2021

Reactions: Schmide, Joe NYC, Saylick and 6 others

DrMrLordX

Lifer

Aug 12, 2021

#392

Doug S said:
What I'm trying to tell you through my objections is that there is NO WAY to stop beneficiaries of subsidies from buying back stock or issuing dividends, and even if you could that's not a worthwhile goal. Stockholders are the owners of a business, why should profits not be delivered to them?

Companies that agree to receiving Federal funds for specific programs can be made to jump through all kinds of hoops to receive said funds. Stock buybacks are not always in the best interest of the shareholders, and they do not always deliver profits to the shareholders.

Reactions: moinmoin

H

Hitman928

Diamond Member

Aug 12, 2021

#393

If I wasn't convinced the Dunning-Kruger effect was real before. . .

Reactions: Tlh97, lobz and moinmoin

A

Abwx

Lifer

Aug 12, 2021

#394

moinmoin said:
Productized means turning something into a product that can be produced and sold. In R&D there is a lot of stuff that's unfortunately never productized.

There have been mentions of Zen 4 going to datacenters first, with the consumer version coming later. If one thinks of the DDR5 market size having to increase over time first before it's really suitable for the consumer mass market, letting the competition as well as datacenters have a go at it first may be a pretty good strategic move.

So that mean rendered producible, wich they can not do with Zen 4 since they have no definitive silicon.

If they want to go ahead they have to use what is at hand for validation purpose, but it s not stipulated that Vermeer will be used for a V-Cache commercial product.

We ll see but seems to me that Zen 4 should perform better without V-Cache than such an equipped Zen 3, hence that would be a lot of a efforts for a less than one year product cycle.

J

jpiniero

Lifer

Aug 12, 2021

#395

Abwx said:
We ll see but seems to me that Zen 4 should perform better without V-Cache than such an equipped Zen 3, hence that would be a lot of a efforts for a less than one year product cycle.

They need something new for now. Zen 4 consumer probally isn't until the end of next year.

M

Mopetar

Diamond Member

Aug 12, 2021

#396

Why would anyone buy a Zen 3D mid-cycle when Zen 4 is that much closer? Particularly if details start to emerge suggesting that the next CPUs will use a new platform and that Zen 3D is a dead end platform wise.

I really don't think they need to rush out a new product just to compete with Intel when they're already moving everything they have and still have a lot of market segments that haven't seen Zen 3 products released yet.

Given the GPU market conditions don't look like they'll clear up for quite a while I think a lot of people would be fine holding out on an upgrade for a while longer.

J

jpiniero

Lifer

Aug 12, 2021

#397

Mopetar said:
Why would anyone buy a Zen 3D mid-cycle when Zen 4 is that much closer?

Marketing demands new products. You can't go two full years without a product refresh. If Zen 3D launches at CES it could be 10-12 months before Zen 4 consumer does.

Reactions: Schmide, RnR_au and Joe NYC

L

LightningZ71

Platinum Member

Aug 12, 2021

#398

If Zen3D is AM4 compatible, there are plenty of reasons to upgrade to it. In fact, there are hundred$ of reasons involved in getting new RAM, a new Motherboard, ripping up your machine in the process... It is highly likely that the larger L3 cache of Zen3D will largely obviate the bandwidth advantages of DDR5 on the AM5 platforms in it's early days as the early modules will not have a significant bandwidth advantage over higher clocked DDR4 modules. The larger cache will have better average latency than those DDR5 modules, and, for many tasks, the vast majority of their memory calls will be fulfilled by the L3 instead of the DDR bus.

The sad truth that computing faces going forward is that higher clocks and higher IPC lead to greater demands on the memory bus. Raw computing throughput for leading edge processors has increased a lot over the last six years or so, yet DRAM hasn't really made as massive an increase. On the desktop, we went from quad core, eight thread i7 processors to sixteen core, thirty two thread processors that run at higher clocks and complete more instructions per clock. Dual channel DDR4 has gone from 2133 to only about 4000 (for widely available, not crazy expensive modules). That's over four times the instruction and data demand at the top end for only slightly less than doubling of ram bandwidth. There's no getting around the fact that faster processors are going to need larger and larger caches to see real performance increases commensurate with the performance increases of the cores.

Reactions: Tlh97, Joe NYC and moinmoin

J

JoeRambo

Golden Member

Aug 12, 2021

#399

I think it is completely safe to expect that all future AMD's top of the line desktop/workstation CPUs will continue to ship with stacked L3. I have zero doubts that this 3D stacking will spread to other companies as well and AMDs GPU department too. It's like free lunch that also enables one to rise ASP and compete in market. We had clock wars, core count wars and now cache wars will properly begin.
The future is bright overall, imagine Ampere, but instead of ridiculous 120W guzzling GDDR6X, it has 256MB of L3 stacked? I think this will give awesome one time performance boost to everyone from Nvidia to custom ARM crowd.

My gut feeling about Z3D vs Alder Lake situation - for typical desktop and gaming ADL will be very handicapped by DDR5. Anandtech and other ~~enthusiast serving~~ sites will be happy to test it with DDR5 4800CL40 JEDEC, that will perform ~DDR4 2400CL20 levels and ~15-16ns of latency.
That is why we have Cinebench leak only - 30% IPC gain there, but probably disastrous performance everywhere else. Imagine Anandtech 11700K prerelease testing, but square it due to low speeds of new memory subsystem and immature IMC code.

So at stock speeds AMD will easily win with 96MB of L3, for proper enthusiasts a lot of things depend on prices and speeds of proper DDR5 memory ( and maybe DDR4 mobos?).

Reactions: Schmide, Joe NYC, Tlh97 and 1 other person

D

Doug S

Diamond Member

Aug 12, 2021

#400

coercitiv said:
The fact that Intel, after avoiding paying taxes for years, is seeking taxpayer funds to vitalize their business. They had a massive advantage over the competition and squandered it all on maximizing profits. Yes, it's the stockholder's right to to demand swift and maximal delivery of profits, but doing so while endangering the company means heavy "subsidies" from taxpayers should come at an equally steep price: equity in the company.

Boeing used their cash reserves for years to buy back stock. When brown matter hit the fan, they sought taxpayer aid, but refused to exchange equity and borrowed piles of cash instead. They won't be seeing them profits for a while though.

They weren't "seeking" funds, the government made the first move there - in a panicked reaction to the chip shortage and the realization of the legislators that fewer and fewer chips are made here. Of course Intel is going to want to get in on the cash grab if foreign companies are getting in line.

If Intel had gone begging hat in hand to the government and said "woe is us, we can't compete unless you give us money" that would be different, but that's not what happened.

Boeing is another thing entirely. They were a great engineering led firm destroyed when the incompetent beancounter management from McDonnell Douglas took over.

Reactions: Schmide, Joe NYC and Zucker2k

You must log in or register to reply here.

Share:

Facebook X (Twitter) Reddit Tumblr WhatsApp Email Link

TRENDING THREADS

Discussion Zen 5 Speculation (EPYC Turin and Strix Point/Granite Ridge - Ryzen 9000)
- Started by DisEnchantment
- Sep 29, 2022
- Replies: 25K
CPUs and Overclocking
T
Discussion Intel Meteor, Arrow, Lunar & Panther Lakes Discussion Threads
- Started by Tigerick
- Aug 22, 2022
- Replies: 21K
CPUs and Overclocking
Discussion Intel current and future Lakes & Rapids thread
- Started by TheF34RChannel
- Jun 18, 2017
- Replies: 23K
CPUs and Overclocking
Discussion Apple Silicon SoC thread
- Started by Eug
- Nov 10, 2020
- Replies: 10K
CPUs and Overclocking
Question Zen 6 Speculation Thread
- Started by IronLynx
- May 22, 2024
- Replies: 5K
CPUs and Overclocking

Top Bottom

This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
By continuing to use this site, you are consenting to our use of cookies.

Accept Learn more…