I'm sure many of us here have vague memories of chips that had real-world, unfixable bugs and problems. Problems that would not go away with a better motherboard, BIOS updates, OS reinstalls - without rewritten software. Problems that others might even blame on a bad overclock until the OP reverted to stock settings and would persist.
Pentiums with incorrect FDIV lookup tables and buslocks from F00F... opcodes, Cyrixes that had weird issues with anything assuming Pentum-level instructions, K6s that had stability issues with too much RAM, Athlons that would go up in smoke for lack of a heatspreader or a fast thermal diode.
But most of those examples are in the distant past, likely buried in landfill. PCs, their components and their marketing have changed over the years - expansion slot, form factor and firmware standards have changed over the years to be easier to work with and harder to mess up. You can't just plug in the motherboard power backwards and see it go up in smoke anymore unless you're pushing down with the force of a steamroller. Can't insert your CPU backwards and fry it these days like you could your Socket 3 486!
Mid-end motherboards today come in glossy, made-to-fit boxes, have sleek plastic shrouds, a jet-black silkscreen and fancy lettering; they are often fashionable, 'refined' products, are they not, far cries from the crude green and brown boards of the 80s and 90s or the more fisher-price boards around the new millennium (even modern ones sans UEFI, IMO). Surely the janky past of glitchiness is behind us, with our modern equipment offering a smooth, trouble-free computing experience?
Now (well, really around a year ago), I hear that a large amount of early Ryzens have a manufacturing defect - ringing, out-of-spec inductance somewhere, grounding issue, insufficient capacitance in power layers/ripple smoothing, design features being packed too close, mask defect blowing open part of a line? - that causes segfaults compiling GCC, and not every time either. Even after microcode and UEFI updates. RMA required. Huge numbers of chips affected. Virtual 8086 mode semi-broken (apparently fixed with AGESA 1006?), causing issues with some VMs or running 16-bit software on 32-bit Windows. Reports of unusual crashes in Maya? Handbrake? Broken C6 sleep mode causing lockups when idle in Linux and BSD?
And if these obvious symptoms are cropping up, how many silent issues might be occurring in the background that don't trigger a full-blown segfault on Ryzen? (Not to say that errata on Intel are never an issue)
To me, this feels a bit (a bit) like the ominous days of Super Socket 7 and its standard melange. What kinds of little issues could come up at the worst time to bite you in the back? As if termites infested some of the towers and towers of software written for PCs, but you don't know exactly which ones until you walk in and try to run them. Like a piece of the digital landscape has broken off, but you're blind and can only hope you don't fall down the hole it created.
(The things that really come to mind on the recent Intel side are issues with atomicity on Skylake HT (test program was tight assembly loops?), Meltdown/Spectre of course and x87 having questionable precision? Also curious about chipset/USB stability on the last 10 years worth of the AMD lines? Have heard of some issues regarding USB latency with SB7/9xx on the AMD side... Serato appears to be an example of software affected.)
Perhaps it's just paranoia on my part, but I think I make somewhat of a point that these issues should have been ironed out. These are not all just hypothetical issues listed on a datasheet. Sometimes I wonder if deterministic faults on 2 different PCs with similar architecture could cause real issues for DC projects like BOINC that depend on cross-system validation? Computers that don't compute the way we expect them to...
Zen uarch family erratum sheet
Anecdotes from people here with Ryzen systems would be interesting to hear.
Pentiums with incorrect FDIV lookup tables and buslocks from F00F... opcodes, Cyrixes that had weird issues with anything assuming Pentum-level instructions, K6s that had stability issues with too much RAM, Athlons that would go up in smoke for lack of a heatspreader or a fast thermal diode.
But most of those examples are in the distant past, likely buried in landfill. PCs, their components and their marketing have changed over the years - expansion slot, form factor and firmware standards have changed over the years to be easier to work with and harder to mess up. You can't just plug in the motherboard power backwards and see it go up in smoke anymore unless you're pushing down with the force of a steamroller. Can't insert your CPU backwards and fry it these days like you could your Socket 3 486!
Mid-end motherboards today come in glossy, made-to-fit boxes, have sleek plastic shrouds, a jet-black silkscreen and fancy lettering; they are often fashionable, 'refined' products, are they not, far cries from the crude green and brown boards of the 80s and 90s or the more fisher-price boards around the new millennium (even modern ones sans UEFI, IMO). Surely the janky past of glitchiness is behind us, with our modern equipment offering a smooth, trouble-free computing experience?
Now (well, really around a year ago), I hear that a large amount of early Ryzens have a manufacturing defect - ringing, out-of-spec inductance somewhere, grounding issue, insufficient capacitance in power layers/ripple smoothing, design features being packed too close, mask defect blowing open part of a line? - that causes segfaults compiling GCC, and not every time either. Even after microcode and UEFI updates. RMA required. Huge numbers of chips affected. Virtual 8086 mode semi-broken (apparently fixed with AGESA 1006?), causing issues with some VMs or running 16-bit software on 32-bit Windows. Reports of unusual crashes in Maya? Handbrake? Broken C6 sleep mode causing lockups when idle in Linux and BSD?
And if these obvious symptoms are cropping up, how many silent issues might be occurring in the background that don't trigger a full-blown segfault on Ryzen? (Not to say that errata on Intel are never an issue)
To me, this feels a bit (a bit) like the ominous days of Super Socket 7 and its standard melange. What kinds of little issues could come up at the worst time to bite you in the back? As if termites infested some of the towers and towers of software written for PCs, but you don't know exactly which ones until you walk in and try to run them. Like a piece of the digital landscape has broken off, but you're blind and can only hope you don't fall down the hole it created.
(The things that really come to mind on the recent Intel side are issues with atomicity on Skylake HT (test program was tight assembly loops?), Meltdown/Spectre of course and x87 having questionable precision? Also curious about chipset/USB stability on the last 10 years worth of the AMD lines? Have heard of some issues regarding USB latency with SB7/9xx on the AMD side... Serato appears to be an example of software affected.)
Perhaps it's just paranoia on my part, but I think I make somewhat of a point that these issues should have been ironed out. These are not all just hypothetical issues listed on a datasheet. Sometimes I wonder if deterministic faults on 2 different PCs with similar architecture could cause real issues for DC projects like BOINC that depend on cross-system validation? Computers that don't compute the way we expect them to...
Zen uarch family erratum sheet
Anecdotes from people here with Ryzen systems would be interesting to hear.
Last edited: