Hi, I hope I'm posting in the right section of the forum 
I was discussing with a friend, and we both started to have doubts about NAS and RAID being really the best solution for data storage. I hope that someone with direct experience will help us to better understand things
I hope that I'll be able to explain clearly the argument, sorry for any misunderstanding because English is not my primary language.
Let's say that I'll have a 4 bay NAS (it's just an example), and I put inside 4 x 10TB drives. In the best case scenario, I'll use NAS drives (IronWolf series from Seagate or Red series from WD), so let's say that we are in the best conditions a consumer user can get. The system will be configured to make a RAID 5 so the system can support failure of 1 drive.
Let's say that the drives installed in the NAS have a 3 years warranty. So, theoretically these drives should work for at least 3 years. Let's say that after three years one drive fails: installing a new drive will cause no data loss because the NAS will rebuild all missing data from the other drives.
And this is where the argument started. After 3 years, all drives are pretty much teared (even if still working) and rebuilding a so large amount of data (10TB is not a joke) will stress a lot all the old drives. My friend was telling that this will cause one or more of the other drives to fail at the end of the rebuilding or, worst, during the rebuild process, causing the loss of all data.
Even if in theory this reasoning seems making sense, I answered my friend that I'm not sure that this is always true. I'm wondering what happens in large data farms (like google or facebook just as example) that use even larger arrays. If the above is true, i.e. that repairing an array will tear all other disks then every time a disk fails all disks must be replaced as soon as possible to avoid data loss. If this is true, google must spend millions only to replace HDDs in their data farms... It seems pretty improbable to me but of course I'm not an expert in these things.
My argumentation is if data rebuild is so much stressful for the system, then why RAID is still so used around the world? Maybe the drive failure after a rebuild is not that common (even if it can happen).
After much discussion we couldn't find a true answer to this question.
I thought to ask here hoping that someone with direct experience could answer us. If the above reasoning is true, the larger are HDDs used, the more a NAS becomes unreliable. But, on the other side, using a NAS with small HDDs is not cost effective anymore, so again I'm wondering if using a NAS is really worth.
Maybe there is a sort of green zone that makes things more balanced? Just to make some examples:
1) maybe using only 4TB drives is the best compromise to reduce the chance of killing HDDs in case of array rebuild?
2) Or maybe the best option is to use at least 8 HDDs instead of just 4 (so each drive will be stressed lesser because the data needed to rebuild the faulty drive are more evenly distributed among all HDDs)?
3) Maybe the chance of killing drives is directly proportional to the amount of data to rebuild, i.e avoid to fill the NAS more than 50% will reduce the chance of damages? But if this is the case, it is not cost effective at all: what is the point in having 30TB of storage space if I can use only 15TB? Again, NAS seems not worth.
4) Maybe the best options depends on the HDD size, for example for 4TB drives a 4 bay NAS is the best option, but for 10TB dives is better to get a 8 bay NAS to spread data on more disks to avoid too much stress in case of rebuild.
Actually at this point I'm just confused.
There is someone with direct experience of data rebuilding of a large disk (let's say at least 8TB)? Is it really a so destroying procedure to rebuild an array with large HDDs?
The only alternative I can think is using single HDDs with no raid and store these drives in a drawer, but also this is not a 100% guarantee that data will be safe, because even if not used a HDD can become useless after some years in the drawer (maybe the motor will not spin anymore, or whatever problem can arise). I have read about people complaining that a HDD used few times is not working anymore after staying completely still in a drawer for a long time.
I hope that I explained clearly enough.
Thanks anyone who will read all this and give me some answers
I was discussing with a friend, and we both started to have doubts about NAS and RAID being really the best solution for data storage. I hope that someone with direct experience will help us to better understand things
I hope that I'll be able to explain clearly the argument, sorry for any misunderstanding because English is not my primary language.
Let's say that I'll have a 4 bay NAS (it's just an example), and I put inside 4 x 10TB drives. In the best case scenario, I'll use NAS drives (IronWolf series from Seagate or Red series from WD), so let's say that we are in the best conditions a consumer user can get. The system will be configured to make a RAID 5 so the system can support failure of 1 drive.
Let's say that the drives installed in the NAS have a 3 years warranty. So, theoretically these drives should work for at least 3 years. Let's say that after three years one drive fails: installing a new drive will cause no data loss because the NAS will rebuild all missing data from the other drives.
And this is where the argument started. After 3 years, all drives are pretty much teared (even if still working) and rebuilding a so large amount of data (10TB is not a joke) will stress a lot all the old drives. My friend was telling that this will cause one or more of the other drives to fail at the end of the rebuilding or, worst, during the rebuild process, causing the loss of all data.
Even if in theory this reasoning seems making sense, I answered my friend that I'm not sure that this is always true. I'm wondering what happens in large data farms (like google or facebook just as example) that use even larger arrays. If the above is true, i.e. that repairing an array will tear all other disks then every time a disk fails all disks must be replaced as soon as possible to avoid data loss. If this is true, google must spend millions only to replace HDDs in their data farms... It seems pretty improbable to me but of course I'm not an expert in these things.
My argumentation is if data rebuild is so much stressful for the system, then why RAID is still so used around the world? Maybe the drive failure after a rebuild is not that common (even if it can happen).
After much discussion we couldn't find a true answer to this question.
I thought to ask here hoping that someone with direct experience could answer us. If the above reasoning is true, the larger are HDDs used, the more a NAS becomes unreliable. But, on the other side, using a NAS with small HDDs is not cost effective anymore, so again I'm wondering if using a NAS is really worth.
Maybe there is a sort of green zone that makes things more balanced? Just to make some examples:
1) maybe using only 4TB drives is the best compromise to reduce the chance of killing HDDs in case of array rebuild?
2) Or maybe the best option is to use at least 8 HDDs instead of just 4 (so each drive will be stressed lesser because the data needed to rebuild the faulty drive are more evenly distributed among all HDDs)?
3) Maybe the chance of killing drives is directly proportional to the amount of data to rebuild, i.e avoid to fill the NAS more than 50% will reduce the chance of damages? But if this is the case, it is not cost effective at all: what is the point in having 30TB of storage space if I can use only 15TB? Again, NAS seems not worth.
4) Maybe the best options depends on the HDD size, for example for 4TB drives a 4 bay NAS is the best option, but for 10TB dives is better to get a 8 bay NAS to spread data on more disks to avoid too much stress in case of rebuild.
Actually at this point I'm just confused.
There is someone with direct experience of data rebuilding of a large disk (let's say at least 8TB)? Is it really a so destroying procedure to rebuild an array with large HDDs?
The only alternative I can think is using single HDDs with no raid and store these drives in a drawer, but also this is not a 100% guarantee that data will be safe, because even if not used a HDD can become useless after some years in the drawer (maybe the motor will not spin anymore, or whatever problem can arise). I have read about people complaining that a HDD used few times is not working anymore after staying completely still in a drawer for a long time.
I hope that I explained clearly enough.
Thanks anyone who will read all this and give me some answers