Question SSD Monitoring

shan2020

Junior Member
Dec 18, 2019
8
0
6
Looking for a recommended monitoring tool for SSD endurance monitoring. I have a poweredge with SSD remaining_endurance as 1%, where as smartctl output of smart attribute media_wearout_inidicator displaying as 100%. so both output are not inline.

Other than smartmontool any other recommended?
 

UsandThem

Elite Member
May 4, 2000
16,068
7,383
146
I either use Samsung Magician (for my Samsung drives), or I run Crystal Disk Info for the other ones to see the health of the SSDs.

Depending on the brand of the SSD, many of the larger brands offer their own monitoring ulitities (WD, Crucial, Seagate, Sandisk, Intel, etc).
 

shan2020

Junior Member
Dec 18, 2019
8
0
6
thanks for your reply.

My current situation is most servers are dell PowerEdge with some SSDs provided by Dell and others are Intel, Seagate etc..

following are the smart attributes displayed. where as omreport of dell returns "remaining rated endurance as 1%"
From the below smartctl attributes other than "Media_Wearout_Indicator" anything else I can take account to consider which shows SSD endurance has gone down?

Bash:
SMART Attributes Data Structure revision number: 16401
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000e   130   130   039    Old_age   Always       -       800099760
  5 Reallocated_Sector_Ct   0x0033   100   100   001    Pre-fail  Always       -       0
  9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       29626
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       41
 13 Read_Soft_Error_Rate    0x001e   130   100   000    Old_age   Always       -       5095067056
179 Used_Rsvd_Blk_Cnt_Tot   0x0033   100   100   010    Pre-fail  Always       -       0
180 Unused_Rsvd_Blk_Cnt_Tot 0x0032   100   100   000    Old_age   Always       -       7204
181 Program_Fail_Cnt_Total  0x003a   100   100   000    Old_age   Always       -       0
182 Erase_Fail_Count_Total  0x003a   100   100   000    Old_age   Always       -       0
184 End-to-End_Error        0x0032   100   100   000    Old_age   Always       -       0
194 Temperature_Celsius     0x0022   100   100   000    Old_age   Always       -       32
195 Hardware_ECC_Recovered  0x0032   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x003e   100   100   000    Old_age   Always       -       0
228 Power-off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       65535
233 Media_Wearout_Indicator 0x0032   100   100   000    Old_age   Always       -       70406865
 
Last edited:

UsandThem

Elite Member
May 4, 2000
16,068
7,383
146
thanks for your reply.

My current situation is most servers are dell PowerEdge with some SSDs provided by Dell and others are Intel, Seagate etc..

following are the smart attributes displayed. where as omreport of dell returns "remaining rated endurance as 1%"
From the below smartctl attributes other than "Media_Wearout_Indicator" anything else I can take account to consider which shows SSD endurance has gone down?
When I've had drives start going bad, the always begin showing an increase in the Reallocated_Sector_Ct value, but yours looks good in that area. The drive you show above has high readings in the Media_Wearout and Raw_Read_Error_Rate, among other SMART warnings. It appears the drive has close to 30,000 hours on it.

At this point if you are still under warranty from Dell for it, you can contact them about getting a replacement. However, if it's out of warranty, you can replace it before it totally fails or wait it until it finally dies (just make sure you have any important data backed up).
 
  • Like
Reactions: corkyg and shan2020

UsandThem

Elite Member
May 4, 2000
16,068
7,383
146
How this 30,000 hours is calculated from the smartctl attributes output? Your explanation would be helpful.
The RAW values are what I am looking at, and I believe it shows 29,626 hours. Although you can verify it in a utility like Crystal Disk Info.

That said, I also had a SSD come in a HP prebuilt system (new PC) back in 2015 and it reported being defective with some SSD utilities. It was a new unit, and the various SSD utilities were reporting incorrect data from the SSD.

Is this a new PC? Have you tried running a hardware scan in Dell SupportAssist to see what it reports back?
 

shan2020

Junior Member
Dec 18, 2019
8
0
6
these are 3 to 4 years old server. let me compare with few servers and about the smartctl output.
We can operate through iDrac or omreport commands but not sure about others.

The main target is to analyze SSD manufactured by segate,intel etc which are not possible to analyze through dell's omreport tool.