interpreting iostat output

Jeff7181

Lifer
Aug 21, 2002
18,368
11
81
So, I have this server (based on redhat) that we're seeing low (write) throughput on. It has a RAID6 array with 24 Western Digital RE3 drives. Running iostat -x results seem to indicate that the array is saturated, with %util being almost consistently at 100%.

However, the await seems to average over 100ms while svctm is usually around 1.

Does this mean that the device isn't really saturated since the svctm isn't increasing?
 

Jeff7181

Lifer
Aug 21, 2002
18,368
11
81
Oh... I forgot to mention I did RTFM and it gives a nice description, but not really enough info to know how to interpret the data when the numbers you see don't match some simple examples. Also doesn't say whether this being a RAID volume makes any difference.
 

Khyron320

Senior member
Aug 26, 2002
306
0
0
www.khyrolabs.com
Does the raid controller keep logging? Try the messages log it does sound like a drive issue. If not fail the drive out and replace, you should have cold spares laying around for such a large array.
 

Scarpozzi

Lifer
Jun 13, 2000
26,391
1,780
126
I agree with Khyron320. I'd start looking at possible hardware failures and troubleshoot accordingly. If you have the ability to read the SMART data from you drives, you might be able to locate the drive causing problems that way. I know Dell Openmanage now tells you when a drive has been flagged for a likely future failure.

I typically don't recommend RAID5 for such large arrays because you only have to lose 2 disks to lose the array or cause really bad corruption. RAID6 is slightly better (still gives you fast reads), but RAID10 will give you better write performance if you're starting to see degraded performance and give you much better odds to not lose the array even if you lose multiple disks. It's rare that I deploy an array that's not RAID 10 anymore....RAID5 has burned me too many times in the past.
 

Jeff7181

Lifer
Aug 21, 2002
18,368
11
81
I totally agree but this is an "appliance" that we bullied the manufacturer into giving us the root login for when we threatened to throw their product in the dumpster.

Are you saying the iostat output I'm describing sounds like a single bad drive in the array?