corrupt data on raid array

Smokey0066

Senior member
Oct 9, 1999
488
0
0
I've got a 1.5TB raid5 array with a Areca 1120 controller running in win 2k3 server.

This past weekend has been quite frustrating for me with random fits of corrupt data. Then a reboot to run chkdsk which seems to fix it and I regain access to my files for a brief time until it repeats.

I've been reading some posts and I keep seeing reference to faulty hardware. I assume one of my drives has some bad sectors or soon to be bad sectors on it. How do I find out which one of my drives in the raid array is bad/going nuts? I'm pretty new to this raid stuff and I do keep back up DVDs of my important data but I would really like to get this file corruption issue resolved so I do not have to reboot and run checkdisk every time I work with files on the system.

I need to spend some time on the box to see if the controller has any features to allow for individual disk checks but if anyone has any other tips/tricks/advice that you don't mind passing along that'd be great.
 

Smokey0066

Senior member
Oct 9, 1999
488
0
0
No tips huh?

Well I just got home and yet again I'm having corrupted files. So its running checkdisk at the moment. Once its back up and running I will run seatools and see if that'll be able to scan the drives.

Also is there a SMART utility that will work with the raid controller? or is that only something I can check when I'm in in the menu of the controlleR?

I'm getting frustrated and want to break something. My desktop just died when I got home.. ACK so much to fix..
 

Fullmetal Chocobo

Moderator<br>Distributed Computing
Moderator
May 13, 2003
13,704
7
81
How are you finding out that you have corrupted files? Are you getting errors reported from the RAID controller? Or are they being reported from the OS? If reported from the OS, the RAID controller could be working fine--such as if you have a virus or something.

The controller has methods of verifying the status of the array--I'd have to look at my manual to remember the exact name of it though. Also, if you are using the latest version of drivers for the RAID controller, it will report the SMART data to you when you use the Areca software.
 

Fullmetal Chocobo

Moderator<br>Distributed Computing
Moderator
May 13, 2003
13,704
7
81
"check volume set". Doing this forces the controller to do a bit for bit parity check on the entire array, ensuring everything is as it should be in the array.
 

Smokey0066

Senior member
Oct 9, 1999
488
0
0
This whole raid thing is fairly new to me.

The error I get is a little bubble that pops up on the lower right hand corner when I'm on the machine and it says groove monitor sync failed or something and that there are corrupt files on the disk array. It advises to run checkdisk so I've been rebooting to run checkdisk when i see this. When i'm just accessing the folder the files are unopenable.

So which log should I be looking at to see the event you guys are talking about?

I checked the event viewer and the following is what I see:

The file system structure on the disk is corrupt and unusable. Please run the chkdsk utility on the volume Storage.

For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.


Fullmetal: Do I need to enable that check volume set? What Areca software are you refering to for reviewing SMART data? I've got the Areca HTTP software installed but I don't really see anything or know how to use that HTTP software. Again I'm really a rookie and am learning as I mess up here.

So last night I was unable to run seatools. Boot to the cd and it just had a grey screen with nothing. I'm guessing it couldn't find the drives through the raid controller. I did finish checkdisk and its fine thus far. I think I only see problems when I'm writing/moving data around.

thanks for all the tips.
 

Fullmetal Chocobo

Moderator<br>Distributed Computing
Moderator
May 13, 2003
13,704
7
81
Do you have the Areca RAID controller software installed in the OS (not just the drivers, but the software) so that you can view and do maintenance on the arrays? The program is called ARCHTTP. You can initiate the check volume set from that program. Also, are you using the card in a PCI-X or PCI slot?
 

Smokey0066

Senior member
Oct 9, 1999
488
0
0
I am using the PCI X (1120) not the PCI e (1220) I have ARCHTTP GUI installed but the only functionality I see with it is START and CLOSE buttons. No where within the app do i see where I can run tests and such. The top menu has "file" "tool" "service" Thats it. Maybe i've got the wrong thing installed?
 

Smokey0066

Senior member
Oct 9, 1999
488
0
0
okay I figured out how to access the raid controller. To think, 6 months without knowing this. I should've read the manual on day 1 before I plugged the card into the slot.

I started the volume consistency check but overall nothing stands out. So does this bring be back to a software issue within windows server 2003?
 

Fullmetal Chocobo

Moderator<br>Distributed Computing
Moderator
May 13, 2003
13,704
7
81
"groove monitor sync failed or something". What program is giving you this error? I haven't heard of this program (unless you are talking about Microsoft Groove). Can you give more details about that, so that we can figure out exactly where the error is coming from? Did you check out the SMART data on the hard drives in ARCHTTP? What were the values for the hds--were they all good?
 

Smokey0066

Senior member
Oct 9, 1999
488
0
0
I think its from MS office groove. I'm going to let the consistency check finish then I can try to cause the error again. It seems to happen when I run JAlbum to generate my photo album, at least that when alot of files are being created and written to the array.

SMART data: all the hd temps were 34C, Fan is at ~3k rpm.

Thanks for giving me a hand with this. I feel alot better someone out there has worked with this areca hardware.
 

Fullmetal Chocobo

Moderator<br>Distributed Computing
Moderator
May 13, 2003
13,704
7
81
If MS Groove is the only thing reporting errors, I would first suspect errors with the program. If the errors were reported by the ARCHTTP program, then I would be worried. You said the you were unable to access the folders/files when an error occurred. What files/folders are you trying to access specifically? Are they associated with the Groove program?
 

Smokey0066

Senior member
Oct 9, 1999
488
0
0
Yup, I get a little bubble on the bottom of my screen and I think its from MS Groove, not from ARCHTTP.

I was working with my photo album with JAlbum and the error occured when i was generating the album which entails modifying image size and then creating http docs.

I can try to uninstall MS Office since I don't use it anyways. The files that are being corrupted are *.jpg and *.html files. and I don't know how they are associated with groove since I do not use that program.
 

Fullmetal Chocobo

Moderator<br>Distributed Computing
Moderator
May 13, 2003
13,704
7
81
What version of JAlbum are you using? That might be the problem there--it may not be creating / resizing the files properly. Try uninstalling and the reinstalling that program as well as office, and see if there are any changes. Also, YGPM.
 

VirtualLarry

No Lifer
Aug 25, 2001
56,226
9,990
126
Have you done a Memtest86+ run on the host system's memory? I wonder if that could be causing the corruption.
 

Smokey0066

Senior member
Oct 9, 1999
488
0
0
so the consistency check came back ok.

I have not done anything on the pc because it was running the check. I will uninstal MS Office tonight along with performing memtest.

thanks for everyones help. Will keep you guys updated.