How do I force stop a md array?

Red Squirrel · Jul 14, 2014

I need to destroy and reconstruct a mdadm array as when I built it I assumed it would add a drive as hot spare given I was trying to use an odd number of drives for raid 10 but it somehow used it. Now I can't regrow it because it did some weird stuff to the array instead of a standard raid 10.

Long story short I just want to start over. There is no data on it.

Linux has a terrible habit of locking disk resources for no reason. I was not even able to umount the file system, had to use -l. This is an empty file system with nothing but a few folders. As much as it's saying that it's busy, no it's not, there was nothing using it. Now it wont let me stop the md device even though it's not mounted.

I really don't want to have to reboot the whole server.

Worse case scenario I suppose I can go pull all the drives, but that sometimes causes stale drive letters, so I want to avoid that.

Anything else I can do? I just want to stop/destroy this raid device so I can start over and recreate it.

ch33zw1z · Jul 15, 2014

http://www.ducea.com/2009/03/08/mdadm-cheat-sheet/

If you can't stop and delete, try failing the drives in the array.

ch33zw1z · Jul 17, 2014

You get anywhere with this, Red?

Red Squirrel · Jul 18, 2014

Was able to fail 2 drives but it wont let me fail any more than that. Basically it wont let me kill the array.

I even tried to zero out the array itself and each drive, in hopes it would basically just crash and let go of all the drives. No go. I also tried to just make another array so I can "steal" the drives given they no longer have a superblock but it still wont let me as it says they're busy.

I even tried to kill -9 the raid process for that array, no go.

I don't really want to physically remove the drives as I have a feeling it will end up leaving stale drive letters and it may also not release md2, which I'll want to use for the new array.

theevilsharpie · Jul 18, 2014

Linux has a terrible habit of locking disk resources for no reason

Linux locks disk resources if something is using the disk. This can be problematic on NFS mounts where the remote host no longer exists, but if Linux is complaining that something is still using a local disk, it's probably correct.

You can use the 'lsof' command to find the process that is locking your array. You'll need to kill that process before Linux will let you destroy the array.

Red Squirrel · Jul 18, 2014

They should still at least offer some way to force it though. Often stuff "uses" the disk for no reason such as what is happening now. Trying to unmount an NFS share is pretty much impossible due to this locking stuff.

Just tried lsof now, that looks like a useful command for these situations. Though, it does not seem very accurate. I tried it on a path that should have tons of stuff being accessed (Lot of VMs there) and it comes back empty. Oddly, it's showing my minecraft server files being accessed, and that Minecraft server is turned off. Out of all the things to show... kinda odd.

Though, while experimenting with grep I did find something interesting:

Code:

[root@isengard ~]# lsof | grep md2
jbd2/md2-  9583      root  cwd       DIR              253,0     4096          2 /
jbd2/md2-  9583      root  rtd       DIR              253,0     4096          2 /
jbd2/md2-  9583      root  txt   unknown                                        /proc/9583/exe
md2_raid1 31032      root  cwd       DIR              253,0     4096          2 /
md2_raid1 31032      root  rtd       DIR              253,0     4096          2 /
md2_raid1 31032      root  txt   unknown                                        /proc/31032/exe

What can I do with this? Kill -9 not working on those processes.

theevilsharpie · Jul 18, 2014

Code:

/proc/9583/exe

This is a symlink to the command that started the process. Knowing what the command is should give you a better idea of what the process is doing and why it's stuck.

You can also run `strace -p 9583` to find out exactly what that process is doing.

Red Squirrel · Jul 18, 2014

This is what I get when I try that:

Code:

[root@isengard ~]# strace -p 9583
attach: ptrace(PTRACE_ATTACH, ...): Operation not permitted

Same with the other ID.

mv2devnull · Jul 18, 2014

Red Squirrel said:
Worse case scenario I suppose I can go pull all the drives, but that sometimes causes stale drive letters, so I want to avoid that.

There are no "drive letters" on Linux.

Do you have raid within partition, or partitions within raid?

Somewhere on CentOS Forum/repo is a script "getinfo.sh" that can list many details about system. It is very useful, because most of the time the user fails miserably to explain the facts about his system that relate to the problem.

You said that you had trouble umounting. Remove the volumes from fstab/automount and reboot.

Red Squirrel · Jul 18, 2014

mv2devnull said:
There are no "drive letters" on Linux.

Do you have raid within partition, or partitions within raid?

Somewhere on CentOS Forum/repo is a script "getinfo.sh" that can list many details about system. It is very useful, because most of the time the user fails miserably to explain the facts about his system that relate to the problem.

You said that you had trouble umounting. Remove the volumes from fstab/automount and reboot.

Well I meant sda, sdb etc... I don't want to be left which a bunch of stale ones that are "stuck". I don't have any other raids or partitions, in fact I unmounted it. I had to use umount -l though because it refused to let me unmount despite the fact that it was an empty drive with only a few folders.

Do you know where I can get that getinfo.sh script? I tried yum install getinfo and getinfo.sh but it might be called something else. Really something like this should be built into Linux. It would be nice if there was a way to get all the system info in one shot. Heck, half the time I can't even find what distro I'm running if I don't remember what I installed.

There are no volumes in fstab, and I can't reboot this box, there are VMs running off NFS shares.

mv2devnull · Jul 18, 2014

https://www.centos.org/forums/viewtopic.php?f=12&t=870

Red Squirrel · Jul 18, 2014

Wow that's a really neat script, I don't know why that's not part of all distros. Here is the output:

Not sure how it will help with this issue though...

Code:

[root@isengard ~]# cat /tmp/basedata.Y1pmco 
Information for general problems.
[code]
== BEGIN uname -rmi ==
2.6.32-358.el6.x86_64 x86_64 x86_64
== END   uname -rmi ==

== BEGIN rpm -qa \*-release\* ==
rpmforge-release-0.5.3-1.el6.rf.x86_64
centos-release-6-4.el6.centos.10.x86_64
== END   rpm -qa \*-release\* ==

== BEGIN cat /etc/redhat-release ==
CentOS release 6.4 (Final)
== END   cat /etc/redhat-release ==

== BEGIN getenforce ==
Disabled
== END   getenforce ==

== BEGIN free -m ==
             total       used       free     shared    buffers     cached
Mem:          7843       7608        235          0       1130       6053
-/+ buffers/cache:        425       7418
Swap:         7983          1       7982
== END   free -m ==

== BEGIN rpm -qa yum\* rpm-\* python | sort ==
python-2.6.6-36.el6.x86_64
rpm-build-4.8.0-32.el6.x86_64
rpm-libs-4.8.0-32.el6.x86_64
rpm-python-4.8.0-32.el6.x86_64
yum-3.2.29-40.el6.centos.noarch
yum-metadata-parser-1.1.2-16.el6.x86_64
yum-plugin-fastestmirror-1.1.30-14.el6.noarch
yum-plugin-security-1.1.30-14.el6.noarch
yum-utils-1.1.30-14.el6.noarch
== END   rpm -qa yum\* rpm-\* python | sort ==

== BEGIN ls /etc/yum.repos.d ==
CentOS-Base.repo
CentOS-Debuginfo.repo
CentOS-Media.repo
CentOS-Vault.repo
mirrors-rpmforge
mirrors-rpmforge-extras
mirrors-rpmforge-testing
rpmforge.repo
== END   ls /etc/yum.repos.d ==

== BEGIN cat /etc/yum.conf ==
[main]
cachedir=/var/cache/yum/$basearch/$releasever
keepcache=0
debuglevel=2
logfile=/var/log/yum.log
exactarch=1
obsoletes=1
gpgcheck=1
plugins=1
installonly_limit=5
bugtracker_url=http://bugs.centos.org/set_project.php?project_id=16&ref=http://bugs.centos.org/bug_report_page.php?category=yum
distroverpkg=centos-release

#  This is the default, if you make this bigger yum won't see if the metadata
# is newer on the remote and so you'll "gain" the bandwidth of not having to
# download the new metadata and "pay" for it by yum not having correct
# information.
#  It is esp. important, to have correct metadata, for distributions like
# Fedora which don't keep old packages around. If you don't like this checking
# interupting your command line usage, it's much better to have something
# manually check the metadata once an hour (yum-updatesd will do this).
# metadata_expire=90m

# PUT YOUR REPOS HERE OR IN separate files named file.repo
# in /etc/yum.repos.d
== END   cat /etc/yum.conf ==

== BEGIN yum repolist all ==
Loaded plugins: fastestmirror, security
Loading mirror speeds from cached hostfile
 * base: mirror.science.uottawa.ca
 * extras: mirror.science.uottawa.ca
 * rpmforge: mirror.team-cymru.org
 * updates: mirror.netaddicted.ca
repo id                  repo name                                status
C6.0-base                CentOS-6.0 - Base                        disabled
C6.0-centosplus          CentOS-6.0 - CentOSPlus                  disabled
C6.0-contrib             CentOS-6.0 - Contrib                     disabled
C6.0-extras              CentOS-6.0 - Extras                      disabled
C6.0-updates             CentOS-6.0 - Updates                     disabled
C6.1-base                CentOS-6.1 - Base                        disabled
C6.1-centosplus          CentOS-6.1 - CentOSPlus                  disabled
C6.1-contrib             CentOS-6.1 - Contrib                     disabled
C6.1-extras              CentOS-6.1 - Extras                      disabled
C6.1-updates             CentOS-6.1 - Updates                     disabled
C6.2-base                CentOS-6.2 - Base                        disabled
C6.2-centosplus          CentOS-6.2 - CentOSPlus                  disabled
C6.2-contrib             CentOS-6.2 - Contrib                     disabled
C6.2-extras              CentOS-6.2 - Extras                      disabled
C6.2-updates             CentOS-6.2 - Updates                     disabled
C6.3-base                CentOS-6.3 - Base                        disabled
C6.3-centosplus          CentOS-6.3 - CentOSPlus                  disabled
C6.3-contrib             CentOS-6.3 - Contrib                     disabled
C6.3-extras              CentOS-6.3 - Extras                      disabled
C6.3-updates             CentOS-6.3 - Updates                     disabled
base                     CentOS-6 - Base                          enabled: 6,367
c6-media                 CentOS-6 - Media                         disabled
centosplus               CentOS-6 - Plus                          disabled
contrib                  CentOS-6 - Contrib                       disabled
debug                    CentOS-6 - Debuginfo                     disabled
extras                   CentOS-6 - Extras                        enabled:    14
rpmforge                 RHEL 6 - RPMforge.net - dag              enabled: 4,718
rpmforge-extras          RHEL 6 - RPMforge.net - extras           disabled
rpmforge-testing         RHEL 6 - RPMforge.net - testing          disabled
updates                  CentOS-6 - Updates                       enabled: 1,153
repolist: 12,252
== END   yum repolist all ==

== BEGIN egrep 'include|exclude' /etc/yum.repos.d/*.repo ==
== END   egrep 'include|exclude' /etc/yum.repos.d/*.repo ==

== BEGIN sed -n -e "/^\[/h; /priority *=/{ G; s/\n/ /; s/ity=/ity = /; p }" /etc/yum.repos.d/*.repo | sort -k3n ==
== END   sed -n -e "/^\[/h; /priority *=/{ G; s/\n/ /; s/ity=/ity = /; p }" /etc/yum.repos.d/*.repo | sort -k3n ==

== BEGIN cat /etc/fstab ==

#
# /etc/fstab
# Created by anaconda on Thu Jun 13 01:39:09 2013
#
# Accessible filesystems, by reference, are maintained under '/dev/disk'
# See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info
#
/dev/mapper/vg_isengard-lv_root /                       ext4    defaults        1 1
UUID=3297a90e-ba3a-43a8-be48-e9f2a667a122 /boot                   ext4    defaults        1 2
/dev/mapper/vg_isengard-lv_home /home                   ext4    defaults        1 2
/dev/mapper/vg_isengard-lv_swap swap                    swap    defaults        0 0
tmpfs                   /dev/shm                tmpfs   defaults        0 0
devpts                  /dev/pts                devpts  gid=5,mode=620  0 0
sysfs                   /sys                    sysfs   defaults        0 0
proc                    /proc                   proc    defaults        0 0


UUID=9f665ca0-4f01-47bf-a0f9-4b64f855bac3 /volumes/raid1    ext4       noauto   0   0
UUID=660ff2b7-269f-4706-b7e8-e0fc8eb8147f /volumes/raid2    ext3       noauto   0   0
#UUID=dd5e73d4-111f-49a3-b7c1-f0bc37d28d94 /volumes/raid3    ext4       noauto   0   0

== END   cat /etc/fstab ==

== BEGIN df -h ==
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/vg_isengard-lv_root
                       50G  4.4G   43G  10% /
tmpfs                 3.9G     0  3.9G   0% /dev/shm
/dev/sdn1             485M   38M  422M   9% /boot
/dev/mapper/vg_isengard-lv_home
                       53G  180M   50G   1% /home
/dev/md0              5.4T  2.4T  2.8T  46% /volumes/raid1
/dev/md1              6.3T  4.1T  1.9T  69% /volumes/raid2
== END   df -h ==

== BEGIN fdisk -lu ==

Disk /dev/sdn: 120.0 GB, 120034123776 bytes
255 heads, 63 sectors/track, 14593 cylinders, total 234441648 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00095386

   Device Boot      Start         End      Blocks   Id  System
/dev/sdn1   *        2048     1026047      512000   83  Linux
Partition 1 does not end on cylinder boundary.
/dev/sdn2         1026048   234440703   116707328   8e  Linux LVM

Disk /dev/sdm: 1000.2 GB, 1000204886016 bytes
255 heads, 63 sectors/track, 121601 cylinders, total 1953525168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000


Disk /dev/sdg: 1000.2 GB, 1000204886016 bytes
255 heads, 63 sectors/track, 121601 cylinders, total 1953525168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000


Disk /dev/sdc: 1000.2 GB, 1000204886016 bytes
255 heads, 63 sectors/track, 121601 cylinders, total 1953525168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000


Disk /dev/sdl: 1000.2 GB, 1000204886016 bytes
255 heads, 63 sectors/track, 121601 cylinders, total 1953525168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000


Disk /dev/sda: 1000.2 GB, 1000204886016 bytes
255 heads, 63 sectors/track, 121601 cylinders, total 1953525168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000


Disk /dev/sdb: 1000.2 GB, 1000204886016 bytes
255 heads, 63 sectors/track, 121601 cylinders, total 1953525168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000


Disk /dev/sdf: 1000.2 GB, 1000204886016 bytes
255 heads, 63 sectors/track, 121601 cylinders, total 1953525168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000


Disk /dev/sde: 1000.2 GB, 1000204886016 bytes
255 heads, 63 sectors/track, 121601 cylinders, total 1953525168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000


Disk /dev/sdd: 1000.2 GB, 1000204886016 bytes
255 heads, 63 sectors/track, 121601 cylinders, total 1953525168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000


Disk /dev/sdh: 3000.6 GB, 3000592982016 bytes
255 heads, 63 sectors/track, 364801 cylinders, total 5860533168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk identifier: 0x00000000


Disk /dev/sdk: 3000.6 GB, 3000592982016 bytes
255 heads, 63 sectors/track, 364801 cylinders, total 5860533168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk identifier: 0x00000000


Disk /dev/sdj: 3000.6 GB, 3000592982016 bytes
255 heads, 63 sectors/track, 364801 cylinders, total 5860533168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk identifier: 0x00000000


Disk /dev/sdi: 3000.6 GB, 3000592982016 bytes
255 heads, 63 sectors/track, 364801 cylinders, total 5860533168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk identifier: 0x00000000


Disk /dev/mapper/vg_isengard-lv_root: 53.7 GB, 53687091200 bytes
255 heads, 63 sectors/track, 6527 cylinders, total 104857600 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000


Disk /dev/mapper/vg_isengard-lv_swap: 8371 MB, 8371830784 bytes
255 heads, 63 sectors/track, 1017 cylinders, total 16351232 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000


Disk /dev/md1: 7001.4 GB, 7001415221248 bytes
2 heads, 4 sectors/track, 1709329888 cylinders, total 13674639104 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 65536 bytes / 458752 bytes
Disk identifier: 0x00000000


Disk /dev/md0: 6000.9 GB, 6000916561920 bytes
2 heads, 4 sectors/track, 1465067520 cylinders, total 11720540160 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 524288 bytes / 1048576 bytes
Disk identifier: 0x00000000


Disk /dev/mapper/vg_isengard-lv_home: 57.4 GB, 57445187584 bytes
255 heads, 63 sectors/track, 6983 cylinders, total 112197632 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000


Disk /dev/sdo: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk identifier: 0x00000000


Disk /dev/sdp: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk identifier: 0x00000000


Disk /dev/sdq: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk identifier: 0x00000000


Disk /dev/sdr: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk identifier: 0x00000000


Disk /dev/sds: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk identifier: 0x00000000


Disk /dev/sdt: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk identifier: 0x00000000


Disk /dev/sdu: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk identifier: 0x00000000


Disk /dev/md2: 7000.9 GB, 7000924094464 bytes
2 heads, 4 sectors/track, 1709209984 cylinders, total 13673679872 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 524288 bytes / 3670016 bytes
Disk identifier: 0x00000000


Disk /dev/sdv: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk identifier: 0x00000000

== END   fdisk -lu ==

== BEGIN blkid ==
/dev/mapper/vg_isengard-lv_root: UUID="0eabf412-b8cb-4207-af62-395c57237ba3" TYPE="ext4" 
/dev/sda: UUID="11f961e7-0e37-ba39-2c8a-155276dd72ee" UUID_SUB="6df28169-8be8-643e-8759-a2d665f2e12c" LABEL="isengard.loc:0" TYPE="linux_raid_member" 
/dev/mapper/vg_isengard-lv_swap: UUID="4ccffacd-960e-4a58-817d-b06bdee9c244" TYPE="swap" 
/dev/mapper/vg_isengard-lv_home: UUID="01950f07-feef-4bf3-a096-76049f562d60" TYPE="ext4" 
/dev/md0: UUID="9f665ca0-4f01-47bf-a0f9-4b64f855bac3" TYPE="ext4" 
/dev/sdc: UUID="11f961e7-0e37-ba39-2c8a-155276dd72ee" UUID_SUB="9c844f77-e526-48af-822f-f1b7fce823c8" LABEL="isengard.loc:0" TYPE="linux_raid_member" 
/dev/sdd: UUID="11f961e7-0e37-ba39-2c8a-155276dd72ee" UUID_SUB="01846b52-24bf-4a84-57cb-b148de2fa095" LABEL="isengard.loc:0" TYPE="linux_raid_member" 
/dev/sdb: UUID="11f961e7-0e37-ba39-2c8a-155276dd72ee" UUID_SUB="ee7c8d79-862c-0e13-078e-004986cd9f29" LABEL="isengard.loc:0" TYPE="linux_raid_member" 
/dev/md1: UUID="660ff2b7-269f-4706-b7e8-e0fc8eb8147f" SEC_TYPE="ext2" TYPE="ext3" 
/dev/sdf: UUID="11f961e7-0e37-ba39-2c8a-155276dd72ee" TYPE="linux_raid_member" 
/dev/sdh: UUID="2e257e19-33da-b86c-2e11-2e06b386598e" TYPE="linux_raid_member" UUID_SUB="6df28169-8be8-643e-8759-a2d665f2e12c" LABEL="isengard.loc:0" 
/dev/sdi: UUID="2e257e19-33da-b86c-2e11-2e06b386598e" TYPE="linux_raid_member" UUID_SUB="ee7c8d79-862c-0e13-078e-004986cd9f29" LABEL="isengard.loc:0" 
/dev/sdj: UUID="2e257e19-33da-b86c-2e11-2e06b386598e" TYPE="linux_raid_member" UUID_SUB="9c844f77-e526-48af-822f-f1b7fce823c8" LABEL="isengard.loc:0" 
/dev/sdk: UUID="2e257e19-33da-b86c-2e11-2e06b386598e" TYPE="linux_raid_member" UUID_SUB="01846b52-24bf-4a84-57cb-b148de2fa095" LABEL="isengard.loc:0" 
/dev/sdm: UUID="11f961e7-0e37-ba39-2c8a-155276dd72ee" TYPE="linux_raid_member" 
/dev/sdn1: UUID="3297a90e-ba3a-43a8-be48-e9f2a667a122" TYPE="ext4" 
/dev/sdn2: UUID="rI07QE-15qa-6V7G-pm4c-mLC6-HMDh-RvbO9Y" TYPE="LVM2_member" 
/dev/sdg: UUID="11f961e7-0e37-ba39-2c8a-155276dd72ee" TYPE="linux_raid_member" 
/dev/sde: UUID="11f961e7-0e37-ba39-2c8a-155276dd72ee" TYPE="linux_raid_member" 
/dev/sdo: UUID="12ea5e09-f9c6-2e88-0b46-fb9654199143" UUID_SUB="860633a2-37a7-d9b5-945d-5ba1001db74a" LABEL="isengard.loc:2" TYPE="linux_raid_member" 
/dev/sdp: UUID="12ea5e09-f9c6-2e88-0b46-fb9654199143" UUID_SUB="eb1b7b66-c9c5-4f44-e288-9cdda6bb4dac" LABEL="isengard.loc:2" TYPE="linux_raid_member" 
/dev/sdq: UUID="12ea5e09-f9c6-2e88-0b46-fb9654199143" UUID_SUB="77a21bf0-af2b-8aea-09b4-85439f0b3254" LABEL="isengard.loc:2" TYPE="linux_raid_member" 
/dev/sdr: UUID="12ea5e09-f9c6-2e88-0b46-fb9654199143" UUID_SUB="24f4376a-2ddc-9651-9295-d327b7e1cfdc" LABEL="isengard.loc:2" TYPE="linux_raid_member" 
/dev/sds: UUID="12ea5e09-f9c6-2e88-0b46-fb9654199143" UUID_SUB="1d46a3f7-4a33-0556-bab9-51e66827e040" LABEL="isengard.loc:2" TYPE="linux_raid_member" 
/dev/sdt: UUID="8cd30df3-4acb-2c76-3459-7cdca00eb438" UUID_SUB="32d0c90d-fb0d-d4dd-e21c-2b85d951bc61" LABEL="isengard.loc:200" TYPE="linux_raid_member" 
/dev/sdu: UUID="8cd30df3-4acb-2c76-3459-7cdca00eb438" UUID_SUB="3a60f8ab-2e12-c251-e5fa-0a5270ea1292" LABEL="isengard.loc:200" TYPE="linux_raid_member" 
/dev/sdv: UUID="8cd30df3-4acb-2c76-3459-7cdca00eb438" UUID_SUB="f48f64e1-8c4b-feb9-bbb0-ea71e190f546" LABEL="isengard.loc:200" TYPE="linux_raid_member" 
== END   blkid ==

== BEGIN cat /proc/mdstat ==
Personalities : [raid6] [raid5] [raid4] [raid10] [raid0] 
md2 : active raid10 sds[4] sdr[3] sdq[2] sdp[1] sdo[0]
      6836839936 blocks super 1.2 512K chunks 2 near-copies [7/5] [UUUUU__]
      
md0 : active raid10 sdk[2] sdh[0] sdj[1] sdi[3]
      5860270080 blocks super 1.2 512K chunks 2 near-copies [4/4] [UUUU]
      
md1 : active raid5 sdm[1] sde[7] sdd[2] sdf[6] sdc[0] sdb[3] sda[5] sdg[4]
      6837319552 blocks level 5, 64k chunk, algorithm 2 [8/8] [UUUUUUUU]
      
unused devices: <none>
== END   cat /proc/mdstat ==

== BEGIN pvs ==
  PV         VG          Fmt  Attr PSize   PFree
  /dev/sdn2  vg_isengard lvm2 a--  111.30g    0 
== END   pvs ==

== BEGIN vgs ==
  VG          #PV #LV #SN Attr   VSize   VFree
  vg_isengard   1   3   0 wz--n- 111.30g    0 
== END   vgs ==

== BEGIN lvs ==
  LV      VG          Attr      LSize  Pool Origin Data%  Move Log Cpy%Sync Convert
  lv_home vg_isengard -wi-ao--- 53.50g                                             
  lv_root vg_isengard -wi-ao--- 50.00g                                             
  lv_swap vg_isengard -wi-ao---  7.80g                                             
== END   lvs ==

== BEGIN rpm -qa kernel\* | sort ==
kernel-2.6.32-358.el6.x86_64
kernel-devel-2.6.32-358.el6.x86_64
kernel-firmware-2.6.32-358.el6.noarch
kernel-headers-2.6.32-358.el6.x86_64
== END   rpm -qa kernel\* | sort ==

== BEGIN lspci -nn ==
00:00.0 Host bridge [0600]: Intel Corporation Xeon E3-1200 v2/Ivy Bridge DRAM Controller [8086:0158] (rev 09)
00:01.0 PCI bridge [0604]: Intel Corporation Xeon E3-1200 v2/3rd Gen Core processor PCI Express Root Port [8086:0151] (rev 09)
00:01.1 PCI bridge [0604]: Intel Corporation Xeon E3-1200 v2/3rd Gen Core processor PCI Express Root Port [8086:0155] (rev 09)
00:06.0 PCI bridge [0604]: Intel Corporation Xeon E3-1200 v2/3rd Gen Core processor PCI Express Root Port [8086:015d] (rev 09)
00:19.0 Ethernet controller [0200]: Intel Corporation 82579LM Gigabit Network Connection [8086:1502] (rev 05)
00:1a.0 USB controller [0c03]: Intel Corporation 6 Series/C200 Series Chipset Family USB Enhanced Host Controller #2 [8086:1c2d] (rev 05)
00:1c.0 PCI bridge [0604]: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 1 [8086:1c10] (rev b5)
00:1c.4 PCI bridge [0604]: Intel Corporation 6 Series/C200 Series Chipset Family PCI Express Root Port 5 [8086:1c18] (rev b5)
00:1d.0 USB controller [0c03]: Intel Corporation 6 Series/C200 Series Chipset Family USB Enhanced Host Controller #1 [8086:1c26] (rev 05)
00:1e.0 PCI bridge [0604]: Intel Corporation 82801 PCI Bridge [8086:244e] (rev a5)
00:1f.0 ISA bridge [0601]: Intel Corporation C204 Chipset Family LPC Controller [8086:1c54] (rev 05)
00:1f.2 SATA controller [0106]: Intel Corporation 6 Series/C200 Series Chipset Family SATA AHCI Controller [8086:1c02] (rev 05)
00:1f.3 SMBus [0c05]: Intel Corporation 6 Series/C200 Series Chipset Family SMBus Controller [8086:1c22] (rev 05)
01:00.0 Fibre Channel [0c04]: QLogic Corp. ISP2432-based 4Gb Fibre Channel to PCI Express HBA [1077:2432] (rev 03)
01:00.1 Fibre Channel [0c04]: QLogic Corp. ISP2432-based 4Gb Fibre Channel to PCI Express HBA [1077:2432] (rev 03)
02:00.0 Serial Attached SCSI controller [0107]: LSI Logic / Symbios Logic SAS2008 PCI-Express Fusion-MPT SAS-2 [Falcon] [1000:0072] (rev 03)
03:00.0 Serial Attached SCSI controller [0107]: LSI Logic / Symbios Logic SAS2008 PCI-Express Fusion-MPT SAS-2 [Falcon] [1000:0072] (rev 03)
04:00.0 Serial Attached SCSI controller [0107]: LSI Logic / Symbios Logic SAS2008 PCI-Express Fusion-MPT SAS-2 [Falcon] [1000:0072] (rev 03)
05:00.0 Ethernet controller [0200]: Intel Corporation 82574L Gigabit Network Connection [8086:10d3]
06:03.0 VGA compatible controller [0300]: Matrox Electronics Systems Ltd. MGA G200eW WPCM450 [102b:0532] (rev 0a)
== END   lspci -nn ==

== BEGIN lsusb ==
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 002 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 001 Device 002: ID 8087:0024 Intel Corp. Integrated Rate Matching Hub
Bus 002 Device 002: ID 8087:0024 Intel Corp. Integrated Rate Matching Hub
== END   lsusb ==

== BEGIN rpm -qa kmod\* kmdl\* ==
== END   rpm -qa kmod\* kmdl\* ==

== BEGIN ifconfig -a ==
eth0      Link encap:Ethernet  HWaddr 00:25:90:AF:55:BF  
          inet addr:10.1.1.50  Bcast:10.1.255.255  Mask:255.255.0.0
          inet6 addr: fe80::225:90ff:feaf:55bf/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:6752330391 errors:1222 dropped:17 overruns:0 frame:611
          TX packets:28304551122 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:1970221112161 (1.7 TiB)  TX bytes:41922457836357 (38.1 TiB)
          Interrupt:20 Memory:dfd00000-dfd20000 

eth1      Link encap:Ethernet  HWaddr 00:25:90:AF:55:BE  
          BROADCAST MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)
          Interrupt:16 Memory:dfb00000-dfb20000 

lo        Link encap:Local Loopback  
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:1186 errors:0 dropped:0 overruns:0 frame:0
          TX packets:1186 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:86170 (84.1 KiB)  TX bytes:86170 (84.1 KiB)

== END   ifconfig -a ==

== BEGIN brctl show ==
bridge name	bridge id		STP enabled	interfaces
== END   brctl show ==

== BEGIN route -n ==
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
10.1.0.0        0.0.0.0         255.255.0.0     U     0      0        0 eth0
169.254.0.0     0.0.0.0         255.255.0.0     U     1002   0        0 eth0
0.0.0.0         10.1.1.1        0.0.0.0         UG    0      0        0 eth0
== END   route -n ==

== BEGIN sysctl -a | grep .rp_filter ==
net.ipv4.conf.all.rp_filter = 0
net.ipv4.conf.all.arp_filter = 0
net.ipv4.conf.default.rp_filter = 1
net.ipv4.conf.default.arp_filter = 0
net.ipv4.conf.lo.rp_filter = 1
net.ipv4.conf.lo.arp_filter = 0
net.ipv4.conf.eth0.rp_filter = 1
net.ipv4.conf.eth0.arp_filter = 0
net.ipv4.conf.eth1.rp_filter = 1
net.ipv4.conf.eth1.arp_filter = 0
== END   sysctl -a | grep .rp_filter ==

== BEGIN ip rule show ==
0:	from all lookup local 
32766:	from all lookup main 
32767:	from all lookup default 
== END   ip rule show ==

== BEGIN ip route show ==
10.1.0.0/16 dev eth0  proto kernel  scope link  src 10.1.1.50 
169.254.0.0/16 dev eth0  scope link  metric 1002 
default via 10.1.1.1 dev eth0 
== END   ip route show ==

== BEGIN cat /etc/resolv.conf ==
# Generated by NetworkManager
#search loc

nameserver 10.1.1.10

# No nameservers found; try putting DNS servers into your
# ifcfg files in /etc/sysconfig/network-scripts like so:
#
# DNS1=xxx.xxx.xxx.xxx
# DNS2=xxx.xxx.xxx.xxx
# DOMAIN=lab.foo.com bar.foo.com
== END   cat /etc/resolv.conf ==

== BEGIN egrep 'net|hosts' /etc/nsswitch.conf ==
#hosts:     db files nisplus nis dns
hosts:      files dns
#networks:   nisplus [NOTFOUND=return] files
#netmasks:   nisplus [NOTFOUND=return] files     
netmasks:   files
networks:   files
netgroup:   nisplus
== END   egrep 'net|hosts' /etc/nsswitch.conf ==

== BEGIN chkconfig --list | grep -Ei 'network|wpa' ==
network        	0:off	1:off	2:on	3:on	4:on	5:on	6:off
wpa_supplicant 	0:off	1:off	2:off	3:off	4:off	5:off	6:off
== END   chkconfig --list | grep -Ei 'network|wpa' ==

mv2devnull · Jul 18, 2014

You seem to have three arrays. Who are we talking about?

What was the error with

Code:

mdadm --stop /dev/mdX

Red Squirrel · Jul 18, 2014

md2 is the bad one that I want to remove. md0 and md1 are production and working fine.

This is the error I get when I try to stop the array.

Code:

[root@isengard ~]# mdadm --stop /dev/md2
mdadm: Cannot get exclusive access to /dev/md2:Perhaps a running process, mounted filesystem or active volume group?
[root@isengard ~]#

It's not mounted. --force does not work either.

Edit: I'll be gone for like a week starting tomorrow, so I'll revisit that then, but open to more ideas of stuff to try.

manly · Jul 25, 2014

well the easy workaround is to reboot into single user mode or rescue Linux.

Red Squirrel · Jul 26, 2014

Can't reboot this machine. It's a EXTREMELY last resort. It would mean I need to take down my whole network. My VMs, file shares, everything depends on this file server. It's probably closer to a SAN than a server, in a way. It's rare a server reboots 100% cleanly, there's usually some kind of issue that has to be tended, and it would take a long time if I need to reboot everything.

I think I will just go ahead and physically pull the drives out and hope for the best... Just got back from vacation so I'll try that once all my stuff is settled. I still need to transfer/sort my pics then back them up before I do anything to do with my file server.

Red Squirrel · Jul 28, 2014

Well I went ahead and pulled the 8 drives out. Good news is, it released all the drive "letters" so I'm not stuck with a bunch of stale entries. Bad news is, I STILL can't stop the array, despite the fact that it has NO DISKS.

Code:

mdadm --detail /dev/md2
/dev/md2:
        Version : 1.2
  Creation Time : Sat Jun 28 14:47:10 2014
     Raid Level : raid10
     Array Size : 6836839936 (6520.12 GiB 7000.92 GB)
  Used Dev Size : 1953382912 (1862.89 GiB 2000.26 GB)
   Raid Devices : 7
  Total Devices : 5
    Persistence : Superblock is persistent

    Update Time : Mon Jul 28 18:12:17 2014
          State : clean, FAILED 
 Active Devices : 5
Working Devices : 5
 Failed Devices : 0
  Spare Devices : 0

         Layout : near=2
     Chunk Size : 512K

    Number   Major   Minor   RaidDevice State
       0       8      224        0      active sync
       1       8      240        1      active sync
       2      65        0        2      active sync
       3      65       16        3      active sync
       4      65       32        4      active sync
       5       0        0        5      removed
       6       0        0        6      removed

Is there anything else I can try? Worse case scenario I will just create md3 and ignore md2 but I'm hoping there's a way (without rebooting) to clear md2.

I guess worse case scenario I can create it as md3 but put md2 in the config file, and I mount by uuid in fstab, so if ever I do have to reboot it should hopefully fix itself.

Red Squirrel · Jul 29, 2014

Got it! It turns out those two drives were still registered with a raid 0 test array I had created (was the only 2 drives I was able to force out of the old raid 10), I thought I stopped that array but apparently not. All good now. New 7.2TB raid 10 is live!

That old raid 10 is still stuck though but as long as it does not come to haunt me later I should be ok...

timinski · Dec 28, 2015

@Red_Squirrel: I have the same/similar issue and have lost much time troubleshooting it. How do you find out the drives and/or partitions were still registered with the old RAID 0 test array? My situation is with a RAID1 setup....

Red Squirrel · Dec 28, 2015

I never did end up getting rid of it, it's still there, I just put up with it. It scares the crap out of me every time I see it then realize it's not an active array. :biggrin:

Code:

[root@isengard ~]# mdadm --detail /dev/md2
/dev/md2:
        Version : 1.2
  Creation Time : Sat Jun 28 14:47:10 2014
     Raid Level : raid10
     Array Size : 6836839936 (6520.12 GiB 7000.92 GB)
  Used Dev Size : 1953382912 (1862.89 GiB 2000.26 GB)
   Raid Devices : 7
  Total Devices : 5
    Persistence : Superblock is persistent

    Update Time : Sun Dec  6 14:19:31 2015
          State : clean, FAILED 
 Active Devices : 5
Working Devices : 5
 Failed Devices : 0
  Spare Devices : 0

         Layout : near=2
     Chunk Size : 512K

    Number   Major   Minor   RaidDevice State
       0       8      224        0      active sync
       1       8      240        1      active sync
       2      65        0        2      active sync
       3      65       16        3      active sync
       4      65       32        4      active sync
      10       0        0       10      removed
      12       0        0       12      removed
[root@isengard ~]#

I still need to get the courage to schedule a reboot for this server, to troubleshoot another issue that I think a kernel update might fix. Kinda scared to do that though, so I will probably wait till a point where I can build a secondary storage server and setup some kind of replication/failover setup. My guess is that a reboot will probably clear up that array.

Did a boo boo a while back involving system files, so a reboot may potentially be a catastrophe for this server so been holding off on it. Need to look into some kind of redundant (at the server level) storage setup at some point.

timinski · Dec 28, 2015

I hear your pain regarding production server. My question was more to the nuance of how exactly you find out the drivers/partitions were still registered. I can't spot that on my target system.

Between my original post and your response, I did however note that the old md128 device I'm trying to kill used superblock 1.2 and my current re-installation setup is 0.90

timinski · Dec 28, 2015

timinski said:
I hear your pain regarding production server. My question was more to the nuance of how exactly you find out the drivers/partitions were still registered. I can't spot that on my target system.

Between my original post and your response, I did however note that the old md128 device I'm trying to kill used superblock 1.2 and my current re-installation setup is 0.90

i.e., perhaps this latter characteristic is causing my issue....

Red Squirrel · Dec 28, 2015

timinski said:
I hear your pain regarding production server. My question was more to the nuance of how exactly you find out the drivers/partitions were still registered. I can't spot that on my target system.

Between my original post and your response, I did however note that the old md128 device I'm trying to kill used superblock 1.2 and my current re-installation setup is 0.90

Hmm not sure what you mean. I know because I still see /dev/md2 registered, when it should not be, because all the drives were pulled out. If I try to remove that array it fails. Originally it would not actually let me force out the drives, so I had to physically pull them out of their respective bays, so that I can then use them in the new array.

I don't have the md128 device thankfully though. I've seen that before and it's always annoying. I have md0, md1 and md3 which are valid, and the defunct md2 that I can't do anything with. md3 was suppose to be md2.

master_shake_ · Dec 29, 2015

I just realized how much I love megaraid storage manager.

That all looks like gibberish to me.

Red Squirrel · Dec 29, 2015

master_shake_ said:
I just realized how much I love megaraid storage manager.

That all looks like gibberish to me.

The nice thing with md raid is that it's hardware independent. Only thing I'd really want to use hardware raid for is OS drive, since it's kinda hard to use software raid for it. (it somehow can be done, using two boot partitions or something but too complicated for my tastes). With md raid I can survive a controller failure, total system failure etc... just put the drives in a new system and reassemble the array.

ZFS looks pretty interesting too, I may try it some time.

The nice thing with command line tools too is it's easy to monitor. I have several monitors that look at the status of the array by parsing out the status line and what not. So if I get a disk failure I get a notice and email. Though the raid system sends it's own email too, but it's always nice to tie it into the main monitoring system as well.

How do I force stop a md array?

No Lifer

Lifer

Lifer

No Lifer

Platinum Member

No Lifer

Platinum Member

No Lifer

Golden Member

No Lifer

Golden Member

No Lifer

Golden Member

No Lifer

Lifer

No Lifer

No Lifer

No Lifer

Junior Member

No Lifer

Junior Member

Junior Member

No Lifer

Diamond Member

No Lifer