Another I.T. blog: How to remove an FC LUN from a running RedHat 6 server.

This quick howto document shows how to remove a fibre channel LUN under multipathd(8) control from a running RedHat Enterprise Linux 6 machine. Be careful when performing online storage modifications. Make sure you have a valid backup. And of course I can't be held resonsible for any problems if you follow these steps ;)

In our example, we have an LVM volume mounted on /export/oracle which is under multipathd(8) control. We will remove this volume from the server without taking the machine down.

So first, make sure the mount point is not used anymore. Check your applications and users and remove all references to this device.

If the volume is mounted, check if it's used and if not, then unmount it. The fuser(1) and lsof(1) commands can tell you if the device is in use. Don't forget that if this file system is shared via NFS, you will need to stop the NFS daemons before you can umount(1) it.

df -h /export/oracle

/dev/mapper/ora-bckp 2.0T 1.4T 509G 74% /export/oracle

sudo fuser /export/oracle

sudo umount /export/oracle

Now unmount the file system.

sudo umount /export/oracle

From the df(1) command above, we saw that the /export/oracle file system is in fact an LVM logical volume called « bckp » from the volume group « ora ». Let's take a look at the LVM configuration for both of these objects starting with the logical volume.

sudo lvs bckp
LV VG Attr LSize Pool Origin Data% Move Log Cpy%Sync Convert
bckp ora -wi-a---- 2.00t

Then the volume group.

sudo vgs ora

VG #PV #LV #SN Attr VSize VFree

ora 4 1 0 wz--n- 2.00t 0

And finally, the physical devices.

sudo pvs | egrep 'PV|ora'

PV VG Fmt Attr PSize PFree
/dev/mapper/backup01 ora lvm2 a-- 512.00g 0
/dev/mapper/backup02 ora lvm2 a-- 512.00g 0
/dev/mapper/backup03 ora lvm2 a-- 512.00g 0
/dev/mapper/backup04 ora lvm2 a-- 512.00g 0

Take a note of these four LVM physical devices. We will use this info later. But for now, we must first remove the logical volume and then the volume group from LVM. We start by removing the logical volume.

sudo lvremove ora/bckp
Do you really want to remove active logical volume bckp? [y/n]: y
Logical volume "bckp" successfully removed

Then we remove the volume group.

sudo vgremove ora
Volume group "ora" successfully removed

We can now work on the LVM physical devices.

sudo pvremove /dev/mapper/backup01
Labels on physical volume "/dev/mapper/backup01" successfully wiped

sudo pvremove /dev/mapper/backup02 /dev/mapper/backup03 /dev/mapper/backup04

Labels on physical volume "/dev/mapper/backup02" successfully wiped
Labels on physical volume "/dev/mapper/backup03" successfully wiped
Labels on physical volume "/dev/mapper/backup04" successfully wiped

Good, now let's check the multipath status for these four LVM physical devices.

sudo multipath -ll
[...output truncated...]

backup04 (3600508b4000c1ec00001400000b30000) dm-2 HP,HSV300
size=512G features='1 queue_if_no_path' hwhandler='0' wp=rw
|-+- policy='service-time 0' prio=50 status=active
| |- 2:0:0:4 sdd 8:48 active ready running
| `- 3:0:3:4 sdt 65:48 active ready running
`-+- policy='service-time 0' prio=10 status=enabled
|- 2:0:3:4 sdj 8:144 active ready running
`- 3:0:0:4 sdn 8:208 active ready running
backup03 (3600508b4000c1ec00001400000a60000) dm-3 HP,HSV300
size=512G features='1 queue_if_no_path' hwhandler='0' wp=rw
|-+- policy='service-time 0' prio=50 status=active
| |- 2:0:3:3 sdi 8:128 active ready running
| `- 3:0:0:3 sdm 8:192 active ready running
`-+- policy='service-time 0' prio=10 status=enabled
|- 2:0:0:3 sdc 8:32 active ready running
`- 3:0:3:3 sds 65:32 active ready running
backup02 (3600508b4000c1ec00001400000980000) dm-1 HP,HSV300
size=512G features='1 queue_if_no_path' hwhandler='0' wp=rw
|-+- policy='service-time 0' prio=50 status=active
| |- 2:0:0:2 sdb 8:16 active ready running
| `- 3:0:3:2 sdr 65:16 active ready running
`-+- policy='service-time 0' prio=10 status=enabled
|- 2:0:3:2 sdh 8:112 active ready running
`- 3:0:0:2 sdl 8:176 active ready running
backup01 (3600508b4000c1ec00001400000840000) dm-0 HP,HSV300
size=512G features='1 queue_if_no_path' hwhandler='0' wp=rw
|-+- policy='service-time 0' prio=50 status=active
| |- 2:0:0:1 sda 8:0 active ready running
| `- 3:0:3:1 sdq 65:0 active ready running
`-+- policy='service-time 0' prio=10 status=enabled
|- 2:0:3:1 sdg 8:96 active ready running
`- 3:0:0:1 sdk 8:160 active ready running

Now remove the LUNs from multipathd(8) control.

sudo multipath -f backup01 backup02 backup03 backup04

Once that's done, make sure they're not listed in the following output.

sudo multipath -ll | grep backup

Update /etc/multipath.conf to remove the LUN. In this example, I removed this block of code from the file's multipaths section. YMMV of course, because the LUN's WWN will obviously not be the same.

sudo vim /etc/multipath.conf

<remove>

multipath {
wwid "3600508b4000c1ec00001400000840000"
alias backup01
}
multipath {
wwid "3600508b4000c1ec00001400000980000"
alias backup02
}
multipath {
wwid "3600508b4000c1ec00001400000a60000"
alias backup03
}
multipath {
wwid "3600508b4000c1ec00001400000b30000"
alias backup04
}

</remove>

Tell multipathd(8) that the configuration has changed.

sudo /etc/init.d/multipahtd reload

Clear the device from the SCSI subsystem. This is where we need the recorded output from above. What we need is the HBA number:Channel:Target ID:LUN number numbers. These numbers look like 2:0:1:3 in the `multipath -ll` output. Since we previously saved our SCSI IDs in the /tmp/ids file, we can simply do this :

sudo su - root
cat /tmp/ids | while read id; do

echo "1" > /sys/class/scsi_device/${id}/device/delete

done

This will generate logs similar to these ones in /var/log/messages :

Aug 16 13:19:52 oxygen multipathd: sdw: remove path (uevent)

Now that we have safely removed the LUNs from the server, we can remove those LUNs from the storage array. Once you do this, the server from which we just removed a LUNs will complain in it's /var/log/messages :

Aug 16 13:48:59 oxygen kernel: sd 5:0:0:1: [sdc] Warning! Received an indication that the LUN assignments on this target have changed. The Linux SCSI layer does not automatically remap LUN assignments.

These are warning messages only and can be safely ignored. To be complete, we should really issue a LIP from each of the HBA ports on the server. If you don't know how many HBA ports you have, just look into the /sys/class/fc_host directory. There is going to be one sub-directory per HBA port. In this example, the machine has two single ports HBA, so we have two sub-directories.

ls /sys/class/fc_host/
host2 host3

To issue a LIP reset, simple do this.

sudo su - root
ls /sys/class/fc_host/ | while read dir
do echo $dir; echo 1 > /sys/class/fc_host/${dir}/issue_lip
done

And that's it!

Should you want to read more about online storage management under RedHat 6, then read the Red Hat Enterprise Linux 6 Storage Administration Guide « Deploying and configuring single-node storage in Red Hat Enterprise Linux 6 »

HTH,

DA+

Another I.T. blog

Thursday, August 16, 2012

How to remove an FC LUN from a running RedHat 6 server.

No comments:

Post a Comment