QLA Linux and RDAC - Sun 6140 Failover

| | Comments (6) | TrackBacks (0)
Accessing a Sun StorageTek 6140 array with multipathing (failover) via Fiberchannel from a Linux host is not so easy (You'll dream about having scsi_vhci MPxIO out of the box in Linux...).

First: The Sun StorageTek 6140 does not support symmetric multipathing - only one path and controller is active for a given volume. When trying to access the volume on another path SCSI errors are the result. So the failover code has to talk to the array if it whishes to change the active path.

Second: Linux (with actual Q-Logic drivers) offer two multipathing strategies and none of them is usable with this Sun arrays:
  1. Q-Logic qla2xxx failover: Arrays must support Target Group Mode, Path activation via CDB or by LUN reset. This seems not to be the case for the StorageTek 6140.
  2. dm-multipath, a part of the Linux dm suite. It only supports symmetric multipathing - so not usable as well.

Sun itself offers a "Linux-RDAC for Linux 09" for download but it won't compile under SLES 10, 11 nor under Redhat 5. It will complain about a missing scsi/scsi_request.h.

I found an actual version on the LSI website:

http://www.lsi.com/rdac/ds4000.html#current

The file needed for SLES 10, SLES 11 or Redhat 5 is:

http://www.lsi.com/rdac/rdac-LINUX-09.03.0C05.0214-source.tar.gz

Don't mind if there is "DS4000" marked. This RDAC driver will work well with Sun arrays

Prerequisites:
  1. You MUST have kernel source installed, and the gcc compiler. The installation programs from SUSE and Redhat will install dependencies automatically (header files, ...).
  2. The Q-Logic driver has to be loaded:
    modprobe qla2xxx
  3. Please disable Q-Logic failover code: ql2xfailover=0  (in the appropriate modprobe.conf.local file).

Become root on your machine, unpack the source code, change to the source directory and type make:

primitivo:/usr/src/sun-rdac # wget http://www.lsi.com/rdac/rdac-LINUX-09.03.0C05.0214-source.tar.gz
--10:48:20--  http://www.lsi.com/rdac/rdac-LINUX-09.03.0C05.0214-source.tar.gz
           => `rdac-LINUX-09.03.0C05.0214-source.tar.gz'
Resolving www.lsi.com... 192.19.195.53
Connecting to www.lsi.com|192.19.195.53|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 376,020 (367K) [application/x-gzip]

100%[========================================================================================>] 376,020      231.11K/s

10:48:22 (230.46 KB/s) - `rdac-LINUX-09.03.0C05.0214-source.tar.gz' saved [376020/376020]

primitivo:/usr/src/sun-rdac # tar zxf rdac-LINUX-09.03.0C05.0214-source.tar.gz
primitivo:/usr/src/sun-rdac # cd linuxrdac-09.03.0C05.0214
primitivo:/usr/src/sun-rdac/linuxrdac-09.03.0C05.0214 # make

The result should be like this:

make[1]: Entering directory `/usr/src/linux-2.6.16.60-0.42.4-obj/x86_64/smp'
make -C ../../../linux-2.6.16.60-0.42.4 O=../linux-2.6.16.60-0.42.4-obj/x86_64/smp modules
  CC [M]  /usr/src/sun-rdac/linuxrdac-09.03.0C05.0214/MPP_hba.o
  CC [M]  /usr/src/sun-rdac/linuxrdac-09.03.0C05.0214/mppLnx26p_upper.o


[... Many lines omitted ...]


Checking Host Adapter Configuration...
Detected 1 LSI Host Adapter Port(s) on the system
Detected 2 QLogic Host Adapter Port(s) on the system
Host Adapters from different supported vendors co-exists on your system.
Please wait while we modify the system configuration files.
Your kernel version is 2.6.16.60-0.42.4-smp
Preparing to install MPP driver against this kernel version...
Generating module dependencies...
Warning:Duplicate module options detected.
Option in /etc/modprobe.conf.local ( ql2xfailover=0 ) takes precedence
Creating new MPP initrd image...
Root device:    /dev/disk/by-id/scsi-3600508e0000000009ec1e521e0b4940e-part5 (/dev/sda5) (mounted on / as ext3)
Module list:     scsi_mod sd_mod sg mppUpper amd74xx mptsas processor thermal fan jbd ext3 edd qla2xxx_conf qla2xxx mppVhba (xennet xenblk)

[...]

Kernel image:   /boot/vmlinuz-2.6.16.60-0.42.4-smp
Initrd image:   /boot/mpp-2.6.16.60-0.42.4-smp.img
        You must now edit your boot loader configuration file, /boot/grub/menu.lst, to
        add a new boot menu, which uses mpp-2.6.16.60-0.42.4-smp.img as the initrd image.
        Now Reboot the system for MPP to take effect.
        The new boot menu entry should look something like this (note that it may
        vary with different system configuration):

        ...

                title SUSE Linux (2.6.16.60-0.42.4-smp) with MPP support
                kernel (hd1,3)/boot/vmlinuz root=/dev/hdb4 vga=0x31a selinux=0 splash=silent console=tty0 resume=/dev/hda2 elevator=cfq showopts
                initrd (hd0,8)/boot/mpp-2.6.16.60-0.42.4-smp.img

You're ready to go!


It created a new initrd image including mpp drivers (RDAC multipathing). Just include this initrd file in a grub record (/boot/grub/menu.lst), an example is given as last output above.

if you don't want to boot from SAN you can do this manually as well to bring up the volumes.

First, UNLOAD THE Q-LOGIC DRIVER: modprobe -r qla2xxx
If this fails, do (but only in this case!, it's a module dependency bug in SLES 10):

  1. rmmod -f scsi_transport_fc
  2. rmmod -f firmware_class
  3. rmmod -f qla2xxx
Then do the following:

  1. Load the new driver "mppUpper":

    modprobe mppUpper

  2. Load the Q-Logic hardware driver:

    modprobe qla2xxx

    This should result in a  dmesg  output like this:

    qla2xxx 0000:83:00.0: LOOP UP detected (2 Gbps).
      Vendor: SUN       Model: CSM200_R          Rev: 0710
      Type:   Direct-Access                      ANSI SCSI revision: 05
    736 [RAIDarray.mpp]Host 52 Target 0 Lun 0 Is a physical device but is an Unconfigured Device.
      Vendor: SUN       Model: CSM200_R          Rev: 0710
      Type:   Direct-Access                      ANSI SCSI revision: 05
      Vendor: SUN       Model: CSM200_R          Rev: 0710
      Type:   Direct-Access                      ANSI SCSI revision: 05

  3. Load the new mpp virtual host adapter:

    modprobe mppVhba

  4. The new multipathed drives should appear:

    scsi54 : mpp virtual bus adapter :version:09.03.0C05.0214,timestamp:Fri Jun 26 18:02:53 CDT 2009
     54:0:0:0: scsi scan: consider passing scsi_mod.dev_flags=SUN:VirtualDisk:0x240 or 0x1000240
      Vendor: SUN       Model: VirtualDisk       Rev: 0710
      Type:   Direct-Access                      ANSI SCSI revision: 05
    scsi(54:0:0:10): Enabled tagged queuing, queue depth 30.
    SCSI device sdb: 1048576000 512-byte hdwr sectors (536871 MB)
    sdb: Write Protect is off
    sdb: Mode Sense: 77 00 10 08
    SCSI device sdb: drive cache: write back w/ FUA
    SCSI device sdb: 1048576000 512-byte hdwr sectors (536871 MB)
    sdb: Write Protect is off
    sdb: Mode Sense: 77 00 10 08
    SCSI device sdb: drive cache: write back w/ FUA
     sdb: sdb1


To control RDAC operation just use "mppUtil":

primitivo:/dev/disk/by-id # mppUtil -a
Hostname    = primitivo
Domainname  = N/A
Time        = GMT 10/14/2009 13:17:49

---------------------------------------------------------------
Info of Array Module's seen by this Host.
---------------------------------------------------------------
ID              WWN                      Type     Name
---------------------------------------------------------------
 0      600a0b800048a6920000000048699e5b FC     st6100_2
 1      600a0b8000487f960000000048467441 FC     st6100_1
---------------------------------------------------------------


To see detailed information of an array, type:

primitivo:/dev/disk/by-id # mppUtil -a st6100_1
Hostname    = primitivo
Domainname  = N/A
Time        = GMT 10/14/2009 13:18:02

MPP Information:
----------------
      ModuleName: st6100_1                                 SingleController: N
 VirtualTargetID: 0x001                                       ScanTriggered: N
     ObjectCount: 0x000                                          AVTEnabled: N
             WWN: 600a0b8000487f960000000048467441               RestoreCfg: N
    ModuleHandle: none                                        Page2CSubPage: Y
 FirmwareVersion: 7.10.25.xx
   ScanTaskState: 0x00000000
        LBPolicy: LeastQueueDepth


Controller 'A' Status:
-----------------------
ControllerHandle: none                                    ControllerPresent: Y
    UTMLunExists: Y (127)                                            Failed: N
   NumberOfPaths: 1                                          FailoverInProg: N
                                                                ServiceMode: N

    Path #1
    ---------
 DirectoryVertex: present                                           Present: Y
       PathState: OPTIMAL
          PathId: 77350000 (hostId: 53, channelId: 0, targetId: 0)


Controller 'B' Status:
-----------------------
ControllerHandle: none                                    ControllerPresent: Y
    UTMLunExists: Y (127)                                            Failed: N
   NumberOfPaths: 1                                          FailoverInProg: N
                                                                ServiceMode: N

    Path #1
    ---------
 DirectoryVertex: present                                           Present: Y
       PathState: OPTIMAL
          PathId: 77340001 (hostId: 52, channelId: 0, targetId: 1)


[... rest omitted ...]

To be sure to mount the right volumes in your /etc/fstab or in your cluster/heartbeat configuration, use the "disk by id" functionality. The multipathed disks are listed as
/dev/disk/by-id/scsi-....

 Example:

lrwxrwxrwx 1 root root   9 2009-10-14 14:32 scsi-3600a0b800048a692000007ed4a7ca15a ->
../../sdb
lrwxrwxrwx 1 root root  10 2009-10-14 14:32 scsi-3600a0b800048a692000007ed4a7ca15a-part1 ->
../../sdb1



To bring up new disks, just use "mppBusRescan":

primitivo:/dev/disk/by-id # mppBusRescan
scan qla2 HBA host /sys/class/scsi_host/host53...
        no new device found
scan qla2 HBA host /sys/class/scsi_host/host52...
        no new device found
scan mptsas HBA host /sys/class/scsi_host/host0...
        no new device found
run /usr/sbin/mppUtil -s busscan...
scan mpp virtual host /sys/class/scsi_host/host54...
        no new virtual device found
primitivo:/dev/disk/by-id #



Further configuration:
You may alter mpp settings in /etc/mpp.conf. But normally you don't need to do this

0 TrackBacks

Listed below are links to blogs that reference this entry: QLA Linux and RDAC - Sun 6140 Failover.

TrackBack URL for this entry: http://southbrain.com/mt/mt-tb.cgi/121

6 Comments

Pascal, thanks alot for this guide.

I work with Redhat 5, kernel 2.6.18-164.9.1.el5 and rdac-LINUX-09.03.0C05.0251 module. IBM DS3200 is connected with 2 controllers at my Server via SAS.

Both controllers can be viewed by the mpputil.

When i try to simulate a failover, turning off the controller, which holds the active sas path to my server, the redhat remounts its ext3 to readonly. I already set the error=continue flag in the fstab. So the path to the second controller comes not up or the filesystem remounts itself not to rw again.


Any ideas? :)


Regards,
ironhead

How do you mount your root fs? Do you use "by-id" or "by-path"? The latter will NOT work.

Pascal, thanks alot for your response.

The root fs is mounted by path. How do i mount the / by id?!

[root@localhost ~]# cat /etc/fstab
/dev/VolGroup00/LogVol00 / ext3 errors=continue 1 1
LABEL=/boot /boot ext3 defaults 1 2
tmpfs /dev/shm tmpfs defaults 0 0
devpts /dev/pts devpts gid=5,mode=620 0 0
sysfs /sys sysfs defaults 0 0
proc /proc proc defaults 0 0
[root@localhost ~]# lvdisplay
--- Logical volume ---
LV Name /dev/VolGroup00/LogVol00
VG Name VolGroup00
LV UUID HptIKt-t7db-rQME-vDIH-AYfs-ElA6-WcVBgY
LV Write Access read/write
LV Status available
# open 1
LV Size 16,22 GB
Current LE 519
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 256
Block device 253:0

--- Logical volume ---
LV Name /dev/VolGroup00/LogVol01
VG Name VolGroup00
LV UUID GPY6Dt-qPUu-uN1p-tvG1-fCUm-POMj-CaTtSi
LV Write Access read/write
LV Status available
# open 0
LV Size 13,66 GB
Current LE 437
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 256
Block device 253:1


Regards

Ah OK I forgot Redhat mounts per Volume Manager Volume and /boot is a seperate partition where the kernel resides (grub 1 cannot work directly with lvm). That's ok.

Normally lvm should "find" and assemble your root volume regardless of controller and sdX-Instances.

What's the exact error causing the non-remounting of your root to rw? Can you see the error message before it scrolls out of the window?

What are the kernel options passed at boot?

Just wondering if anyone have gotten this to work with Ubuntu? Thanks.

Nice tutorial, i used this for setting up the RDAC driver for a Sun StorageTek 2530 (DS-3000). Up to now everthing works well!

Kind regards of the computer center university of freiburg!

Leave a comment

June 2010

Sun Mon Tue Wed Thu Fri Sat
    1 2 3 4 5
6 7 8 9 10 11 12
13 14 15 16 17 18 19
20 21 22 23 24 25 26
27 28 29 30      

About

This blog is owned by:

Pascal Gienger
Kanzleistr. 14
78462 Konstanz
Phone +49 7531 584298
Fax +49 7531 584298-9

Phone USA 1-678-791-4182

YouTube Channel: pascalgienger