May 2008 Archives

SCSI transport failed: reason 'tran_err': giving up

| | TrackBacks (0)
In case you have SAN RAID storage, and you are running a "zpool create" to create a new zfs volume and you encounter these messages:

May 26 08:40:22 atlanta scsi: [ID 107833 kern.warning] WARNING: /scsi_vhci/disk@g0080e52126cdf002 (sd69):
May 26 08:40:22 atlanta  SCSI transport failed: reason 'tran_err': giving up
May 26 08:40:23 atlanta scsi: [ID 107833 kern.warning] WARNING: /scsi_vhci/disk@g0080e52126cdf002 (sd69):
May 26 08:40:23 atlanta  SCSI transport failed: reason 'tran_err': giving up
May 26 08:40:24 atlanta scsi: [ID 107833 kern.warning] WARNING: /scsi_vhci/disk@g0080e52126cdf002 (sd69):
May 26 08:40:24 atlanta  SCSI transport failed: reason 'tran_err': giving up
May 26 08:40:24 atlanta scsi: [ID 107833 kern.warning] WARNING: /scsi_vhci/disk@g0080e52126cdf002 (sd69):
May 26 08:40:24 atlanta  SCSI transport failed: reason 'tran_err': giving up


This does not always mean that your SAN disk has a problem, no it can be simply the fact that the disks you use already have a zfs header stating the length of the "disk". On some RAID devices, when you reconfigure system disks/disk sets, disks won't be initialized again and - if the new disk group is smaller than it was before - Solaris will try to get the last block of that "disk" - as written in the "wrong" zfs header. This will fail.

Correct it by writing zeroes at the header:

dd if=/dev/zero count=10 of=/dev/dsk/...diskname....

and reboot.

Normal fiberchannel devices are named
cXtWWPNdLUN

Example:

/dev/dsk/c5t21000080E511F169d0

WWPN in this case is 21:00:00:80:E5:11:F1:69. It appears as "target" in the standard Solaris SCSI notation.

The problem: If you got multiple paths (to multiple WWPNs) to your devices, your volumes will appears as many identical volumes - one for each path:

       1. c5t21000080E511F169d2 <LSI-ProFibre 4000R-5902-262.26GB>
          /pci@7b,0/pci10de,5d@e/pci1077,142@0/fp@0,0/disk@w21000080e511f169,2
       2. c5t21000080E511F169d3 <LSI-ProFibre 4000R-5902-262.26GB>
          /pci@7b,0/pci10de,5d@e/pci1077,142@0/fp@0,0/disk@w21000080e511f169,3
       3. c5t22000080E511F169d2 <LSI-ProFibre 4000R-5902-262.26GB>
          /pci@7b,0/pci10de,5d@e/pci1077,142@0/fp@0,0/disk@w22000080e511f169,2
       4. c5t22000080E511F169d3 <LSI-ProFibre 4000R-5902-262.26GB>
          /pci@7b,0/pci10de,5d@e/pci1077,142@0/fp@0,0/disk@w22000080e511f169,3
       5. c5t23000080E511F169d2 <LSI-ProFibre 4000R-5902-262.26GB>
          /pci@7b,0/pci10de,5d@e/pci1077,142@0/fp@0,0/disk@w23000080e511f169,2
       6. c5t23000080E511F169d3 <LSI-ProFibre 4000R-5902-262.26GB>
          /pci@7b,0/pci10de,5d@e/pci1077,142@0/fp@0,0/disk@w23000080e511f169,3
       7. c5t24000080E511F169d2 <LSI-ProFibre 4000R-5902-262.26GB>
          /pci@7b,0/pci10de,5d@e/pci1077,142@0/fp@0,0/disk@w24000080e511f169,2
       8. c5t24000080E511F169d3 <LSI-ProFibre 4000R-5902-262.26GB>
          /pci@7b,0/pci10de,5d@e/pci1077,142@0/fp@0,0/disk@w24000080e511f169,3


How to correct this?

Tell scsi_vhci to accept this "LSI ProFibre"-Device as multipath device:

device-type-scsi-options-list =
"LSI     ProFibre 4000R", "symmetric-option";
symmetric-option = 0x1000000;


Set a suitable region size to use logical-block round-robin:

device-type-mpxio-options-list =
"device-type=LSI     ProFibre 4000R", "load-balance-options=logical-block-options";
logical-block-options="load-balance=logical-block", "region-size=20";



Result:

       1. c6t0080E52126CDF002d0 <LSI-ProFibre 4000R-5902-262.26GB>
          /scsi_vhci/disk@g0080e52126cdf002
       2. c6t0080E52126CDF003d0 <LSI-ProFibre 4000R-5902-262.26GB>
          /scsi_vhci/disk@g0080e52126cdf003



Wonderful.

New vhci_stat released

| | Comments (1) | TrackBacks (0)
I corrected a bug of vhci_stat. The new version (1.2) is available for download.

--> vhci_stat page.

If you have devices of multiple vendors to be accessed by the Solaris MPxIO architecture, you must use this syntax in scsi_vhci.conf:

 device-type-scsi-options-list =
"ADVUNI  OXYGENRAID", "symmetric-option",
"LSI     ProFibre 4000R", "symmetric-option";
symmetric-option = 0x1000000;


To set up logical block round-robin for both devices, use this (region-size=20 means 2^20 bytes: 1 MB):

device-type-mpxio-options-list =
"device-type=ADVUNI  OXYGENRAID", "load-balance-options=logical-block-options",
"device-type=LSI     ProFibre 4000R", "load-balance-options=logical-block-options";
logical-block-options="load-balance=logical-block", "region-size=20";

Note that the device lines are comma-separated.
varmail.gifSATA RAIDs are an alternative for environments requiring FiberChannel SAN connectivity and large cheap storage. That's what the promoters are saying.

I used the past weekend to compare multiple configurations of an Infortrend Oxygenraid system. It is also available as Advanced Unibyte Oxygenraid. Other OEM brand names may exist.

The hardware configuration tested includes 16 SATA disks (Seagate Barracuda 7200.10, 750 GB per disk). 12 were used for this tests. Large RAID reconfiguration delays slowed down the whole story.

Four filebench tests were run against each configuration.

Read it in the articles section...

Comments appreciated!
Why?

Because it will just start the resilvering process from the beginning... Just in case you don't want to wait for some other hours for the mirror to complete...

ApOsTrOpHe?!

| | TrackBacks (0)
Installing patches on solaris machines can be funny:

    root  9635  9628   0 09:15:04 pts/3       0:00 sed s,'"'"',ApOsTrOpHe,g

just found after ps -edaf :)

(yes i KNOW what this sed command does, but it is funny to see).
overlay2.jpgYesterday I saved an old S-Bus (sun) ethernet scsi combi card from the trash. It's quite special because engineers seem to have forgotton a capacitor and a logic circuit on the board, so they did extra wiring and sticking to the produced card. Funny to see. I am wondering how many of these cards were corrected this way until a new revision of the board was developed.

And yes, it is "made in U.S.A." which it proudly "says" on its back.

The SCSI controller is a NCR 53CF96-1, the ethernet controller is the AMD chip under the green wirings. The big LSI chip seems to be a microcontroller, doing some bios work for the s-bus card.

Revision seems to be "210-2013-03 REV.50". And yes, these cards were expensive...


overlay3.jpgoverlay1.jpg




December 2015

Sun Mon Tue Wed Thu Fri Sat
    1 2 3 4 5
6 7 8 9 10 11 12
13 14 15 16 17 18 19
20 21 22 23 24 25 26
27 28 29 30 31    

About

This blog is owned by:

Pascal Gienger
J├Ągerstrasse 77
8406 Winterthur
Switzerland


Google+: Profile
YouTube Channel: pascalgienger