Pseries Tech Tips/ Diskstuff

Site hosted by Angelfire.com: Build your free website today!

pSeries Tech Tips

SCSI/ SSA RAID Information

Using RAID

Redundant Array of Independent Disks (RAID) is a term used to describe the technique of improving data availability through the use of arrays of disks and various data-striping methodologies. Disk arrays are groups of disk drives that work together to achieve higher data-transfer and I/O rates than those provided by single large drives. An array is a set of multiple disk drives plus a specialized controller (an array controller) that keeps track of how data is distributed across the drives. Data for a particular file is written in segments to the different drives in the array rather than being written to a single drive.

Arrays can also provide data redundancy so that no data is lost if a single drive (physical disk) in the array should fail. Depending on the RAID level, data is either mirrored or striped.

Subarrays are contained within an array subsystem. Depending on how you configure it, an array subsystem can contain one or more sub-arrays, also referred to as Logical Units (LUN). Each LUN has its own characteristics (RAID level, logical block size and logical unit size, for example). From the operating system, each subarray is seen as a single hdisk with its own unique name.

RAID algorithms can be implemented as part of the operating system's file system software, or as part of a disk device driver (common for RAID 0 and RAID 1). These algorithms can be performed by a locally embedded processor on a hardware RAID adapter. Hardware RAID adapters generally provide better performance than software RAID because embedded processors offload the main system processor by performing the complex algorithms, sometimes employing specialized circuitry for data transfer and manipulation.

RAID Levels and Their Performance Implications

Each of the RAID levels supported by disk arrays uses a different method of writing data and hence provides different benefits.

RAID 0 - For Performance

RAID 0 is also known as data striping. It is well-suited for program libraries requiring rapid loading of large tables, or more generally, applications requiring fast access to read-only data, or fast writing. RAID 0 is only designed to increase performance; there is no redundancy, so any disk failures require reloading from backups. Select RAID Level 0 for applications that would benefit from the increased performance capabilities of this RAID Level. Never use this level for critical applications that require high availability.

RAID 1 - For Availability/Good Read Response Time

RAID 1 is also known as disk mirroring. It is most suited to applications that require high data availability, good read response times, and where cost is a secondary issue. The response time for writes can be somewhat slower than for a single disk, depending on the write policy; the writes can either be executed in parallel for speed or serially for safety. Select RAID Level 1 for applications with a high percentage of read operations and where the cost is not the major concern.

RAID 2 - Rarely Used

RAID 2 is rarely used. It implements the same process as RAID 3, but can utilize multiple disk drives for parity, while RAID 3 can use only one.

RAID 3 - For CAD/CAM, Sequential Access to Large Files

RAID 3 and RAID 2 are parallel process array mechanisms, where all drives in the array operate in unison. Similar to data striping, information to be written to disk is split into chunks (a fixed amount of data), and each chunk is written out to the same physical position on separate disks (in parallel). More advanced versions of RAID 2 and 3 synchronize the disk spindles so that the reads and writes can truly occur simultaneously (minimizing rotational latency buildups between disks). This architecture requires parity information to be written for each stripe of data; the difference between RAID 2 and RAID 3 is that RAID 2 can utilize multiple disk drives for parity, while RAID 3 can use only one. The LVM does not support Raid 3; therefore, a RAID 3 array must be used as a raw device from the host system.

Performance is very good for large amounts of data but poor for small requests because every drive is always involved, and there can be no overlapped or independent operation. It is well-suited for large data objects such as CAD/CAM or image files, or applications requiring sequential access to large data files. Select RAID 3 for applications that process large blocks of data. RAID 3 provides redundancy without the high overhead incurred by mirroring in RAID 1.

RAID 4 - Less Used (Parity Volume Bottleneck)

RAID 4 addresses some of the disadvantages of RAID 3 by using larger chunks of data and striping the data across all of the drives except the one reserved for parity. Write requests require a read/modify/update cycle that creates a bottleneck at the single parity drive. Therefore, RAID 4 is not used as often as RAID 5, which implements the same process, but without the parity volume bottleneck.

RAID 5 - High Availability and Fewer Writes Than Reads

RAID 5, as has been mentioned, is very similar to RAID 4. The difference is that the parity information is distributed across the same disks used for the data, thereby eliminating the bottleneck. Parity data is never stored on the same drive as the chunks that it protects. This means that concurrent read and write operations can now be performed, and there are performance increases due to the availability of an extra disk (the disk previously used for parity). There are other enhancements possible to further increase data transfer rates, such as caching simultaneous reads from the disks and transferring that information while reading the next blocks. This can generate data transfer rates at up to the adapter speed.

RAID 5 is best used in environments requiring high availability and fewer writes than reads. Select RAID level 5 for applications that manipulate small amounts of data, such as transaction processing applications.

RAID 6 - Seldom Used

RAID 6 is similar to RAID 5, but with additional parity information written that permits data recovery if two disk drives fail. Extra parity disk drives are required, and write performance is slower than a similar implementation of RAID 5.

RAID 7 - A Definition of 3rd Parties

The RAID 7 architecture gives data and parity the same privileges. The level 7 implementation allows each individual drive to access data as fast as possible. This is achieved by three features:

Independent control and data paths for each I/O device/interface.
Each device/interface is connected to a high-speed data bus that has a central cache capable of supporting multiple host I/O paths.
A real time, process-oriented operating system is embedded into the disk drive array architecture. The embedded operating system "frees" the drives by allowing each drive head to move independently of the other disk drives. Also, the RAID 7 embedded operating system is enabled to handle a heterogeneous mix of disk drive types and sizes.

RAID 10 - RAID-0+1

RAID-0+1, also known in the industry as RAID 10, implements block interleave data striping and mirroring. RAID 10 is not formally recognized by the RAID Advisory Board (RAB), but, it is an industry standard term. In RAID 10, data is striped across multiple disk drives, and then those drives are mirrored to another set of drives.

The performance of RAID 10 is approximately the same as RAID 0 for sequential I/Os. RAID 10 provides an enhanced feature for disk mirroring that stripes data and copies the data across all the drives of the array. The first stripe is the data stripe; the second stripe is the mirror (copy) of the first data stripe, but it is shifted over one drive. Because the data is mirrored, the capacity of the logical drive is 50 percent of the physical capacity of the hard disk drives in the array.
-----------------------------

In case of striping the data is evenly spread across the disks. But
when you add more disks to the volume group you may not be able to
extend your filesystem if it does not satisfy the stripe width.

Summary of RAID Levels

The advantages and disadvantages of the different RAID levels are summarized in the following table:

RAID Level	Availability	Capacity	Performance	Cost
0	none	100 percent	high	medium
1	mirroring	50 percent	medium/high	high
2/3	parity	80 percent	medium	medium
4/5/6/7	parity	80 percent	medium	medium
10	mirroring	50 percent	high	high

RAID Performance Summary

The most common RAID implementations are: 0, 1, 3 and 5. Levels 2, 4 and 6 have problems with performance and are functionally not better than the other ones. In most cases, RAID 5 is used instead of RAID 3 because of the bottleneck when using only one disk for parity.

RAID 0 and RAID 1 can be implemented with software support only. RAID 3, 5 and 7 require both hardware and software support (special RAID adapters or RAID array controllers).

GENERAL APPROXIMATE REBUILD TIMES

For RAID 5, the build and rebuild times increase as the number of member disks increases and as the size of each member disk increases. If the array consists of 6+P 9.1 GB disks, the following times are taken to initially build and to rebuild after a failure if there are no concurrent I/O operations to the array:
Initial build time 32 min.
Rebuild time 49 min.

For RAID 1 and RAID 0+1, the rebuild time increases as the disk size is increased
but is not affected by how many member disks are in the array. If the array
consists of 9.1 GB disks with a 16 KB strip size, the following times are
taken to rebuild after a failure if there are no concurrent I/O operations to the array:
Rebuild time 50 min.

ADDING ADDITIONAL DISKS TO RAID

Unlike AIX filesystems that use the Logical Volume Manager, RAID arrays cannot be increased dynamically, additional disks cannot be added to existing RAID arrays. The only method available to increase the size of an existing RAID array is to delete the existing array and create a new larger array.

In order to delete and re-create a RAID array, AIX must not be using the array. Further more, because the array has to be deleted and re-created, all the information on the array is destroyed. Therefore a full backup of the affected data is required and all file systems, logical volumes, volume groups MUST BE REMOVED before deleting the array.

The suggested procedure is as follows :-

Backup the filesystems on the RAID array to tape or other media.
Umount the filesystems on the RAID array.
Remove the filesystems and logical volumes on the RAID array.
Varyoff the volume group that is on the RAID array
Export the volume group that is on the RAID array.
Delete the RAID array
Create a new RAID array including any new additional disks
Re-create the volume group on the hdisk associated with the new RAID array.
Re-create the logical volume and filesystems on the new RAID array.
Mount the filesystems on the RAID array.
Restore the backup from the tape or other media

DISK SIZING SAMPLE
Here is a sample of sizing...

25 GB space needed, iostat indicates peak of 1000 tps with an
average i/o size of 6KB, I/O is random, with a R/W ratio of 80/20
on unprotected disks...(assume we can balance i/o via good data layout)
the 60 is i/o per second from the drive statistics...
The .8 and .2 are the 80/20 ratio

unprotected data
1000/60 = 17 disks needed
17 disks at least 1.5 GB each 17x 1.5=25.5 GB

Mirroring
{(0.8x tps) + 2x(0.2x tps)} /60 = 20 disks needed

Raid 5
{(0.8x1000) + 4x(0.2x1000)}/60=27 disks needed
27 disks at least 1 GB each (27-1) x 1 =26 GB
the 4x is write penalty..... first is reads...

the 6230 operates raid 5 configurations will support (2+P) to (15+P) arrays and up to 6 (15+P) arrays. So there is a minimum of 3 disks..........should be same size and same speed...so not to slow down the whole thing..

To maximize storage capacity using RAID-5, it is wise to use drives of the same physical capacity, for example, all 4.5 GB drives. If a mixture of drives is used in an array, then only the portion of the drives equal to the smallest drive in the array is used and the rest is unused. For example, if an array is configured using three 4.5 GB drives and one 2 GB drive, then the array will only use 2 GB on each drive.

In RAID level 5, both data and data parity information are striped across a set of disks. Disks are able to satisfy requests independently which provides high read performance in a request rate intensive environment. Since parity information is used, a RAID 5 stripe can withstand a single disk failure without losing data or access to data.

Unfortunately, the write performance of RAID 5 is poor. Each write requires four independent disk accesses to be completed. First old data and parity are read off separate disks. Next the new parity is calculated. Finally, the new data and parity are written to separate disks. Many array vendors use write caching to compensate for the poor write performance of RAID 5.

See your salesperson for more information. This is only a sample to give you the general idea in how to do calculations.

MISC DISK STUFF

DISK_ERR1 (Cd, Disk or R/W optical operation failure)
PERM
Failure of Physical volume media

DISK_ERR2 (Cd, Disk or R/W Optial operation failure
PERM
Failure in disk assembly hardware (for example power loss)

DISK_ERR3 (CD, disk or R/W optical operation failure
PERM
Error was detected by the SCSI adapter
(The error could have been initiated by something else on the SCSI bus which
affected the disk drives)

DISK_ERR4 (CD, Disk, or R/W optical recovered error)
TEMP
Error caused by a bad block or event of a recorded error

DISK_ERR5 (Undetermined Error)
PERM
SCSI device driver failure of unknown type

An error class value of H and an error type value of PERM indicates that the system encountered a problem with a piece of hardware and could not recover from it….

Disk errors 1, 2, and 4 will return sense data which can be analyzed by the diagnostic programs to provide extra information regarding the nature of the error and its severity.

DISK_ERR 4 is by far the most common error generated, and it is the least severe. It indicates that a bad block has been detected during a read or write request to the disk…

How do I determine the PP size for any disk?
think to myself, what is the smallest PP size that will cover the whole disk if
there are 1016 of them?

For example, if I have a disk with 18200 MB, will 16 MB PPs work? 16 * 1016
= 16256. Nope, too small. Will 32 MB work? 32 * 1016 = 32512. Yes! So I
use 32 MB PP sizes. m later on, I will have to expand it 32 MB at a time.
With an 18 GB drive, who cares?

Just realize that the PP is the small unit of allocation. Don't go on a perfect
division of pp into size of disk. Some area is used for control information at
the start of the disk - (VGDA and VGSA) so the allocation finally boils down
to, what is reasonable division into disk size and, what is the unit I want to
allocate.

1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024

The PCI SCSI RAID Array adapter supports
RAID levels 0,1 and 5

RAID level 0, every array is allowed 1 to 16 disks
RAID level 1, every array is allowed 2 to 16 disks
RAID level 5, every array is allowed 3 to 16 disks
Capacity :
RAID level 0, capacity = number_of_drives x drive_capacity
RAID level 1, capacity = (number_of_drives x drive_capacity) / 2
RAID level 5, capacity = (number_of_drives –1 ) x drive_capacity

Note: The PCI 4 channel Ultra 3 Scsi card (FC2498) cannot do JBOD
The 2498 ultra3 scsi controller card requires devices.pci.14102e00.4.3.3.25.rte
or higher or it will show up as defined and
will not become available in order to create the raid arrays
This is also not a bootable device.

CREATING A PCI SCSI RAID
Using Command Line
scraidmgr –A –c yes –l (array) –r (raid level) –e (disk ids) –q (queue depth) –g yes
example:
scraidmgr -A -c yes -l scraid0 -r 5 -e "08 0B 0D" -q 8 –f no -g yes
hdisk1 Available
Consistency Check in progress -
1 percent complete
5 percent complete
10 percent complete
.100 percent complete
--------------------------------
Using SMIT

Enter the SMIT menu "PCI SCSI Disk Array Manager" using the smit short cut :-
smit pdam
Select "Create a PCI SCSI Disk Array" and press Enter
Select the PCI SCSI RAID Adapter required and press Enter
Select the RAID Array type required and press Enter
Use ESC+7 to select all the drives required to create the RAID Array
and press Enter Initialize Parity

If the parity / mirror (RAID5 / RAID1) is not initialized, then any consistency check run on the array will encounter inconsistencies. It is recommended that parity / mirror be initialized at array creation time. RAID0 is a non redundant array and therefore cannot be initialized.
Press Enter to continue
When OK and 100% you are complete.

PCI SCSI RAID Array Instructions
Failing a Drive

The physical SCSI drive can be in one of seven states :-

Online, Spare, Failed, Reconstruct, Warning, Hot Spare or Non Existent.

Failed State
The drive was failed by the adapter or the user and must be replaced. A drive goes
into a failed state when one of the following conditions occur :-
-Drive does not respond to selection
-Drive failed to spin up
-Failed in an inquiry or read capacity
-Failed to read or write
-Failed to respond to a SCSI command
-Inquiry, capacity, serial number and SCSI ID does not match configuration
information store on the adapter NVRAM
-User failed the drive

It is necessary to fail a drive before you are able to remove and replace it
from the RAID Array.
-------------------------------------------

Using Command Line:

scraidmgr –F –d 08 –l (array) –y (disk id)
example:-
scraidmgr -F -d 08 -l hdisk1 -y 0B
scraidmgr -C -l hdisk1
hdisk1 Available Raid 5 04-01-00-0,0 8607 MB Status DEGRADED
hdisk1 08 Channel 0 ID 8 ONLINE - 4304Meg
hdisk1 0B Channel 0 ID B FAILED DRIVE - 4340Meg
hdisk1 0D Channel 0 ID D ONLINE - 4304Meg
------------------------------------------

Using SMIT

Enter the SMIT menu "PCI SCSI Disk Array Manager Menu"
1)smitty pdam
2)Select Fail a Drive in a PCI SCSI Disk ARRAY and press Enter
3)Select Raid Array Required and press enter
Hdisk1 Available Raid 5 04-01-00-0,0 8607 MB status optimal
Hdisk1 08 channel 0 ID 8 online - 4304 meg
Hdisk1 0B Channel 0 ID B online – 4340 Meg
Hdisk1 OD channel 0 ID D online – 4340 Meg

4)Select the drive that you wish to fail and press Enter
08 Channel 0 ID 8 Online - 4304 MEG
0B channel 0 ID B Online - 4304 MEG
0D channel 0 ID D Online - 4304 MEG

5) Confirm data and press enter to continue (disk and channel ID)
6) ARE YOU SURE? Press enter to continue
7) You should see OK
8)The array should look like this:
hdisk1 Available Raid 5 04-01-00-0,0 8607 MB Status DEGRADED
hdisk1 08 Channel 0 ID 8 ONLINE - 4304Meg
hdisk1 0B Channel 0 ID B FAILED DRIVE - 4340Meg
hdisk1 0D Channel 0 ID D ONLINE - 4304Meg
You should be able to remove the drive………… and put the new in………

PCI SCSI RAID Array Instructions
Reconstructing a RAID Array

Reconstruction is a process used to restore a degraded RAID1 or RAID5 disk array to its original state after a single drive has been replaced. During reconstruction, the adapter recalculates the data on the drive that was replaced, using data and parity from the other drives in the array. The controller then writes this data to the replaced drive.

Although RAID1 does not have parity, the adapter can reconstruct data on a RAID disk by copying data from the mirrored stripe. If a Hot spare has been defined, the adapter automatically initiates the reconstruction process after a drive status changes to FAILED. If a Hot spare of the appropriate capacity had not previously been defined, the reconstruct must be initiated by the user after replacing the FAILED drive. Once reconstruction is started, the adapter completes the following actions :-

* Copies special array configuration information file to the new drive.
* Recalculates the data and/or parity from the data and parity on the other disks in the array.
* Writes the recalculated data and parity to the new drive.

The disk array will remain accessible while reconstruction is in progress. Drive reconstruction may have a varying impact on system I/O performance. The rate at which the reconstruction occurs depends on the value on the reconstruction rate parameter.

Using Command Line

scraidmgr –F –d 8D –l (array) –y (failed disk id) –z (new disk id)
example:-
scraidmgr -F -d 8B -l 'hdisk1' -y '0A' -z '0B'

Reconstruction in progress
1 percent complete
3 percent complete
6 percent complete

100 percent complete

scraidmgr -C -l hdisk1
hdisk1 Available Raid 5 04-01-00-0,0 8607 MB Status OPTIMAL
hdisk1 08 Channel 0 ID 8 ONLINE - 4304Meg
hdisk1 0B Channel 0 ID B ONLINE - 4340Meg
hdisk1 0D Channel 0 ID D ONLINE - 4304Meg
---------------------------------
Using SMIT

1)smitty pdam
In this example one drive in a RAID 5 Array of three drives has failed. Another drive will be added and the Array reconstructed. The original Status of the array is this:
hdisk1 Available Raid 5 04-01-00-0,0 8607 MB Status DEGRADED
hdisk1 08 Channel 0 ID 8 ONLINE - 4304Meg
hdisk1 0B Channel 0 ID B FAILED DRIVE - 4340Meg
hdisk1 0D Channel 0 ID D ONLINE - 4304Meg

2) Select reconstruct a PCI SCSI Disk Array and press enter

3)Select the raid array required and press enter

4) Select the failed drive required and press enter

5) Press f4 to display a list of drives to be used to reconstruct the array (this is if you have an extra. If you are in the same slot you are selecting the one that was failed…

6) Select drive required and press enter
It will ask are you sure?

7) Press enter It will say reconstruction in process…
1 percent complete
3 percent complete
etc
100 percent complete

it will say ok when done…… Check the status and all should be there and status is optimal

SCSI ARRAY STATES

OPTIMAL
The Disk Array is operating normally

DEGRADED
A single drive has failed in a RAID 1 or 5 Disk Array, and the unit is now in degraded
mode. Replace the failed drive as soon as possible. Check the drive status to determine
which drive failed. The status will remain degraded until the new drive is reconstructed.

DEAD
The Disk Array is not functioning. Any data which may have been on the array is lost.
Check individual drive status.

RECONSTRUCTING
The adapter is reconstructing drive data on the new drive.

Command line options for PCI SCSI Raid
command scraidmgr.
-------------------------------------------------------------------
scraidmgr -l hdisk# [CDJOP]
scraidmgr -l adptr# [BEHLQSTVWY]
scraidmgr -l hdisk# -F [-y channel_id -d drive_status]
scraidmgr -l hdisk# -M [ -q queue depth]
scraidmgr -l hdisk# -Z [ -c auto_repair (yes/no)]
scraidmgr -l adptr# -A [-r raid_level -e drives
-q queue depth -f read_ahead -g init_parity]
scraidmgr -l adptr# -F [-y channel_id -d drive_status]
scraidmgr -l adptr# -I [ -c delete_all_hdisks (yes/no)]
scraidmgr -l adptr# -H [-r raid_level -e drives]
scraidmgr -l adptr# -N [ -c display_POCL_only (yes/no)]
scraidmgr -l adptr# -P [ -y drives]

-A Add RAID hdisk
-r raid_level ( 0, 1 or 5)
-e drives
-q queue depth
-f read ahead
-g initialize parity
-B Status for all Drives in the Subsystem - Informational
-C Status and Drives for AVAILABLE hdisk - Formatted
-D Delete hdisk
-E Status for all Drives in the Subsystem - Formatted
-F Modify Drive Status
-y channel_id .e.g. 12 is channel 1 id 2
-d Drive Status to modify to
8B=Replace and Reconstruct the Drive
08=Fail the Drive
00=Delete the Drive
89=Revive the Drive
81=Mark the Drive Spare
85=Mark the Drive Hot Spare
-z channel_id Used ONLY with Drive Status 8B
-H Acquire available capacity
-r raid_level ( 0, 1 or 5)
-e drives
-I USE WITH CAUTION:performs reset on adapter
-c yes/no also delete all hdisks and assoc.data
-J Status and Drives for DEFINED hdisk - Formatted
-L USE WITH CAUTION:performs config sync to drives
-M Modify hdisk
-q queue depth (1-64)
-N Display/Sync to the Power On Change List
-c yes/no Display POCL only
-O Status,drives etc. for an hdisk - Informational
-P Acquire Vital Product Data
-y channel_id .e.g. 12 is channel 1 id 2
-Q Disk Groups - Formatted
-S Spare Drives - Formatted
-T Disk Groups - Informational
-V Hot Spare Drives - Formatted
-W Failed Drives - Formatted
-Y Empty Drive Slots - Formatted
-Z Perform parity check/repair on an hdisk
-c auto-repair (yes or no)

SOFTWARE RAID 0+1
RAID 0+1 AIX 4.3.3

Beginning with AIX Version 4.3.3, it is possible to create logical
volumes that are both mirrored and stripped, providing RAID0+1
capability. The performance improvement of the striping technique can be combined with the availability provided by mirroring.

If no mirror is present, a failure on one disk causes all the striped logical
volume contents to be unavailable. When introducing the second (or third)
copy of the logical volume, the failure of the disk does not affect access to
data as long as there is one available copy of data on another disk.

Since this is anew feature of AIX Version 4.3.3, the volume group in which a
mirrored and striped logical volume is created cannot be imported by any
other machines running previous versions of AIX. When you create such a
logical volume, a warning message appears and you are requested to confirm
your choice.

From an administrator's point of view, very few changes have been made to
the LVM command interface in order to make mirror and striping available on the same logical volume. Logical volumes are created as in previous releases,
but copies may be specified, added and removed when the logical volume is
striped.

A new concept of allocation policy has been added to LVM. Since striped
logical volumes are very sensitive to the loss of a disk, it is important to force
them to have mirrored copies on different physical volumes and not to have
partitions allocated for one copy on the same physical volume with the
partitions from another copy.

This new allocation policy is called super strict. It is mandatory for striped and
mirrored logical volumes, but it can also be chosen for other logical volume
types. Every LVM command that has the -s flag to define the allocation policy
now may have one of the following values:

-s y Strict allocation policy: copies of a logical partition cannot share same
physical volume

-s n No allocation policy: Copies of a logical partition can share the same
physical volume

-s s Super strict allocation policy: The partitions allocated for one mirror
cannot share a physical volume with the partitions from another mirror.

You can either create a new striped and mirrored logical volume with a single
command or first create a striped logical volume and then create the copies.

NOTE: If you create a striped logical volume with SMIT or WBS Manager, the administration tool ensures that the super strict policy is applied. When you
use the mklvcopy to create the mirrored copies, the volumes will already have the super strictness policy when the striped logical volumes were created.
This policy will be preserved during the copy.

mklv -y 'raid10' -c '2' '-S4K' datavg 5 hdisk1 hdisk2 hdisk3 hdisk4

In order to replace a failed disk that contains a striped and mirrored logical
volume, you should use the replacepv command. If your failed disk is, for
example, hdisk5, and you have a spare disk hdisk20, you must first be sure
that hdisk20 does not belong to any volume group, then you can issue the
following command:
# replacepv hdisk5 hdisk20

NOTE:
One user reported a problem when he defined RAID10 and then put it back
as undefined.... The two disks had the same PVID which had to be
cleared...the error message was 0516-304 mkvg unable to find device id
004213317ebd2b46.. in Device configuration database and
0516-324 mkvg: Unable to find PV Identifier for physical volume...inconsistent for physical volumes..
The solution was to do a
chdev -l hdiskX -a pv=clear
and
chdev -l hdiskX -a pv=yes

There is also hardware Raid 0+1 and this can also accommodate
other facilities.

SSA RAID ADAPTERS

SSA ARRAY STATES

Good State
The array is online and it can be read and written. All the array components are present.
All parity data (except that affected by recently completed write operations) is
synchronized. No data or parity rebuilding is outstanding. The array is fully
protected against the loss of one component.

Exposed State
One component is missing from the array. When the array is read, data can be
reconstructed for the missing component. The first write operation causes the
array to enter the "Degraded" state, unless there is no hot spare available that
can be used to replace the missing component. In the "Exposed" state, the
missing component can be reintroduced or replaced. Then, after any
necessary rebuilding, the array is returned to the "Good" state.

Degraded State
One component is missing and a write operation has been received for the array. Read and write
operations to the array are supported. However, if power is lost before all the parity data
has been written, it might not be possible to recreate all the data for the missing component.
The missing component is permanently excluded from the array.

Note :- While in Degraded state, an array is not protected. If another disk drive in
the array fails, or the power fails during a write operation, data might be lost.

Rebuilding State
The array is online and it can be read and written. The full complement of array components
are present but data and parity are being rebuilt on one of the components.

Offline State
An array enters Offline state when two or more member disk drives become missing. Read and write operations are not allowed.

---------------------------------------------------

SSA PHYSICAL DISK ASSIGNMENT

SSA physical disks can have different uses assigned to them. For example they can be
AIX system disks, members of an SSA RAID array, or be candidates to replace failing
disks in a RAID array.

AIX System Disks
Disk Drives that are connected to an SSA RAID adapter do not need to be members of an array. The SSA RAID adapter handles such disk drives in the same way as a non RAID SSA adapter does. It transfers data directly between the disk drives and the system, and uses no RAID functions.When first installed, all disk drives are, by default, defined as AIX disks.

Array Candidate Disk
The disk drive has been assigned for use with a RAID array. Disks can be Array Candidate disks without being used by a RAID array.

Hot Spare Disk
In the event of a member disk failing the SSA RAID manager can use any available hot spare disk to dynamically replace the failing disk. The failing disk will be rejected from the array, and the hot spare brought in in its place.

SOME Hardware Prerequisites for SSA Boot Support (as new adapters come out they may not be on this list yet

One of the following is required if you want to boot from an SSA disk drive:
• 6214 SSA four-port Adapter (4-D)
• 6216 SSA Enhanced four-Port Adapter (4-G)
• 6217 SSA RAID four-port Adapter (4-I)
• 6219 Micro-Channel SSA Multi-Initiator/RAID EL Adapter (4-M)
• 6221 SSA Enhanced four-port Adapter (4-G)

Any RS/6000 or SP node based on the Common Hardware Reference Platform (CHRP) can be booted from an Advanced SerialRAID adapter (FC 6225) or Advanced SerialRAID Plus adapter (FC 6230).

• The system cannot be booted from a disk attached to an SSA RAID adapter,
if the disk is part of a RAID logical unit number (LUN). Booting from the 6215
or 6218 adapters is not supported.
• An SSA loop that has a disk with the boot logical volume on it can only be
attached through a single adapter on a single machine.
• Only certain models of the RS/6000 can boot from SSA disks
If you are mirroring ROOTVG with SSA disks, we recommend that each
copy of the boot logical volume be on a separate SSA loop.

SSA/RAID Setup

Verify that you have a pci raid adapters
lsdev -Cc adapter

(some of these are for microchannel some for pci...Make sure you
have all the ones you need for your adapter...)

1) Do an lslpp -l | grep devices.mca.8f97 ( do for each one)
devices.mca.8f97.com
devices.mca.8f97.diag
devices.ssa.disk.rte
devices.ssa.IBM_raid.rte
devices.ssa.tm.rte
devices.pci.14104500.rte
devices.pci.14104500.diag

Make sure there are there at base levels first. Then reboot machine. DO A
BACKUP... smitty mksysb (save off to the side just in case)

2) Then load the patches….put them in a separate directory like
/tmp/patches. See Document on how to get patches

3) Install them and then reboot machine

4) Then we may need to do a
rmdev -d -l (for each of the ssa logical hard disks if they exist.............
do an lsdev -C | more and look)

5) rmdev -d -l (for ssa physical disks) and hdisks

6) rmdev -d -l ssa0 (or whatever it is called)

7) rmdev -d -l ssar (if it exists)

8) reboot machine

9) verify the microcode lscfg -vl pdisk and lscfg -vl hdisk

10) call hardware and get them to update microcode. Or get off the
web.....

11) smit ssaraid for non ssa the menu is smitty pdam

smitty ssaraid

List All Defined SSA RAID Arrays
List All Supported SSA RAID Arrays
List All SSA RAID Arrays Connected to a RAID Manager
List Status Of All Defined SSA RAID Arrays
List/Identify SSA Physical Disks
List/Delete Old RAID Arrays Recorded in SSA RAID Manager
Add an SSA RAID Array
Delete an SSA RAID Array
Change/Show Attributes of an SSA RAID Array
Change Member Disks in an SSA RAID Array
Change/Show Use of an SSA Physical Disk
Change Use of Multiple SSA Physical Disks

12) Change Show Use of SSA Multiple Physical Disks

13) Select disks

14) On line for new use of ssa disks change to array candidate disks

15) To make one disk a hot swap you must go back into smitty ssaraid

16) Change/show physical disk drive.

17) Select the drive to change and under new use change to hot spare.

18) Go back into smitty ssaraid

19) Select add an SAA raid array. select your adapter.

20) Look for field that says enable use of hot spare. Remember if you
did not do this in step 16..do not say true set to false . You will get an
error every hour if you do not set up correctly.

21) After you run this command…. DO NOT GO FURTHER You must wait
for the parity to build or you will really mess things up…wait time is
anywhere from 20 minutes to several hours.

22) Go into smitty ssaraid

23) Select #4 List status of SSA raid

24) You will see two columns..one is unbuilt and one is unsynced both
numbers will be zero. when it is done. DO NOT try to make a volume
group or anything else before they both are at zero….trust me…you
will end up doing it over and over again….. Do a mksysb now that
it is set up.

If you get an error on shutdown that says: /usr/sbin/shutdown[1018]:
/usr/sbin/ssadisk_shutdown: not found. you need to do the following:
ln -s /usr/lpp/devices.ssa.disk/bin/ssadisk_shutdown
/usr/sbin/ssadisk_shutdown
----------------------------------
Notes on Restore for SSA
1) All member drives and hot spares must be in same loop.

2) It takes a write to the array to start a rebuild. So if the system is only doing
reads, the rebuld will wait until it does a write.

3) If you have a hot spare....It will automatically rebuild itself, provided
that you do a write and are in the same loop.

4) When you have a disk replaced and are now putting it in, and the hot spare
has already rebuilt, you can just run cfgmgr and make the new drive a hot
spare. (Change SSA physical disk and select new use...change to hot.spare)
You should see it go from degraded to rebuilding.

5) If you do not have a hot spare or if it was in the wrong loop, you have to do it manually. The drive will probably be rejected from the array. You will then need to do a rmdev -d -l pdisk? (put your number in)
and then cfgmgr Then change member disks in ssa array. Add a disk to ssa
raid array, and then it should automatically start to rebuild for you..... you
should see it change in status from degraded to rebuilding.

SSA Command Line Information

Identify an SSA Physical Disk

SSARAID -I -l ssa0 -n pdisk0
name
pdisk0
id
000629CA4A3900D
class
disk
use
member
blocksize
512
size
4.5GB
state
good
fastwrite
off

It is impossible to use the system location codes to physically locate SSA disk drives example:-
lscfg |grep disk
+ hdisk0 04-A0-00-0,0Other SCSI Disk Drive
+ pdisk0 04-08-P 4GB SSA F Physical Disk Drive
+ pdisk1 04-08-P 4GB SSA F Physical Disk Drive
+ pdisk2 04-08-P 4GB SSA F Physical Disk Drive

Also the numeric identifiers of pdisks, hdisks, and the disk drive slots are not related to each other.

For example, pdisk1 is not necessarily configured as hdisk1 and is not necessarily in slot/bay 1.

In order to physically identify and locate an SSA disk drive,
there is a utility to "flash" the "check light" of a named
SSA pdisk.

To start flashing the "check light" on SSA disk pdisk0 :-
ssaidentify -l pdisk0 –y

To stop flashing the "check light" on SSA disk pdisk0 :-
ssaidentify -l pdisk0 –n

Adding a New SSA Physical Disk

cfgmgr
Or use the mkdev command :-
mkdev -c (Class) -s (Subclass) -t (Type) -p (Parent) -w (Connection)
example:
mkdev -c pdisk -s ssar -t 4000mbF -p ssar -w 000629D192FF00D
pdisk3 Available

Mapping hdisks to pdisks

ssaxlate -l (pdisk name or hdisk name)
example:
ssaxlate -l pdisk2
hdisk1

ssaxlate -l hdisk1
pdisk2

Changing the Use of an SSA Physical Disk

In the example below, an SSA Physical disk is being changed from "Array Candidate Disk" (default) to "AIX System Disk".

SSARAID -H -l (adapter name) -n (SSA physical disk name) -a (attribute) -d
-k (system disk name)

examples :-
To change a pdisk to be an "AIX System Disk"
SSARAID -H -l ssa0 -n pdisk0 -a use=system -d -k hdisk1
000629D192FF00D changed 000629D192FF00D attached hdisk1 Available

To change a pdisk to be an "Array Candidate Disk"
SSARAID -H -l ssa0 -n pdisk0 -a use=free
000629D192FF00D detached

Replacing a Disk within a RAID Array

The example below has 3 SSA disks and the use of "hot spares" is turned off, and no "hot spare" disk is available. The array can either in the exposed state or in the degraded state depending if the array has been written to since the disk failure occurred.
---------------------------------------

Using Command Line
Add a new disk into the system (replace old and put new in)
Ensure that the new disk is an "Array Candidate Disk".
checking/changing this attribute.

SSARAID -Al (adapter name) -n (RAID Array name) -i exchange -a new_member=(new pdisk name)

example:
SSARAID -Al ssa0 -n hdisk1 -i exchange -a new_member=pdisk3 000629CA4A3900D Changed The SSA RAID Array will now start to rebuild.
Check the array status for rebuild.
---------------------------------------------------------------
Using SMIT
Add a new disk into the system
Adding an SSA Physical Disk"
Ensure that the new disk is an "Array Candidate Disk".
Enter the SMIT screen "Add a Disk to an SSA RAID Array" using
the smit short cut smit addSSARAID

Note : You cannot enter this smit screen unless there is an SSA RAID Array in the "exposed" or "degraded" state.

This will display the status of all SSA RAID Arrays on the system.
Select the SSA RAID Array required and press Enter.
Move the cursor to the "Disk to Add" field and
press F4 or ESC+4 (replaced or new disk)
Select the "Disk to Add" and press Enter.
Press Enter again at the "Add a Disk to an SSA RAID Array" screen
The SSA RAID Array will now start to rebuild.
See "SSA RAID MANAGER: RAID Array States" for checking
the Array status.
(after rebuild)
Enter the SMIT screen "Remove a Disk from an SSA RAID Array"
using the smit short cut smit redSSARAID
Select the SSA RAID Array required and press Enter.
Move the cursor to the "Disk to Remove" field and press F4 or ESC+4
Select the "Disk to Remove" and press Enter.
Press Enter again at the "Remove a Disk to an SSA RAID Array" screen

Deleting an SSA Physical Disk

Warning : It is possible to delete a pdisk that is being used within an SSA RAID Array.

The SSA RAID Array will continue to operate and the filesystem can still be accessed but error reporting for that pdisk will not function correctly. The pdisk can be added back into the system again.

Using Command Line
rmdev -l (pdisk) -d
example:
rmdev -l pdisk3 –d

Using SMIT
Enter the SMIT menu "SSA Physical Disks" using the smit short cut :-
smit ssadphys
Select "Remove an SSA Physical Disk" and press Enter
Select the required pdisk and press Enter
Toggle the KEEP definition in database" to NO and press Enter

ERROR LOG ANALYSIS
A command line utility has been provide that allows you to run a manual SSA
error log analysis from the command line or via shell scripts. The command
scans the AIX error log and looks for (and reports) the most significant SSA
error.
Examples :
ssa_ela
ssa0 SRN 49000

ssa_ela -l ssa0
ssa0 SRN 49000

The most significant error in the error log is "B4C00618" which is "RESOURCE UNAVAILABLE"

errpt
errpt –a –j B4C00618

LABEL:
SSA_ARRAY_ERROR
IDENTIFIER:
B4C00618

The SRN (Service Request Number) is generated from the first two
blocks of the sense data ( 0490 0001 ) if it is
present in the error log. The first number is the number of times that the error has been logged during the previous 24
hours. The next five numbers make up the SRN.

Using the SRN Table found in the failing devices Maintenance manual,
you can identify the cause of the problem, the failing field-replaceable
units (FRUs) , and service actions that might be needed to solve the problem -

Description: An array is in the Degraded state because a disk drive is not available to the array, and a write command has been sent to that array.

Action: A disk drive might not be available for one of the following reasons:
The disk drive has failed.
The disk drive has been removed from the subsystem.
An SSA link has failed.
A power failure has occurred.

If the SSA service aids are available, run the Link Verification service aid (see "Link Verification Service Aid" on page 12-10) to find any failed
disk drives, failed SSA links, or power failures that might have caused the problem. If you find any faults, go to the Start MAP (or equivalent) in
the unit Installation and Service Guide to isolate the problem, then go to 35 on page 13-48 of MAP 2324: SSA RAID to return the array to the
Good state. If the SSA service aids are not available, or the Link Verification service aid does not find any faults, go to "MAP 2324: SSA RAID" on page 13-30 to isolate the problem.

Note :

SSA_DISK_ERR2 and SSA_DISK_ERR3 errors do not generate a SRN.

DISK_ERR1 and DISK_ERR4 which are media errors, will generate a SRN if more that a predetermined amount of these errors are present in the error log.

When Errors do occur it is important to clear the errpt so that cron jobs don't continue to harass you with messages...

There are two cron jobs that control this action.
The first is diagela that runs every night usually at 3am and writes to the errpt and sends you a console message. The other is:

The run_ssa_healthcheck cron job checks for SSA subsystem problems that do not cause I/O errors, but cause some loss or redundancy or functionality. It reports such errors each hour until the problem is solved. This cron entry sends a command to the adapter. The command causes the adapter to write a new error log entry for any problems that it can detect, although those problems might not be causing any failure in the user’s applications. Such problems could include :-

Adapter hardware faults
Adapter configuration problems
RAID array problems
Fast-write cache problems
Open serial link conditions
Link configuration faults
Disk drives that are returning Check status to an Inquiry cmd
Redundant power failures in SSA enclosures

NOTE:

Emails that come every hour can occur when you pulled a drive out or if you defined a hot spare but you don't have one in the slot defined.

To make one disk a hot swap you must go back into
smitty ssaraid

Select the drive to change and under new use change to hot spare.

Go back into smitty ssaraid

Select add an SAA raid array. select your adapter.

Look for field that says enable use of hot spare. Remember if you did not do this .do not say true set to false . You will get an error every hour if you do not set up correctly.

A command line utility has been provide that allows you to run a manual SSA error log analysis from the command line or via shell scripts. The command scans the AIX error log and looks for (and reports) the most significant SSA error.

Examples :
ssa_ela
ssa0 SRN 49000

ssa_ela -l ssa0
ssa0 SRN 49000

The most significant error in the error log is "B4C00618" which is "RESOURCE UNAVAILABLE"

#>errpt –a –j B4C00618

LABEL:
SSA_ARRAY_ERROR
IDENTIFIER:
B4C00618

Using the SRN Table found in the failing devices Maintenance manual, you can identify the cause of the problem, the failing field-replaceable units (FRUs) , and service actions that might be needed to solve the problem :-

Description: An array is in the Degraded state because a disk drive is not available to the array, and a write command has been sent to that array.

Action: A disk drive might not be available for one of the following reasons:

The disk drive has failed.
The disk drive has been removed from the subsystem.
An SSA link has failed.
A power failure has occurred.

If the SSA service aids are available, run the Link Verification service aid to find any failed disk drives, failed SSA links, or power failures that might have caused the problem. If you find any faults, go to the Start MAP (or equivalent) in the unit Installation and Service Guide to isolate the problem, and return the array to the Good state.

GENERAL SSA ERRORS
SSA_LINK_ERROR SSA                     serial link failures
SSA_LINK_OPEN SSA                         serial link open
SSA_DETECTED_ERROR SSA              detected failures
SSA_DEVICE_ERROR SSA                   device failures
SSA_DEGRADED_ERROR SSA              Degraded Condition
SSA_HDW_ERROR SSA                       Hardware Error Condition
SSA_HDW_RECOVERED                       Recovered SSA Hardware Error
SSA_SOFTWARE_ERROR SSA             Software or microcode errors
SSA_LOGGING_ERROR                        Unable to log against a pdisk
SSA_ARRAY_ERROR SSA RAID           Array detected error
SSA_SETUP_ERROR SSA                     configuration error
SSA_CACHE_ERROR SSA                     cache error
SSA_DISK_ERR1 DASD                        Detected Software Error
SSA_DISK_ERR2 DASD                        statistical data
SSA_DISK_ERR3                                  Recovered SSA Disk Media Error
SSA_DISK_ERR4                                  Physical Volume Hardware Error

(03/2002 slg)