[ILUG] SOLUTION: RE: [SAGE-IE] Dell PERC 3/Di & interrupted reconfiguration

Kenn Humborg kenn at bluetree.ie
Fri Feb 3 10:22:11 GMT 2006


I'm quoting the complete message for the archives.  Answer at
the end.

> I had an array with 3 73GB disks in a RAID 5, PE2650, PERC 3/Di.
> It looked like this:
> 
> Executing: disk show space
> 
> Scsi B:ID:L Usage      Size
> ----------- ---------- -------------
>   0:00:0     Container 64.0KB:68.3GB
>   0:00:0     Free      68.3GB:7.50KB
>   0:01:0     Container 64.0KB:68.3GB
>   0:01:0     Free      68.3GB:7.50KB
>   0:02:0     Container 64.0KB:68.3GB
>   0:02:0     Free      68.3GB:7.50KB
> 
> I inserted two new disks.  As per 
> 
>    
> http://www1.us.dell.com/content/topics/global.aspx/power/en/ps1q03
> _michael?c=us&cs=04&l=en&s=bsd
> 
> 
> I rescanned, init-ed the new disks and then did
> 
>    container reconfigure 0 (0,3,0) 
> 
> to bring in the first new disk.  I wanted to expand the RAID 
> to be a 4 x 73GB array.  The reconfigure started and I did
> a disk show space:
> 
> Executing: disk show space
> 
> Scsi B:ID:L Usage      Size
> ----------- ---------- -------------
>   0:00:0     Container 64.0KB:45.5GB
>   0:00:0     Container 64.0KB:68.3GB
>   0:00:0     Free      68.3GB:7.50KB
>   0:00:0     Container 68.3GB:1.00MB
>   0:01:0     Container 64.0KB:68.3GB
>   0:01:0     Container 64.0KB:45.5GB
>   0:01:0     Free      68.3GB:7.50KB
>   0:01:0     Container 68.3GB:1.00MB
>   0:02:0     Container 64.0KB:45.5GB
>   0:02:0     Container 64.0KB:68.3GB
>   0:02:0     Free      68.3GB:7.50KB
>   0:02:0     Container 68.3GB:1.00MB
>   0:03:0     Container 64.0KB:45.5GB
>   0:03:0     Free      45.5GB:22.7GB
>   0:04:0     Free      64.0KB:68.3GB
> 
> When I saw the 45GB stuff mentioned and what looked like 
> overlapping partitions on the disks, I panicked.  I did 
> a task stop to stop the reconfigure.
> 
> Later, after finding out about container list /full, I 
> figured out that it was converting my 3 x 68.3GB array
> into a 4 x 45.5GB array (empty columns removed to avoid
> line wraps):
> 
> Executing: container list /full=TRUE
> Num          Total  Chunk          Scsi   Partition
> Label Type   Size   Size   Usage   B:ID:L Offset:Size   State   Ent
> ----- ------ ------ ------ ------- ------ ------------- ------- ---
>  0    Reconf  136GB        Open
>  /dev/sda
> 62    RAID-5  136GB   32KB None    0:00:0 64.0KB:45.5GB Dest    0
>                                    0:01:0 64.0KB:45.5GB Dest    1
>                                    0:02:0 64.0KB:45.5GB Dest    2
>                                    0:03:0 64.0KB:45.5GB Dest    3
> 63    RAID-5  136GB   32KB None    0:00:0 64.0KB:68.3GB Source  0
>                                    0:01:0 64.0KB:68.3GB Source  1
>                                    0:02:0 64.0KB:68.3GB Source  2
> 61    RAID-5 2.00MB   64KB None    0:00:0 68.3GB:1.00MB Temp    0
>                                    0:01:0 68.3GB:1.00MB Temp    1
>                                    0:02:0 68.3GB:1.00MB Temp    2
> 
> I guess I really should have specified
> 
>    container reconfigure /partition_size=68.3G 0 (0,3,0) 
> 
> Anyway, now I'm left with a container in state "reconf"
> and I can't figure out how to get it to restart the
> reconfiguration.
> 
> AFA0> container reconfigure 0 (0,3,0)
> Executing: container reconfigure 0 (BUS=0,ID=3,LUN=0)
> Command Error: <The specified container type is invalid for the 
> attempted operation.>
> 
> AFA0> container scrub /no_repair 0
> Executing: container scrub /no_repair=TRUE 0
> Command Error: <The requested operation cannot be done because 
> it was in the wrong state.>
> 
> Is there any way to recover this?  The system seems to be operating
> normally and all of the filesystems on it have fscked OK.
> 
> Or am I looking at a complete system rebuild?  Dell say "if it was
> our data, we'd rebuild from scratch and restore from backup".  I 
> can do this, if I have to.
> 
> I'd expect that the same state could occur if a power loss occurred
> during the reconfigure, so I'm thinking there should be some other
> way to recover from this.

Yesterday I rebuilt the machine, but this morning, I found this:

http://docs.us.dell.com/support/edocs/storage/romb26/cli-ug/cli_admk.htm

From this page:

   Restarting a Stopped Reconfigure Task

   The restart reconfigure task attribute indicates the restart of 
   a stopped reconfigure task. The task stop command stops a 
   reconfigure task. See Stopping Tasks for information onstopping 
   a task. In the following example, the container reconfigure 
   command with the /restart switch restarts a container 
   reconfigure task on container 1:

      AFA0> container reconfigure /restart=TRUE /wait=TRUE 1
      Executing: container reconfigure /restart=TRUE /wait=TRUE 1

So, all I really needed to do was do a "container reconfigure /restart 0"
to get it to finish the conversion of 3x68.3GB to a 4x45.5GB 
container, and then do 

   container reconfigure /partition_size=68GB 0

to expand the RAID up to a 4x68GB container.

Unfortunately, I won't have an opportunity to test this.

Later,
Kenn




More information about the ILUG mailing list