Thursday, September 3, 2015

How ASM disk header block repair works !!


source : http://laurent-leturgez.com/2012/11/12/how-asm-disk-header-block-repair-works/

How ASM disk header block repair works

In this post, I will explain one of my last works about the ASM disk header block.
First, I will create a TEST tablespace in my orcl database. The TEST’s datafile will be managed by ASM.
[oracle@oel ~]$ . oraenv
ORACLE_SID = [+ASM] ? orcl
The Oracle base remains unchanged with value /u01/app/oracle
[oracle@oel ~]$ sqlplus / as sysdba
SQL> create tablespace test datafile '+data' size 5M autoextend on maxsize unlimited;
Tablespace created.

SQL>  select instance_name from v$instance;
INSTANCE_NAME
----------------
orcl

SQL> select file_id,tablespace_name,file_name,status,online_status from dba_data_files
  2  /

   FILE_ID TABLESPACE_NAME                FILE_NAME                                          STATUS    ONLINE_
---------- ------------------------------ -------------------------------------------------- --------- -------
         4 USERS                          /u02/oradata/orcl/users01.dbf                      AVAILABLE ONLINE
         3 UNDOTBS1                       /u02/oradata/orcl/undotbs01.dbf                    AVAILABLE ONLINE
         2 SYSAUX                         /u02/oradata/orcl/sysaux01.dbf                     AVAILABLE ONLINE
         1 SYSTEM                         /u02/oradata/orcl/system01.dbf                     AVAILABLE SYSTEM
         5 EXAMPLE                        /u02/oradata/orcl/example01.dbf                    AVAILABLE ONLINE
         6 TEST                           +DATA/orcl/datafile/test.258.798578613             AVAILABLE ONLINE

6 rows selected.
My ASM instance manages two diskgroups DATA (external redundancy) and RL (normal redundancy).
DATA is a 2 disks diskgroup managed by asmlib.
SQL> select group_number,name,type,state from v$asm_diskgroup
  2  /

GROUP_NUMBER NAME TYPE   STATE
------------ ---- ------ -----------
           1 DATA EXTERN MOUNTED
           2 RL   NORMAL MOUNTED
SQL> select path from v$asm_disk where group_number=1;

PATH
---------------------------------
ORCL:ASM1
ORCL:ASM2
And now, we erase the header block of each disk. Header block is the first one of an ASM disk device, and its default size is 4096 bytes (This size is available in _asm_blksize undocumented parameter).
[oracle@oel ~]$ dd if=/dev/zero of=/dev/oracleasm/disks/ASM1 bs=4096 count=1
1+0 records in
1+0 records out
4096 bytes (4.1 kB) copied, 2.2e-05 seconds, 186 MB/s
[oracle@oel ~]$ dd if=/dev/zero of=/dev/oracleasm/disks/ASM2 bs=4096 count=1
1+0 records in
1+0 records out
4096 bytes (4.1 kB) copied, 0.000336 seconds, 12.2 MB/s
Even my block headers destroyed, I can still write in my diskgroup and allocate extents in it.
SQL> alter database datafile 6 resize 10M;

Database altered.

SQL> create table t tablespace TEST as select * from dba_source;

Table created.

SQL> alter system checkpoint;

System altered.

SQL> select file#,checkpoint_time from v$datafile_header where file#=6;

     FILE# CHECKPOINT_TIME
---------- -------------------
         6 05/11/2012 19:27:43
Well, let’s try to restart the rdbms instance:
SQL> shutdown immediate
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> startup
ORACLE instance started.

Total System Global Area 1068937216 bytes
Fixed Size                  2235208 bytes
Variable Size             377488568 bytes
Database Buffers          683671552 bytes
Redo Buffers                5541888 bytes
Database mounted.
Database opened.
SQL> select count(*) from t;

  COUNT(*)
----------
    702070
Damned ! I can still restart it and read all file extents.
There’s something strange in this demo, my ASM disk header is invalid and I still can read files. I will verify with kfed my headers state (only one ASM disk is shown below):
oracle@oel ~]$ kfed read /dev/oracleasm/disks/ASM1
kfbh.endian:                          0 ; 0x000: 0x00
kfbh.hard:                            0 ; 0x001: 0x00
kfbh.type:                            0 ; 0x002: KFBTYP_INVALID
kfbh.datfmt:                          0 ; 0x003: 0x00
kfbh.block.blk:                       0 ; 0x004: blk=0
kfbh.block.obj:                       0 ; 0x008: file=0
kfbh.check:                           0 ; 0x00c: 0x00000000
kfbh.fcn.base:                        0 ; 0x010: 0x00000000
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000
2B469E002400 00000000 00000000 00000000 00000000  [................]
  Repeat 255 times
KFED-00322: Invalid content encountered during block traversal: [kfbtTraverseBlock][Invalid OSM block type][][0]

[oracle@oel ~]$ sudo /u01/app/oracle/product/11.2.0.3/dbhome_1/bin/kfed read /dev/sdb1
kfbh.endian:                          0 ; 0x000: 0x00
kfbh.hard:                            0 ; 0x001: 0x00
kfbh.type:                            0 ; 0x002: KFBTYP_INVALID
kfbh.datfmt:                          0 ; 0x003: 0x00
kfbh.block.blk:                       0 ; 0x004: blk=0
kfbh.block.obj:                       0 ; 0x008: file=0
kfbh.check:                           0 ; 0x00c: 0x00000000
kfbh.fcn.base:                        0 ; 0x010: 0x00000000
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000
2B413559C400 00000000 00000000 00000000 00000000  [................]
  Repeat 255 times
KFED-00322: file not found; arguments: [kfbtTraverseBlock] [Invalid OSM block type] [] [0]
So KFED has confirmed my header blocks are invalid. Now, I will try to unmount and remount the diskgroup to see if there’s any effect on diskgroup mount operation.
[oracle@oel ~]$ sqlplus / as sysasm

SQL> alter diskgroup DATA dismount;

Diskgroup altered.

SQL> alter diskgroup DATA mount;
alter diskgroup DATA mount
*
ERROR at line 1:
ORA-15032: not all alterations performed
ORA-15017: diskgroup "DATA" cannot be mounted
ORA-15063: ASM discovered an insufficient number of disks for diskgroup "DATA"
So ASM disk header seems to be important for mounting the diskgroup but without effect on asm extent allocation etc.
Now I will repair asm disks with kfed and repair operation to restore my asm header blocks:
[oracle@oel ~]$ kfed repair /dev/oracleasm/disks/ASM1
[oracle@oel ~]$ kfed repair /dev/oracleasm/disks/ASM2

[oracle@oel ~]$ sqlplus / as sysasm

SQL> alter diskgroup DATA mount;

Diskgroup altered.
So KFED is able to restore the header (with the repair operation).
[oracle@oel ~]$ . oraenv
ORACLE_SID = [+ASM] ? orcl
The Oracle base remains unchanged with value /u01/app/oracle

[oracle@oel ~]$ sqlplus / as sysdba

Connected to an idle instance.

SQL> startup
...
Database opened.
SQL> select count(*) from T;

  COUNT(*)
----------
    702070
And datas in the T table are still there … no problem !
Well, let’s summarize … I erased the header block of both asm disks in my diskgroup without any impact on file extent allocation. The only impact I had was on the mount capability. Next, I have restored the header block of each disk with a kfed repair operation … yes, but without any backup of the disk or the disk header. So I wonder … where was the backup of my header block?
First, I will try a very simple method. I will have a look at all metadata blocks in my asm disk device. I hope I will find another header block. For this purpose, i will use a basic shell script to analyze each block. If the script finds a block with the type KFBTYP_DISKHEAD, it will keep the block position, and print them at the end of script execution:
The basic script:
#!/bin/bash
#set -x
export ORAENV_ASK=NO
export ORACLE_SID=+ASM
. oraenv

i=0
ret=0
tab_cnt=0;
blk_typ='';

while [ "$blk_typ" != "KFBTYP_INVALID" ];
do
  blk_typ=`kfed read $1 blkn=$i | grep kfbh.type | awk '{print $5;}' | sed 's/^ *//g' | sed 's/ *$//g' `
  ret=$?
  if [ "$blk_typ" = "KFBTYP_DISKHEAD" ]; then
    t[$i]=$i
  fi
  let i=$i+1
done
echo "list of header block with KFBTYP_DISKHEAD type"
echo ${t[@]}
Execution results:
[oracle@oel ~]$ ./asm_surgery.sh /dev/oracleasm/disks/ASM1
The Oracle base remains unchanged with value /u01/app/oracle
list of header block with KFBTYP_DISKHEAD type
0 510
[oracle@oel ~]$ ./asm_surgery.sh /dev/oracleasm/disks/ASM2
The Oracle base remains unchanged with value /u01/app/oracle
list of header block with KFBTYP_DISKHEAD type
0 510
Ok, so there is a copy of the header block in the 510th block in my disk. Indeed, as Bane Radulović mentioned it in its blog, a backup copy of ASM disk header is in the second last block of allocation unit 1. So, if my AU is 1Mb and my block size 4096 bytes, a copy of my header block will be available in ((1048576 / 4096) * 2 – 1  = 511), as the block# starts at 0, it is located in the 510th block.
To double check this, I will erase the header block of my first ASM disk, and use kfed repair and strace to see what really happens:
[oracle@oel ~]$ dd if=/dev/zero of=/dev/oracleasm/disks/ASM1 bs=4096 count=1
1+0 records in
1+0 records out
4096 bytes (4.1 kB) copied, 2.3e-05 seconds, 178 MB/s

[oracle@oel ~]$ strace kfed repair /dev/oracleasm/disks/ASM1
.../...

stat("/dev/oracleasm/disks/ASM1", {st_mode=S_IFBLK|0660, st_rdev=makedev(8, 17), ...}) = 0
access("/dev/oracleasm/disks/ASM1", F_OK) = 0
statfs("/dev/oracleasm/disks/ASM1", {f_type=0x958459f6, f_bsize=4096, f_blocks=0, f_bfree=0, f_bavail=0, f_files=0, f_ffree=0, f_fsid={0, 0}, f_namelen=255, f_frsize=4096}) = 0
open("/dev/oracleasm/disks/ASM1", O_RDWR) = 7
lseek(7, 2088960, SEEK_SET)             = 2088960
read(7, "\1\202\1\1\376\200<\206\371\7"..., 4096) = 4096
lseek(7, 0, SEEK_SET)                   = 0
read(7, ""..., 4096) = 4096
lseek(7, 0, SEEK_SET)                   = 0
write(7, "\1\202\1\1\200\302\206\371\7"..., 4096) = 4096
close(7)                                = 0
This is very interesting.
First operation, it opens the device and read the block after 2088960 bytes. But 2088960/4096 = 510, so it reads the 510th block of the disk. Next it reads the block at position 0 (header block), and then writes in it the content of the 510th block.
Well, now I know there’s a copy of each header block in a secret position !!! (block #510) and kfed uses this block to repair the main header block.
During my tests, I noticed that sometimes, asm disks lost their asm label (I don’t know why). As a consequence, repaired disks won’t be recreated by the oracleasm scandisks (or after a reboot).
[root@oel ~]# ls /dev/oracleasm/disks/*
/dev/oracleasm/disks/ASM1  /dev/oracleasm/disks/ASM2  /dev/oracleasm/disks/ASM3  /dev/oracleasm/disks/ASM4
[root@oel ~]# oracleasm scandisks
Reloading disk partitions: done
Cleaning any stale ASM disks...
Cleaning disk "ASM1"
Scanning system for ASM disks...
[root@oel ~]# ls /dev/oracleasm/disks/*
/dev/oracleasm/disks/ASM2  /dev/oracleasm/disks/ASM3  /dev/oracleasm/disks/ASM4
According the fact that you know which device is corresponding to asmlib disk device, you can relabel this disk with oracleasm renamedisk command (be very careful with this command):
[root@oel ~]# oracleasm renamedisk -f /dev/sdb1 ASM1
Writing disk header: done
Instantiating disk "ASM1": done

[root@oel ~]# oracleasm scandisks
Reloading disk partitions: done
Cleaning any stale ASM disks...
Scanning system for ASM disks...

[root@oel ~]# ls /dev/oracleasm/disks/*
/dev/oracleasm/disks/ASM1  /dev/oracleasm/disks/ASM2  /dev/oracleasm/disks/ASM3  /dev/oracleasm/disks/ASM4

10 responses to “How ASM disk header block repair works

  1. Prosvet November 23, 2012 at 2:38 PM
    I think it is not entirely correct to say that
    “there is a copy of the header block in the 510th block…”.
    If I’m not mistaken in ASM version 11.1.0.7 and later the second last block from allocation unit 1 is KFBTYP_DISKHEAD ( the backup copy of the ASM disk header ) !!!
    By default you will get 510th block
    kfdhdb.blksize: 4096
    kfdhdb.ausize: 1048576
    and (1048576 / 4096) * 2 – 1 = 511. The first block is 0 so you get 510th block.
    Thanks to Bane Radulović )))
    • Laurent November 23, 2012 at 3:10 PM
      Hello,
      Indeed, in my case, it was the 510th block (or the second last block from allocation unit 1)
      I have updated the post with the correct formula and the reference to Bane’s blog
      Thanks.
  2. Bikash April 27, 2013 at 2:43 PM
    Hello,
    [oracle@rac1 ~]$ blksize=`kfed read /dev/oracleasm/disks/DISK1 | grep blksize | tr -s ‘ ‘ | cut -d’ ‘ -f2`
    [oracle@rac1 ~]$ echo $blksize
    4096
    [oracle@rac1 ~]$ ausize=`kfed read /dev/oracleasm/disks//DISK1 | grep ausize | tr -s ‘ ‘ | cut -d’ ‘ -f2`
    [
    [oracle@rac1 ~]$ echo $ausize
    1048576
    [oracle@rac1 ~]$ let n=$ausize/$blksize-2
    [oracle@rac1 ~]$ echo $n
    254
    I am unable to understand why there is a multiplication of 2 ((1048576 / 4096) * 2 – 1 = 511. The first block is 0 so you get 510th block.).
    Could you please explain this last block is in 511 instead of 254 ?
  3. Laurent April 27, 2013 at 9:54 PM
    Hi,
    In my example, I didn’t mention the au number and without aun parameter, kfed starts at aun=0
    In Bane’s blog, he mentioned the aun=1, so he queried the 255th block (254 because it start at 0), as I didn’t mentionned it, I have to jump the aun=0 which counts ($ausize/$blocksize).
    [oracle@oel ~]$ let firstau=$ausize/$blksize-1
    [oracle@oel ~]$ let n=$firstau+$ausize/$blksize-1
    [oracle@oel ~]$ kfed read /dev/oracleasm/disks/ASM1 ausz=$ausize blkn=$n | grep KFBTYP
    kfbh.type: 1 ; 0x002: KFBTYP_DISKHEAD
    [oracle@oel ~]$ echo $n
    510
    which is equivalent to:
    [oracle@oel ~]$ kfed read /dev/oracleasm/disks/ASM1 ausz=$ausize aun=0 blkn=$n | grep KFBTYP
    kfbh.type: 1 ; 0x002: KFBTYP_DISKHEAD
    hope this help.
  4. Bikash April 28, 2013 at 7:30 AM
    First of all Thank you very much for your reply.
    total number of blocks –> 256 ( 0 to 255) (second last block if we start with 0 it would be 254 or if we start with 1 it would be 255)
    [oracle@rac1 ~]$ ausize=`kfed read /dev/oracleasm/disks/DISK1 | grep ausize | tr -s ‘ ‘ | cut -d’ ‘ -f2`
    [oracle@rac1 ~]$
    [oracle@rac1 ~]$ echo $ausize
    1048576
    [oracle@rac1 ~]$ blksize=`kfed read /dev/oracleasm/disks/DISK1 | grep blksize | tr -s ‘ ‘ | cut -d’ ‘ -f2`
    [oracle@rac1 ~]$ echo $blksize
    4096
    [oracle@rac1 ~]$ let firstau=$ausize/$blksize-1
    [oracle@rac1 ~]$ echo $firstau
    255
    Only thing I am unable to understand is why the first allocation unit number starts with 255 ?Is the number starting from (0 to 254 reserve for ASM )
    • Laurent April 28, 2013 at 9:12 PM
      Hello
      Each AU is composed of 256th (indexed from 0 to 255), so if we found the second last block in the allocation unit 1 (the 2nd in the disk), we have something like that:
      – first AU is indexed from 0 to 255. In this AU, the header block is located on the AUN=0 (default in kfed), at the relative block# 0
      – Next AU is indexed from 0 to 255 (absolute index: 256 to 511, but in absolute: 257th to 512th block). So the second last block is accessible in the AUN=1 in the relative block= 254 (or absolute 510). And the last block contains the heartbeat block
      See above with a relative block number in the AU:
      [oracle@oel ~]$ kfed read /dev/oracleasm/disks/ASM1 aun=1 ausz=$ausize blkn=254 | grep KFBTYP
      kfbh.type: 1 ; 0x002: KFBTYP_DISKHEAD
      [oracle@oel ~]$ kfed read /dev/oracleasm/disks/ASM1 aun=1 ausz=$ausize blkn=255 | grep KFBTYP
      kfbh.type: 19 ; 0x002: KFBTYP_HBEAT
      See above with an absolute block number in the disk (same result):
      [oracle@oel ~]$ kfed read /dev/oracleasm/disks/ASM1 ausz=$ausize blkn=510 | grep KFBTYP
      kfbh.type: 1 ; 0x002: KFBTYP_DISKHEAD
      [oracle@oel ~]$ kfed read /dev/oracleasm/disks/ASM1 ausz=$ausize blkn=511 | grep KFBTYP
      kfbh.type: 19 ; 0x002: KFBTYP_HBEAT
      Let me know if I’m clear enough (I have some doubts ;) because for me each AU is composed of 256 blocks)
  5. Bikash April 29, 2013 at 8:29 AM
    Thank you very much for explaining this ..It is clear now. Thanks again.
  6. Sachin October 22, 2014 at 7:16 AM
    Very helpful blog .. Just wondering if this activity ( restoring Label ) advisable when asm disks has data on it and for some reason ( deleted ) not visible and could not be scanned or just for new blank ones ?
    Any reply will be appreciated.
    Regards
    Sachin
    • Laurent October 22, 2014 at 7:34 AM
      Hi Sachin
      The given method is ok to restore metadata. If you have some data on your disks and for any reason, only the header block is corrupted, it can be used to restore it.
      Of course, it’s at your own risk, because at this level, Oracle won’t support you.
      Laurent

No comments:

Post a Comment