联系:手机/微信(+86 17813235971) QQ(107644445)
标题:Physically Addressed Metadata Redundancy on 12c ASM ( PHYS_META_REPLICATED )
作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]
从版本12.1开始,ASM会对某些物理元数据做一份复制,具体的说是每个磁盘的第一个AU(0号AU)上元数据。这意味着,ASM同时维护着两份磁盘头、FST(Free Space Table)表、AT(Allocation table)表的数据。需要注意的是ASM对这些数据采用的是复制(replicate),而不是镜像(mirror)。ASM镜像(mirror)意味着把一份数据,拷贝到不同磁盘上;而物理元数据的副本位于相同的磁盘,因此使用的术语复制(replicate)。这意味着在external冗余的磁盘组中,物理元数据也会被复制。PST也是物理元数据,但是ASM是通过镜像,而不是复制来提供数据保护。因此只有在normal和high冗余的磁盘组中,PST表存在数据的冗余。物理元数据位于每块ASM磁盘的0号AU。元数据复制的特性打开后,ASM会把0号AU的内容拷贝到11号AU,然后同时维护这两份副本。创建磁盘组时如果指定或修改了一个已经存在的磁盘组的compatibility属性为12.1及以上,该特性会自动被打开。当提升ASM compatibility属性值为12.1及以上时,如果11号AU有数据,ASM将把这些数据移动到别处,然后将物理元数据复制到11号AU。从版本11.1.0.7开始,ASM在1号AU的倒数第二个块维护了一份磁盘头的副本。在版本12.1中,ASM仍然维护着这个副本数据。也就是说,现在每个ASM磁盘,有磁盘头的三个副本。
au 0中具体数据
命令行创建diskgroup
[grid@localhost ~]$ sqlplus / as sysasm SQL*Plus: Release 12.2.0.1.0 Production on Sat Apr 22 08:13:31 2017 Copyright (c) 1982, 2016, Oracle. All rights reserved. Connected to: Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production SQL> create diskgroup xifenfei external redundancy disk '/dev/xifenfei-sdg','/dev/xifenfei-sdh'; Diskgroup created. SQL> exit Disconnected from Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production
查看xifenfei磁盘组属性
[grid@localhost ~]$ asmcmd lsattr -l -G XIFENFEI Name Value access_control.enabled FALSE access_control.umask 066 au_size 1048576 cell.smart_scan_capable FALSE compatible.asm 11.2.0.2.0 compatible.rdbms 10.1.0.0.0 disk_repair_time 3.6h idp.boundary auto idp.type dynamic sector_size 512
这里可以看到目前compatible.asm为11.2,没有phys_meta_replicated属性
查看磁盘头信息
ASM磁盘头的fdhdb.flags条目指代了物理元数据的复制状态:
· kfdhdb.flags = 0 — 元数据没有复制
· kfdhdb.flags = 1 — 元数据已经复制完毕
· kfdhdb.flags = 2 — 元数据在复制过程中
[grid@localhost ~]$ for disk in `asmcmd lsdsk -G XIFENFEI --suppressheader`; > do kfed read $disk | egrep "dskname|flags"; done kfdhdb.dskname: XIFENFEI_0000 ; 0x028: length=13 kfdhdb.flags: 0 ; 0x0fc: 0x00000000 kfdhdb.dskname: XIFENFEI_0001 ; 0x028: length=13 kfdhdb.flags: 0 ; 0x0fc: 0x00000000 [grid@localhost ~]$ kfed read /dev/xifenfei-sdg|grep kfbh.type kfbh.type: 1 ; 0x002: KFBTYP_DISKHEAD [grid@localhost ~]$ kfed read /dev/xifenfei-sdg aun=11|grep kfbh.type kfbh.type: 8 ; 0x002: KFBTYP_CHNGDIR [grid@localhost ~]$ kfed read /dev/xifenfei-sdg blkn=254 aun=1|grep kfbh.type kfbh.type: 1 ; 0x002: KFBTYP_DISKHEAD
这里也比较明显,在aun 11的位置没有第一个au的备份
修改compatible.asm为12.2
[grid@localhost ~]$ asmcmd setattr -G XIFENFEI compatible.asm 12.2.0.0.0 [grid@localhost ~]$ asmcmd lsattr -l -G XIFENFEI Name Value access_control.enabled FALSE access_control.umask 066 appliance._partnering_type NULL au_size 1048576 cell.smart_scan_capable FALSE cell.sparse_dg allnonsparse compatible.asm 12.2.0.0.0 compatible.rdbms 10.1.0.0.0 content.check FALSE content.type data disk_repair_time 3.6h failgroup_repair_time 24.0h idp.boundary auto idp.type dynamic logical_sector_size 512 phys_meta_replicated true preferred_read.enabled FALSE scrub_async_limit 1 scrub_metadata.enabled FALSE sector_size 512 thin_provisioned FALSE
这里可以看到修改为compatible.asm=12.2之后,出现phys_meta_replicated属性
查看au的备份
[grid@localhost ~]$ for disk in `asmcmd lsdsk -G XIFENFEI --suppressheader`; > do kfed read $disk | egrep "dskname|flags"; done kfdhdb.dskname: XIFENFEI_0000 ; 0x028: length=13 kfdhdb.flags: 1 ; 0x0fc: 0x00000001 kfdhdb.dskname: XIFENFEI_0001 ; 0x028: length=13 kfdhdb.flags: 1 ; 0x0fc: 0x00000001 [grid@localhost ~]$ kfed read /dev/xifenfei-sdg aun=11|grep kfbh.type kfbh.type: 1 ; 0x002: KFBTYP_DISKHEAD [grid@localhost ~]$ kfed read /dev/xifenfei-sdg blkn=254 aun=1|grep kfbh.type kfbh.type: 1 ; 0x002: KFBTYP_DISKHEAD [grid@localhost ~]$ kfed read /dev/xifenfei-sdg|grep kfbh.type kfbh.type: 1 ; 0x002: KFBTYP_DISKHEAD
这里就可以看到kfdhdb.flags为1,在au 11的地方也变为了磁盘头信息
模拟第一个au彻底损坏
[grid@localhost ~]$ dd if=/dev/zero of=/dev/xifenfei-sdg bs=1024k count=1 conv=notrunc 1+0 records in 1+0 records out [grid@localhost ~]$ kfed read /dev/xifenfei-sdg kfbh.endian: 0 ; 0x000: 0x00 kfbh.hard: 0 ; 0x001: 0x00 kfbh.type: 0 ; 0x002: KFBTYP_INVALID kfbh.datfmt: 0 ; 0x003: 0x00 kfbh.block.blk: 0 ; 0x004: blk=0 kfbh.block.obj: 0 ; 0x008: file=0 kfbh.check: 0 ; 0x00c: 0x00000000 kfbh.fcn.base: 0 ; 0x010: 0x00000000 kfbh.fcn.wrap: 0 ; 0x014: 0x00000000 kfbh.spare1: 0 ; 0x018: 0x00000000 kfbh.spare2: 0 ; 0x01c: 0x00000000 000000000 00000000 00000000 00000000 00000000 [................] Repeat 255 times KFED-00322: invalid content encountered during block traversal: [kfbtTraverseBlock][Invalid OSM block type][][0]
尝试mount磁盘组
SQL> alter diskgroup xifenfei mount; alter diskgroup xifenfei mount * ERROR at line 1: ORA-15032: not all alterations performed ORA-15017: diskgroup "XIFENFEI" cannot be mounted ORA-15040: diskgroup is incomplete
alert日志信息
SQL> alter diskgroup xifenfei mount 2017-04-22T08:30:00.889037-04:00 NOTE: cache registered group XIFENFEI 1/0xB15C368B NOTE: cache began mount (first) of group XIFENFEI 1/0xB15C368B NOTE: Assigning number (1,1) to disk (/dev/xifenfei-sdh) 2017-04-22T08:30:01.001544-04:00 ERROR: no read quorum in group: required 1, found 0 disks 2017-04-22T08:30:01.001737-04:00 NOTE: cache dismounting (clean) group 1/0xB15C368B (XIFENFEI) NOTE: messaging CKPT to quiesce pins Unix process pid: 20894, image: oracle@localhost.localdomain (TNS V1-V3) NOTE: dbwr not being msg'd to dismount NOTE: LGWR not being messaged to dismount NOTE: cache dismounted group 1/0xB15C368B (XIFENFEI) NOTE: cache ending mount (fail) of group XIFENFEI number=1 incarn=0xb15c368b NOTE: cache deleting context for group XIFENFEI 1/0xb15c368b 2017-04-22T08:30:01.028825-04:00 GMON dismounting group 1 at 2 for pid 23, osid 20894 2017-04-22T08:30:01.029146-04:00 NOTE: Disk XIFENFEI_0001 in mode 0x8 marked for de-assignment ERROR: diskgroup XIFENFEI was not mounted ORA-15032: not all alterations performed ORA-15017: diskgroup "XIFENFEI" cannot be mounted ORA-15040: diskgroup is incomplete 2017-04-22T08:30:01.036014-04:00 ERROR: alter diskgroup xifenfei mount
很明显由于xifenfei-sdg第一个au 已经被完全dd掉,xifenfei磁盘组无法mount,提示ORA-15040: diskgroup is incomplete
使用备份au还原
[grid@localhost ~]$ dd if=/dev/xifenfei-sdg skip=11 bs=1024k count=1 of=/tmp/sdg_header 1+0 records in 1+0 records out 1048576 bytes (1.0 MB) copied, 0.048576 s, 21.6 MB/s [grid@localhost ~]$ kfed read /tmp/sdg_header |more kfbh.endian: 1 ; 0x000: 0x01 kfbh.hard: 130 ; 0x001: 0x82 kfbh.type: 1 ; 0x002: KFBTYP_DISKHEAD kfbh.datfmt: 2 ; 0x003: 0x02 kfbh.block.blk: 0 ; 0x004: blk=0 kfbh.block.obj: 2147483648 ; 0x008: disk=0 kfbh.check: 3085718230 ; 0x00c: 0xb7ec52d6 kfbh.fcn.base: 41 ; 0x010: 0x00000029 kfbh.fcn.wrap: 0 ; 0x014: 0x00000000 kfbh.spare1: 0 ; 0x018: 0x00000000 kfbh.spare2: 0 ; 0x01c: 0x00000000 kfdhdb.driver.provstr: ORCLDISK ; 0x000: length=8 kfdhdb.driver.reserved[0]: 0 ; 0x008: 0x00000000 kfdhdb.driver.reserved[1]: 0 ; 0x00c: 0x00000000 kfdhdb.driver.reserved[2]: 0 ; 0x010: 0x00000000 kfdhdb.driver.reserved[3]: 0 ; 0x014: 0x00000000 kfdhdb.driver.reserved[4]: 0 ; 0x018: 0x00000000 kfdhdb.driver.reserved[5]: 0 ; 0x01c: 0x00000000 kfdhdb.compat: 203423744 ; 0x020: 0x0c200000 kfdhdb.dsknum: 0 ; 0x024: 0x0000 kfdhdb.grptyp: 1 ; 0x026: KFDGTP_EXTERNAL kfdhdb.hdrsts: 3 ; 0x027: KFDHDR_MEMBER kfdhdb.dskname: XIFENFEI_0000 ; 0x028: length=13 kfdhdb.grpname: XIFENFEI ; 0x048: length=8 kfdhdb.fgname: XIFENFEI_0000 ; 0x068: length=13 [grid@localhost ~]$ dd if=/tmp/sdg_header of=/dev/xifenfei-sdg bs=1024k count=1 conv=notrunc 1+0 records in 1+0 records out 1048576 bytes (1.0 MB) copied, 0.0262761 s, 39.9 MB/s [grid@localhost ~]$ kfed read /dev/xifenfei-sdg|more kfbh.endian: 1 ; 0x000: 0x01 kfbh.hard: 130 ; 0x001: 0x82 kfbh.type: 1 ; 0x002: KFBTYP_DISKHEAD kfbh.datfmt: 2 ; 0x003: 0x02 kfbh.block.blk: 0 ; 0x004: blk=0 kfbh.block.obj: 2147483648 ; 0x008: disk=0 kfbh.check: 3085718230 ; 0x00c: 0xb7ec52d6 kfbh.fcn.base: 41 ; 0x010: 0x00000029 kfbh.fcn.wrap: 0 ; 0x014: 0x00000000 kfbh.spare1: 0 ; 0x018: 0x00000000 kfbh.spare2: 0 ; 0x01c: 0x00000000 kfdhdb.driver.provstr: ORCLDISK ; 0x000: length=8 kfdhdb.driver.reserved[0]: 0 ; 0x008: 0x00000000 kfdhdb.driver.reserved[1]: 0 ; 0x00c: 0x00000000 kfdhdb.driver.reserved[2]: 0 ; 0x010: 0x00000000 kfdhdb.driver.reserved[3]: 0 ; 0x014: 0x00000000 kfdhdb.driver.reserved[4]: 0 ; 0x018: 0x00000000 kfdhdb.driver.reserved[5]: 0 ; 0x01c: 0x00000000 kfdhdb.compat: 203423744 ; 0x020: 0x0c200000 kfdhdb.dsknum: 0 ; 0x024: 0x0000 kfdhdb.grptyp: 1 ; 0x026: KFDGTP_EXTERNAL kfdhdb.hdrsts: 3 ; 0x027: KFDHDR_MEMBER kfdhdb.dskname: XIFENFEI_0000 ; 0x028: length=13 kfdhdb.grpname: XIFENFEI ; 0x048: length=8 kfdhdb.fgname: XIFENFEI_0000 ; 0x068: length=13
xifenfei磁盘组mount成功
[grid@localhost ~]$ sqlplus / as sysasm SQL*Plus: Release 12.2.0.1.0 Production on Sat Apr 22 08:34:53 2017 Copyright (c) 1982, 2016, Oracle. All rights reserved. Connected to: Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production SQL> alter diskgroup xifenfei mount; Diskgroup altered.
asm alert日志
SQL> alter diskgroup xifenfei mount 2017-04-22T08:34:59.298838-04:00 NOTE: cache registered group XIFENFEI 1/0xFA6C368E NOTE: cache began mount (first) of group XIFENFEI 1/0xFA6C368E NOTE: Assigning number (1,0) to disk (/dev/xifenfei-sdg) NOTE: Assigning number (1,1) to disk (/dev/xifenfei-sdh) 2017-04-22T08:35:05.447528-04:00 NOTE: GMON heartbeating for grp 1 (XIFENFEI) GMON querying group 1 at 5 for pid 23, osid 21195 2017-04-22T08:35:05.449557-04:00 NOTE: cache is mounting group XIFENFEI created on 2017/04/22 08:13:39 NOTE: cache opening disk 0 of grp 1: XIFENFEI_0000 path:/dev/xifenfei-sdg NOTE: 04/22/17 08:35:04 XIFENFEI.F1X0 found on disk 0 au 2 fcn 0.0 datfmt 2 NOTE: cache opening disk 1 of grp 1: XIFENFEI_0001 path:/dev/xifenfei-sdh 2017-04-22T08:35:05.450316-04:00 NOTE: cache mounting (first) external redundancy group 1/0xFA6C368E (XIFENFEI) NOTE: cache recovered group 1 to fcn 0.352 NOTE: redo buffer size is 256 blocks (1056768 bytes) 2017-04-22T08:35:05.504356-04:00 NOTE: LGWR attempting to mount thread 1 for diskgroup 1 (XIFENFEI) NOTE: LGWR found thread 1 closed at ABA 2.63 lock domain=0 inc#=0 instnum=1 NOTE: LGWR mounted thread 1 for diskgroup 1 (XIFENFEI) 2017-04-22T08:35:05.555647-04:00 NOTE: LGWR opened thread 1 (XIFENFEI) at fcn 0.352 ABA 3.64 lock domain=1 inc#=0 instnum=1 gx.incarn=4201395854 mntstmp=2017/04/22 08:35:05.510000 2017-04-22T08:35:05.556006-04:00 NOTE: cache mounting group 1/0xFA6C368E (XIFENFEI) succeeded NOTE: cache ending mount (success) of group XIFENFEI number=1 incarn=0xfa6c368e 2017-04-22T08:35:05.596616-04:00 NOTE: Instance updated compatible.asm to 12.2.0.0.0 for grp 1 (XIFENFEI). 2017-04-22T08:35:05.599181-04:00 NOTE: Instance updated compatible.rdbms to 10.1.0.0.0 for grp 1 (XIFENFEI). 2017-04-22T08:35:05.608332-04:00 SUCCESS: diskgroup XIFENFEI was mounted 2017-04-22T08:35:05.635588-04:00 SUCCESS: alter diskgroup xifenfei mount