联系:手机/微信(+86 17813235971) QQ(107644445)
标题:ORA-15130: diskgroup “ORADATA” is being dismounted
作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]
磁盘组mount之后,立马又dismount
Sat Dec 25 17:48:45 2021 SQL> alter diskgroup ORADATA mount NOTE: cache registered group ORADATA number=5 incarn=0xd4b7ac6a NOTE: cache began mount (first) of group ORADATA number=5 incarn=0xd4b7ac6a NOTE: Assigning number (5,24) to disk (/dev/mapper/data31) NOTE: Assigning number (5,26) to disk (/dev/mapper/data33) NOTE: Assigning number (5,21) to disk (/dev/mapper/data29) NOTE: Assigning number (5,23) to disk (/dev/mapper/data30) NOTE: Assigning number (5,25) to disk (/dev/mapper/data32) NOTE: Assigning number (5,19) to disk (/dev/mapper/data27) NOTE: Assigning number (5,20) to disk (/dev/mapper/data28) NOTE: Assigning number (5,18) to disk (/dev/mapper/data26) NOTE: Assigning number (5,14) to disk (/dev/mapper/data22) NOTE: Assigning number (5,17) to disk (/dev/mapper/data25) NOTE: Assigning number (5,16) to disk (/dev/mapper/data24) NOTE: Assigning number (5,15) to disk (/dev/mapper/data23) NOTE: Assigning number (5,13) to disk (/dev/mapper/data21) NOTE: Assigning number (5,12) to disk (/dev/mapper/data20) NOTE: Assigning number (5,10) to disk (/dev/mapper/data19) NOTE: Assigning number (5,9) to disk (/dev/mapper/data18) NOTE: Assigning number (5,8) to disk (/dev/mapper/data17) NOTE: Assigning number (5,3) to disk (/dev/mapper/data12) NOTE: Assigning number (5,22) to disk (/dev/mapper/data3) NOTE: Assigning number (5,2) to disk (/dev/mapper/data11) NOTE: Assigning number (5,7) to disk (/dev/mapper/data16) NOTE: Assigning number (5,28) to disk (/dev/mapper/data5) NOTE: Assigning number (5,32) to disk (/dev/mapper/data9) NOTE: Assigning number (5,6) to disk (/dev/mapper/data15) NOTE: Assigning number (5,5) to disk (/dev/mapper/data14) NOTE: Assigning number (5,4) to disk (/dev/mapper/data13) NOTE: Assigning number (5,1) to disk (/dev/mapper/data10) NOTE: Assigning number (5,30) to disk (/dev/mapper/data7) NOTE: Assigning number (5,29) to disk (/dev/mapper/data6) NOTE: Assigning number (5,31) to disk (/dev/mapper/data8) NOTE: Assigning number (5,11) to disk (/dev/mapper/data2) NOTE: Assigning number (5,27) to disk (/dev/mapper/data4) NOTE: Assigning number (5,0) to disk (/dev/mapper/data1) Sat Dec 25 17:48:52 2021 NOTE: GMON heartbeating for grp 5 GMON querying group 5 at 153 for pid 32, osid 68608 NOTE: cache opening disk 0 of grp 5: ORADATA_0000 path:/dev/mapper/data1 NOTE: F1X0 found on disk 0 au 2 fcn 0.0 NOTE: cache opening disk 1 of grp 5: ORADATA_0001 path:/dev/mapper/data10 NOTE: cache opening disk 2 of grp 5: ORADATA_0002 path:/dev/mapper/data11 NOTE: cache opening disk 3 of grp 5: ORADATA_0003 path:/dev/mapper/data12 NOTE: cache opening disk 4 of grp 5: ORADATA_0004 path:/dev/mapper/data13 NOTE: cache opening disk 5 of grp 5: ORADATA_0005 path:/dev/mapper/data14 NOTE: cache opening disk 6 of grp 5: ORADATA_0006 path:/dev/mapper/data15 NOTE: cache opening disk 7 of grp 5: ORADATA_0007 path:/dev/mapper/data16 NOTE: cache opening disk 8 of grp 5: ORADATA_0008 path:/dev/mapper/data17 NOTE: cache opening disk 9 of grp 5: ORADATA_0009 path:/dev/mapper/data18 NOTE: cache opening disk 10 of grp 5: ORADATA_0010 path:/dev/mapper/data19 NOTE: cache opening disk 11 of grp 5: ORADATA_0011 path:/dev/mapper/data2 NOTE: cache opening disk 12 of grp 5: ORADATA_0012 path:/dev/mapper/data20 NOTE: cache opening disk 13 of grp 5: ORADATA_0013 path:/dev/mapper/data21 NOTE: cache opening disk 14 of grp 5: ORADATA_0014 path:/dev/mapper/data22 NOTE: cache opening disk 15 of grp 5: ORADATA_0015 path:/dev/mapper/data23 NOTE: cache opening disk 16 of grp 5: ORADATA_0016 path:/dev/mapper/data24 NOTE: cache opening disk 17 of grp 5: ORADATA_0017 path:/dev/mapper/data25 NOTE: cache opening disk 18 of grp 5: ORADATA_0018 path:/dev/mapper/data26 NOTE: cache opening disk 19 of grp 5: ORADATA_0019 path:/dev/mapper/data27 NOTE: cache opening disk 20 of grp 5: ORADATA_0020 path:/dev/mapper/data28 NOTE: cache opening disk 21 of grp 5: ORADATA_0021 path:/dev/mapper/data29 NOTE: cache opening disk 22 of grp 5: ORADATA_0022 path:/dev/mapper/data3 NOTE: cache opening disk 23 of grp 5: ORADATA_0023 path:/dev/mapper/data30 NOTE: cache opening disk 24 of grp 5: ORADATA_0024 path:/dev/mapper/data31 NOTE: cache opening disk 25 of grp 5: ORADATA_0025 path:/dev/mapper/data32 NOTE: cache opening disk 26 of grp 5: ORADATA_0026 path:/dev/mapper/data33 NOTE: cache opening disk 27 of grp 5: ORADATA_0027 path:/dev/mapper/data4 NOTE: cache opening disk 28 of grp 5: ORADATA_0028 path:/dev/mapper/data5 NOTE: cache opening disk 29 of grp 5: ORADATA_0029 path:/dev/mapper/data6 NOTE: cache opening disk 30 of grp 5: ORADATA_0030 path:/dev/mapper/data7 NOTE: cache opening disk 31 of grp 5: ORADATA_0031 path:/dev/mapper/data8 NOTE: cache opening disk 32 of grp 5: ORADATA_0032 path:/dev/mapper/data9 NOTE: cache mounting (first) external redundancy group 5/0xD4B7AC6A (ORADATA) Sat Dec 25 17:48:52 2021 * allocate domain 5, invalid = TRUE kjbdomatt send to inst 2 Sat Dec 25 17:48:52 2021 NOTE: attached to recovery domain 5 NOTE: starting recovery of thread=1 ckpt=92.6417 group=5 (ORADATA) NOTE: advancing ckpt for group 5 (ORADATA) thread=1 ckpt=92.6418 NOTE: cache recovered group 5 to fcn 0.9502919 NOTE: redo buffer size is 256 blocks (1053184 bytes) Sat Dec 25 17:48:52 2021 NOTE: LGWR attempting to mount thread 1 for diskgroup 5 (ORADATA) NOTE: LGWR found thread 1 closed at ABA 92.6417 NOTE: LGWR mounted thread 1 for diskgroup 5 (ORADATA) NOTE: LGWR opening thread 1 at fcn 0.9502919 ABA 93.6418 NOTE: cache mounting group 5/0xD4B7AC6A (ORADATA) succeeded NOTE: cache ending mount (success) of group ORADATA number=5 incarn=0xd4b7ac6a Sat Dec 25 17:48:53 2021 NOTE: Instance updated compatible.asm to 11.2.0.0.0 for grp 5 SUCCESS: diskgroup ORADATA was mounted SUCCESS: alter diskgroup ORADATA mount Sat Dec 25 17:48:53 2021 NOTE: diskgroup resource ora.ORADATA.dg is online WARNING:cache read a corrupt block: group=5(ORADATA)dsk=5 blk=2 disk=5(ORADATA_0005)incarn=2406 au=0 blk=2 count=1 Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_rbal_48956.trc: ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483653] [2] [0 != 1] NOTE: a corrupted block from group ORADATA was dumped to /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_rbal_48956.trc WARNING:cache read(retry)a corrupt block:group=5(ORADATA)dsk=5 blk=2 disk=5(ORADATA_0005)incarn=2406 au=0 blk=2 count=1 Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_rbal_48956.trc: ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483653] [2] [0 != 1] ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483653] [2] [0 != 1] ERROR: cache failed to read group=5(ORADATA) dsk=5 blk=2 from disk(s): 5(ORADATA_0005) ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483653] [2] [0 != 1] ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483653] [2] [0 != 1] NOTE: cache initiating offline of disk 5 group ORADATA NOTE: process _rbal_+asm1 (48956) initiating offline of disk 5.240607694 (ORADATA_0005) with mask 0x7e in group 5 NOTE: initiating PST update: grp = 5, dsk = 5/0xe5761ce, mask = 0x6a, op = clear GMON updating disk modes for group 5 at 155 for pid 18, osid 48956 ERROR: Disk 5 cannot be offlined, since diskgroup has external redundancy. ERROR: too many offline disks in PST (grp 5) Sat Dec 25 17:48:55 2021 NOTE: cache dismounting (not clean) group 5/0xD4B7AC6A (ORADATA) WARNING: Offline for disk ORADATA_0005 in mode 0x7f failed. Sat Dec 25 17:48:55 2021 NOTE: halting all I/Os to diskgroup 5 (ORADATA) NOTE: messaging CKPT to quiesce pins Unix process pid: 22744, image: oracle@wxzldb1 (B000) Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_rbal_48956.trc (incident=1289754): ORA-15335: ASM metadata corruption detected in disk group 'ORADATA' ORA-15130: diskgroup "ORADATA" is being dismounted ORA-15066: offlining disk "ORADATA_0005" in group "ORADATA" may result in a data loss ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483653] [2] [0 != 1] ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483653] [2] [0 != 1] Incident details in: /u01/app/grid/diag/asm/+asm/+ASM1/incident/incdir_1289754/+ASM1_rbal_48956_i1289754.trc NOTE: LGWR doing non-clean dismount of group 5 (ORADATA) NOTE: LGWR sync ABA=93.6418 last written ABA 93.6418 kjbdomdet send to inst 2 detach from dom 5, sending detach message to inst 2 Sat Dec 25 17:48:56 2021 List of instances: 1 2 Dirty detach reconfiguration started (new ddet inc 1, cluster inc 4) Sat Dec 25 17:48:56 2021 Sweep [inc][1289754]: completed Global Resource Directory partially frozen for dirty detach * dirty detach - domain 5 invalid = TRUE 41 GCS resources traversed, 0 cancelled Dirty Detach Reconfiguration complete freeing rdom 5 System State dumped to trace file /u01/app/grid/diag/asm/+asm/+ASM1/incident/incdir_1289754/+ASM1_rbal_48956_i1289754.trc WARNING: dirty detached from domain 5 NOTE: cache dismounted group 5/0xD4B7AC6A (ORADATA)
问题比较明显是由于disk=5 au=0 blk=2有问题导致磁盘组mount之后立马异常.通过kfed分析对应block情况
C:\Users\XFF>kfed read h:\temp\asmdisk\data14.dd|more kfbh.endian: 1 ; 0x000: 0x01 kfbh.hard: 130 ; 0x001: 0x82 kfbh.type: 1 ; 0x002: KFBTYP_DISKHEAD kfbh.datfmt: 1 ; 0x003: 0x01 kfbh.block.blk: 0 ; 0x004: blk=0 kfbh.block.obj: 2147483653 ; 0x008: disk=5 kfbh.check: 314993330 ; 0x00c: 0x12c66ab2 kfbh.fcn.base: 0 ; 0x010: 0x00000000 kfbh.fcn.wrap: 0 ; 0x014: 0x00000000 kfbh.spare1: 0 ; 0x018: 0x00000000 kfbh.spare2: 0 ; 0x01c: 0x00000000 kfdhdb.driver.provstr: ORCLDISK ; 0x000: length=8 kfdhdb.driver.reserved[0]: 0 ; 0x008: 0x00000000 kfdhdb.driver.reserved[1]: 0 ; 0x00c: 0x00000000 kfdhdb.driver.reserved[2]: 0 ; 0x010: 0x00000000 kfdhdb.driver.reserved[3]: 0 ; 0x014: 0x00000000 kfdhdb.driver.reserved[4]: 0 ; 0x018: 0x00000000 kfdhdb.driver.reserved[5]: 0 ; 0x01c: 0x00000000 kfdhdb.compat: 186646528 ; 0x020: 0x0b200000 kfdhdb.dsknum: 5 ; 0x024: 0x0005 kfdhdb.grptyp: 1 ; 0x026: KFDGTP_EXTERNAL kfdhdb.hdrsts: 3 ; 0x027: KFDHDR_MEMBER kfdhdb.dskname: ORADATA_0005 ; 0x028: length=12 kfdhdb.grpname: ORADATA ; 0x048: length=7 kfdhdb.fgname: ORADATA_0005 ; 0x068: length=12 C:\Users\XFF>kfed read h:\temp\asmdisk\data14.dd aun=0 blkn=2|more kfbh.endian: 0 ; 0x000: 0x00 kfbh.hard: 0 ; 0x001: 0x00 kfbh.type: 0 ; 0x002: KFBTYP_INVALID kfbh.datfmt: 0 ; 0x003: 0x00 kfbh.block.blk: 0 ; 0x004: blk=0 kfbh.block.obj: 0 ; 0x008: file=0 kfbh.check: 0 ; 0x00c: 0x00000000 kfbh.fcn.base: 0 ; 0x010: 0x00000000 kfbh.fcn.wrap: 0 ; 0x014: 0x00000000 kfbh.spare1: 0 ; 0x018: 0x00000000 kfbh.spare2: 0 ; 0x01c: 0x00000000 0066D8200 00000000 00000000 00000000 00000000 [................] Repeat 255 times KFED-00322: Invalid content encountered during block traversal: [kfbtTraverseBlock][Invalid OSM block type][][0]
通过kfed分析,该block确实异常,该block主要记录au的分配信息,如果asm 磁盘组的空间不变化,不执行rebalance,一般不会主动访问该block,不访问该block磁盘组也就不会dismount,按照这个解决思路,通过patch解决,让oradata磁盘组不再执行rebalance和分配/回收空间即可一直稳定的mount
数据库直接open成功,实现数据0丢失