联系:手机/微信(+86 17813235971) QQ(107644445)
作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]
有朋友的asm磁盘组因为以前遗留问题(在另外一套机器上的asm disk被加入到了一个新的asm磁盘组中,导致老的dg直接dismount,新加入asm disk的磁盘组一直在使用,未听建议进行重建),昨天突然意外dismount了
Mon Dec 18 08:38:13 2023 NOTE: No asm libraries found in the system ASM Health Checker found 1 new failures Mon Dec 18 08:38:35 2023 NOTE: client his2:his registered, osid 3998514, mbr 0x1 Thu Jan 04 21:44:55 2024 WARNING: cache read a corrupt block: group=2(DATA) fn=1 blk=6743 disk=8 (DATA_0008) incarn=1428496145 au=3 blk=87 count=1 Errors in file /u01/app/grid/diag/asm/+asm/+ASM4/trace/+ASM4_ora_4915366.trc: ORA-15196: invalid ASM block header [kfc.c:26368] [blk_kfbl] [1] [6743] [6999 != 6743] NOTE: a corrupted block from group DATA was dumped to /u01/app/grid/diag/asm/+asm/+ASM4/trace/+ASM4_ora_4915366.trc WARNING: cache read (retry) a corrupt block: group=2(DATA) fn=1 blk=6743 disk=8 (DATA_0008) incarn=1428496145 au=3 blk=87 count=1 Errors in file /u01/app/grid/diag/asm/+asm/+ASM4/trace/+ASM4_ora_4915366.trc: ORA-15196: invalid ASM block header [kfc.c:26368] [blk_kfbl] [1] [6743] [6999 != 6743] ORA-15196: invalid ASM block header [kfc.c:26368] [blk_kfbl] [1] [6743] [6999 != 6743] ERROR: cache failed to read group=2(DATA) fn=1 blk=6743 from disk(s): 8(DATA_0008) ORA-15196: invalid ASM block header [kfc.c:26368] [blk_kfbl] [1] [6743] [6999 != 6743] ORA-15196: invalid ASM block header [kfc.c:26368] [blk_kfbl] [1] [6743] [6999 != 6743] NOTE: cache initiating offline of disk 8 group DATA NOTE: process _user4915366_+asm4 (4915366) initiating offline of disk 8.1428496145 (DATA_0008) with mask 0x7e in group 2 NOTE: initiating PST update: grp = 2, dsk = 8/0x55251f11, mask = 0x6a, op = clear Thu Jan 04 21:44:55 2024 GMON updating disk modes for group 2 at 9 for pid 24, osid 4915366 ERROR: Disk 8 cannot be offlined, since diskgroup has external redundancy. ERROR: too many offline disks in PST (grp 2) Thu Jan 04 21:44:55 2024 NOTE: cache dismounting (not clean) group 2/0x7F35EE0E (DATA) WARNING: Offline for disk DATA_0008 in mode 0x7f failed. Thu Jan 04 21:44:55 2024 NOTE: halting all I/Os to diskgroup 2 (DATA) NOTE: messaging CKPT to quiesce pins Unix process pid: 3473846, image: oracle@zzzx1 (B000) Errors in file /u01/app/grid/diag/asm/+asm/+ASM4/trace/+ASM4_ora_4915366.trc (incident=4023553): ORA-15335: ASM metadata corruption detected in disk group 'DATA' ORA-15130: diskgroup "DATA" is being dismounted ORA-15066: offlining disk "DATA_0008" in group "DATA" may result in a data loss ORA-15196: invalid ASM block header [kfc.c:26368] [blk_kfbl] [1] [6743] [6999 != 6743] ORA-15196: invalid ASM block header [kfc.c:26368] [blk_kfbl] [1] [6743] [6999 != 6743] Incident details in: /u01/app/grid/diag/asm/+asm/+ASM4/incident/incdir_4023553/+ASM4_ora_4915366_i4023553.trc Thu Jan 04 21:44:57 2024 ERROR: ORA-15130 in COD recovery for diskgroup 2/0x7f35ee0e (DATA) ERROR: ORA-15130 thrown in RBAL for group number 2 Errors in file /u01/app/grid/diag/asm/+asm/+ASM4/trace/+ASM4_rbal_2228716.trc: ORA-15130: diskgroup "DATA" is being dismounted
尝试重新mount 磁盘组,片刻之后自动dismount
Thu Jan 04 23:10:35 2024 NOTE: Instance updated compatible.asm to 11.2.0.0.0 for grp 2 SUCCESS: diskgroup DATA was mounted SUCCESS: alter diskgroup data mount Thu Jan 04 23:10:42 2024 NOTE: diskgroup resource ora.DATA.dg is online Thu Jan 04 23:10:47 2024 NOTE: client his2:his registered, osid 3998052, mbr 0x1 Thu Jan 04 23:11:00 2024 WARNING: cache read a corrupt block: group=2(DATA) fn=1 blk=6743 disk=8 (DATA_0008) incarn=1428496181 au=3 blk=87 count=1 Errors in file /u01/app/grid/diag/asm/+asm/+ASM4/trace/+ASM4_ora_4129826.trc: ORA-15196: invalid ASM block header [kfc.c:26368] [blk_kfbl] [1] [6743] [6999 != 6743] NOTE: a corrupted block from group DATA was dumped to /u01/app/grid/diag/asm/+asm/+ASM4/trace/+ASM4_ora_4129826.trc WARNING: cache read (retry) a corrupt block: group=2(DATA) fn=1 blk=6743 disk=8 (DATA_0008) incarn=1428496181 au=3 blk=87 count=1 Errors in file /u01/app/grid/diag/asm/+asm/+ASM4/trace/+ASM4_ora_4129826.trc: ORA-15196: invalid ASM block header [kfc.c:26368] [blk_kfbl] [1] [6743] [6999 != 6743] ORA-15196: invalid ASM block header [kfc.c:26368] [blk_kfbl] [1] [6743] [6999 != 6743] ERROR: cache failed to read group=2(DATA) fn=1 blk=6743 from disk(s): 8(DATA_0008) ORA-15196: invalid ASM block header [kfc.c:26368] [blk_kfbl] [1] [6743] [6999 != 6743] ORA-15196: invalid ASM block header [kfc.c:26368] [blk_kfbl] [1] [6743] [6999 != 6743] NOTE: cache initiating offline of disk 8 group DATA NOTE: process _user4129826_+asm4 (4129826) initiating offline of disk 8.1428496181 (DATA_0008) with mask 0x7e in group 2 NOTE: initiating PST update: grp = 2, dsk = 8/0x55251f35, mask = 0x6a, op = clear Thu Jan 04 23:11:01 2024 GMON updating disk modes for group 2 at 21 for pid 35, osid 4129826 ERROR: Disk 8 cannot be offlined, since diskgroup has external redundancy. ERROR: too many offline disks in PST (grp 2) Thu Jan 04 23:11:01 2024 NOTE: cache dismounting (not clean) group 2/0x1CB5EE3B (DATA) WARNING: Offline for disk DATA_0008 in mode 0x7f failed. NOTE: messaging CKPT to quiesce pins Unix process pid: 5112822, image: oracle@zzzx1 (B000)
从报错信息看是DATA_0008磁盘的au 3 blkn 87的block异常,应该是block 6743被写成了6999导致了该问题
kfbh.endian: 0 ; 0x000: 0x00 kfbh.hard: 130 ; 0x001: 0x82 kfbh.type: 4 ; 0x002: KFBTYP_FILEDIR kfbh.datfmt: 1 ; 0x003: 0x01 kfbh.block.blk: 6999 ; 0x004: blk=6999 kfbh.block.obj: 1 ; 0x008: file=1 kfbh.check: 3317183844 ; 0x00c: 0xc5b83564 kfbh.fcn.base: 165670551 ; 0x010: 0x09dfee97 kfbh.fcn.wrap: 0 ; 0x014: 0x00000000 kfbh.spare1: 0 ; 0x018: 0x00000000 kfbh.spare2: 0 ; 0x01c: 0x00000000 kfffdb.node.incarn: 1145623147 ; 0x000: A=1 NUMM=0x22246935 kfffdb.node.frlist.number: 4294967295 ; 0x004: 0xffffffff kfffdb.node.frlist.incarn: 0 ; 0x008: A=0 NUMM=0x0 kfffdb.hibytes: 0 ; 0x00c: 0x00000000 kfffdb.lobytes: 83482624 ; 0x010: 0x04f9d800
这个处理比较简单吧
kfbh.block.blk: 6999 ; 0x004: blk=6999 修改为 kfbh.block.blk: 6743; 0x004: blk=6743
通过检查确认磁盘组不再dismount,但是由于后续元数据还有问题,导致asm无法创建新的文件,后续建议:在数据库在mount状态下,rman备份,重建该磁盘组