有客户因为没有认识到linux中的磁盘被asm使用,对其进行分区并且做成了ext4的文件系统,从history中获取客户操作命令
600 fdisk -l
601 fdisk /dev/sdb
602 mkfs ext4 /dev/sdb1
603 fdisk -l
604 mkfs -t ext4 /dev/sdb1
605 cd /
606 mkdir u01
607 mount /dev/sdb1 /u01
608 df -h
确认磁盘情况,确认sdb直接被asm磁盘使用(asmdisk1)
[grid@racdb3 trace]$ ls -l /dev/asm*
brw-rw---- 1 grid asmadmin 8, 16 Sep 30 14:34 /dev/asmdisk1
[grid@racdb3 trace]$ ls -l /dev/sd*
brw-rw---- 1 root disk 8, 0 Jul 27 2021 /dev/sda
brw-rw---- 1 root disk 8, 1 Jul 27 2021 /dev/sda1
brw-rw---- 1 root disk 8, 2 Jul 27 2021 /dev/sda2
brw-rw---- 1 root disk 8, 16 Sep 30 11:23 /dev/sdb
brw-rw---- 1 root disk 8, 17 Sep 30 11:23 /dev/sdb1
brw-rw---- 1 root disk 8, 32 Jul 27 2021 /dev/sdc
asm日志报错
Fri Sep 30 11:31:41 2022
NOTE: SMON starting instance recovery for group DATA domain 1 (mounted)
NOTE: SMON skipping disk 0 - no header
NOTE: cache initiating offline of disk 0 group DATA
NOTE: process _smon_+asm3 (2989) initiating offline of disk 0.3915953109 (DATA_0000) with mask 0x7e in group 1
NOTE: initiating PST update: grp = 1, dsk = 0/0xe968b3d5, mask = 0x6a, op = clear
Fri Sep 30 11:31:41 2022
GMON updating disk modes for group 1 at 4 for pid 17, osid 2989
ERROR: Disk 0 cannot be offlined, since diskgroup has external redundancy.
ERROR: too many offline disks in PST (grp 1)
Fri Sep 30 11:31:41 2022
NOTE: cache dismounting (not clean) group 1/0x34F84324 (DATA)
WARNING: Offline for disk DATA_0000 in mode 0x7f failed.
Fri Sep 30 11:31:41 2022
NOTE: halting all I/Os to diskgroup 1 (DATA)
ERROR: No disks with F1X0 found on disk group DATA
NOTE: aborting instance recovery of domain 1 due to diskgroup dismount
NOTE: SMON skipping lock domain (1) validation because diskgroup being dismounted
数据库日志报错
Fri Sep 30 11:31:44 2022
Errors in file /oracle/app/oracle/diag/rdbms/xifenfei/xifenfei3/trace/xifenfei3_lmon_26356.trc:
ORA-00202: control file: '+DATA/xifenfei/controlfile/current.256.968794097'
ORA-15078: ASM diskgroup was forcibly dismounted
Fri Sep 30 11:31:45 2022
Errors in file /oracle/app/oracle/diag/rdbms/xifenfei/xifenfei3/trace/xifenfei3_ckpt_26388.trc:
ORA-00206: error in writing (block 5, # blocks 1) of control file
ORA-00202: control file: '+DATA/xifenfei/controlfile/current.257.968794097'
ORA-15078: ASM diskgroup was forcibly dismounted
ORA-15078: ASM diskgroup was forcibly dismounted
ORA-00206: error in writing (block 5, # blocks 1) of control file
ORA-00202: control file: '+DATA/xifenfei/controlfile/current.256.968794097'
ORA-15078: ASM diskgroup was forcibly dismounted
ORA-15078: ASM diskgroup was forcibly dismounted
Errors in file /oracle/app/oracle/diag/rdbms/xifenfei/xifenfei3/trace/xifenfei3_ckpt_26388.trc:
ORA-00221: error on write to control file
ORA-00206: error in writing (block 5, # blocks 1) of control file
ORA-00202: control file: '+DATA/xifenfei/controlfile/current.257.968794097'
ORA-15078: ASM diskgroup was forcibly dismounted
ORA-15078: ASM diskgroup was forcibly dismounted
ORA-00206: error in writing (block 5, # blocks 1) of control file
ORA-00202: control file: '+DATA/xifenfei/controlfile/current.256.968794097'
ORA-15078: ASM diskgroup was forcibly dismounted
ORA-15078: ASM diskgroup was forcibly dismounted
CKPT (ospid: 26388): terminating the instance due to error 221
通过kfed 查看asm disk被破坏情况
[root@racdb3 scsi_host]# kfed read /dev/asmdisk1
kfbh.endian: 0 ; 0x000: 0x00
kfbh.hard: 0 ; 0x001: 0x00
kfbh.type: 0 ; 0x002: KFBTYP_INVALID
kfbh.datfmt: 0 ; 0x003: 0x00
kfbh.block.blk: 0 ; 0x004: blk=0
kfbh.block.obj: 0 ; 0x008: file=0
kfbh.check: 0 ; 0x00c: 0x00000000
kfbh.fcn.base: 0 ; 0x010: 0x00000000
kfbh.fcn.wrap: 0 ; 0x014: 0x00000000
kfbh.spare1: 0 ; 0x018: 0x00000000
kfbh.spare2: 0 ; 0x01c: 0x00000000
7F4FAAD45400 00000000 00000000 00000000 00000000 [................]
Repeat 26 times
7F4FAAD455B0 00000000 00000000 45C222C8 01000000 [.........".E....]
7F4FAAD455C0 FE830001 003FFFFF E9D60000 0000FFFF [......?.........]
7F4FAAD455D0 00000000 00000000 00000000 00000000 [................]
Repeat 1 times
7F4FAAD455F0 00000000 00000000 00000000 AA550000 [..............U.]
7F4FAAD45600 00000000 00000000 00000000 00000000 [................]
Repeat 223 times
KFED-00322: Invalid content encountered during block traversal: [kfbtTraverseBlock][Invalid OSM block type][][0]
[root@racdb3 scsi_host]# kfed read /dev/asmdisk1 aun=2
kfbh.endian: 0 ; 0x000: 0x00
kfbh.hard: 0 ; 0x001: 0x00
kfbh.type: 0 ; 0x002: KFBTYP_INVALID
kfbh.datfmt: 0 ; 0x003: 0x00
kfbh.block.blk: 0 ; 0x004: blk=0
kfbh.block.obj: 0 ; 0x008: file=0
kfbh.check: 0 ; 0x00c: 0x00000000
kfbh.fcn.base: 0 ; 0x010: 0x00000000
kfbh.fcn.wrap: 0 ; 0x014: 0x00000000
kfbh.spare1: 0 ; 0x018: 0x00000000
kfbh.spare2: 0 ; 0x01c: 0x00000000
7F64E77A0400 00000000 00000000 00000000 00000000 [................]
Repeat 223 times
7F64E77A1200 000081F9 000181F9 000281F9 000381F9 [................]
7F64E77A1210 000481F9 000C81F9 000D81F9 001881F9 [................]
7F64E77A1220 002881F9 003E81F9 007981F9 00AB81F9 [..(...>...y.....]
7F64E77A1230 013881F9 016C81F9 044581F9 04B081F9 [..8...l...E.....]
7F64E77A1240 061A81F9 0CD081F9 1E8481F9 00000000 [................]
7F64E77A1250 00000000 00000000 00000000 00000000 [................]
Repeat 26 times
KFED-00322: Invalid content encountered during block traversal: [kfbtTraverseBlock][Invalid OSM block type][][0]
[root@racdb3 scsi_host]# kfed read /dev/asmdisk1 aun=3
kfbh.endian: 0 ; 0x000: 0x00
kfbh.hard: 0 ; 0x001: 0x00
kfbh.type: 0 ; 0x002: KFBTYP_INVALID
kfbh.datfmt: 0 ; 0x003: 0x00
kfbh.block.blk: 0 ; 0x004: blk=0
kfbh.block.obj: 0 ; 0x008: file=0
kfbh.check: 0 ; 0x00c: 0x00000000
kfbh.fcn.base: 0 ; 0x010: 0x00000000
kfbh.fcn.wrap: 0 ; 0x014: 0x00000000
kfbh.spare1: 0 ; 0x018: 0x00000000
kfbh.spare2: 0 ; 0x01c: 0x00000000
7F8D101FF400 00000000 00000000 00000000 00000000 [................]
Repeat 223 times
7F8D10200200 000082F9 000182F9 000282F9 000382F9 [................]
7F8D10200210 000482F9 000C82F9 000D82F9 001882F9 [................]
7F8D10200220 002882F9 003E82F9 007982F9 00AB82F9 [..(...>...y.....]
7F8D10200230 013882F9 016C82F9 044582F9 04B082F9 [..8...l...E.....]
7F8D10200240 061A82F9 0CD082F9 1E8482F9 00000000 [................]
7F8D10200250 00000000 00000000 00000000 00000000 [................]
Repeat 26 times
KFED-00322: Invalid content encountered during block traversal: [kfbtTraverseBlock][Invalid OSM block type][][0]
[root@racdb3 scsi_host]# kfed read /dev/asmdisk1 aun=4
kfbh.endian: 0 ; 0x000: 0x00
kfbh.hard: 0 ; 0x001: 0x00
kfbh.type: 0 ; 0x002: KFBTYP_INVALID
kfbh.datfmt: 0 ; 0x003: 0x00
kfbh.block.blk: 0 ; 0x004: blk=0
kfbh.block.obj: 0 ; 0x008: file=0
kfbh.check: 0 ; 0x00c: 0x00000000
kfbh.fcn.base: 0 ; 0x010: 0x00000000
kfbh.fcn.wrap: 0 ; 0x014: 0x00000000
kfbh.spare1: 0 ; 0x018: 0x00000000
kfbh.spare2: 0 ; 0x01c: 0x00000000
7F142949C400 00000000 00000000 00000000 00000000 [................]
Repeat 223 times
7F142949D200 000083F9 000183F9 000283F9 000383F9 [................]
7F142949D210 000483F9 000C83F9 000D83F9 001883F9 [................]
7F142949D220 002883F9 003E83F9 007983F9 00AB83F9 [..(...>...y.....]
7F142949D230 013883F9 016C83F9 044583F9 04B083F9 [..8...l...E.....]
7F142949D240 061A83F9 0CD083F9 1E8483F9 00000000 [................]
7F142949D250 00000000 00000000 00000000 00000000 [................]
Repeat 26 times
KFED-00322: Invalid content encountered during block traversal: [kfbtTraverseBlock][Invalid OSM block type][][0]
[root@racdb3 scsi_host]# kfed read /dev/asmdisk1 aun=5
kfbh.endian: 0 ; 0x000: 0x00
kfbh.hard: 0 ; 0x001: 0x00
kfbh.type: 0 ; 0x002: KFBTYP_INVALID
kfbh.datfmt: 0 ; 0x003: 0x00
kfbh.block.blk: 0 ; 0x004: blk=0
kfbh.block.obj: 0 ; 0x008: file=0
kfbh.check: 0 ; 0x00c: 0x00000000
kfbh.fcn.base: 0 ; 0x010: 0x00000000
kfbh.fcn.wrap: 0 ; 0x014: 0x00000000
kfbh.spare1: 0 ; 0x018: 0x00000000
kfbh.spare2: 0 ; 0x01c: 0x00000000
7F0615CF6400 00000000 00000000 00000000 00000000 [................]
Repeat 255 times
KFED-00322: Invalid content encountered during block traversal: [kfbtTraverseBlock][Invalid OSM block type][][0]
磁盘前几个au被破坏严重.而且相关的备份block都已经损坏,基于这种情况,直接参考:
asm磁盘dd破坏恢复
asm disk header 彻底损坏恢复
asm disk 磁盘部分被清空恢复
通过底层恢复出来相关数据文件,并检测正常
进一步通过au分配列表获恢复redo,ctl等文件
H:\TEMP\asm-ext4\other>dir
驱动器 H 中的卷是 SSD-SX
卷的序列号是 84EB-F434
H:\TEMP\asm-ext4\other 的目录
2022-09-30 21:52 25,165,824 256.dd
2022-09-30 21:52 25,165,824 257.dd
2022-09-30 23:52 52,429,312 258.dd.1
2022-09-30 23:54 52,429,312 259.dd.1
2022-09-30 23:55 52,429,312 260.dd.1
2022-09-30 23:55 52,429,312 261.dd.1
2022-09-30 23:56 52,429,312 270.dd.1
2022-09-30 23:57 52,429,312 271.dd.1
2022-09-30 23:57 52,429,312 272.dd.1
2022-09-30 23:57 52,429,312 273.dd.1
2022-09-30 23:58 52,429,312 274.dd.1
2022-10-01 00:01 52,429,312 275.dd.1
2022-10-01 00:00 52,429,312 276.dd.1
2022-10-01 00:00 52,429,312 277.dd.1
2022-10-01 00:00 52,429,312 278.dd.1
2022-09-30 23:59 52,429,312 279.dd.1
2022-09-30 23:59 52,429,312 280.dd.1
2022-09-30 23:59 52,429,312 281.dd.1
在另外的新机器上尝试恢复库
[oracle@xifenfei ~]$ sqlplus / as sysdba
SQL*Plus: Release 11.2.0.4.0 Production on Sat Oct 1 10:18:58 2022
Copyright (c) 1982, 2013, Oracle. All rights reserved.
Connected to an idle instance.
SQL> startup mount pfile='/tmp/pfile'
ORACLE instance started.
Total System Global Area 1519898624 bytes
Fixed Size 2253464 bytes
Variable Size 939527528 bytes
Database Buffers 570425344 bytes
Redo Buffers 7692288 bytes
ORA-00227: corrupt block detected in control file: (block 8, # blocks 1)
ORA-00202: control file: '/oradata/256.dd'
控制文件损坏,重建ctl
SQL> CREATE CONTROLFILE REUSE DATABASE "xifenfei" NORESETLOGS NOARCHIVELOG
2 MAXLOGFILES 50
3 MAXLOGMEMBERS 5
4 MAXDATAFILES 100
5 MAXINSTANCES 8
6 MAXLOGHISTORY 226
7 LOGFILE
8 group 7 '/oradata/270.dd.1' size 50M,
9 group 8 '/oradata/272.dd.1' size 50M,
10 group 5 '/oradata/274.dd.1' size 50M,
11 group 6 '/oradata/276.dd.1' size 50M,
12 group 3 '/oradata/278.dd.1' size 50M,
13 group 4 '/oradata/280.dd.1' size 50M,
14 group 1 '/oradata/258.dd.1' size 50M,
15 group 2 '/oradata/260.dd.1' size 50M
16 DATAFILE
17 '/oradata/1',
18 '/oradata/2',
19 '/oradata/3',
20 '/oradata/4',
21 '/oradata/5',
22 '/oradata/6',
23 '/oradata/7',
24 '/oradata/8',
25 '/oradata/9',
26 '/oradata/10',
27 '/oradata/11'
28 CHARACTER SET ZHS16GBK
29 ;
Control file created.
尝试open库,报ORA-600 kqfidps_update_stats:2,ORA-600 4194等错误
SQL> recover database;
Media recovery complete.
SQL> alter database open ;
alter database open
*
ERROR at line 1:
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00600: internal error code, arguments: [kqfidps_update_stats:2],
[0x7FFCCBEB3EC0], [], [], [], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [4193], [19319], [l.ok
解决该异常,open数据库成功
SQL> startup mount pfile='/tmp/pfile';
ORACLE instance started.
Total System Global Area 1519898624 bytes
Fixed Size 2253464 bytes
Variable Size 939527528 bytes
Database Buffers 570425344 bytes
Redo Buffers 7692288 bytes
Database mounted.
SQL> alter database open;
Database altered.
导出数据库,遭遇个别表如下ORA-08103和ORA-01555两种错误,这种是由于个别block在做成文件系统的时候被损坏,底层恢复的时候block被置空导致,对其异常表进行单独处理即可
. . 正在导出表 ALBUM
EXP-00056: 遇到 ORACLE 错误 8103
ORA-08103: 对象不再存在
. . 正在导出表 M_PUSH_CONTENT
EXP-00056: 遇到 ORACLE 错误 1555
ORA-01555: 快照过旧: 回退段号 (名称为 "") 过小
ORA-22924: 快照太旧
通过上述操作,实现客户数据的恢复,最大限度挽回客户损坏,再次提醒对于asm disk进行了误操作,建议第一时间保护现场(不要有任何的写入操作,可以最大限度恢复数据)