非常规恢复 – 第11页

ORA-00354 ORA-00353 ORA-00312异常处理

数据库启动报错
WIN平台oracle 9.2.0.6版本数据库redo log block header损坏，ORA-00354 ORA-00353 ORA-00312错误导致数据库无法启动

SQL >alter database open;
*
ERROR at line 1:
ORA-00354: corrupt redo log block header
ORA-00353: log corruption near block 1892904 change 281470950178815
ORA-00312: online log 3 thread 1: 'D:\ORACLE\ORADATA\ZOYO\REDO03.LOG'

Sun Jan 24 15:44:05 2016
Database mounted in Exclusive Mode.
Completed: alter database mount exclusive
Sun Jan 24 15:44:05 2016
alter database open
Sun Jan 24 15:44:05 2016
Beginning crash recovery of 1 threads
Sun Jan 24 15:44:05 2016
Started redo scan
ORA-354 signalled during: alter database open...
Shutting down instance: further logons disabled
Shutting down instance (immediate)
License high water mark = 3
Sun Jan 24 15:44:32 2016
ALTER DATABASE CLOSE NORMAL
ORA-1109 signalled during: ALTER DATABASE CLOSE NORMAL...

通过分析,确定损坏的redo03为当前redo,无法使用正常方法打开,加上_allow_resetlogs_corruption参数,尝试打开库,依旧失败

数据库报ORA-600 2662错误

Sun Jan 24 16:26:30 2016
SMON: enabling cache recovery
Sun Jan 24 16:26:30 2016
Errors in file d:\oracle\admin\zoyo\udump\zoyo_ora_640.trc:
ORA-00600: 内部错误代码，参数: [2662], [0], [31563641], [0], [31563654], [4194721], [], []
Sun Jan 24 16:26:31 2016
Errors in file d:\oracle\admin\zoyo\udump\zoyo_ora_640.trc:
ORA-00704: 引导程序进程失败
ORA-00600: 内部错误代码，参数: [2662], [0], [31563641], [0], [31563654], [4194721], [], []
Sun Jan 24 16:26:31 2016
Error 704 happened during db open, shutting down database
USER: terminating instance due to error 704
Instance terminated by USER, pid = 640
ORA-1092 signalled during: alter database open resetlogs...

ORA 600 2662的错误处理
根据经验,这个错误只需要推scn即可,可以通过bbed,隐含参数,event,oradebug,修改控制文件等方法进行,推scn之后,数据库报熟悉的ORA-00604 ORA-00607 ORA-600 4194错误,以前我们遇到的block大部分是128,这次报异常block为9.实际中跟版本有关系,在ORACLE 9.2.0.6中该错误为file 1 block 9.大部分版本为128

Sun Jan 24 16:29:39 2016
SMON: enabling cache recovery
Sun Jan 24 16:29:39 2016
Errors in file d:\oracle\admin\zoyo\udump\zoyo_ora_3432.trc:
ORA-00600: 内部错误代码，参数: [4194], [14], [5], [], [], [], [], []
Sun Jan 24 16:29:39 2016
Doing block recovery for fno: 1 blk: 401
Sun Jan 24 16:29:39 2016
Recovery of Online Redo Log: Thread 1 Group 1 Seq 2 Reading mem 0
  Mem# 0 errs 0: D:\ORACLE\ORADATA\ZOYO\REDO01.LOG
Doing block recovery for fno: 1 blk: 9
Sun Jan 24 16:29:40 2016
Recovery of Online Redo Log: Thread 1 Group 1 Seq 2 Reading mem 0
  Mem# 0 errs 0: D:\ORACLE\ORADATA\ZOYO\REDO01.LOG
Sun Jan 24 16:29:40 2016
Errors in file d:\oracle\admin\zoyo\udump\zoyo_ora_3432.trc:
ORA-00604: 递归 SQL 层 1 出现错误
ORA-00607: 当更改数据块时出现内部错误
ORA-00600: 内部错误代码，参数: [4194], [14], [5], [], [], [], [], []
Error 604 happened during db open, shutting down database
USER: terminating instance due to error 604
Instance terminated by USER, pid = 3432

ORA-00604 ORA-00607 ORA-600 4194分析trace文件

*** 2016-01-24 16:29:40.031
Recovery of Online Redo Log: Thread 1 Group 1 Seq 2 Reading mem 0
Block image after block recovery:
buffer tsn: 0 rdba: 0x00400009 (1/9)
scn: 0x0000.01e112e1 seq: 0x01 flg: 0x04 tail: 0x12e10e01
frmt: 0x02 chkval: 0xba76 type: 0x0e=KTU UNDO HEADER W/UNLIMITED EXTENTS
  Extent Control Header
  -----------------------------------------------------------------
  Extent Header:: spare1: 0      spare2: 0      #extents: 6      #blocks: 47
                  last map  0x00000000  #maps: 0      offset: 4128
      Highwater::  0x00400191  ext#: 4      blk#: 0      ext size: 8
  #blocks in seg. hdr's freelists: 0
  #blocks below: 0
  mapblk  0x00000000  offset: 4
                   Unlocked
     Map Header:: next  0x00000000  #extents: 6    obj#: 0      flag: 0x40000000
  Extent Map
  -----------------------------------------------------------------
   0x0040000a  length: 7
   0x00400011  length: 8
   0x00400181  length: 8
   0x00400189  length: 8
   0x00400191  length: 8
   0x00400199  length: 8
  TRN CTL:: seq: 0x008e chd: 0x0060 ctl: 0x0024 inc: 0x00000000 nfb: 0x0001
            mgc: 0x8002 xts: 0x0068 flg: 0x0001 opt: 2147483646 (0x7ffffffe)
            uba: 0x00400191.008e.04 scn: 0x0000.01ded29c
Version: 0x01
  FREE BLOCK POOL::
    uba: 0x00400191.008e.04 ext: 0x4  spc: 0x1c3e
    uba: 0x00000000.002f.21 ext: 0x5  spc: 0x1334
    uba: 0x00000000.002e.37 ext: 0x4  spc: 0x788
    uba: 0x00000000.0000.00 ext: 0x0  spc: 0x0
    uba: 0x00000000.0000.00 ext: 0x0  spc: 0x0
  TRN TBL::

从这里可以确定undo segment header中的分配block记录有问题,清除ktuxc.fbp.fbp[N].kuba.kdba相关记录,数据库正常打开

    . struct ktuxc  kernel transaction undo xaction table control with 15 members
    . {
    .   struct kscn   scn with 3 members
    .   {
04148     ub4           bas      = 0X9CD2DE01 = 31380124
04152     ub2           wrp      = 0X0000 = 0
04154     cc32          pad      = 0X0000 = 0
    .   }
    .   struct kuba   uba with 4 members
    .   {
04156     kdba          dba      = 0X91014000 = 0x00400191 file 1 block 401
04160     ub2           seq      = 0X8E00 = 142
04162     ub1           rec      = 0X04 = 4
04163     cc16          pad      = 0X00 = 0
    .   }
04164   sb2           flg      = 0X0100 = 1
04166   ub2           seq      = 0X8E00 = 142
04168   sb2           nfb      = 0X0100 = 1
04170   cc32          pad1     = 0X0000 = 0
04172   ub4           inc      = 0X00000000 = 0
04176   sb2           chd      = 0X6000 = 96
04178   sb2           ctl      = 0X2400 = 36
04180   ub2x          mgc      = 0X0280 = 0x8002
04182   ub2           ver      = 0X0100 = 1
04184   ub2           xts      = 0X6800 = 104
04186   cc32          pad2     = 0X0000 = 0
04188   ub4           opt      = 0XFEFFFF7F = 2147483646
    .   ktufb fbp[5] (array with 5 elements)
    .     struct fbp   [0] with 3 members
    .     {
    .       struct kuba   uba with 4 members
    .       {
04192         kdba          dba      = 0X91014000 = 0x00400191 file 1 block 401
04196         ub2           seq      = 0X8E00 = 142
04198         ub1           rec      = 0X04 = 4
04199         cc16          pad      = 0X00 = 0
    .       }
04200       sb2           ext      = 0X0400 = 4
04202       sb2           spc      = 0X3E1C = 7230
    .     }
    .     struct fbp   [1] with 3 members
    .     {
    .       struct kuba   uba with 4 members
    .       {
04204         kdba          dba      = 0X00000000 = 0x00000000 file 0 block 0
04208         ub2           seq      = 0X2F00 = 47
04210         ub1           rec      = 0X21 = 33
04211         cc16          pad      = 0X00 = 0
    .       }
04212       sb2           ext      = 0X0500 = 5
04214       sb2           spc      = 0X3413 = 4916
    .     }
    .     struct fbp   [2] with 3 members
    .     {
    .       struct kuba   uba with 4 members
    .       {
04216         kdba          dba      = 0X00000000 = 0x00000000 file 0 block 0
04220         ub2           seq      = 0X2E00 = 46
04222         ub1           rec      = 0X37 = 55
04223         cc16          pad      = 0X00 = 0
    .       }
04224       sb2           ext      = 0X0400 = 4
04226       sb2           spc      = 0X8807 = 1928
    .     }
    .     struct fbp   [3] with 3 members
    .     {
    .       struct kuba   uba with 4 members
    .       {
04228         kdba          dba      = 0X00000000 = 0x00000000 file 0 block 0
04232         ub2           seq      = 0X0000 = 0
04234         ub1           rec      = 0X00 = 0
04235         cc16          pad      = 0X00 = 0
    .       }
04236       sb2           ext      = 0X0000 = 0
04238       sb2           spc      = 0X0000 = 0
    .     }
    .     struct fbp   [4] with 3 members
    .     {
    .       struct kuba   uba with 4 members
    .       {
04240         kdba          dba      = 0X00000000 = 0x00000000 file 0 block 0
04244         ub2           seq      = 0X0000 = 0
04246         ub1           rec      = 0X00 = 0
04247         cc16          pad      = 0X00 = 0
    .       }
04248       sb2           ext      = 0X0000 = 0
04250       sb2           spc      = 0X0000 = 0
    .     }
    . }

Sun Jan 24 16:44:52 2016
SMON: enabling tx recovery
Sun Jan 24 16:44:52 2016
Database Characterset is ZHS16GBK
replication_dependency_tracking turned off (no async multimaster replication found)
Completed: ALTER DATABASE OPEN

oracle asm disk格式化恢复—格式化为ntfs文件系统

联系：手机/微信(+86 17813235971) QQ(107644445)

标题：oracle asm disk格式化恢复—格式化为ntfs文件系统

接到网友请求,由于操作人员粗心把asm disk的磁盘映射到另外的机器上,并且格式化为了win ntfs文件系统,导致asm 磁盘组异常,数据库无法使用
asm 日志报ORA-27072错

Mon Nov 30 12:00:13 2015
Errors in file c:\app\administrator\diag\asm\+asm\+asm\trace\+asm_gmon_868.trc:
ORA-27070: async read/write failed
OSD-04008: WriteFile() 失败, 无法写入文件
O/S-Error: (OS 21) 设备未就绪。
WARNING: IO Failed. group:1 disk(number.incarnation):0.0xf0f0bbfb disk_path:\\.\ORCLDISKDATA0
	 AU:1 disk_offset(bytes):2093056 io_size:4096 operation:Write type:synchronous
	 result:I/O error process_id:868
WARNING: disk 0.4042308603 (DATA_0000) not responding to heart beat
ERROR: too many offline disks in PST (grp 1)
WARNING: Disk DATA_0000 in mode 0x7f will be taken offline
Mon Nov 30 12:00:13 2015
NOTE: process 576:37952 initiating offline of disk 0.4042308603 (DATA_0000) with mask 0x7e in group 1
WARNING: Disk DATA_0000 in mode 0x7f is now being taken offline
NOTE: initiating PST update: grp = 1, dsk = 0/0xf0f0bbfb, mode = 0x15
kfdp_updateDsk(): 5
kfdp_updateDskBg(): 5
Errors in file c:\app\administrator\diag\asm\+asm\+asm\trace\+asm_gmon_868.trc:
ORA-27072: File I/O error
WARNING: IO Failed. group:1 disk(number.incarnation):1.0xf0f0bbfc disk_path:\\.\ORCLDISKDATA1
	 AU:1 disk_offset(bytes):1048576 io_size:4096 operation:Read type:synchronous
	 result:I/O error process_id:868
Errors in file c:\app\administrator\diag\asm\+asm\+asm\trace\+asm_gmon_868.trc:
ORA-27072: File I/O error
WARNING: IO Failed. group:1 disk(number.incarnation):1.0xf0f0bbfc disk_path:\\.\ORCLDISKDATA1
	 AU:1 disk_offset(bytes):1052672 io_size:4096 operation:Read type:synchronous
	 result:I/O error process_id:868
Errors in file c:\app\administrator\diag\asm\+asm\+asm\trace\+asm_gmon_868.trc:
ORA-27072: File I/O error
WARNING: IO Failed. group:1 disk(number.incarnation):2.0xf0f0bbfd disk_path:\\.\ORCLDISKDATA2
	 AU:1 disk_offset(bytes):1048576 io_size:4096 operation:Read type:synchronous
	 result:I/O error process_id:868
Errors in file c:\app\administrator\diag\asm\+asm\+asm\trace\+asm_gmon_868.trc:
ORA-27072: File I/O error
WARNING: IO Failed. group:1 disk(number.incarnation):2.0xf0f0bbfd disk_path:\\.\ORCLDISKDATA2
	 AU:1 disk_offset(bytes):1052672 io_size:4096 operation:Read type:synchronous
	 result:I/O error process_id:868
Errors in file c:\app\administrator\diag\asm\+asm\+asm\trace\+asm_gmon_868.trc:
ORA-27072: File I/O error
WARNING: IO Failed. group:1 disk(number.incarnation):3.0xf0f0bbfe disk_path:\\.\ORCLDISKDATA3
	 AU:1 disk_offset(bytes):1048576 io_size:4096 operation:Read type:synchronous
	 result:I/O error process_id:868
Errors in file c:\app\administrator\diag\asm\+asm\+asm\trace\+asm_gmon_868.trc:
ORA-27072: File I/O error
WARNING: IO Failed. group:1 disk(number.incarnation):3.0xf0f0bbfe disk_path:\\.\ORCLDISKDATA3
	 AU:1 disk_offset(bytes):1052672 io_size:4096 operation:Read type:synchronous
	 result:I/O error process_id:868
Errors in file c:\app\administrator\diag\asm\+asm\+asm\trace\+asm_gmon_868.trc:
ORA-27072: File I/O error
WARNING: IO Failed. group:1 disk(number.incarnation):4.0xf0f0bbff disk_path:\\.\ORCLDISKDATA4
	 AU:1 disk_offset(bytes):1048576 io_size:4096 operation:Read type:synchronous
	 result:I/O error process_id:868
Errors in file c:\app\administrator\diag\asm\+asm\+asm\trace\+asm_gmon_868.trc:
ORA-27072: File I/O error
WARNING: IO Failed. group:1 disk(number.incarnation):4.0xf0f0bbff disk_path:\\.\ORCLDISKDATA4
	 AU:1 disk_offset(bytes):1052672 io_size:4096 operation:Read type:synchronous
	 result:I/O error process_id:868
Errors in file c:\app\administrator\diag\asm\+asm\+asm\trace\+asm_gmon_868.trc:
ORA-27072: File I/O error
WARNING: IO Failed. group:1 disk(number.incarnation):6.0xf0f0bc01 disk_path:\\.\ORCLDISKDATA6
	 AU:1 disk_offset(bytes):1048576 io_size:4096 operation:Read type:synchronous
	 result:I/O error process_id:868
Errors in file c:\app\administrator\diag\asm\+asm\+asm\trace\+asm_gmon_868.trc:
ORA-27072: File I/O error
WARNING: IO Failed. group:1 disk(number.incarnation):6.0xf0f0bc01 disk_path:\\.\ORCLDISKDATA6
	 AU:1 disk_offset(bytes):1052672 io_size:4096 operation:Read type:synchronous
	 result:I/O error process_id:868
Errors in file c:\app\administrator\diag\asm\+asm\+asm\trace\+asm_gmon_868.trc:
ORA-27072: File I/O error
WARNING: IO Failed. group:1 disk(number.incarnation):7.0xf0f0bc02 disk_path:\\.\ORCLDISKDATA7
	 AU:1 disk_offset(bytes):1048576 io_size:4096 operation:Read type:synchronous
	 result:I/O error process_id:868
Errors in file c:\app\administrator\diag\asm\+asm\+asm\trace\+asm_gmon_868.trc:
ORA-27072: File I/O error
WARNING: IO Failed. group:1 disk(number.incarnation):7.0xf0f0bc02 disk_path:\\.\ORCLDISKDATA7
	 AU:1 disk_offset(bytes):1052672 io_size:4096 operation:Read type:synchronous
	 result:I/O error process_id:868
ERROR: no PST quorum in group: required 1, found 0
WARNING: Disk DATA_0000 in mode 0x7f offline aborted
Mon Nov 30 12:00:14 2015
SQL> alter diskgroup DATA dismount force /* ASM SERVER */
NOTE: cache dismounting (not clean) group 1/0xBB404B03 (DATA)
Mon Nov 30 12:00:14 2015
NOTE: halting all I/Os to diskgroup DATA
Mon Nov 30 12:00:14 2015
NOTE: LGWR doing non-clean dismount of group 1 (DATA)
NOTE: LGWR sync ABA=367.7265 last written ABA 367.7265
NOTE: cache dismounted group 1/0xBB404B03 (DATA)
kfdp_dismount(): 6
kfdp_dismountBg(): 6
NOTE: De-assigning number (1,0) from disk (\\.\ORCLDISKDATA0)
NOTE: De-assigning number (1,1) from disk (\\.\ORCLDISKDATA1)
NOTE: De-assigning number (1,2) from disk (\\.\ORCLDISKDATA2)
NOTE: De-assigning number (1,3) from disk (\\.\ORCLDISKDATA3)
NOTE: De-assigning number (1,4) from disk (\\.\ORCLDISKDATA4)
NOTE: De-assigning number (1,5) from disk (\\.\ORCLDISKDATA5)
NOTE: De-assigning number (1,6) from disk (\\.\ORCLDISKDATA6)
NOTE: De-assigning number (1,7) from disk (\\.\ORCLDISKDATA7)
SUCCESS: diskgroup DATA was dismounted
NOTE: cache deleting context for group DATA 1/-1153414397
SUCCESS: alter diskgroup DATA dismount force /* ASM SERVER */
ERROR: PST-initiated MANDATORY DISMOUNT of group DATA

这里的asm日志很明显由于asm disk无法正常访问,报ORA-27072错误,磁盘组强制dismount.

分析磁盘情况
asm-disk1
asm-disk2

通过与客户沟通,确定从I到O本为asm disk 被格式化为了NTFS文件系统的磁盘,结合asmtool分析可以发现还有一个asm disk没有格式化掉,该磁盘组中一个共有8个磁盘格式化掉了7个.

通过kfed分析磁盘信息

C:\Users\Administrator>kfed read '\\.\J:'
kfbh.endian:                        235 ; 0x000: 0xeb
kfbh.hard:                           82 ; 0x001: 0x52
kfbh.type:                          144 ; 0x002: *** Unknown Enum ***
kfbh.datfmt:                         78 ; 0x003: 0x4e
kfbh.block.blk:               542328404 ; 0x004: T=0 NUMB=0x20534654
kfbh.block.obj:                 2105376 ; 0x008: TYPE=0x0 NUMB=0x2020
kfbh.check:                        2050 ; 0x00c: 0x00000802
kfbh.fcn.base:                        0 ; 0x010: 0x00000000
kfbh.fcn.wrap:                    63488 ; 0x014: 0x0000f800
kfbh.spare1:                   16711743 ; 0x018: 0x00ff003f
kfbh.spare2:                       2048 ; 0x01c: 0x00000800
ERROR!!!, failed to get the oracore error message
C:\Users\Administrator>kfed read '\\.\J:' blkn=2
kfbh.endian:                         70 ; 0x000: 0x46
kfbh.hard:                           73 ; 0x001: 0x49
kfbh.type:                           76 ; 0x002: *** Unknown Enum ***
kfbh.datfmt:                         69 ; 0x003: 0x45
kfbh.block.blk:                  196656 ; 0x004: T=0 NUMB=0x30030
kfbh.block.obj:                33563364 ; 0x008: TYPE=0x0 NUMB=0x22e4
kfbh.check:                           0 ; 0x00c: 0x00000000
kfbh.fcn.base:                    65537 ; 0x010: 0x00010001
kfbh.fcn.wrap:                    65592 ; 0x014: 0x00010038
kfbh.spare1:                        416 ; 0x018: 0x000001a0
kfbh.spare2:                       1024 ; 0x01c: 0x00000400
ERROR!!!, failed to get the oracore error message
C:\Users\Administrator>kfed read '\\.\J:' blkn=256
kfbh.endian:                          1 ; 0x000: 0x01
kfbh.hard:                          130 ; 0x001: 0x82
kfbh.type:                           13 ; 0x002: KFBTYP_PST_NONE
kfbh.datfmt:                          1 ; 0x003: 0x01
kfbh.block.blk:              2147483648 ; 0x004: T=1 NUMB=0x0
kfbh.block.obj:              2147483654 ; 0x008: TYPE=0x8 NUMB=0x6
kfbh.check:                    17662471 ; 0x00c: 0x010d8207
kfbh.fcn.base:                        0 ; 0x010: 0x00000000
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000
ERROR!!!, failed to get the oracore error message
C:\Users\Administrator>kfed read '\\.\J:' blkn=510
kfbh.endian:                          1 ; 0x000: 0x01
kfbh.hard:                          130 ; 0x001: 0x82
kfbh.type:                            1 ; 0x002: KFBTYP_DISKHEAD
kfbh.datfmt:                          1 ; 0x003: 0x01
kfbh.block.blk:                     254 ; 0x004: T=0 NUMB=0xfe
kfbh.block.obj:              2147483654 ; 0x008: TYPE=0x8 NUMB=0x6
kfbh.check:                   717599272 ; 0x00c: 0x2ac5b228
kfbh.fcn.base:                        0 ; 0x010: 0x00000000
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000
kfdhdb.driver.provstr:    ORCLDISKDATA6 ; 0x000: length=13
kfdhdb.driver.reserved[0]:   1096040772 ; 0x008: 0x41544144
kfdhdb.driver.reserved[1]:           54 ; 0x00c: 0x00000036
kfdhdb.driver.reserved[2]:            0 ; 0x010: 0x00000000
kfdhdb.driver.reserved[3]:            0 ; 0x014: 0x00000000
kfdhdb.driver.reserved[4]:            0 ; 0x018: 0x00000000
kfdhdb.driver.reserved[5]:            0 ; 0x01c: 0x00000000
…………

通过分析,可以确定asm disk的备份block没有被覆盖,原则上可以通过备份block实现磁盘组恢复,从而减小了恢复难度

kfed恢复磁盘头

C:\Users\Administrator> kfed repair '\\.\J:'
C:\Users\Administrator>kfed read '\\.\J:'
kfbh.endian:                          1 ; 0x000: 0x01
kfbh.hard:                          130 ; 0x001: 0x82
kfbh.type:                            1 ; 0x002: KFBTYP_DISKHEAD
kfbh.datfmt:                          1 ; 0x003: 0x01
kfbh.block.blk:                     254 ; 0x004: T=0 NUMB=0xfe
kfbh.block.obj:              2147483654 ; 0x008: TYPE=0x8 NUMB=0x6
kfbh.check:                   717599272 ; 0x00c: 0x2ac5b228
kfbh.fcn.base:                        0 ; 0x010: 0x00000000
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000
kfdhdb.driver.provstr:    ORCLDISKDATA6 ; 0x000: length=13
kfdhdb.driver.reserved[0]:   1096040772 ; 0x008: 0x41544144
kfdhdb.driver.reserved[1]:           54 ; 0x00c: 0x00000036
kfdhdb.driver.reserved[2]:            0 ; 0x010: 0x00000000
kfdhdb.driver.reserved[3]:            0 ; 0x014: 0x00000000
kfdhdb.driver.reserved[4]:            0 ; 0x018: 0x00000000
kfdhdb.driver.reserved[5]:            0 ; 0x01c: 0x00000000
…………

确定asm disk相关信息
对于7个被格式化的磁盘都进行类似处理之后,通过工具看到相关磁盘信息如下
asm-disk3

恢复处理
根据ntfs的文件系统分布,我们可以知道,虽然asm disk header备份block正常,但是asm disk中间部分依旧有不少au会被破坏
ntfs

这样的情况,不合适直接使用工具拷贝出来datafile(由于可能记录block的字典正好被覆盖,导致拷贝出来的文件异常,在恢复过程中我们也做了试验小文件拷贝ok,大文件拷贝然后使用dbv检测有很多坏块),我们采用工具(asm disk header 彻底损坏恢复)从底层扫描直接重组出来asm disk中的数据文件,然后结合拷贝出来的控制文件,redo文件,参数文件,然后通过重命名相关路径,然后直接open数据库

Q:\>sqlplus / as sysdba
SQL*Plus: Release 11.2.0.1.0 Production on 星期三 1月 22 16:08:18 2014
Copyright (c) 1982, 2010, Oracle.  All rights reserved.
连接到:
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
SQL> set pages 1000
SQL> col name for a100
SQL> set lines 150
SQL> select file#,name from v$datafile;
     FILE# NAME
---------- --------------------------------------------------------------------
         1 +DATA/vspdb/datafile/system.256.778520603
         2 +DATA/vspdb/datafile/sysaux.257.778520603
         3 +DATA/vspdb/datafile/undotbs1.258.778520603
         4 +DATA/vspdb/datafile/users.259.778520603
         5 +DATA/vspdb/datafile/vsp_tbs.293.779926097
        …………
       147 +DATA/vspdb/datafile/index_dg.418.864665747
       148 +DATA/vspdb/datafile/data_dg.419.864667053
       149 +DATA/vspdb/datafile/vsp_mm_tbs.420.890410367
       150 +DATA/vspdb/datafile/vsp_mm_tbs.421.890410457
SQL> select member from v$logfile;
MEMBER
-------------------------------------------------------------------------------------
+DATA/vspdb/onlinelog/group_7.263.862676593
+DATA/vspdb/onlinelog/group_7.262.862676601
+DATA/vspdb/onlinelog/group_4.410.862652291
+DATA/vspdb/onlinelog/group_4.411.862652307
+DATA/vspdb/onlinelog/group_5.412.862653715
+DATA/vspdb/onlinelog/group_5.413.862653727
+DATA/vspdb/onlinelog/group_6.414.862676425
+DATA/vspdb/onlinelog/group_6.415.862676433
重命名数据文件和redo文件，open数据库
SQL> recover database;
完成介质恢复。
SQL> alter database open;
数据库已更改。
已用时间:  00: 00: 04.51

由于部分block被覆盖,使用空块代替,导致数据访问到该block就会出现ora-8103(模拟普通ORA-08103并解决,模拟极端ORA-08103并解决)错误,对于该种对象,最简单处理方法就是直接通过dul抽出来数据然后truncate table重新导入数据,当然如果你想彻底安全逻辑方式重建库最靠谱
如果您遇到此类情况,无法解决请联系我们，提供专业ORACLE数据库恢复技术支持
Phone:17813235971 Q Q:107644445 E-Mail:dba@xifenfei.com

分享某客户存储异常恢复之后oracle故障恢复—ORA-600 4155

联系：手机/微信(+86 17813235971) QQ(107644445)

标题：分享某客户存储异常恢复之后oracle故障恢复—ORA-600 4155

某客户使用win 2003，Oracle 11.2.0.1+ASM架构方式,由于存储异常并且做了存储恢复之后,ASM可以正常mount起来,但是数据库无法打开
使用dbv检查system发现有少量坏块

DBVERIFY - 开始验证: FILE = +DATA/xifenfei/datafile/system.256.764288125
页 3117 标记为损坏
Corrupt block relative dba: 0x00400c2d (file 1, block 3117)
Bad header found during dbv:
Data in bad block:
 type: 11 format: 2 rdba: 0x00400001
 last change scn: 0x0000.00000000 seq: 0x1 flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x00000b01
 check value in block header: 0xfeec
 computed block checksum: 0x0
Corrupt block relative dba: 0x0042002d (file 1, block 131117)
Bad header found during dbv:
Data in bad block:
 type: 11 format: 2 rdba: 0x00400001
 last change scn: 0x0000.00000000 seq: 0x1 flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x00000b01
 check value in block header: 0xfeec
 computed block checksum: 0x0
Corrupt block relative dba: 0x0042003d (file 1, block 131133)
Bad header found during dbv:
Data in bad block:
 type: 11 format: 2 rdba: 0x00400001
 last change scn: 0x0000.00000000 seq: 0x1 flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x00000b01
 check value in block header: 0xfeec
 computed block checksum: 0x0
DBVERIFY - 验证完成
检查的页总数: 222208
处理的页总数 (数据): 188939
失败的页总数 (数据): 19
处理的页总数 (索引): 17375
失败的页总数 (索引): 0
处理的页总数 (其他): 3190
处理的总页数 (段)  : 1
失败的总页数 (段)  : 0
空的页总数: 12701
标记为损坏的总页数: 3
流入的页总数: 0
加密的总页数        : 0
最高块 SCN            : 0 (0.0)

很多”页 131125 失败, 校验代码为 6125″类似错误忽略.
我们对于这些坏块进行分析，这些坏块未涉及oracle 最核心的基表数据,从理论上可以open数据库

尝试打开数据库

ALTER DATABASE OPEN
Beginning crash recovery of 1 threads
 parallel recovery started with 32 processes
Started redo scan
Mon Nov 16 14:12:45 2015
NOTE: dependency between database xifenfei and diskgroup resource ora.DATA.dg is established
Errors in file d:\oracle\diag\rdbms\xifenfei\xifenfei\trace\xifenfei_ora_5672.trc  (incident=937262):
ORA-00353: 日志损坏接近块 12509 更改 14199034494312 时间 02/05/2015 03:09:12
ORA-00312: 联机日志 2 线程 1: '+DATA/xifenfei/onlinelog/group_2.265.764288315'
ORA-00312: 联机日志 2 线程 1: '+DATA/xifenfei/onlinelog/group_2.264.764288315'
Incident details in: d:\oracle\diag\rdbms\xifenfei\xifenfei\incident\incdir_937262\xifenfei_ora_5672_i937262.trc
Media Recovery failed with error 399
ORA-355 signalled during: ALTER DATABASE RECOVER  DATABASE  ...

可以确定存储恢复的redo有问题(ORA-00353,ORA-00312),数据库无法直接打开

使用参数_allow_resetlogs_corruption屏蔽redo异常resetlogs库

Mon Nov 16 15:26:41 2015
alter database open resetlogs
Mon Nov 16 15:26:41 2015
Starting background process ASMB
Mon Nov 16 15:26:41 2015
ASMB started with pid=25, OS id=6612
Starting background process RBAL
Mon Nov 16 15:26:41 2015
RBAL started with pid=26, OS id=6940
NOTE: initiating MARK startup
Starting background process MARK
Mon Nov 16 15:26:41 2015
MARK started with pid=27, OS id=1720
NOTE: MARK has subscribed
NOTE: Loaded library: System
SUCCESS: diskgroup DATA was mounted
RESETLOGS is being done without consistancy checks. This may result
in a corrupted database. The database should be recreated.
Mon Nov 16 15:26:44 2015
NOTE: dependency between database xifenfei and diskgroup resource ora.DATA.dg is established
Archived Log entry 1 added for thread 1 sequence 1390429 ID 0xf6320db5 dest 1:
Archived Log entry 2 added for thread 1 sequence 1390427 ID 0xf6320db5 dest 1:
ARCH: Log corruption near block 16350 change 14199035261082 time ?
CORRUPTION DETECTED: thread 1 sequence 1390428 log 3 at block 16350. Arch found corrupt blocks
Errors in file d:\oracle\diag\rdbms\xifenfei\xifenfei\trace\xifenfei_ora_8132.trc  (incident=951271):
ORA-00353: 日志损坏接近块 16350 更改 14199035261082 时间 02/05/2015 03:12:49
ORA-00312: 联机日志 3 线程 1: '+DATA/xifenfei/onlinelog/group_3.267.764288315'
ORA-00312: 联机日志 3 线程 1: '+DATA/xifenfei/onlinelog/group_3.266.764288315'
Incident details in: d:\oracle\diag\rdbms\xifenfei\xifenfei\incident\incdir_951271\xifenfei_ora_8132_i951271.trc
Errors in file d:\oracle\diag\rdbms\xifenfei\xifenfei\trace\xifenfei_ora_8132.trc:
ORA-00354: 损坏重做日志块标头
ORA-00353: 日志损坏接近块 16350 更改 14199035261082 时间 02/05/2015 03:12:49
ORA-00312: 联机日志 3 线程 1: '+DATA/xifenfei/onlinelog/group_3.267.764288315'
ORA-00312: 联机日志 3 线程 1: '+DATA/xifenfei/onlinelog/group_3.266.764288315'
ARCH: All Archive destinations made inactive due to error 354
Committing creation of archivelog '+DATA/xifenfei/archivelog/2015_11_16/thread_1_seq_1390428.276.895937207' (error 354)
Deleted Oracle managed file +DATA/xifenfei/archivelog/2015_11_16/thread_1_seq_1390428.276.895937207
******************************************************
Detected premature EOF of log 3 at block 16350; re-trying archival
******************************************************
Mon Nov 16 15:26:49 2015
Sweep [inc][951271]: completed
Mon Nov 16 15:26:49 2015
Trace dumping is performing id=[cdmp_20151116152649]
Archived Log entry 3 added for thread 1 sequence 1390428 ID 0xf6320db5 dest 1:
RESETLOGS after incomplete recovery UNTIL CHANGE 14199033899179
Resetting resetlogs activation ID 4130475445 (0xf6320db5)
Errors in file d:\oracle\diag\rdbms\xifenfei\xifenfei\trace\xifenfei_m001_800.trc  (incident=951319):
ORA-00353: log corruption near block 2270 change 14199035131016 time 02/05/2015 03:12:58
ORA-00334: archived log: '+DATA/xifenfei/archivelog/2015_11_16/thread_1_seq_1390428.276.895937209'
Incident details in: d:\oracle\diag\rdbms\xifenfei\xifenfei\incident\incdir_951319\xifenfei_m001_800_i951319.trc
Errors in file d:\oracle\diag\rdbms\xifenfei\xifenfei\trace\xifenfei_m001_800.trc  (incident=951320):
ORA-00355: change numbers out of order
ORA-00353: log corruption near block 2270 change 14199035131016 time 02/05/2015 03:12:58
ORA-00334: archived log: '+DATA/xifenfei/archivelog/2015_11_16/thread_1_seq_1390428.276.895937209'
Incident details in: d:\oracle\diag\rdbms\xifenfei\xifenfei\incident\incdir_951320\xifenfei_m001_800_i951320.trc
Trace dumping is performing id=[cdmp_20151116152651]
Mon Nov 16 15:26:51 2015
Sweep [inc][951320]: completed
Sweep [inc][951319]: completed
Sweep [inc2][951320]: completed
Sweep [inc2][951319]: completed
Sweep [inc2][951271]: completed
Trace dumping is performing id=[cdmp_20151116152653]
Checker run found 1 new persistent data failures
Mon Nov 16 15:26:53 2015
Setting recovery target incarnation to 2
Mon Nov 16 15:26:54 2015
Assigning activation ID 4262085362 (0xfe0a42f2)
LGWR: STARTING ARCH PROCESSES
Mon Nov 16 15:26:54 2015
ARC0 started with pid=29, OS id=2896
ARC0: Archival started
LGWR: STARTING ARCH PROCESSES COMPLETE
ARC0: STARTING ARCH PROCESSES
Mon Nov 16 15:26:55 2015
ARC1 started with pid=30, OS id=1748
Mon Nov 16 15:26:55 2015
ARC2 started with pid=31, OS id=1920
ARC1: Archival started
ARC1: Becoming the 'no FAL' ARCH
ARC1: Becoming the 'no SRL' ARCH
Thread 1 opened at log sequence 1
  Current log# 1 seq# 1 mem# 0: +DATA/xifenfei/onlinelog/group_1.262.764288315
  Current log# 1 seq# 1 mem# 1: +DATA/xifenfei/onlinelog/group_1.263.764288315
Successful open of redo thread 1
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Mon Nov 16 15:26:55 2015
SMON: enabling cache recovery
Mon Nov 16 15:26:55 2015
ARC3 started with pid=32, OS id=7236
ARC2: Archival started
ARC3: Archival started
ARC0: STARTING ARCH PROCESSES COMPLETE
ARC0: Becoming the heartbeat ARCH
Errors in file d:\oracle\diag\rdbms\xifenfei\xifenfei\trace\xifenfei_ora_8132.trc  (incident=951272):
ORA-00600: 内部错误代码, 参数: [4155], [], [], [], [], [], [], [], [], [], [], []
Incident details in: d:\oracle\diag\rdbms\xifenfei\xifenfei\incident\incdir_951272\xifenfei_ora_8132_i951272.trc
Errors in file d:\oracle\diag\rdbms\xifenfei\xifenfei\trace\xifenfei_ora_8132.trc:
ORA-00600: 内部错误代码, 参数: [4155], [], [], [], [], [], [], [], [], [], [], []
Errors in file d:\oracle\diag\rdbms\xifenfei\xifenfei\trace\xifenfei_ora_8132.trc:
ORA-00600: 内部错误代码, 参数: [4155], [], [], [], [], [], [], [], [], [], [], []
Error 600 happened during db open, shutting down database
Trace dumping is performing id=[cdmp_20151116152658]
USER (ospid: 8132): terminating the instance due to error 600
Mon Nov 16 15:27:07 2015
Instance terminated by USER, pid = 8132
ORA-1092 signalled during: alter database open resetlogs...

在resetlogs过程中由于遭遇了ORA-600[4155]导致数据库无法正常打开.

分析相关trace文件

*** 2015-11-16 15:26:56.921
*** SESSION ID:(145.3) 2015-11-16 15:26:56.921
*** CLIENT ID:() 2015-11-16 15:26:56.921
*** SERVICE NAME:(SYS$USERS) 2015-11-16 15:26:56.921
*** MODULE NAME:(sqlplus.exe) 2015-11-16 15:26:56.921
*** ACTION NAME:() 2015-11-16 15:26:56.921
Dump continued from file: d:\oracle\diag\rdbms\xifenfei\xifenfei\trace\xifenfei_ora_8132.trc
ORA-00600: 内部错误代码, 参数: [4155], [], [], [], [], [], [], [], [], [], [], []
========= Dump for incident 951272 (ORA 600 [4155]) ========
----- Beginning of Customized Incident Dump(s) -----
XID passed in =xid: 0x000b.001.00fcfb45
XID from Undo block = xid: 0x000b.009.00fcc561
----- End of Customized Incident Dump(s) -----
*** 2015-11-16 15:26:57.203
dbkedDefDump(): Starting incident default dumps (flags=0x2, level=3, mask=0x0)
----- Current SQL Statement for this session (sql_id=7j16t46cacjt9) -----
alter database open resetlogs
----- Call Stack Trace -----
calling              call     entry                argument values in hex
location             type     point                (? means dubious value)
-------------------- -------- -------------------- ----------------------------
ksedst1()+129        CALL???  skdstdst()           009233DA2 000000000 000000000
                                                   000000000
ksedst()+69          CALL???  ksedst1()            000000002 000000000 006F605E0
                                                   000000000
dbkedDefDump()+4536  CALL???  ksedst()             000000287 000000000 000000000
                                                   000000000
ksedmp()+43          CALL???  dbkedDefDump()       000000003 000000002 000000000
                                                   000468E71
ksfdmp()+87          CALL???  ksedmp()             000000000 000000000 000000000
                                                   000000000
dbgexPhaseII()+1819  CALL???  ksfdmp()             000000000 000000000 000000000
                                                   000000000
dbgexExplicitEndInc  CALL???  dbgexPhaseII()       000000000 000000000 000000000
()+755                                             000000000
dbgeEndDDEInvocatio  CALL???  dbgexExplicitEndInc  00CFB0570 00CFB7540 01F9012D8
nImpl()+748                   ()                   01F901300
dbgeEndDDEInvocatio  CALL???  dbgeEndDDEInvocatio  00CFB0570 00CFB7540 01F903130
n()+47                        nImpl()              00000000A
ktugce()+610         CALL???  dbgeEndDDEInvocatio  006E24498 01F8FF91E 00000002E
                              n()                  035636366
ktdgti()+609         CALL???  ktugce()             000000000 000001F68 000001F68
                                                   000000001
k2vGetCollectingInf  CALL???  ktdgti()             000000000 008A34E71 01F902470
o()+324                                            000000018
k2vcbk()+182         CALL???  k2vGetCollectingInf  000000000 000000000 000000000
                              o()                  000000008
kturRecoverTxn()+82  CALL???  k2vcbk()             000000000 FCFB450000000B
67                                                 000610001 7FFDF055860
kturRecoverUndoSegm  CALL???  kturRecoverTxn()     01F903448 009460001 000000000
ent()+1371                                         00147AE14
ktuiup()+1520        CALL???  kturRecoverUndoSegm  7FF0000000B 01F903628
                              ent()                000000000 00147AE14
ktuini()+80          CALL???  ktuiup()             000000001 00000000E 01F9038A0
                                                   003CB3D06
adbdrv()+44263       CALL???  ktuini()             000000000 01F9090C8 000000000
                                                   000000000
opiexe()+20842       CALL???  adbdrv()             000000023 000000003
                                                   7FF00000102 000000000
opiosq0()+5129       CALL???  opiexe()+16981       000000004 000000000 01F90A8E0
                                                   009361AB3
kpooprx()+357        CALL???  opiosq0()            000000003 00000000E 01F90ABB0
                                                   0000000A4
kpoal8()+940         CALL???  kpooprx()            000020C80 01E65CCD0 00CE91AD8
                                                   000000001
opiodr()+1662        CALL???  kpoal8()             00000005E 00000001C 01F90E120
                                                   00A4EF224
ttcpip()+1325        CALL???  opiodr()             480000000000005E
                                                   49004D000000001C 01F90E120
                                                   4100200000000000
opitsk()+2040        CALL???  ttcpip()             01E735200 000000000 000000000
                                                   000000000
opiino()+1258        CALL???  opitsk()             00000001E 000000000 000000000
                                                   01F90FA18
opiodr()+1662        CALL???  opiino()             00000003C 000000004 01F90FAD0
                                                   000000000
opidrv()+864         CALL???  opiodr()             00000003C 000000004 01F90FAD0
                                                   6F5C3A6400000000
sou2o()+98           CALL???  opidrv()+150         00000003C 000000004 01F90FAD0
                                                   000000000
opimai_real()+158    CALL???  sou2o()              01F90FB00 01F90FBC4
                                                   F0010000B07DF 1009C002B0019
opimai()+191         CALL???  opimai_real()        00000001B 01F90FC88 000000036
                                                   000000000
OracleThreadStart()  CALL???  opimai()             01F90FE90 01F60FF38 000000002
+724                                               01F90FC88
0000000078D3B6DA     CALL???  OracleThreadStart()  01F60FF38 000000000 000000000
                                                   01F90FFA8
--------------------- Binary Stack Dump ---------------------
UNDO BLK:
xid: 0x000b.009.00fcc561  seq: 0x513d cnt: 0x2f  irb: 0x2f  icl: 0x0   flg: 0x0000
 Rec Offset      Rec Offset      Rec Offset      Rec Offset      Rec Offset
---------------------------------------------------------------------------
0x01 0x1f64     0x02 0x1ee4     0x03 0x1e30     0x04 0x1d7c     0x05 0x1cc8
0x06 0x1c14     0x07 0x1b60     0x08 0x1aac     0x09 0x19f8     0x0a 0x1944
0x0b 0x1890     0x0c 0x17dc     0x0d 0x1728     0x0e 0x1674     0x0f 0x15c0
0x10 0x150c     0x11 0x1458     0x12 0x13a4     0x13 0x12f0     0x14 0x123c
0x15 0x1188     0x16 0x10d4     0x17 0x1050     0x18 0x0fd0     0x19 0x0f1c
0x1a 0x0e98     0x1b 0x0de4     0x1c 0x0d30     0x1d 0x0c7c     0x1e 0x0bc8
0x1f 0x0b44     0x20 0x0ac4     0x21 0x0a10     0x22 0x095c     0x23 0x08a8
0x24 0x07f4     0x25 0x0770     0x26 0x06f0     0x27 0x063c     0x28 0x0588
0x29 0x04d4     0x2a 0x0420     0x2b 0x039c     0x2c 0x031c     0x2d 0x0298
0x2e 0x0218     0x2f 0x0164

ORA-600 4155是由于在恢复过程中发现事务的id和undo segment中的事务表中id不匹配从而出现此类问题.针对此问题,可以通过bbed修改事务表记录,或者直接丢弃该事务,从而绕过该错误.

处理掉异常事务id后,继续open库

Mon Nov 16 16:16:07 2015
ALTER DATABASE OPEN
Beginning crash recovery of 1 threads
parallel recovery started with 32 processes
Started redo scan
Completed redo scan
read 8 KB redo, 0 data blocks need recovery
Started redo application at
Thread 1: logseq 1, block 2, scn 14199161880578
Recovery of Online Redo Log: Thread 1 Group 1 Seq 1 Reading mem 0
Mem# 0: +DATA/xifenfei/onlinelog/group_1.262.764288315
Mem# 1: +DATA/xifenfei/onlinelog/group_1.263.764288315
Completed redo application of 0.00MB
Completed crash recovery at
Thread 1: logseq 1, block 19, scn 14199161900601
0 data blocks read, 0 data blocks written, 8 redo k-bytes read
Current SCN is not changed: _minimum_giga_scn (scn 14199161880576) is too small
Mon Nov 16 16:16:08 2015
LGWR: STARTING ARCH PROCESSES
Mon Nov 16 16:16:08 2015
ARC0 started with pid=62, OS id=7612
ARC0: Archival started
LGWR: STARTING ARCH PROCESSES COMPLETE
ARC0: STARTING ARCH PROCESSES
Mon Nov 16 16:16:09 2015
ARC1 started with pid=63, OS id=5620
Mon Nov 16 16:16:09 2015
ARC2 started with pid=64, OS id=7308
ARC1: Archival started
ARC1: Becoming the 'no FAL' ARCH
ARC1: Becoming the 'no SRL' ARCH
Thread 1 advanced to log sequence 2 (thread open)
Thread 1 opened at log sequence 2
Current log# 2 seq# 2 mem# 0: +DATA/xifenfei/onlinelog/group_2.264.764288315
Current log# 2 seq# 2 mem# 1: +DATA/xifenfei/onlinelog/group_2.265.764288315
Successful open of redo thread 1
Archived Log entry 1 added for thread 1 sequence 1 ID 0xfe09f6df dest 1:
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Mon Nov 16 16:16:10 2015
SMON: enabling cache recovery
Dictionary check beginning
Tablespace 'TEMP' #3 found in data dictionary,
but not in the controlfile. Adding to controlfile.
Dictionary check complete
Verifying file header compatibility for 11g tablespace encryption..
Verifying 11g file header compatibility for tablespace encryption completed
SMON: enabling tx recovery
*********************************************************************
WARNING: The following temporary tablespaces contain no files.
This condition can occur when a backup controlfile has
Mon Nov 16 16:16:09 2015
ARC3 started with pid=65, OS id=7064
been restored. It may be necessary to add files to these
ARC2: Archival started
tablespaces. That can be done using the SQL statement:
ARC3: Archival started
ARC0: STARTING ARCH PROCESSES COMPLETE
ALTER TABLESPACE <tablespace_name> ADD TEMPFILE
ARC0: Becoming the heartbeat ARCH
Alternatively, if these temporary tablespaces are no longer
needed, then they can be dropped.
Empty temporary tablespace: TEMP
*********************************************************************
Database Characterset is ZHS16GBK
No Resource Manager plan active
**********************************************************
WARNING: Files may exists in db_recovery_file_dest
that are not known to the database. Use the RMAN command
CATALOG RECOVERY AREA to re-catalog any such files.
If files cannot be cataloged, then manually delete them
using OS command.
One of the following events caused this:
1. A backup controlfile was restored.
2. A standby controlfile was restored.
3. The controlfile was re-created.
4. db_recovery_file_dest had previously been enabled and
then disabled.
**********************************************************
replication_dependency_tracking turned off (no async multimaster replication found)
WARNING: AQ_TM_PROCESSES is set to 0. System operation might be adversely affected.
LOGSTDBY: Validating controlfile with logical metadata
LOGSTDBY: Validation complete
Completed: ALTER DATABASE OPEN

因为该数据库是经过了存储恢复,除system之外,其他文件也有大量坏块,因为恢复过程相对比较麻烦,除了上面列出来的ORA-00353,ORA-00312,ORA-600 4155等各种错误之外,还有大量的ORA-01578,ORA-01110.由于11G比较常见的ORA-00283,ORA-16433问题。

某医院存储掉线导致Oracle数据库故障恢复

联系：手机/微信(+86 17813235971) QQ(107644445)

标题：某医院存储掉线导致Oracle数据库故障恢复

xx医院存储突然掉线,导致数据库异常,现场工程师折腾了一天,问题依旧没有解决,无奈之下找到我们,希望我们能够帮忙恢复数据库.
启动报ORA-00600[2131]错误

Fri Nov 06 14:50:59 2015
ALTER DATABASE   MOUNT
This instance was first to mount
Fri Nov 06 14:50:59 2015
ALTER SYSTEM SET local_listener=' (ADDRESS=(PROTOCOL=TCP)(HOST=192.168.4.4)(PORT=1521))' SCOPE=MEMORY SID='xifenfei1';
NOTE: Loaded library: System
SUCCESS: diskgroup DATA was mounted
NOTE: dependency between database xifenfei and diskgroup resource ora.DATA.dg is established
Errors in file /home/app/oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_ora_13221.trc  (incident=191085):
ORA-00600: internal error code, arguments: [2131], [33], [32], [], [], [], [], [], [], [], [], []
Incident details in: /home/app/oracle/diag/rdbms/xifenfei/xifenfei1/incident/incdir_191085/xifenfei1_ora_13221_i191085.trc
Fri Nov 06 14:51:10 2015
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
ORA-600 signalled during: ALTER DATABASE   MOUNT...

出现该错误的原因是由于：We are attempting to write a controlfile checkpoint progress record, but find we do not have the progress record generating this exception.由于控制文件异常导致,出现此类情况,我们一般使用单个控制文件一次尝试,如果都不可以考虑重建控制文件

由于坏块(逻辑/物理)导致数据库实例恢复无法进行

Beginning crash recovery of 2 threads
Started redo scan
kcrfr_rnenq: use log nab 393216
kcrfr_rnenq: use log nab 2
Completed redo scan
 read 4427 KB redo, 500 data blocks need recovery
Started redo application at
 Thread 1: logseq 5731, block 391398
 Thread 2: logseq 4252, block 520815
Recovery of Online Redo Log: Thread 1 Group 2 Seq 5731 Reading mem 0
  Mem# 0: +DATA/xifenfei/onlinelog/group_2.266.835331047
Recovery of Online Redo Log: Thread 2 Group 8 Seq 4252 Reading mem 0
  Mem# 0: +DATA/xifenfei/onlinelog/group_8.331.835330421
Errors in file /home/app/oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_ora_13770.trc  (incident=197486):
ORA-00600: internal error code, arguments: [kdxlin:psno out of range], [], [], [], [], [], [], [], [], [], [], []
Incident details in:/home/app/oracle/diag/rdbms/xifenfei/xifenfei1/incident/incdir_197486/xifenfei1_ora_13770_i197486.trc
Fri Nov 06 15:03:09 2015
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Errors in file /home/app/oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_ora_13770.trc  (incident=197487):
ORA-01578: ORACLE data block corrupted (file # 2, block # 65207)
ORA-01110: data file 2: '+DATA/xifenfei/datafile/sysaux.257.835324753'
ORA-10564: tablespace SYSAUX
ORA-01110: data file 2: '+DATA/xifenfei/datafile/sysaux.257.835324753'
ORA-10561: block type 'TRANSACTION MANAGED INDEX BLOCK', data object# 81045
ORA-00600: internal error code, arguments: [kdxlin:psno out of range], [], [], [], [], [], [], [], [], [], [], []
Incident details in:/home/app/oracle/diag/rdbms/xifenfei/xifenfei1/incident/incdir_197487/xifenfei1_ora_13770_i197487.trc
Errors in file /home/app/oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_ora_13770.trc:
ORA-01578: ORACLE data block corrupted (file # 2, block # 65207)
ORA-01110: data file 2: '+DATA/xifenfei/datafile/sysaux.257.835324753'
ORA-10564: tablespace SYSAUX
ORA-01110: data file 2: '+DATA/xifenfei/datafile/sysaux.257.835324753'
ORA-10561: block type 'TRANSACTION MANAGED INDEX BLOCK', data object# 81045
ORA-00600: internal error code, arguments: [kdxlin:psno out of range], [], [], [], [], [], [], [], [], [], [], []
Recovery of Online Redo Log: Thread 2 Group 3 Seq 4253 Reading mem 0
  Mem# 0: +DATA/xifenfei/onlinelog/group_3.332.835330505
Hex dump of (file 14, block 62536) in trace file /home/app/oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_ora_13770.trc
Reading datafile '+DATA/xifenfei/datafile/ht01.dbf' for corruption at rdba: 0x0380f448 (file 14, block 62536)
Reread (file 14, block 62536) found same corrupt data (logically corrupt)
RECOVERY OF THREAD 1 STUCK AT BLOCK 62536 OF FILE 14
Fri Nov 06 15:03:13 2015
Abort recovery for domain 0
Aborting crash recovery due to error 1172
Errors in file /home/app/oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_ora_13770.trc:
ORA-01172: recovery of thread 1 stuck at block 62536 of file 14
ORA-01151: use media recovery to recover block, restore backup if needed
Abort recovery for domain 0
Errors in file /home/app/oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_ora_13770.trc:
ORA-01172: recovery of thread 1 stuck at block 62536 of file 14
ORA-01151: use media recovery to recover block, restore backup if needed
ORA-1172 signalled during: ALTER DATABASE OPEN...

查看资料发现和Bug 14301592 – Several errors by corrupt blocks shifted by 2 bytes in buffer cache during recovery caused by INDEX redo apply，可以通过ALLOW 1 CORRUPTION临时解决

使用ALLOW 1 CORRUPTION进行恢复,出现ORA-07445[kdxlin]错误

Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
+DATA/xifenfei/onlinelog/group_3.332.835330505
ORA-00279: change 700860458 generated at 11/05/2015 21:20:15 needed for thread
1
ORA-00289: suggestion : +ARCHIVE/xifenfei/xifenfei_1_5731_835324843.arc
ORA-00280: change 700860458 for thread 1 is in sequence #5731
Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
+DATA/xifenfei/onlinelog/group_2.266.835331047
ORA-00283: recovery session canceled due to errors
ORA-10562: Error occurred while applying redo to data block (file# 2, block#
70104)
ORA-10564: tablespace SYSAUX
ORA-01110: data file 2: '+DATA/xifenfei/datafile/sysaux.257.835324753'
ORA-10561: block type 'TRANSACTION MANAGED INDEX BLOCK', data object# 82289
ORA-00607: Internal error occurred while making a change to a data block
ORA-00602: internal programming exception
ORA-07445: exception encountered: core dump [kdxlin()+4088] [SIGSEGV]
[ADDR:0xC] [PC:0x95FB572] [Address not mapped to object] []
ORA-01112: media recovery not started

ORA-07445[kdxlin()+4088]未找到类似说明,到了这一步,无法简单的恢复成功,只能通过设置隐含参数跳过实例恢复,尝试resetlog库

通过设置_allow_resetlogs_corruption参数继续恢复

SQL> startup pfile='/tmp/pfile.ora' mount;
ORACLE instance started.
Total System Global Area 7315603456 bytes
Fixed Size                  2267384 bytes
Variable Size            2566915848 bytes
Database Buffers         4731174912 bytes
Redo Buffers               15245312 bytes
Database mounted.
SQL> alter database open resetlogs;
alter database open resetlogs
*
ERROR at line 1:
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00600: internal error code, arguments: [kclchkblk_4], [0], [700869927],
[0], [700860464], [], [], [], [], [], [], []
Process ID: 13563
Session ID: 157 Serial number: 3

alert日志报错

Fri Nov 06 19:26:39 2015
SMON: enabling cache recovery
Instance recovery: looking for dead threads
Instance recovery: lock domain invalid but no dead threads
Errors in file /home/app/oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_ora_13563.trc  (incident=319140):
ORA-00600: internal error code, arguments: [kclchkblk_4], [0], [700869927], [0], [700860464], [], [], [], [], [], [], []
Incident details in:/home/app/oracle/diag/rdbms/xifenfei/xifenfei1/incident/incdir_319140/xifenfei1_ora_13563_i319140.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Redo thread 2 internally disabled at seq 1 (CKPT)
ARC1: Archiving disabled thread 2 sequence 1
Archived Log entry 9956 added for thread 2 sequence 1 ID 0x0 dest 1:
ARC3: Archival started
ARC0: STARTING ARCH PROCESSES COMPLETE
Errors in file /home/app/oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_ora_13563.trc:
ORA-00600: internal error code, arguments: [kclchkblk_4], [0], [700869927], [0], [700860464], [], [], [], [], [], [], []
Errors in file /home/app/oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_ora_13563.trc:
ORA-00600: internal error code, arguments: [kclchkblk_4], [0], [700869927], [0], [700860464], [], [], [], [], [], [], []
Error 600 happened during db open, shutting down database
USER (ospid: 13563): terminating the instance due to error 600
Fri Nov 06 19:26:42 2015
Instance terminated by USER, pid = 13563
ORA-1092 signalled during: alter database open resetlogs...
opiodr aborting process unknown ospid (13563) as a result of ORA-1092
Fri Nov 06 19:26:42 2015
ORA-1092 : opitsk aborting process

这里是比较熟悉的ora-600[kclchkblk_4]错误,和ora-600[2662]错误类似,需要调整scn,由于数据库版本为11.2.0.4,无法使用常规方法调整scn,在修改控制文件,oradebug,bbed方法可供选择

使用oradebug方法处理
因为是asm环境,其他方法处理起来都相对麻烦

[oracle@wisetop1 ~]$ sqlplus / as sysdba
SQL*Plus: Release 11.2.0.4.0 Production on Fri Nov 6 19:30:59 2015
Copyright (c) 1982, 2013, Oracle.  All rights reserved.
Connected to an idle instance.
SQL> startup pfile='/tmp/pfile.ora' mount;
ORACLE instance started.
Total System Global Area 7315603456 bytes
Fixed Size                  2267384 bytes
Variable Size            2566915848 bytes
Database Buffers         4731174912 bytes
Redo Buffers               15245312 bytes
Database mounted.
SQL> oradebug setmypid
Statement processed.
SQL> oradebug poke 0x06001AE70 4 0x2FAF0800
BEFORE: [06001AE70, 06001AE74) = 00000000
AFTER:  [06001AE70, 06001AE74) = 2FAF0800
SQL> alter database open;
Database altered.

至此数据库open成功,后续就是处理一些坏块的工作,并建议客户逻辑重建库.

12C sysaux 异常恢复—ORA-01190错误恢复

联系：手机/微信(+86 17813235971) QQ(107644445)

标题：12C sysaux 异常恢复—ORA-01190错误恢复

有朋友请求支援，他们数据库由于file 3 大量坏块，然后直接使用rman 备份还原了file 3，但是在recover过程中发现归档丢失，而且整个库在丢失归档的scn之后，还做过resetlogs操作，导致现在整个库无法正常启动，报ORA-01190错误，希望帮忙把file 3 给online起来，整个库正常open【当然在丢失sysaux的情况下，数据库可以open起来，但是这种情况下，迁移数据比较麻烦】

SQL> startup;
ORACLE 例程已经启动。
Total System Global Area 3.1868E+10 bytes
Fixed Size                  3601144 bytes
Variable Size            2.8655E+10 bytes
Database Buffers         3154116608 bytes
Redo Buffers               54804480 bytes
数据库装载完毕。
ORA-01190: 控制文件或数据文件 3 来自最后一个 RESETLOGS 之前
ORA-01110: 数据文件 3: 'E:\APP\ORAADM\ORADATA\ORCL\SYSAUX01.DBF'

Oracle Database Recovery Check Result结果显示[脚本]
wrong+resetlogs

尝试不完全恢复并使用隐含参数打开库

Fri Oct 02 19:10:12 2015
ALTER DATABASE RECOVER  database until cancel
Fri Oct 02 19:10:12 2015
Media Recovery Start
 Started logmerger process
Fri Oct 02 19:10:12 2015
Media Recovery failed with error 16433
Fri Oct 02 19:10:14 2015
Recovery Slave PR00 previously exited with exception 283
ORA-283 signalled during: ALTER DATABASE RECOVER  database until cancel  ...
Fri Oct 02 19:10:37 2015
Errors in file E:\APP\ORAADM\diag\rdbms\orcl\oaorcl\trace\oaorcl_m000_5176.trc:
ORA-16433: The database or pluggable database must be opened in read/write mode.
Fri Oct 02 19:10:37 2015
Errors in file E:\APP\ORAADM\diag\rdbms\orcl\oaorcl\trace\oaorcl_m000_5176.trc:
ORA-16433: The database or pluggable database must be opened in read/write mode.
ALTER DATABASE RECOVER  database until cancel
Fri Oct 02 19:11:18 2015
Media Recovery Start
 Started logmerger process
Fri Oct 02 19:11:18 2015
Media Recovery failed with error 16433
Fri Oct 02 19:11:19 2015
Recovery Slave PR00 previously exited with exception 283
ORA-283 signalled during: ALTER DATABASE RECOVER  database until cancel  ...
alter database open resetlogs
ORA-1139 signalled during: alter database open resetlogs...
alter database open
Fri Oct 02 19:11:49 2015
Errors in file E:\APP\ORAADM\diag\rdbms\orcl\oaorcl\trace\oaorcl_ora_4252.trc:
ORA-01190: 控制文件或数据文件 3 来自最后一个 RESETLOGS 之前
ORA-01110: 数据文件 3: 'E:\APP\ORAADM\ORADATA\ORCL\SYSAUX01.DBF'
ORA-1190 signalled during: alter database open...
Fri Oct 02 19:15:38 2015
Errors in file E:\APP\ORAADM\diag\rdbms\orcl\oaorcl\trace\oaorcl_m000_5292.trc:
ORA-16433: The database or pluggable database must be opened in read/write mode.
Fri Oct 02 19:15:38 2015
Errors in file E:\APP\ORAADM\diag\rdbms\orcl\oaorcl\trace\oaorcl_m000_5292.trc:
ORA-16433: The database or pluggable database must be opened in read/write mode.
Fri Oct 02 19:20:39 2015
Errors in file E:\APP\ORAADM\diag\rdbms\orcl\oaorcl\trace\oaorcl_m000_2276.trc:
ORA-16433: The database or pluggable database must be opened in read/write mode.
Fri Oct 02 19:20:39 2015
Errors in file E:\APP\ORAADM\diag\rdbms\orcl\oaorcl\trace\oaorcl_m000_2276.trc:
ORA-16433: The database or pluggable database must be opened in read/write mode.
Fri Oct 02 19:25:40 2015
Errors in file E:\APP\ORAADM\diag\rdbms\orcl\oaorcl\trace\oaorcl_m000_4804.trc:
ORA-16433: The database or pluggable database must be opened in read/write mode.
Fri Oct 02 19:25:40 2015
Errors in file E:\APP\ORAADM\diag\rdbms\orcl\oaorcl\trace\oaorcl_m000_4804.trc:
ORA-16433: The database or pluggable database must be opened in read/write mode.
Fri Oct 02 19:30:41 2015
Errors in file E:\APP\ORAADM\diag\rdbms\orcl\oaorcl\trace\oaorcl_m000_876.trc:
ORA-16433: The database or pluggable database must be opened in read/write mode.
Fri Oct 02 19:30:41 2015
Errors in file E:\APP\ORAADM\diag\rdbms\orcl\oaorcl\trace\oaorcl_m000_876.trc:
ORA-16433: The database or pluggable database must be opened in read/write mode.
Fri Oct 02 19:32:40 2015
Shutting down instance (abort)

数据库遭遇ORA-16433,此类方法无法打开数据库，根据经验值出现此类问题，可能需要重建控制文件，但是由于其中file 3的resetlogs scn不正确，无法包含该文件重建控制文件

Fri Oct 02 20:10:55 2015
WARNING: Default Temporary Tablespace not specified in CREATE DATABASE command
Default Temporary Tablespace will be necessary for a locally managed database in future release
Fri Oct 02 20:10:55 2015
Errors in file E:\APP\ORAADM\diag\rdbms\orcl\oaorcl\trace\oaorcl_ora_5004.trc:
ORA-01189: ????????????? RESETLOGS
ORA-01110: ???? 3: 'E:\APP\ORAADM\ORADATA\ORCL\SYSAUX01.DBF'
ORA-1503 signalled during:  CREATE CONTROLFILE REUSE DATABASE "orcl" RESETLOGS FORCE LOGGING ARCHIVELOG
     MAXLOGFILES 16
     MAXLOGMEMBERS 3
     MAXDATAFILES 100
     MAXINSTANCES 8
     MAXLOGHISTORY 2921
 LOGFILE
 GROUP 3 'E:\APP\ORAADM\ORADATA\ORCL\REDO03.LOG' size 50M,
 GROUP 2 'E:\APP\ORAADM\ORADATA\ORCL\REDO02.LOG'  size 50M,
 GROUP 1 'E:\APP\ORAADM\ORADATA\ORCL\REDO01.LOG'  size 50M
 DATAFILE
'E:\APP\ORAADM\ORADATA\ORCL\SYSTEM01.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\PDBSEED\SYSTEM01.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\SYSAUX01.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\PDBSEED\SYSAUX01.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\UNDOTBS01.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\USERS01.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\PDBORCL\SYSTEM01.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\PDBORCL\SYSAUX01.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\PDBORCL\SAMPLE_SCHEMA_USERS01.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\PDBORCL\EXAMPLE01.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\V3XSPACE.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\NMSA_BACKUP01.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\V3XSPACE1.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\V3XSPACE01.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\V3XSPACE02.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\V3XSPACE03.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\V3XSPACE04.DBF'
 CHARACTER SET AL32UTF8
 ...

除掉file 3 继续重建控制文件

Fri Oct 02 20:33:11 2015
Successful mount of redo thread 1, with mount id 1419796614
Completed:  CREATE CONTROLFILE REUSE DATABASE "orcl" RESETLOGS FORCE LOGGING ARCHIVELOG
     MAXLOGFILES 16
     MAXLOGMEMBERS 3
     MAXDATAFILES 100
     MAXINSTANCES 8
     MAXLOGHISTORY 2921
 LOGFILE
 GROUP 3 'E:\APP\ORAADM\ORADATA\ORCL\REDO03.LOG' size 50M,
 GROUP 2 'E:\APP\ORAADM\ORADATA\ORCL\REDO02.LOG'  size 50M,
 GROUP 1 'E:\APP\ORAADM\ORADATA\ORCL\REDO01.LOG'  size 50M
 DATAFILE
'E:\APP\ORAADM\ORADATA\ORCL\SYSTEM01.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\PDBSEED\SYSTEM01.DBF',
--'E:\APP\ORAADM\ORADATA\ORCL\SYSAUX01.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\PDBSEED\SYSAUX01.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\UNDOTBS01.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\USERS01.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\PDBORCL\SYSTEM01.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\PDBORCL\SYSAUX01.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\PDBORCL\SAMPLE_SCHEMA_USERS01.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\PDBORCL\EXAMPLE01.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\V3XSPACE.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\NMSA_BACKUP01.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\V3XSPACE1.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\V3XSPACE01.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\V3XSPACE02.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\V3XSPACE03.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\V3XSPACE04.DBF'
 CHARACTER SET AL32UTF8

继续恢复数据库

ALTER DATABASE OPEN
Fri Oct 02 20:34:57 2015
…………
Archived Log entry 3 added for thread 1 sequence 8 ID 0x54a083a3 dest 1:
Fri Oct 02 20:35:16 2015
Tablespace 'SYSAUX' #1 found in data dictionary,
but not in the controlfile. Adding to controlfile.
Tablespace 'TEMP' #3 found in data dictionary,
but not in the controlfile. Adding to controlfile.
File #3 found in data dictionary but not in controlfile.
Creating OFFLINE file 'MISSING00003' in the controlfile.
Corrected file 15 plugged in read-only status in control file
Corrected file 16 plugged in read-only status in control file
Corrected file 17 plugged in read-only status in control file
Corrected file 18 plugged in read-only status in control file
Corrected file 19 plugged in read-only status in control file
Dictionary check complete
Verifying file header compatibility for 11g tablespace encryption..
Verifying 11g file header compatibility for tablespace encryption completed
Fri Oct 02 20:35:19 2015
SMON: enabling tx recovery
Fri Oct 02 20:35:19 2015
*********************************************************************
WARNING: The following temporary tablespaces in container(CDB$ROOT)
         contain no files.
Starting background process SMCO
Fri Oct 02 20:35:19 2015
SMCO started with pid=45, OS id=1500
         This condition can occur when a backup controlfile has
         been restored.  It may be necessary to add files to these
         tablespaces.  That can be done using the SQL statement:
         ALTER TABLESPACE <tablespace_name> ADD TEMPFILE
         Alternatively, if these temporary tablespaces are no longer
         needed, then they can be dropped.
           Empty temporary tablespace: TEMP
*********************************************************************
Database Characterset is AL32UTF8
No Resource Manager plan active
Fri Oct 02 20:35:21 2015
Errors in file E:\APP\ORAADM\diag\rdbms\orcl\oaorcl\trace\oaorcl_ora_2220.trc:
ORA-00376: 此时无法读取文件 3
ORA-01111: 数据文件 3 名称未知 - 请重命名以更正文件
ORA-01110: 数据文件 3: 'C:\APP\ORAADM\PRODUCT\12.1.0\DBHOME_1\DATABASE\MISSING00003'
Fri Oct 02 20:35:21 2015
Errors in file E:\APP\ORAADM\diag\rdbms\orcl\oaorcl\trace\oaorcl_ora_2220.trc:
ORA-00376: 此时无法读取文件 3
ORA-01111: 数据文件 3 名称未知 - 请重命名以更正文件
ORA-01110: 数据文件 3: 'C:\APP\ORAADM\PRODUCT\12.1.0\DBHOME_1\DATABASE\MISSING00003'
Error 376 happened during db open, shutting down database
USER (ospid: 2220): terminating the instance due to error 376
Fri Oct 02 20:35:26 2015
Instance terminated by USER, pid = 2220
ORA-1092 signalled during: ALTER DATABASE OPEN...
opiodr aborting process unknown ospid (2220) as a result of ORA-1092

此时由于file 3 未包含在控制文件中，但是存在数据字典中，因此在数据库open的时候出现了默认文件名MISSING0003，尝试重命名改文件指定为存在的file 3,并且尝试恢复

SQL> startup mount;
ORACLE 例程已经启动。
Total System Global Area 3.1868E+10 bytes
Fixed Size                  3601144 bytes
Variable Size            2.8655E+10 bytes
Database Buffers         3154116608 bytes
Redo Buffers               54804480 bytes
数据库装载完毕。
SQL> alter database datafile 3 offline;
数据库已更改。
SQL> alter database rename file 'C:\APP\ORAADM\PRODUCT\12.1.0\DBHOME_1\DATABASE\
MISSING00003' to 'E:\APP\ORAADM\ORADATA\ORCL\SYSAUX01.DBF';
数据库已更改。
SQL> recover database until cancel;
ORA-00279: 更改 617412726 (在 10/02/2015 20:35:06 生成) 对于线程 1 是必需的
ORA-00289: 建议:
E:\APP\ORAADM\FAST_RECOVERY_AREA\ORCL\ARCHIVELOG\2015_10_02\O1_MF_1_9_%U_.ARC
ORA-00280: 更改 617412726 (用于线程 1) 在序列 #9 中
指定日志: {<RET>=suggested | filename | AUTO | CANCEL}
E:\APP\ORAADM\ORADATA\ORCL\REDO01.LOG
ORA-00310: archived log contains sequence 7; sequence 9 required
ORA-00334: archived log: 'E:\APP\ORAADM\ORADATA\ORCL\REDO01.LOG'
ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below
ORA-01194: file 1 needs more recovery to be consistent
ORA-01110: data file 1: 'E:\APP\ORAADM\ORADATA\ORCL\SYSTEM01.DBF'
SQL> recover database until cancel;
ORA-00279: 更改 617412726 (在 10/02/2015 20:35:06 生成) 对于线程 1 是必需的
ORA-00289: 建议:
E:\APP\ORAADM\FAST_RECOVERY_AREA\ORCL\ARCHIVELOG\2015_10_02\O1_MF_1_9_%U_.ARC
ORA-00280: 更改 617412726 (用于线程 1) 在序列 #9 中
指定日志: {<RET>=suggested | filename | AUTO | CANCEL}
E:\APP\ORAADM\ORADATA\ORCL\REDO03.LOG
已应用的日志。
完成介质恢复。
SQL> alter database datafile 3 online;
数据库已更改。
SQL> alter database open resetlogs;
alter database open resetlogs
*
第 1 行出现错误:
ORA-01122: 数据库文件 3 验证失败
ORA-01110: 数据文件 3: 'E:\APP\ORAADM\ORADATA\ORCL\SYSAUX01.DBF'
ORA-01202: 此文件的原型错误 - 创建时间错误

这里比较明显ORA-01202，由于创建控制文件之时没有file 3信息，因此导致控制文件中关于file 3的信息和该文件头的创建时间不一致（此处之时显示了时间不一致，如果通过bbed修改时间，后续可能还有很多东西不一致，因此通过bbed 一个个修改一个个尝试，理论可行，但实际可操作性不好）,因此尝试直接使用bbed修改file 3文件头（由于是win环境，操作稍微麻烦点），把resetlogs信息修改和其他的一样

BBED> m /x 3c6b2b35
 File: SYSAUX01.dbf (3)
 Block: 2                Offsets:  112 to  143           Dba:0x00c00002
------------------------------------------------------------------------
 3c6b2b35 386b2200 00000000 00000000 00000000 00000000 00004000 bb460000
 <32 bytes per line>
BBED> set offset 116
        OFFSET          116
BBED> m /x 3137ca24
 File: SYSAUX01.dbf (3)
 Block: 2                Offsets:  116 to  147           Dba:0x00c00002
------------------------------------------------------------------------
 3137ca24 00000000 00000000 00000000 00000000 00004000 bb460000 7dc12b35
 <32 bytes per line>
BBED> m /x b9f8
 File: SYSAUX01.dbf (3)
 Block: 2                Offsets:  484 to  515           Dba:0x00c00002
------------------------------------------------------------------------
 b9f8a424 00000000 e65e2435 01000000 d3410000 b89b0000 10000900 02000000
 <32 bytes per line>
BBED> set offset +2
        OFFSET          486
BBED> m /x cc24
 File: SYSAUX01.dbf (3)
 Block: 2                Offsets:  486 to  517           Dba:0x00c00002
------------------------------------------------------------------------
 cc240000 0000e65e 24350100 0000d341 0000b89b 00001000 09000200 00000000
 <32 bytes per line>
BBED> m /x 87df offset 492
 File: SYSAUX01.dbf (3)
 Block: 2                Offsets:  492 to  523           Dba:0x00c00002
------------------------------------------------------------------------
 87df2435 01000000 d3410000 b89b0000 10000900 02000000 00000000 00000000
 <32 bytes per line>
BBED>
BBED> m /x 2b35 offset +2
 File: SYSAUX01.dbf (3)
 Block: 2                Offsets:  494 to  525           Dba:0x00c00002
------------------------------------------------------------------------
 2b350100 0000d341 0000b89b 00001000 09000200 00000000 00000000 00000000
 <32 bytes per line>
BBED> d offset 140
 File: SYSAUX01.dbf (3)
 Block: 2                Offsets:  140 to  171           Dba:0x00c00002
------------------------------------------------------------------------
 bb460000 7dc12b35 ba460000 00000000 00000000 00000000 00000000 00000000
 <32 bytes per line>
BBED> m /x 4248
 File: SYSAUX01.dbf (3)
 Block: 2                Offsets:  140 to  171           Dba:0x00c00002
------------------------------------------------------------------------
 42480000 7dc12b35 ba460000 00000000 00000000 00000000 00000000 00000000
 <32 bytes per line>
BBED> d offset 148
 File: SYSAUX01.dbf (3)
 Block: 2                Offsets:  148 to  179           Dba:0x00c00002
------------------------------------------------------------------------
 ba460000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
 <32 bytes per line>
BBED> m /x 4148
 File: SYSAUX01.dbf (3)
 Block: 2                Offsets:  148 to  179           Dba:0x00c00002
------------------------------------------------------------------------
 41480000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
 <32 bytes per line>
BBED> sum apply
Check value for File 3, Block 2:
current = 0xd0c8, required = 0xd0c8
BBED> verify
DBVERIFY - Verification starting
FILE = SYSAUX01.dbf
BLOCK = 1
DBVERIFY - Verification complete
Total Blocks Examined         : 1
Total Blocks Processed (Data) : 0
Total Blocks Failing   (Data) : 0
Total Blocks Processed (Index): 0
Total Blocks Failing   (Index): 0
Total Blocks Empty            : 0
Total Blocks Marked Corrupt   : 0
Total Blocks Influx           : 0
Message 531 not found;  product=RDBMS; facility=BBED

修改完file 3的文件头之后，再次重建控制文件,此次包含file 3

Fri Oct 02 21:19:58 2015
Successful mount of redo thread 1, with mount id 1419797885
Completed:  CREATE CONTROLFILE REUSE DATABASE "orcl" NORESETLOGS FORCE LOGGING ARCHIVELOG
     MAXLOGFILES 16
     MAXLOGMEMBERS 3
     MAXDATAFILES 100
     MAXINSTANCES 8
     MAXLOGHISTORY 2921
 LOGFILE
 GROUP 3 'E:\APP\ORAADM\ORADATA\ORCL\REDO03.LOG' size 50M,
 GROUP 2 'E:\APP\ORAADM\ORADATA\ORCL\REDO02.LOG'  size 50M,
 GROUP 1 'E:\APP\ORAADM\ORADATA\ORCL\REDO01.LOG'  size 50M
 DATAFILE
'E:\APP\ORAADM\ORADATA\ORCL\SYSTEM01.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\PDBSEED\SYSTEM01.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\SYSAUX01.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\PDBSEED\SYSAUX01.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\UNDOTBS01.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\USERS01.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\PDBORCL\SYSTEM01.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\PDBORCL\SYSAUX01.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\PDBORCL\SAMPLE_SCHEMA_USERS01.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\PDBORCL\EXAMPLE01.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\V3XSPACE.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\NMSA_BACKUP01.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\V3XSPACE1.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\V3XSPACE01.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\V3XSPACE02.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\V3XSPACE03.DBF',
'E:\APP\ORAADM\ORADATA\ORCL\V3XSPACE04.DBF'
 CHARACTER SET AL32UTF8

继续恢复数据库,数据库正常open，而且file 3 已经正常online，数据库可以直接导出来，至此恢复大体完成

在win 64位平台上运行bbed(支持ORACLE 10g 11g 12c)

联系：手机/微信(+86 17813235971) QQ(107644445)

标题：在win 64位平台上运行bbed(支持ORACLE 10g 11g 12c)

很多朋友反馈在win 64位操作系统之上无法使用bbed(包括9i,10g,11g,12c数据库版本),以前写过一篇文章,完美实现了在win平台的各个版本的数据库版本之上实现使用bbed(在win中运行bbed程序),可惜很遗憾没有注明平台信息,留下了不少疑问,今天在自己的电脑上再次实现此功能,用来证明win 64位的平台之上也可以运行bbed程序(数据库版本包括10g,11g,12c,在10g之前x86架构中无win 64位版本数据库,因此我也无能为力).

操作系统版本64位
本机测试为win 7 64位操作系统
win-64

win-64-2

数据库版本64位
本机测试数据库版本为12.1.0.2 64位版本(因为12c都支持,那对于10g/11g更是不在话下）
db-64

bbed运行情况
这里的bbed只是运行起来,并未加载数据文件,因此这里看到的FILENAME为空,但是不妨碍证明bbed可以在win平台,64位的数据库中运行
bbed-win64

整体证明win 64位平台,64位数据库运行bbed
一图抵上千言万语,让我们使用一幅完整截图来说明,bbed是可以运行在win 64位平台的64位版本的数据库之上(而且这里使用了目前最新的12.1.0.2版本)
oracle-64-bbed-10g-11g-12c

ORA-01052: required destination LOG_ARCHIVE_DUPLEX_DEST is not specified 恢复思路

联系：手机/微信(+86 17813235971) QQ(107644445)

标题：ORA-01052: required destination LOG_ARCHIVE_DUPLEX_DEST is not specified 恢复思路

今天一网友找到我,说数据库恢复在推scn的过程中遇到了ORA-01052: required destination LOG_ARCHIVE_DUPLEX_DEST is not specified错误无法解决,让我给予支持.这不禁让我想起,现在由于数据库bug和dblink原因导致了很多数据库scn很大,距离天花板非常近,从而使得数据库恢复过程中无法直接简单的推scn,这里正好结合该例子,简单说明下ORA-01052故障的处理.类似文档以前也写过:ORA-01052发生原因的类似文章

由于坏块导致数据库进行实例恢复无法进行

Beginning crash recovery of 1 threads
Started redo scan
Completed redo scan
 read 1901 KB redo, 276 data blocks need recovery
Started redo application at
 Thread 1: logseq 1004, block 172771
Recovery of Online Redo Log: Thread 1 Group 2 Seq 1004 Reading mem 0
  Mem# 0: F:\APP\ADMINISTRATOR\ORADATA\XFF\REDO02.LOG
Fri May 29 10:59:56 2015
RECOVERY OF THREAD 1 STUCK AT BLOCK 439938 OF FILE 19
Fri May 29 11:00:00 2015
Exception [type: ACCESS_VIOLATION, UNABLE_TO_READ] [ADDR:0x2048] [PC:0x6215134, __intel_new_memcpy()+260]
Fri May 29 11:00:12 2015
Trace dumping is performing id=[cdmp_20150529110012]
Fri May 29 11:00:12 2015
Slave exiting with ORA-1172 exception
Errors in file f:\app\administrator\diag\rdbms\XFF\XFF\trace\XFF_p007_1612.trc:
ORA-01172: 线程 1 的恢复停止在块 439938 (在文件 19 中)
ORA-01151: 如果需要, 请使用介质恢复以恢复块和还原备份
Fri May 29 11:00:27 2015
ORA-01578: ORACLE 数据块损坏 (文件号 19, 块号 450245)
ORA-01110: 数据文件 19: 'F:\APP\ADMINISTRATOR\ORADATA\XFF\PSTORE_02.DBF'
ORA-10564: tablespace PSTORE
ORA-01110: 数据文件 19: 'F:\APP\ADMINISTRATOR\ORADATA\XFF\PSTORE_02.DBF'
ORA-10561: block type 'TRANSACTION MANAGED INDEX BLOCK', data object# 91642
ORA-00607: 当更改数据块时出现内部错误
ORA-00602: 内部编程异常错误
ORA-07445: 出现异常错误: 核心转储 [_intel_new_memcpy()+260] [ACCESS_VIOLATION] [ADDR:0x2048]
[PC:0x6215134] [UNABLE_TO_READ] []
Fri May 29 11:00:27 2015
Aborting crash recovery due to slave death, attempting serial crash recovery
RECOVERY OF THREAD 1 STUCK AT BLOCK 439938 OF FILE 19
Fri May 29 11:00:45 2015
Trace dumping is performing id=[cdmp_20150529110045]
Aborting crash recovery due to error 1172
ORA-1172 signalled during: alter database open...

设置_allow_resetlogs_corruption并resetlogs尝试打开数据库

Assigning activation ID 4272042346 (0xfea2316a)
Thread 1 opened at log sequence 1
  Current log# 1 seq# 1 mem# 0: F:\APP\ADMINISTRATOR\ORADATA\XFF\REDO01.LOG
Successful open of redo thread 1
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Fri May 29 11:30:47 2015
SMON: enabling cache recovery
Fri May 29 11:30:47 2015
Errors in file f:\app\administrator\diag\rdbms\XFF\XFF\trace\XFF_ora_3004.trc  (incident=181236):
ORA-00600: ??????, ??: [2662], [3360], [2233437186], [3360], [2235447064], [4194545], [], [], [], [], [], []
Incident details in: f:\app\administrator\diag\rdbms\XFF\XFF\incident\incdir_181236\XFF_ora_3004_i181236.trc
Errors in file f:\app\administrator\diag\rdbms\XFF\XFF\trace\XFF_ora_3004.trc:
ORA-00704: ????????
ORA-00704: ????????
ORA-00600: ??????, ??: [2662], [3360], [2233437186], [3360], [2235447064], [4194545], [], [], [], [], [], []
Errors in file f:\app\administrator\diag\rdbms\XFF\XFF\trace\XFF_ora_3004.trc:
ORA-00704: ????????
ORA-00704: ????????
ORA-00600: ??????, ??: [2662], [3360], [2233437186], [3360], [2235447064], [4194545], [], [], [], [], [], []
Error 704 happened during db open, shutting down database
USER (ospid: 3004): terminating the instance due to error 704
Instance terminated by USER, pid = 3004
ORA-1092 signalled during: alter database open resetlogs...
opiodr aborting process unknown ospid (3004) as a result of ORA-1092

这里可以看到数据库通过设置_allow_resetlogs_corruption参数，进行不完全恢复,跳过数据库启动的实例恢复,然后强制拉库,然后遭遇大家熟悉的ORA-600[2662]错误,使得恢复失败,根据经验,通过推scn来绕过该错误

使用_minimum_giga_scn尝试推SCN

SQL> shutdown immediate
Database closed.
Database dismounted.
ORACLE instance shut down.
--------------------------------
*._minimum_giga_scn=13443
--------------------------------
SQL> startup pfile='/tmp/pfile'
ORACLE instance started.
Total System Global Area  236000356 bytes
Fixed Size                   451684 bytes
Variable Size             201326592 bytes
Database Buffers           33554432 bytes
Redo Buffers                 667648 bytes
Database mounted.
ORA-01052: required destination LOG_ARCHIVE_DUPLEX_DEST is not specified

通过运行Oracle Database Recovery Check检查发现数据库的scn已经非常大,距离天花板较近
_minimum_giga_scn
这里最大允许的推进的scn为13442.7,但是常规的最小的推scn的方法最小值为1024*1024*1024的倍数,因此这里遇到麻烦.
这里数据库遭遇了ORA-01052错误,导致推scn不成功,数据库无法正常启动.出现这类情况,由于scn可以增加的空间非常小,因此可以使用使用oradebug修改数据库scn或者直接修改控制文件scn的方式来精确控制推scn的值(可以实现任何值的scn增加,只要不超过天花板),也可以通过方法修改数据库scn距离天花板的距离,从而实现大幅度使用_minimum_giga_scn来推scn.另外还有一种解决方法:由于ORA-01052是由于scn过大导致(超过了数据库现在的天花板scn),因此出现了ORA-01052.所以另外一种变通的方法,就是通过调整数据库的天花板scn,从而使得_minimum_giga_scn可以继续推scn.在本次恢复中使用最为简单的增加天花板scn的方式来恢复(不过该方法恢复之后需要重建库,其实已经使用了隐含参数屏蔽redo恢复,本身就建议重建库保证数据字典一致性)

kfed恢复误删除磁盘组

联系：手机/微信(+86 17813235971) QQ(107644445)

标题：kfed恢复误删除磁盘组

在某些情况下,可能因为误操作,不小先drop diskgroup,这个时候千万别紧张,出现此类故障,可以通过kfed进行完美恢复(数据0丢失).如果进一步损坏了相关asm disk,那后续恢复就很麻烦了,可能需要使用dul扫描磁盘来进行抢救性恢复,而且可能导致数据丢失.
创建测试磁盘组xifenfei

[grid@xifenfei ~]$ sqlplus / as sysasm
SQL*Plus: Release 11.2.0.4.0 Production on Thu Apr 30 15:12:08 2015
Copyright (c) 1982, 2013, Oracle.  All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Automatic Storage Management option
SQL>  select name,path,header_status from v$asm_disk;
NAME                           PATH                           HEADER_STATU
------------------------------ ------------------------------ ------------
                               /dev/asm-disk3                 CANDIDATE
DATA_0000                      /dev/asm-disk1                 MEMBER
DATA_0001                      /dev/asm-disk2                 MEMBER
SQL> create diskgroup xifenfei external redundancy disk '/dev/asm-disk3';
Diskgroup created.
SQL> select name,path,header_status from v$asm_disk;
NAME                           PATH                           HEADER_STATU
------------------------------ ------------------------------ ------------
XIFENFEI_0000                  /dev/asm-disk3                 MEMBER
DATA_0000                      /dev/asm-disk1                 MEMBER
DATA_0001                      /dev/asm-disk2                 MEMBER

使用/dev/asm-disk3这个磁盘创建磁盘组xifenfei

创建表,存储在xifenfei磁盘组中

[oracle@xifenfei ~]$ sqlplus / as sysdba
SQL*Plus: Release 11.2.0.4.0 Production on Thu Apr 30 15:14:55 2015
Copyright (c) 1982, 2013, Oracle.  All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, Automatic Storage Management, OLAP, Data Mining
and Real Application Testing options
SQL> create tablespace xifenfei datafile '+xifenfei' size 100M;
Tablespace created.
SQL> select name from v$datafile;
NAME
--------------------------------------------------------------------------------
+DATA/xifenfei/datafile/system.256.878224279
+DATA/xifenfei/datafile/sysaux.257.878224279
+DATA/xifenfei/datafile/undotbs1.258.878224279
+DATA/xifenfei/datafile/users.259.878224279
+XIFENFEI/xifenfei/datafile/xifenfei.256.878397315
SQL> create table t_xifenfei tablespace xifenfei
  2  as select * from dba_objects;
Table created.
SQL> select count(*) from t_xifenfei;
  COUNT(*)
----------
     86259

通过在磁盘组中创建表空间,从而实现表xifenfei存放在测试磁盘组中

尝试删除磁盘组xifenfei

SQL> drop diskgroup xifenfei;
drop diskgroup xifenfei
*
ERROR at line 1:
ORA-15039: diskgroup not dropped
ORA-15053: diskgroup "XIFENFEI" contains existing files
SQL> drop diskgroup xifenfei  including contents;
drop diskgroup xifenfei  including contents
*
ERROR at line 1:
ORA-15039: diskgroup not dropped
ORA-15027: active use of diskgroup "XIFENFEI" precludes its dismount
[grid@xifenfei ~]$ asmcmd
ASMCMD> lsof
DB_Name   Instance_Name  Path
xifenfei  xifenfei       +data/xifenfei/controlfile/current.260.878224379
xifenfei  xifenfei       +data/xifenfei/datafile/sysaux.257.878224279
xifenfei  xifenfei       +data/xifenfei/datafile/system.256.878224279
xifenfei  xifenfei       +data/xifenfei/datafile/undotbs1.258.878224279
xifenfei  xifenfei       +data/xifenfei/datafile/users.259.878224279
xifenfei  xifenfei       +data/xifenfei/onlinelog/group_1.261.878224381
xifenfei  xifenfei       +data/xifenfei/onlinelog/group_2.262.878224383
xifenfei  xifenfei       +data/xifenfei/onlinelog/group_3.263.878224385
xifenfei  xifenfei       +data/xifenfei/tempfile/temp.264.878224395
xifenfei  xifenfei       +xifenfei/xifenfei/datafile/xifenfei.256.878397315

由于xifenfei磁盘组被实例使用,因此磁盘组无法删除,报ORA-15027错误
由于xifenfei磁盘组中有文件,因此磁盘组无法删除,报ORA-15053错误
如果这两个阻止你误删除磁盘组的警告依然不能救你,那我也不好多说啥了,只能向我一样继续往下

关闭数据库实例,删除磁盘组

SQL> shutdown immediate
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> drop diskgroup xifenfei;
drop diskgroup xifenfei
*
ERROR at line 1:
ORA-15039: diskgroup not dropped
ORA-15053: diskgroup "XIFENFEI" contains existing files
SQL> drop diskgroup xifenfei  including contents;
Diskgroup dropped.
SQL>  select name,path,header_status from v$asm_disk;
NAME                           PATH                           HEADER_STATU
------------------------------ ------------------------------ ------------
                               /dev/asm-disk3                 FORMER
DATA_0000                      /dev/asm-disk1                 MEMBER
DATA_0001                      /dev/asm-disk2                 MEMBER
SQL> alter diskgroup xifenfei mount;
alter diskgroup xifenfei mount
*
ERROR at line 1:
ORA-15032: not all alterations performed
ORA-15017: diskgroup "XIFENFEI" cannot be mounted
ORA-15063: ASM discovered an insufficient number of disks for diskgroup
"XIFENFEI"

磁盘组被drop之后,无法正常mount,mount之时报ORA-15063凑无

kfed恢复删除磁盘组

[grid@xifenfei ~]$ kfed read /dev/asm-disk3 >/tmp/disk3-0-0
[grid@xifenfei ~]$ kfed  read /dev/asm-disk3  blkn=1 >/tmp/disk3-0-1
[grid@xifenfei ~]$ kfed  read /dev/asm-disk3  aun=1 >/tmp/disk3-1-0
通过vi修改这些/tmp/disk3-*中的部分值
[grid@xifenfei ~]$ kfed merge /dev/asm-disk3 text=/tmp/disk3-0-0
[grid@xifenfei ~]$ kfed merge /dev/asm-disk3  blkn=1 text=/tmp/disk3-0-1
[grid@xifenfei ~]$ kfed merge /dev/asm-disk3 aun=1 text=/tmp/disk3-1-0

查询修复后的asm disk

SQL> col path for a30
SQL> set lines 150
SQL> select name,path,header_status from v$asm_disk;
NAME                           PATH                           HEADER_STATU
------------------------------ ------------------------------ ------------
                               /dev/asm-disk3                 MEMBER
DATA_0000                      /dev/asm-disk1                 MEMBER
DATA_0001                      /dev/asm-disk2                 MEMBER

尝试mount xifenfei 磁盘组

SQL> alter diskgroup xifenfei mount;
Diskgroup altered.
SQL> select name,path,header_status from v$asm_disk;
NAME                           PATH                           HEADER_STATU
------------------------------ ------------------------------ ------------
XIFENFEI_0000                  /dev/asm-disk3                 MEMBER
DATA_0000                      /dev/asm-disk1                 MEMBER
DATA_0001                      /dev/asm-disk2                 MEMBER

测试恢复后磁盘组

SQL> startup
ORACLE instance started.
Total System Global Area  952020992 bytes
Fixed Size                  2258960 bytes
Variable Size             306186224 bytes
Database Buffers          637534208 bytes
Redo Buffers                6041600 bytes
Database mounted.
Database opened.
SQL>  select count(*) from t_xifenfei;
  COUNT(*)
----------
     86259

这里证明,当磁盘组被误删除后,立即停止进一步损坏,可以通过kfed进行完美恢复
如果您遇到此类情况,无法解决请联系我们，提供专业ORACLE数据库恢复技术支持
Phone:17813235971 Q Q:107644445 E-Mail:dba@xifenfei.com

分享I_OBJ4 ORA-8102故障恢复案例

联系：手机/微信(+86 17813235971) QQ(107644445)

标题：分享I_OBJ4 ORA-8102故障恢复案例

在测试环境中对于OBJ$中i_obj4中出现ORA-8102进行了重新并恢复测试,认为自己已经比较清楚的掌握了I_OBJ4的ORA-8102问题处理,可是实际的一个案例,还是比较比实验中复杂,这里贴出来主要操作供大家参考,再次证明数据库恢复的场景不可大意,客户的故障只有你想不到的,没有遇不到的
通过bbed修改obj$中dataobj$重现I_OBJ4索引报ORA-08102错误
 使用bbed 修复I_OBJ4 index 报ORA-8102
数据库创建表提示ORA-8102错误

SQL> startup
ORACLE instance started.
Total System Global Area 2.6991E+10 bytes
Fixed Size		    2213976 bytes
Variable Size		 1.9327E+10 bytes
Database Buffers	 7516192768 bytes
Redo Buffers		  145174528 bytes
Database mounted.
Database opened.
SQL> create table t1 as select * from dual;
create table t1 as select * from dual
                                 *
ERROR at line 1:
ORA-00604: error occurred at recursive SQL level 1
ORA-08102: index key not found, obj# 39, file 1, block 93842 (2)

分析ORA-08102错误

SQL> select object_name,object_type from dba_objects where object_id=39;
OBJECT_NAME                    OBJECT_TYPE
------------------------------ -------------------
I_OBJ4                         INDEX
SQL> create table t1 as select * from dual;
create table t1 as select * from dual
                                 *
ERROR at line 1:
ORA-00604: error occurred at recursive SQL level 1
ORA-08102: index key not found, obj# 39, file 1, block 93842 (2)
SQL> select /*+ index(t i_obj4) */ DATAOBJ#,type#,owner# from obj$  t
minus
select /*+ full(t1) */ DATAOBJ#,type#,owner# from obj$  t1;
  2    3
  DATAOBJ#	TYPE#	  OWNER#
---------- ---------- ----------
     97109	    0	       0
SQL>  select /*+ full(t1) */ DATAOBJ#,type#,owner# from obj$  t1
  minus
  select /*+ index(t i_obj4) */ DATAOBJ#,type#,owner# from obj$  t
  ;
  2    3    4
  DATAOBJ#	TYPE#	  OWNER#
---------- ---------- ----------
     97094	    0	       0
SQL> SET LINES 122
COL INDEX_OWNER FOR A20
COL INDEX_NAME FOR A30
COL TABLE_OWNER FOR A20
COL COLUMN_NAME FOR A25
SELECT TABLE_OWNER,INDEX_NAME,COLUMN_NAME,COLUMN_POSITION
  FROM Dba_Ind_Columns
 WHERE table_name = upper('&TABLE_NAME') order by TABLE_OWNER,INDEX_OWNER,INDEX_NAME,COLUMN_POSITION
and index_name='I_OBJ4';
SQL> SQL> SQL> SQL> SQL>   2    3
Enter value for table_name: OBJ$
old   3:  WHERE table_name = upper('&TABLE_NAME') order by TABLE_OWNER,INDEX_OWNER,INDEX_NAME,COLUMN_POSITION
new   3:  WHERE table_name = upper('OBJ$') order by TABLE_OWNER,INDEX_OWNER,INDEX_NAME,COLUMN_POSITION
TABLE_OWNER	     INDEX_NAME 		    COLUMN_NAME 	      COLUMN_POSITION
-------------------- ------------------------------ ------------------------- ---------------
SYS		     I_OBJ4			    DATAOBJ#				    1
SYS		     I_OBJ4			    TYPE#				    2
SYS		     I_OBJ4			    OWNER#				    3
SQL> SELECT DATAOBJ# FROM OBJ$ WHERE OBJ#=97109;
no rows selected
SQL> SELECT DATAOBJ# FROM OBJ$ WHERE OBJ#=97094;
  DATAOBJ#
----------
     97094
SQL> select /*+ index(t i_obj4) */ rowid,DATAOBJ#,type#,owner# from obj$  t
minus
select /*+ full(t1) */ rowid,DATAOBJ#,type#,owner# from obj$  t1;
  2    3
ROWID		     DATAOBJ#	   TYPE#     OWNER#
------------------ ---------- ---------- ----------
AAAAASAABAAAADxAAb	97109	       0	  0
SQL>  select /*+ full(t1) */ rowid,DATAOBJ#,type#,owner# from obj$  t1
  minus
  select /*+ index(t i_obj4) */ rowid,DATAOBJ#,type#,owner# from obj$  t
  ;
  2    3    4
ROWID		     DATAOBJ#	   TYPE#     OWNER#
------------------ ---------- ---------- ----------
AAAAASAABAAAADxAAb	97094	       0	  0
SQL> select name,obj#,dataobj# from obj$ where rowid='AAAAASAABAAAADxAAb';
NAME				     OBJ#   DATAOBJ#
------------------------------ ---------- ----------
_NEXT_OBJECT				1      97094

到此也比较清楚,rowid为AAAAASAABAAAADxAAb的dataobj#记录在obj$表中为97094而在I_OBJ4中记录为97109,因此两者不一致,从而出现ORA-8102错误

尝试bbed解决ORA-8102问题
尝试修改obj$和i_obj4中的dataobj#记录一致，这里修改obj$中的对应记录

SQL> select dbms_rowid.rowid_relative_fno(rowid) file#,dbms_rowid.rowid_block_number(rowid) block#,
dbms_rowid.rowid_row_number(rowid) row#
from obj$ where rowid='AAAAASAABAAAADxAAb'  2    3
  4  /
     FILE#     BLOCK#	    ROW#
---------- ---------- ----------
	 1	  241	      27
SQL> select dump(97109,16) from dual;
DUMP(97109,16)
----------------------
Typ=2 Len=4: c3,a,48,a
SQL> select dump(97094,16) from dual;
DUMP(97094,16)
-----------------------
Typ=2 Len=4: c3,a,47,5f
-bash-4.1$ bbed blocksize=8192 mode=edit filename=/u01/app/oracle/oradata/oa/system01.dbf
Password:
BBED: Release 2.0.0.0.0 - Limited Production on Sat Mar 14 19:30:18 2015
Copyright (c) 1982, 2009, Oracle and/or its affiliates.  All rights reserved.
************* !!! For Oracle Internal Use only !!! ***************
BBED> show all
	FILE#          	0
	BLOCK#         	1
	OFFSET         	0
	DBA            	0x00000000 (0 0,1)
	FILENAME       	/u01/app/oracle/oradata/oa/system01.dbf
	BIFILE         	bifile.bbd
	LISTFILE
	BLOCKSIZE      	8192
	MODE           	Edit
	EDIT           	Unrecoverable
	IBASE          	Dec
	OBASE          	Dec
	WIDTH          	80
	COUNT          	512
	LOGFILE        	log.bbd
	SPOOL          	No
BBED> set block 241
	BLOCK#         	241
BBED> map
 File: /u01/app/oracle/oradata/oa/system01.dbf (0)
 Block: 241                                   Dba:0x00000000
------------------------------------------------------------
 KTB Data Block (Table/Cluster)
 struct kcbh, 20 bytes                      @0
 struct ktbbh, 48 bytes                     @20
 struct kdbh, 14 bytes                      @68
 struct kdbt[1], 4 bytes                    @82
 sb2 kdbr[105]                              @86
 ub1 freespace[87]                          @296
 ub1 rowdata[7805]                          @383
 ub4 tailchk                                @8188
BBED> p *kdbr[27]
rowdata[0]
----------
ub1 rowdata[0]                              @383      0x2c
BBED> x /rnnncnnncc
rowdata[0]                                  @383
----------
flag@383:  0x2c (KDRHFL, KDRHFF, KDRHFH)
lock@384:  0x00
cols@385:    18
col    0[2] @386: 1
col    1[4] @389: 97094
col    2[1] @394: 0
col   3[12] @396: _NEXT_OBJECT
col    4[2] @409: 1
col    5[0] @412: *NULL*
col    6[1] @413: 0
col    7[7] @415: xm....4
col    8[7] @423: xs....6
col    9[7] @431: xm....4
col   10[1] @439: .
col   11[0] @441: *NULL*
col   12[0] @442: *NULL*
col   13[1] @443: .
col   14[0] @445: *NULL*
col   15[1] @446: .
col   16[4] @448: ..8$
col   17[1] @453: .
BBED> set count 32
	COUNT          	32
BBED> set offset 389
	OFFSET         	389
BBED> d
 File: /u01/app/oracle/oradata/oa/system01.dbf (0)
 Block: 241              Offsets:  389 to  420           Dba:0x00000000
------------------------------------------------------------------------
 04c30a47 5f01800c 5f4e4558 545f4f42 4a454354 02c102ff 01800778 6d080f01
 <32 bytes per line>
BBED> set offset +3
	OFFSET         	392
BBED> d
 File: /u01/app/oracle/oradata/oa/system01.dbf (0)
 Block: 241              Offsets:  392 to  423           Dba:0x00000000
------------------------------------------------------------------------
 475f0180 0c5f4e45 58545f4f 424a4543 5402c102 ff018007 786d080f 01113407
 <32 bytes per line>
BBED> m /x 480a
 File: /u01/app/oracle/oradata/oa/system01.dbf (0)
 Block: 241              Offsets:  392 to  423           Dba:0x00000000
------------------------------------------------------------------------
 480a0180 0c5f4e45 58545f4f 424a4543 5402c102 ff018007 786d080f 01113407
 <32 bytes per line>
BBED>  p *kdbr[27]
rowdata[0]
----------
ub1 rowdata[0]                              @383      0x2c
BBED> x /rnnncnnncc
rowdata[0]                                  @383
----------
flag@383:  0x2c (KDRHFL, KDRHFF, KDRHFH)
lock@384:  0x00
cols@385:    18
col    0[2] @386: 1
col    1[4] @389: 97109
col    2[1] @394: 0
col   3[12] @396: _NEXT_OBJECT
col    4[2] @409: 1
col    5[0] @412: *NULL*
col    6[1] @413: 0
col    7[7] @415: xm....4
col    8[7] @423: xs....6
col    9[7] @431: xm....4
col   10[1] @439: .
col   11[0] @441: *NULL*
col   12[0] @442: *NULL*
col   13[1] @443: .
col   14[0] @445: *NULL*
col   15[1] @446: .
col   16[4] @448: ..8$
col   17[1] @453: .
BBED> sum apply
Check value for File 0, Block 241:
current = 0x913d, required = 0x913d

验证bbed修改后效果

SQL> startup
ORACLE instance started.
Total System Global Area 2.6991E+10 bytes
Fixed Size		    2213976 bytes
Variable Size		 1.9327E+10 bytes
Database Buffers	 7516192768 bytes
Redo Buffers		  145174528 bytes
Database mounted.
Database opened.
SQL> select /*+ index(t i_obj4) */ rowid,DATAOBJ#,type#,owner# from obj$  t
minus
select /*+ full(t1) */ rowid,DATAOBJ#,type#,owner# from obj$  t1;
  2    3
no rows selected
SQL>  select /*+ full(t1) */ rowid,DATAOBJ#,type#,owner# from obj$  t1
  minus
  select /*+ index(t i_obj4) */ rowid,DATAOBJ#,type#,owner# from obj$  t
  ;
  2    3    4
no rows selected
SQL> ANALYZE TABLE sys.obj$ VALIDATE STRUCTURE CASCADE;
ANALYZE TABLE sys.obj$ VALIDATE STRUCTURE CASCADE
*
ERROR at line 1:
ORA-01499: table/index cross reference failure - see trace file
SQL> create table t_xifenfei as select * from dual;
create table t_xifenfei as select * from dual
                                         *
ERROR at line 1:
ORA-00604: error occurred at recursive SQL level 1
ORA-08102: index key not found, obj# 39, file 1, block 93842 (2)

这里比较悲剧,我们看到i_obj4和obj$中对应记录已经一致(条数和数据值),但是依然不能执行创建表操作,依旧报ORA-8102错误.

进一步分析错误原因

SQL> ALTER SESSION SET EVENTS '802 trace name errorstack level 3';
Session altered.
SQL> create table t as select * from dual;
create table t as select * from dual
                                *
ERROR at line 1:
ORA-00604: error occurred at recursive SQL level 1
ORA-08102: index key not found, obj# 39, file 1, block 93842 (2)
SQL> select value from v$diag_info where name='Default Trace File';
VALUE
--------------------------------------------------------------------------------
/u01/app/oracle/diag/rdbms/oa/oa/trace/oa_ora_6163.trc
oer 8102.2 - obj# 39, rdba: 0x00416e92(afn 1, blk# 93842)
kdk key 8102.2:
  ncol: 4, len: 16
  key: (16):  04 c3 0a 48 0a 01 80 01 80 06 00 40 00 f1 00 1b
--这里可以看出来,提示ORA-8102错误依旧在I_OBJ4,97109记录上
SQL> select max(dataobj#) from obj$;
MAX(DATAOBJ#)
-------------
	96815
SQL> select name,obj#,dataobj# from obj$ where rowid='AAAAASAABAAAADxAAb';
NAME				     OBJ#   DATAOBJ#
------------------------------ ---------- ----------
_NEXT_OBJECT				1      97109
--这里很奇怪，通过rowid查询我们已经的出来在obj$中有dataobj#为97109,而通过max(dataobj#)只有96815
分析max(dataobj#)执行计划
SQL> SET AUTOT TRACE
SQL> select max(dataobj#) from obj$;
Execution Plan
----------------------------------------------------------
Plan hash value: 721075849
------------------------------------------------------------------------------------
| Id  | Operation		   | Name   | Rows  | Bytes | Cost (%CPU)| Time    |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT	   |	    |	  1 |	  2 |	  2   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE 	   |	    |	  1 |	  2 |		 |    |
|   2 |   INDEX FULL SCAN (MIN/MAX)| I_OBJ4 |	  1 |	  2 |	  2   (0)| 00:00:01 |
-------------------------------------------------------------------------------------
Statistics
----------------------------------------------------------
	  0  recursive calls
	  0  db block gets
	  2  consistent gets
	  0  physical reads
	  0  redo size
	533  bytes sent via SQL*Net to client
	520  bytes received via SQL*Net from client
	  2  SQL*Net roundtrips to/from client
	  0  sorts (memory)
	  0  sorts (disk)
	  1  rows processed
--这里知晓,由于max(dataobj#)使用了INDEX FULL SCAN (MIN/MAX)执行计划,从而的出来最大值为96815,
--而我们从ORA-8102错误中可以看到index中有dataobj#为97109,证明index中的链表可能出问题
--为什么怀疑是链表有问题呢？因为该index的ffs正常

尝试把ORA-8102报错标记坏块尝试
并且通过event和隐含参数屏蔽index坏块

-bash-4.1$ bbed blocksize=8192 mode=edit filename=/u01/app/oracle/oradata/oa/system01.dbf
Password:
BBED: Release 2.0.0.0.0 - Limited Production on Sat Mar 14 20:30:58 2015
Copyright (c) 1982, 2009, Oracle and/or its affiliates.  All rights reserved.
************* !!! For Oracle Internal Use only !!! ***************
BBED> set block 93842
	BLOCK#         	93842
BBED> set offset 8188
	OFFSET         	8188
BBED> d
 File: /u01/app/oracle/oradata/oa/system01.dbf (0)
 Block: 93842            Offsets: 8188 to 8191           Dba:0x00000000
------------------------------------------------------------------------
 010675ad
 <32 bytes per line>
BBED> m /x 02
 File: /u01/app/oracle/oradata/oa/system01.dbf (0)
 Block: 93842            Offsets: 8188 to 8191           Dba:0x00000000
------------------------------------------------------------------------
 020675ad
 <32 bytes per line>
BBED> sum apply
Check value for File 0, Block 93842:
current = 0x9186, required = 0x9186
BBED> verify
DBVERIFY - Verification starting
FILE = /u01/app/oracle/oradata/oa/system01.dbf
BLOCK = 93842
Block 93842 is corrupt
Corrupt block relative dba: 0x00416e92 (file 0, block 93842)
Fractured block found during verification
Data in bad block:
 type: 6 format: 2 rdba: 0x00416e92
 last change scn: 0x0000.c007ad75 seq: 0x1 flg: 0x06
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0xad750602
 check value in block header: 0x9186
 computed block checksum: 0x0
DBVERIFY - Verification complete
Total Blocks Examined         : 1
Total Blocks Processed (Data) : 0
Total Blocks Failing   (Data) : 0
Total Blocks Processed (Index): 0
Total Blocks Failing   (Index): 0
Total Blocks Empty            : 0
Total Blocks Marked Corrupt   : 1
Total Blocks Influx           : 2
Message 531 not found;  product=RDBMS; facility=BBED
-bash-4.1$ sqlplus / as sysdba
SQL*Plus: Release 11.2.0.1.0 Production on Sat Mar 14 20:33:19 2015
Copyright (c) 1982, 2009, Oracle.  All rights reserved.
Connected to an idle instance.
SQL> startup pfile='/tmp/pfile'
ORACLE instance started.
Total System Global Area 2.6991E+10 bytes
Fixed Size		    2213976 bytes
Variable Size		 1.9327E+10 bytes
Database Buffers	 7516192768 bytes
Redo Buffers		  145174528 bytes
Database mounted.
Database opened.
SQL> create table t as select * from dual;
create table t as select * from dual
                                *
ERROR at line 1:
ORA-00604: error occurred at recursive SQL level 1
ORA-01578: ORACLE data block corrupted (file # 1, block # 93842)
ORA-01110: data file 1: '/u01/app/oracle/oradata/oa/system01.dbf'

通过这一步测试证明,在该ora-8102的错误中,通过坏块是无法解决绕过去该错误,只是把错误从ORA-8102转变为了ORA-1578

修复好制造坏块block

BBED> m /x 01
 File: /u01/app/oracle/oradata/oa/system01.dbf (0)
 Block: 93842            Offsets: 8188 to 8191           Dba:0x00000000
------------------------------------------------------------------------
 010675ad
 <32 bytes per line>
BBED> sum apply
Check value for File 0, Block 93842:
current = 0x9185, required = 0x9185
BBED> verify
DBVERIFY - Verification starting
FILE = /u01/app/oracle/oradata/oa/system01.dbf
BLOCK = 93842
DBVERIFY - Verification complete
Total Blocks Examined         : 1
Total Blocks Processed (Data) : 0
Total Blocks Failing   (Data) : 0
Total Blocks Processed (Index): 1
Total Blocks Failing   (Index): 0
Total Blocks Empty            : 0
Total Blocks Marked Corrupt   : 0
Total Blocks Influx           : 0
Message 531 not found;  product=RDBMS; facility=BBED
SQL> shutdown abort
ORACLE instance shut down.
SQL> startup
ORACLE instance started.
Total System Global Area 2.6991E+10 bytes
Fixed Size		    2213976 bytes
Variable Size		 1.9327E+10 bytes
Database Buffers	 7516192768 bytes
Redo Buffers		  145174528 bytes
Database mounted.
Database opened.
SQL> create table t1 as select * from dual;
create table t1 as select * from dual
                                 *
ERROR at line 1:
ORA-00604: error occurred at recursive SQL level 1
ORA-08102: index key not found, obj# 39, file 1, block 93842 (2)

至此我们大体出来信息：
1. ORA-8102的是I_OBJ4中的_NEXT_OBJECT记录异常和该index链表异常
2. 通过bbed修改I_OBJ4和obj$中相应记录无法解决该问题,因为还有链表异常
3. 通过标记为坏块也无法绕过该问题
由于后续如果继续修复修复i_obj4可能工作量过大,而且可以也比较急,通过人工直接删除I_OBJ4数据字典记录,然后结合bootstrap$核心index(I_OBJ1,I_USER1,I_FILE#_BLOCK#,I_IND1,I_TS#,I_CDEF1等)异常恢复—ORA-00701错误解决处理实现完美恢复

分类目录归档：非常规恢复