pvid=yes导致asm无法mount

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:pvid=yes导致asm无法mount

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

今天凌晨接到客户恢复请求,对于aix rac,两个ibm存储做mirror的环境中,客户做存储容灾演练,发现磁盘的名称发生改变,然后对其中一个磁盘设置pvid,结果悲剧了导致asm一个磁盘组无法正常起来。然后又aix端删除这些设备,然后重新扫描设备。结果不是一个磁盘组不能mount,而是整个gi就无法正常启动。希望我们给予技术支持。
查看asm 日志,确定asm disk信息
asm-disk1
asm-disk2


从这里可以确定,一共有两个asm diskgroup,每个group有两个磁盘,hdisk2和hdisk3 为hisdata,hdisk4,和hdisk5为emrdata.

使用kfed分析磁盘头

dd if=/dev/rhdisk2 of=/tmp/xifenfei/rhdisk2.dd bs=1024k count=10
dd if=/dev/rhdisk3 of=/tmp/xifenfei/rhdisk3.dd bs=1024k count=10
dd if=/dev/rhdisk4 of=/tmp/xifenfei/rhdisk4.dd bs=1024k count=10
dd if=/dev/rhdisk5 of=/tmp/xifenfei/rhdisk5.dd bs=1024k count=10
--传输到我电脑上分析
C:\Users\FAL>kfed read H:\temp\xifenfei\tmp\xifenfei\rhdisk2.dd|grep name
kfdhdb.dskname:            HISDATA_0000 ; 0x028: length=12
kfdhdb.grpname:                 HISDATA ; 0x048: length=7
kfdhdb.fgname:             HISDATA_0000 ; 0x068: length=12
kfdhdb.capname:                         ; 0x088: length=0
C:\Users\FAL>kfed read H:\temp\xifenfei\tmp\xifenfei\rhdisk3.dd|grep name
kfdhdb.dskname:            HISDATA_0001 ; 0x028: length=12
kfdhdb.grpname:                 HISDATA ; 0x048: length=7
kfdhdb.fgname:             HISDATA_0001 ; 0x068: length=12
kfdhdb.capname:                         ; 0x088: length=0
C:\Users\FAL>kfed read H:\temp\xifenfei\tmp\xifenfei\rhdisk4.dd|grep name
kfdhdb.dskname:            EMRDATA_0000 ; 0x028: length=12
kfdhdb.grpname:                 EMRDATA ; 0x048: length=7
kfdhdb.fgname:             EMRDATA_0000 ; 0x068: length=12
kfdhdb.capname:                         ; 0x088: length=0
C:\Users\FAL>kfed read H:\temp\xifenfei\tmp\xifenfei\rhdisk5.dd|grep name
C:\Users\FAL>kfed read H:\temp\xifenfei\tmp\xifenfei\rhdisk5.dd
kfbh.endian:                        201 ; 0x000: 0xc9
kfbh.hard:                          194 ; 0x001: 0xc2
kfbh.type:                          212 ; 0x002: *** Unknown Enum ***
kfbh.datfmt:                        193 ; 0x003: 0xc1
kfbh.block.blk:                       0 ; 0x004: blk=0
kfbh.block.obj:                       0 ; 0x008: file=0
kfbh.check:                           0 ; 0x00c: 0x00000000
kfbh.fcn.base:                        0 ; 0x010: 0x00000000
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000
000000000 C1D4C2C9 00000000 00000000 00000000  [................]
000000010 00000000 00000000 00000000 00000000  [................]
  Repeat 254 times
KFED-00322: Invalid content encountered during block traversal: [kfbtTraverseBlock][Invalid OSM block type][][212]
C:\Users\FAL>kfed read H:\temp\xifenfei\tmp\xifenfei\rhdisk5.dd blkn=2|grep kfbh
kfbh.endian:                          0 ; 0x000: 0x00
kfbh.hard:                          130 ; 0x001: 0x82
kfbh.type:                            3 ; 0x002: KFBTYP_ALLOCTBL
kfbh.datfmt:                          2 ; 0x003: 0x02
kfbh.block.blk:                33554432 ; 0x004: blk=33554432
kfbh.block.obj:                16777344 ; 0x008: file=128
kfbh.check:                  2654889601 ; 0x00c: 0x9e3e6681
kfbh.fcn.base:               1696071680 ; 0x010: 0x65180000
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000
C:\Users\FAL>kfed read H:\temp\xifenfei\tmp\xifenfei\rhdisk5.dd blkn=510|grep name
kfdhdb.dskname:            EMRDATA_0001 ; 0x028: length=12
kfdhdb.grpname:                 EMRDATA ; 0x048: length=7
kfdhdb.fgname:             EMRDATA_0001 ; 0x068: length=12
kfdhdb.capname:                         ; 0x088: length=0

通过上述分析,基本上确定由于对hdisk5设置了pvid导致该asm disk的磁盘头损坏.这个可以直接使用asm repair功能修复(注意要clear pvid)

C:\Users\FAL>kfed read H:\temp\xifenfei\tmp\xifenfei\rhdisk5.dd |grep name
kfdhdb.dskname:            EMRDATA_0001 ; 0x028: length=12
kfdhdb.grpname:                 EMRDATA ; 0x048: length=7
kfdhdb.fgname:             EMRDATA_0001 ; 0x068: length=12
kfdhdb.capname:                         ; 0x088: length=0

启动crs到cssd进程报错分析
1. 由于删除磁盘,扫描设备导致hdisk[2-5] 权限和用户组不对
2. 由于删除,扫描磁盘导致磁盘共享模式不对
修复磁盘头和解决这两个问题之后,gi启动正常,磁盘组也正常mount,数据库也正常启动,数据0丢失,至此完美恢复
oracle-open


类似客户恢复案例:asm disk误设置pvid导致asm diskgroup无法mount恢复
如果您遇到此类情况,无法解决请联系我们,提供专业ORACLE数据库恢复技术支持
Phone:17813235971    Q Q:107644445QQ咨询惜分飞    E-Mail:dba@xifenfei.com

ORA-01555 ORA-600 kdiulk:kcbz_objdchk ORA-600 kdBlkCheckError等错误恢复

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:ORA-01555 ORA-600 kdiulk:kcbz_objdchk ORA-600 kdBlkCheckError等错误恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

数据库启动ORA-00704,0RA-00604,ORA-01555导致数据库无法启动

Tue May 31 17:32:42 2016
SMON: enabling cache recovery
SUCCESS: diskgroup RECOVERY was mounted
ARC3: Archival started
ARC0: STARTING ARCH PROCESSES COMPLETE
ORA-01555 caused by SQL statement below (SQL ID: 4krwuz0ctqxdt, SCN: 0x0004.3af84bee):
select ctime, mtime, stime from obj$ where obj# = :1
Archived Log entry 5 added for thread 1 sequence 10 ID 0x86a261e7 dest 1:
Errors in file /opt/app/oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_ora_12779.trc:
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 1
ORA-01555: snapshot too old: rollback segment number 7 with name "_SYSSMU7_1592079335$" too small
Errors in file /opt/app/oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_ora_12779.trc:
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 1
ORA-01555: snapshot too old: rollback segment number 7 with name "_SYSSMU7_1592079335$" too small
Error 704 happened during db open, shutting down database
USER (ospid: 12779): terminating the instance due to error 704

通过bbed修改事务之后启动数据库

Tue May 31 17:35:49 2016
SMON: enabling tx recovery
*********************************************************************
WARNING: The following temporary tablespaces contain no files.
         This condition can occur when a backup controlfile has
         been restored.  It may be necessary to add files to these
         tablespaces.  That can be done using the SQL statement:
         ALTER TABLESPACE <tablespace_name> ADD TEMPFILE
         Alternatively, if these temporary tablespaces are no longer
         needed, then they can be dropped.
           Empty temporary tablespace: TEMP
*********************************************************************
Updating character set in controlfile to AL32UTF8
Tue May 31 17:35:50 2016
Errors in file /opt/app/oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_p021_13862.trc  (incident=166002):
ORA-00600: 内部错误代码, 参数: [kdiulk:kcbz_objdchk], [0], [0], [1], [], [], [], [], [], [], [], []
Tue May 31 17:35:50 2016
Errors in file /opt/app/oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_p010_13818.trc  (incident=165914):
ORA-00600: 内部错误代码, 参数: [kdiulk:kcbz_objdchk], [0], [0], [1], [], [], [], [], [], [], [], []
Tue May 31 17:35:50 2016
Errors in file /opt/app/oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_p004_13794.trc  (incident=165866):
ORA-00600: 内部错误代码, 参数: [kdiulk:kcbz_objdchk], [0], [0], [1], [], [], [], [], [], [], [], []
Tue May 31 17:35:50 2016
Errors in file /opt/app/oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_p011_13822.trc  (incident=165922):
ORA-00600: 内部错误代码, 参数: [kdiulk:kcbz_objdchk], [0], [0], [1], [], [], [], [], [], [], [], []
Tue May 31 17:35:50 2016
Errors in file /opt/app/oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_p016_13842.trc  (incident=165962):
ORA-00600: 内部错误代码, 参数: [kdiulk:kcbz_objdchk], [0], [0], [1], [], [], [], [], [], [], [], []

ORA-600 [kdiulk:kcbz_objdchk] trace文件

*** SESSION ID:(3.5) 2016-05-31 17:35:50.068
OBJD MISMATCH typ=6, seg.obj=-2, diskobj=222225, dsflg=0, dsobj=285890, tid=285890, cls=1
ORA-00600: 内部错误代码, 参数: [kdiulk:kcbz_objdchk], [0], [0], [1], [], [], [], [], [], [], [], []
Parallel Transaction recovery server caught exception 600
begin Parallel Recovery Context Dump
nsi: 48, nsactive: 48
, nirsi: 1, nidti: 1, ndt: 1, rescan: 0, ptrs: 48
[ktprsi] wdone: 50
[ktpritp 378651b8] ktprsi:
37903b60 37903b78 37903b90 37903ba8 37903bc0 37903bd8 37903bf0 37903c08 37903c20 37903c38 37903c50
37903c68 37903c80 37903c98 37903cb0 37903cc8 37903ce0 37903cf8 37903d10 37903d28 37903d40 37903d58
37903d70 37903d88 37903da0 37903db8 37903dd0 37903de8 37903e00 37903e18 37903e30 37903e48 37903e60
37903e78 37903e90 37903ea8 37903ec0 37903ed8 37903ef0 37903f08 37903f20 37903f38 37903f50 37903f68
37903f80 37903f98 37903fb0 37903fc8
[ktprht] nhb: 47, nfl: 20247, flg: 2
*** 2016-05-31 17:36:08.584
[ktprhb] nfl: 1, nelem: 97, flg: 0, sqn: 1
flist: 37698940
nhe: [ktprhe 32] sqn: -1297235803
[kturur] uoff: -1797708320, sqn: 4
uba: 0x098004cd.07e4.0b
*-----------------------------
* Rec #0xb  slt: 0x07  objn: 123986(0x0001e452)  objd: 285891  tblspc: 10(0x0000000a)
*       Layer:  10 (Index)   opc: 22   rci 0x0a
Undo type:  Regular undo   Last buffer split:  No
Temp Object:  No
Tablespace Undo:  No
rdba: 0x00000000

这里基本上可以确定是由于undo index中的dataobj#和block中的dataobj# 不匹配.在数据库undo回滚之时出现该错误.可以通过跳过undo回滚,然后重建对象

Tue May 31 17:36:06 2016
Simulated error on redo application.
Block recovery from logseq 12, block 959 to scn 20401094719
Recovery of Online Redo Log: Thread 1 Group 3 Seq 12 Reading mem 0
  Mem# 0: +DATA/xifenfei/onlinelog/group_3.263.802446627
Block recovery completed at rba 12.1012.16, scn 4.3221225536
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Simulated error for redo application done.
Errors in file /opt/app/oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_p009_13814.trc  (incident=165906):
ORA-00600: 内部错误代码, 参数: [kdBlkCheckError], [26], [950417], [18025], [], [], [], [], [], [], [], []

这些错误是由于数据库block逻辑异常导致,错过参数含义
在10g中ORA-600 kddummy_blkchk 在11g中ORA-600 kdBlkCheckError

ARGUMENTS:
Arg [a] Absolute file number
Arg [b] Bock number
Arg  Internal error code returned from kcbchk() which indicates the problem encountered.
See Note 46389.1 for details of block check codes.

根据QREF kddummy_blkchk / kdBlkCheckError Check Codes Listing (Full) (Doc ID 1264040.1)分析
这里的18025是代码的KCBTEMAP_EC_START + KTS4_EC_SBFREE部分异常,主要表现在Incorrect firstfree or nfree 可以通过设置一些参数进行屏蔽

在恢复过程中还有其他错误

ORA-600 encountered when generating server alert SMG-4128
ORA-00600: internal error code, arguments: [ktcpoptx:!cmt top lvl], [], [], [], [], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [4406], [0x1026B65348], [0x000000000], [2], [6215], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [ktcpoptx:!cmt top lvl], [], [], [], [], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [4194], [], [], [], [], [], [], [], [], [], [], []
ORACLE Instance xifenfei (pid = 15) - Error 600 encountered while recovering transaction (10, 7) on object 123986.
ORA-00600: internal error code, arguments: [], [], [], [], [], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [kdsgrp1], [], [], [], [], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [kturbleurec1], [], [], [], [], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [kewrose_1], [600],
  [ORA-00600: internal error code, arguments: [4194], [], [], [], [], [], [], [], [], [], [], []
Non-fatal internal error happenned while SMON was doing logging scn->time mapping.

通过整体分析错误主要是由于undo异常导致,通过设置_corrupted_rollback_segments设置db_block_checking等相关参数,清理SMON_SCN_TIME等操作数据库没有其他异常报错,让其通过逻辑方式重建库

强制关机导致数据库无法正常启动恢复

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:强制关机导致数据库无法正常启动恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

有客户qq找到我,说有朋友推荐,让我帮他们恢复数据库.由于强制关机后,数据库无法正常启动.
数据库recover database失败

Mon Mar 28 10:20:33 2016
ALTER DATABASE RECOVER  database
Media Recovery Start
 started logmerger process
Parallel Media Recovery started with 32 slaves
Mon Mar 28 10:20:36 2016
Recovery of Online Redo Log: Thread 1 Group 2 Seq 18686 Reading mem 0
  Mem# 0: E:\ORACLE_DATA\YCCY\REDO02.LOG
Recovery of Online Redo Log: Thread 1 Group 3 Seq 18687 Reading mem 0
  Mem# 0: E:\ORACLE_DATA\YCCY\REDO03.LOG
Recovery of Online Redo Log: Thread 1 Group 1 Seq 18688 Reading mem 0
  Mem# 0: E:\ORACLE_DATA\YCCY\REDO01.LOG
Mon Mar 28 10:20:38 2016
Hex dump of (file 45, block 7431) in trace file d:\oracle\diag\rdbms\yccy\yccy\trace\yccy_pr0q_2968.trc
Corrupt block relative dba: 0x0b401d07 (file 45, block 7431)
Mon Mar 28 10:20:38 2016
Hex dump of (file 45, block 7836) in trace file d:\oracle\diag\rdbms\yccy\yccy\trace\yccy_pr01_2220.trc
Bad header found during media recovery
Corrupt block relative dba: 0x0b401e9c (file 45, block 7836)
Data in bad block:
Bad header found during media recovery
 type: 0 format: 0 rdba: 0x1d070000
 last change scn: 0x4917.f8dc0b40 seq: 0x0 flg: 0x00
 spare1: 0x6 spare2: 0xa2 spare3: 0xc7f7
 consistency value in tail: 0x06010000
 check value in block header: 0x601
 block checksum disabled
Reading datafile 'E:\ORACLE_DATA\YCCY\DT_SYS_IDX12.DBF' for corruption at rdba: 0x0b401d07 (file 45, block 7431)
Reread (file 45, block 7431) found valid data
Repaired corruption at (file 45, block 7431)
Hex dump of (file 45, block 7556) in trace file d:\oracle\diag\rdbms\yccy\yccy\trace\yccy_pr0q_2968.trc
Corrupt block relative dba: 0x0b401d84 (file 45, block 7556)
Bad header found during media recovery
Data in bad block:
 type: 106 format: 3 rdba: 0x1d840000
 last change scn: 0x461d.391a0b40 seq: 0x0 flg: 0x00
 spare1: 0x6 spare2: 0xa2 spare3: 0x2499
 consistency value in tail: 0x06013999
 check value in block header: 0x401
 block checksum disabled
Reading datafile 'E:\ORACLE_DATA\YCCY\DT_SYS_IDX12.DBF' for corruption at rdba: 0x0b401d84 (file 45, block 7556)
Reread (file 45, block 7556) found valid data
Repaired corruption at (file 45, block 7556)
Mon Mar 28 10:20:38 2016
Exception [type: ACCESS_VIOLATION, UNABLE_TO_READ] [ADDR:0xFFFFFFFFFFFFFFFF] [PC:0x1334748, kcbzfw()+3094]
Mon Mar 28 10:20:39 2016
Errors in file d:\oracle\diag\rdbms\yccy\yccy\trace\yccy_pr0k_3900.trc  (incident=131189):
ORA-00600: internal error code, arguments: [kcbr_validate_read_1], [], [], [], [], [], [], [], [], [], [], []
Incident details in: d:\oracle\diag\rdbms\yccy\yccy\incident\incdir_131189\yccy_pr0k_3900_i131189.trc
ERROR: Unable to normalize symbol name for the following short stack (at offset 199):
Errors in file d:\oracle\diag\rdbms\yccy\yccy\trace\yccy_pr0r_3060.trc  (incident=131245):
ORA-07445: exception encountered: core dump [kcbzfw()+3094] [ACCESS_VIOLATION] [ADDR:0xFFFFFFFFFFFFFFFF] [PC:0x1334748] [UNABLE_TO_READ] []
ORA-10567: Redo is inconsistent with data block (file# 5, block# 169345, file offset is 1387274240 bytes)
ORA-10564: tablespace DT_SYS_DAT
ORA-01110: data file 5: 'E:\ORACLE_DATA\YCCY\DT_SYS_DAT.ORA'
ORA-10560: block type 'FIRST LEVEL BITMAP BLOCK'
Incident details in: d:\oracle\diag\rdbms\yccy\yccy\incident\incdir_131245\yccy_pr0r_3060_i131245.trc
Exception [type: ACCESS_VIOLATION, UNABLE_TO_READ] [ADDR:0xFFFFFFFFFFFFFFFF] [PC:0x12EC62C, kcbzdh()+942]
Mon Mar 28 10:20:39 2016
Errors in file d:\oracle\diag\rdbms\yccy\yccy\trace\yccy_pr0d_2112.trc  (incident=131133):
ORA-00600: internal error code, arguments: [kcbrapply_12], [], [], [], [], [], [], [], [], [], [], []
Incident details in: d:\oracle\diag\rdbms\yccy\yccy\incident\incdir_131133\yccy_pr0d_2112_i131133.trc
Mon Mar 28 10:20:39 2016
Errors in file d:\oracle\diag\rdbms\yccy\yccy\trace\yccy_pr0e_3260.trc  (incident=131141):
ORA-00600: internal error code, arguments: [3020], [5], [163457], [21134977], [], [], [], [], [], [], [], []
ORA-10567: Redo is inconsistent with data block (file# 5, block# 163457, file offset is 1339039744 bytes)
ORA-10564: tablespace DT_SYS_DAT
ORA-01110: data file 5: 'E:\ORACLE_DATA\YCCY\DT_SYS_DAT.ORA'
ORA-10560: block type 'FIRST LEVEL BITMAP BLOCK'
Incident details in: d:\oracle\diag\rdbms\yccy\yccy\incident\incdir_131141\yccy_pr0e_3260_i131141.trc
Mon Mar 28 10:20:39 2016
Errors in file d:\oracle\diag\rdbms\yccy\yccy\trace\yccy_pr04_3980.trc  (incident=131021):
ORA-00600: internal error code, arguments: [kcbrapply_12], [], [], [], [], [], [], [], [], [], [], []
Incident details in: d:\oracle\diag\rdbms\yccy\yccy\incident\incdir_131021\yccy_pr04_3980_i131021.trc
Data in bad block:
 type: 0 format: 0 rdba: 0x1e9c0000
 last change scn: 0x4915.f8320b40 seq: 0x0 flg: 0x00
 spare1: 0x6 spare2: 0xa2 spare3: 0x8029
 consistency value in tail: 0x0602e40c
 check value in block header: 0x602
 block checksum disabled
Reading datafile 'E:\ORACLE_DATA\YCCY\DT_SYS_IDX12.DBF' for corruption at rdba: 0x0b401e9c (file 45, block 7836)
Reread (file 45, block 7836) found valid data
Repaired corruption at (file 45, block 7836)
Mon Mar 28 10:20:39 2016
Errors in file d:\oracle\diag\rdbms\yccy\yccy\trace\yccy_pr0f_816.trc  (incident=131149):
ORA-00600: internal error code, arguments: [kcbr_validate_read_1], [], [], [], [], [], [], [], [], [], [], []
Incident details in: d:\oracle\diag\rdbms\yccy\yccy\incident\incdir_131149\yccy_pr0f_816_i131149.trc
Exception [type: ACCESS_VIOLATION, UNABLE_TO_READ] [ADDR:0xFFFFFFFFFFFFFFFF] [PC:0x12EC62C, kcbzdh()+942]
Mon Mar 28 10:20:39 2016
Errors in file d:\oracle\diag\rdbms\yccy\yccy\trace\yccy_pr0i_2132.trc  (incident=131173):
ORA-00600: internal error code, arguments: [3020], [5], [154240], [21125760], [], [], [], [], [], [], [], []
ORA-10567: Redo is inconsistent with data block (file# 5, block# 154240, file offset is 1263534080 bytes)
ORA-10564: tablespace DT_SYS_DAT
ORA-01110: data file 5: 'E:\ORACLE_DATA\YCCY\DT_SYS_DAT.ORA'
ORA-10560: block type 'FIRST LEVEL BITMAP BLOCK'
Incident details in: d:\oracle\diag\rdbms\yccy\yccy\incident\incdir_131173\yccy_pr0i_2132_i131173.trc
Errors in file d:\oracle\diag\rdbms\yccy\yccy\trace\yccy_pr0k_3900.trc  (incident=131190):
ORA-07445: exception encountered: core dump [kcbzdh()+942] [ACCESS_VIOLATION] [ADDR:0xFFFFFFFFFFFFFFFF] [PC:0x12EC62C] [UNABLE_TO_READ] []
ORA-00600: internal error code, arguments: [kcbr_validate_read_1], [], [], [], [], [], [], [], [], [], [], []
Incident details in: d:\oracle\diag\rdbms\yccy\yccy\incident\incdir_131190\yccy_pr0k_3900_i131190.trc
Errors in file d:\oracle\diag\rdbms\yccy\yccy\trace\yccy_pr01_2220.trc  (incident=131037):
ORA-00600: internal error code, arguments: [kcbrapply_14], [], [], [], [], [], [], [], [], [], [], []
Incident details in: d:\oracle\diag\rdbms\yccy\yccy\incident\incdir_131037\yccy_pr01_2220_i131037.trc
Exception [type: ACCESS_VIOLATION, UNABLE_TO_READ] [ADDR:0xFFFFFFFFFFFFFFFF] [PC:0x12EC62C, kcbzdh()+942]
Errors in file d:\oracle\diag\rdbms\yccy\yccy\trace\yccy_pr0f_816.trc  (incident=131150):
ORA-07445: exception encountered: core dump [kcbzdh()+942] [ACCESS_VIOLATION] [ADDR:0xFFFFFFFFFFFFFFFF] [PC:0x12EC62C] [UNABLE_TO_READ] []
ORA-00600: internal error code, arguments: [kcbr_validate_read_1], [], [], [], [], [], [], [], [], [], [], []
Incident details in: d:\oracle\diag\rdbms\yccy\yccy\incident\incdir_131150\yccy_pr0f_816_i131150.trc
Errors in file d:\oracle\diag\rdbms\yccy\yccy\trace\yccy_pr01_2220.trc  (incident=131038):
ORA-07445: exception encountered: core dump [kcbzdh()+942] [ACCESS_VIOLATION] [ADDR:0xFFFFFFFFFFFFFFFF] [PC:0x12EC62C] [UNABLE_TO_READ] []
ORA-00600: internal error code, arguments: [kcbrapply_14], [], [], [], [], [], [], [], [], [], [], []
Incident details in: d:\oracle\diag\rdbms\yccy\yccy\incident\incdir_131038\yccy_pr01_2220_i131038.trc
Mon Mar 28 10:20:39 2016
Errors in file d:\oracle\diag\rdbms\yccy\yccy\trace\yccy_pr0h_4036.trc  (incident=131165):
ORA-00600: internal error code, arguments: [kcbr_validate_read_1], [], [], [], [], [], [], [], [], [], [], []
Incident details in: d:\oracle\diag\rdbms\yccy\yccy\incident\incdir_131165\yccy_pr0h_4036_i131165.trc
Exception [type: ACCESS_VIOLATION, UNABLE_TO_READ] [ADDR:0xFFFFFFFFFFFFFFFF] [PC:0x12EC62C, kcbzdh()+942]
Exception [type: ACCESS_VIOLATION, UNABLE_TO_READ] [ADDR:0xFFFFFFFFFFFFFFFF] [PC:0x12EC13B, kcbzpnd()+299]
Exception [type: ACCESS_VIOLATION, UNABLE_TO_READ] [ADDR:0xFFFFFFFFFFFFFFFF] [PC:0x1351BB9, kcbs_dump_adv_state()+1529]
Exception [type: ACCESS_VIOLATION, UNABLE_TO_READ] [ADDR:0xFFFFFFFFFFFFFFFF] [PC:0x12EC13B, kcbzpnd()+299]
Errors in file d:\oracle\diag\rdbms\yccy\yccy\trace\yccy_pr0h_4036.trc  (incident=131166):
ORA-07445: exception encountered: core dump [kcbzdh()+942] [ACCESS_VIOLATION] [ADDR:0xFFFFFFFFFFFFFFFF] [PC:0x12EC62C] [UNABLE_TO_READ] []
ORA-00600: internal error code, arguments: [kcbr_validate_read_1], [], [], [], [], [], [], [], [], [], [], []
Incident details in: d:\oracle\diag\rdbms\yccy\yccy\incident\incdir_131166\yccy_pr0h_4036_i131166.trc
Exception [type: ACCESS_VIOLATION, UNABLE_TO_READ] [ADDR:0xFFFFFFFFFFFFFFFF] [PC:0x12EC13B, kcbzpnd()+299]
Mon Mar 28 10:20:40 2016
Checker run found 60 new persistent data failures
Errors in file d:\oracle\diag\rdbms\yccy\yccy\trace\yccy_pr0d_2112.trc  (incident=131134):
ORA-07445: exception encountered: core dump [kcbzpnd()+299] [ACCESS_VIOLATION] [ADDR:0xFFFFFFFFFFFFFFFF] [PC:0x12EC13B] [UNABLE_TO_READ] []
ORA-00600: internal error code, arguments: [kcbrapply_12], [], [], [], [], [], [], [], [], [], [], []
Incident details in: d:\oracle\diag\rdbms\yccy\yccy\incident\incdir_131134\yccy_pr0d_2112_i131134.trc
Errors in file d:\oracle\diag\rdbms\yccy\yccy\trace\yccy_pr04_3980.trc  (incident=131022):
ORA-07445: exception encountered: core dump [kcbs_dump_adv_state()+1529] [ACCESS_VIOLATION] [ADDR:0xFFFFFFFFFFFFFFFF] [PC:0x1351BB9] [UNABLE_TO_READ] []
ORA-00600: internal error code, arguments: [kcbrapply_12], [], [], [], [], [], [], [], [], [], [], []
Incident details in: d:\oracle\diag\rdbms\yccy\yccy\incident\incdir_131022\yccy_pr04_3980_i131022.trc
Errors in file d:\oracle\diag\rdbms\yccy\yccy\trace\yccy_pr0e_3260.trc  (incident=131142):
ORA-07445: exception encountered: core dump [kcbzpnd()+299] [ACCESS_VIOLATION] [ADDR:0xFFFFFFFFFFFFFFFF] [PC:0x12EC13B] [UNABLE_TO_READ] []
ORA-00600: internal error code, arguments: [3020], [5], [163457], [21134977], [], [], [], [], [], [], [], []
ORA-10567: Redo is inconsistent with data block (file# 5, block# 163457, file offset is 1339039744 bytes)
ORA-10564: tablespace DT_SYS_DAT
ORA-01110: data file 5: 'E:\ORACLE_DATA\YCCY\DT_SYS_DAT.ORA'
ORA-10560: block type 'FIRST LEVEL BITMAP BLOCK'
Incident details in: d:\oracle\diag\rdbms\yccy\yccy\incident\incdir_131142\yccy_pr0e_3260_i131142.trc
Mon Mar 28 10:20:41 2016
Trace dumping is performing id=[cdmp_20160328102041]
Errors in file d:\oracle\diag\rdbms\yccy\yccy\trace\yccy_pr0i_2132.trc  (incident=131174):
ORA-07445: exception encountered: core dump [kcbzpnd()+299] [ACCESS_VIOLATION] [ADDR:0xFFFFFFFFFFFFFFFF] [PC:0x12EC13B] [UNABLE_TO_READ] []
ORA-00600: internal error code, arguments: [3020], [5], [154240], [21125760], [], [], [], [], [], [], [], []
ORA-10567: Redo is inconsistent with data block (file# 5, block# 154240, file offset is 1263534080 bytes)
ORA-10564: tablespace DT_SYS_DAT
ORA-01110: data file 5: 'E:\ORACLE_DATA\YCCY\DT_SYS_DAT.ORA'
ORA-10560: block type 'FIRST LEVEL BITMAP BLOCK'
Incident details in: d:\oracle\diag\rdbms\yccy\yccy\incident\incdir_131174\yccy_pr0i_2132_i131174.trc
Mon Mar 28 10:20:41 2016
Exception [type: ACCESS_VIOLATION, UNABLE_TO_READ] [ADDR:0x2E7FFFFFE] [PC:0x74CAE3F0, 0000000074CAE3F0]
Errors in file d:\oracle\diag\rdbms\yccy\yccy\trace\yccy_pr06_2684.trc  (incident=131077):
ORA-07445: exception encountered: core dump [PC:0x74CAE3F0] [ACCESS_VIOLATION] [ADDR:0x2E7FFFFFE] [PC:0x74CAE3F0] [UNABLE_TO_READ] []
Incident details in: d:\oracle\diag\rdbms\yccy\yccy\incident\incdir_131077\yccy_pr06_2684_i131077.trc
Mon Mar 28 10:20:42 2016
Exception [type: ACCESS_VIOLATION, UNABLE_TO_WRITE] [ADDR:0x0] [PC:0x4D20D2, kslgetl()+54]
Mon Mar 28 10:20:42 2016
Errors in file d:\oracle\diag\rdbms\yccy\yccy\trace\yccy_pmon_3856.trc  (incident=130853):
ORA-07445: exception encountered: core dump [kslgetl()+54] [ACCESS_VIOLATION] [ADDR:0x0] [PC:0x4D20D2] [UNABLE_TO_WRITE] []
Incident details in: d:\oracle\diag\rdbms\yccy\yccy\incident\incdir_130853\yccy_pmon_3856_i130853.trc
Trace dumping is performing id=[cdmp_20160328102042]
Errors in file d:\oracle\diag\rdbms\yccy\yccy\incident\incdir_131077\yccy_pr06_2684_i131077.trc:
ORA-00607: Internal error occurred while making a change to a data block
ORA-00602: internal programming exception
ORA-07445: exception encountered: core dump [PC:0x74CAE3F0] [ACCESS_VIOLATION] [ADDR:0x2E7FFFFFE] [PC:0x74CAE3F0] [UNABLE_TO_READ] []
Process debug not enabled via parameter _debug_enable
Trace dumping is performing id=[cdmp_20160328102043]
Mon Mar 28 10:21:01 2016
RECO (ospid: 3524): terminating the instance due to error 472
Instance terminated by RECO, pid = 3524

通过观察这段日志,基本上可以发现主要是FILE 45,虽然提示坏块但是最终验证确定为正常块(类似:Reread (file 45, block 7836) found valid data),这里主要是file 5,报了大量的ORA-600[3020].

对数据文件逐个进行recover操作

SQL> startup mount;
ORACLE 例程已经启动。
Total System Global Area 1.7103E+10 bytes
Fixed Size                  2192864 bytes
Variable Size            9059699232 bytes
Database Buffers         8019509248 bytes
Redo Buffers               21762048 bytes
数据库装载完毕。
SQL> recover datafile 1;
完成介质恢复。
SQL> recover  datafile 2;
ORA-03113: 通信通道的文件结尾
进程 ID: 1652
会话 ID: 551 序列号: 55
SQL> recover datafile 3;
完成介质恢复。
SQL> recover datafile 4;
完成介质恢复。
SQL> recover datafile 5;
ORA-03113: 通信通道的文件结尾
进程 ID: 4900
会话 ID: 551 序列号: 56131
SQL> recover datafile 6;
完成介质恢复。
…………
SQL> recover datafile 63;
完成介质恢复。
SQL> recover datafile 64;
完成介质恢复。

除掉datafile 2,5之外,其他文件全部recover成功.

对于file 2 尝试处理
无法通过recover成功,只能暂时放弃,后续考虑先offline open库,然后把这个文件强制online

SQL> recover  datafile 2 ;
ORA-03113: 通信通道的文件结尾
进程 ID: 5020
会话 ID: 551 序列号: 3
Mon Mar 28 10:47:12 2016
ALTER DATABASE RECOVER  datafile 2
Media Recovery Start
Serial Media Recovery started
Recovery of Online Redo Log: Thread 1 Group 1 Seq 18688 Reading mem 0
  Mem# 0: E:\ORACLE_DATA\YCCY\REDO01.LOG
Exception [type: ACCESS_VIOLATION, UNABLE_TO_READ] [ADDR:0x2E7FFFFFE] [PC:0x74CAE3F0, 0000000074CAE3F0]
Errors in file d:\oracle\diag\rdbms\yccy\yccy\trace\yccy_ora_3508.trc  (incident=143022):
ORA-07445: 出现异常错误: 核心转储 [PC:0x74CAE3F0] [ACCESS_VIOLATION] [ADDR:0x2E7FFFFFE] [PC:0x74CAE3F0] [UNABLE_TO_READ] []
Incident details in: d:\oracle\diag\rdbms\yccy\yccy\incident\incdir_143022\yccy_ora_3508_i143022.trc
Errors in file d:\oracle\diag\rdbms\yccy\yccy\incident\incdir_143022\yccy_ora_3508_i143022.trc:
ORA-00607: 当更改数据块时出现内部错误
ORA-00602: 内部编程异常错误
ORA-07445: 出现异常错误: 核心转储 [PC:0x74CAE3F0] [ACCESS_VIOLATION] [ADDR:0x2E7FFFFFE] [PC:0x74CAE3F0] [UNABLE_TO_READ] []

对于file 5处理

SQL> recover datafile 5;
ORA-00283: 恢复会话因错误而取消
ORA-00600: 内部错误代码, 参数: [3020], [5], [163457], [21134977], [], [], [],
[], [], [], [], []
ORA-10567: Redo is inconsistent with data block (file# 5, block# 163457, file
offset is 1339039744 bytes)
ORA-10564: tablespace DT_SYS_DAT
ORA-01110: 数据文件 5: 'E:\ORACLE_DATA\YCCY\DT_SYS_DAT.ORA'
ORA-10560: block type 'FIRST LEVEL BITMAP BLOCK'
SQL> recover  datafile 5 allow 1 corruption;
ORA-00283: 恢复会话因错误而取消
ORA-00600: 内部错误代码, 参数: [3020], [5], [162433], [21133953], [], [], [],
[], [], [], [], []
ORA-10567: Redo is inconsistent with data block (file# 5, block# 162433, file
offset is 1330651136 bytes)
ORA-10564: tablespace DT_SYS_DAT
ORA-01110: 数据文件 5: 'E:\ORACLE_DATA\YCCY\DT_SYS_DAT.ORA'
ORA-10560: block type 'FIRST LEVEL BITMAP BLOCK'
SQL> recover  datafile 5 allow 1 corruption;
ORA-00283: 恢复会话因错误而取消
ORA-00600: 内部错误代码, 参数: [3020], [5], [166272], [21137792], [], [], [],
[], [], [], [], []
ORA-10567: Redo is inconsistent with data block (file# 5, block# 166272, file
offset is 1362100224 bytes)
ORA-10564: tablespace DT_SYS_DAT
ORA-01110: 数据文件 5: 'E:\ORACLE_DATA\YCCY\DT_SYS_DAT.ORA'
ORA-10560: block type 'FIRST LEVEL BITMAP BLOCK'
SQL> recover  datafile 5 allow 1 corruption;
ORA-00283: 恢复会话因错误而取消
ORA-00600: 内部错误代码, 参数: [3020], [5], [169346], [21140866], [], [], [],
[], [], [], [], []
ORA-10567: Redo is inconsistent with data block (file# 5, block# 169346, file
offset is 1387282432 bytes)
ORA-10564: tablespace DT_SYS_DAT
ORA-01110: 数据文件 5: 'E:\ORACLE_DATA\YCCY\DT_SYS_DAT.ORA'
ORA-10560: block type 'FIRST LEVEL BITMAP BLOCK'
SQL> recover  datafile 5 allow 1 corruption;
完成介质恢复。

open数据库并online datafile 2

SQL> startup pfile='d:/pfile.txt' mount;
ORACLE 例程已经启动。
Total System Global Area 1.7103E+10 bytes
Fixed Size                  2192864 bytes
Variable Size            9059699232 bytes
Database Buffers         8019509248 bytes
Redo Buffers               21762048 bytes
数据库装载完毕。
SQL> alter database datafile 2 offline;
数据库已更改。
SQL> alter database open;
数据库已更改。
SQL> shutdown immediate;
ORA-03113: 通信通道的文件结尾
SQL> conn / as sysdba
已连接到空闲例程。
SQL> startup pfile='d:/pfile.txt' mount;
ORACLE 例程已经启动。
Total System Global Area 1.7103E+10 bytes
Fixed Size                  2192864 bytes
Variable Size            9059699232 bytes
Database Buffers         8019509248 bytes
Redo Buffers               21762048 bytes
数据库装载完毕。
SQL> select group#,status from v$log;
    GROUP# STATUS
---------- ----------------
         1 INACTIVE
         3 INACTIVE
         2 CURRENT
SQL> recover database until cancel;
ORA-00279: 更改 1226478477 (在 03/28/2016 20:23:37 生成) 对于线程 1 是必需的
ORA-00289: 建议:
D:\ORACLE\FLASH_RECOVERY_AREA\YCCY\ARCHIVELOG\2016_03_28\O1_MF_1_18689_%U_.ARC
ORA-00280: 更改 1226478477 (用于线程 1) 在序列 #18689 中
指定日志: {<RET>=suggested | filename | AUTO | CANCEL}
E:\ORACLE_DATA\YCCY\REDO02.LOG
已应用的日志。
完成介质恢复。
SQL> alter database datafile 2 online;
数据库已更改。
SQL> alter database open resetlogs;
数据库已更改。

数据库基本上属于正常打开,处理掉3020部分的坏块基本ok

ORA-01172 ORA-01151 故障恢复

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:ORA-01172 ORA-01151 故障恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

有客户存储异常断电,导致数据库启动报ORA-01172错,导致数据库无法open
数据库启动报ORA-01172错误

Wed Mar 23 14:16:23 2016
ALTER DATABASE OPEN
Wed Mar 23 14:16:24 2016
Beginning crash recovery of 1 threads
 parallel recovery started with 15 processes
Wed Mar 23 14:16:24 2016
Started redo scan
Wed Mar 23 14:16:25 2016
Completed redo scan
 62588 redo blocks read, 15 data blocks need recovery
Wed Mar 23 14:16:25 2016
Started redo application at
 Thread 1: logseq 15050, block 2, scn 2439828667
Wed Mar 23 14:16:25 2016
Recovery of Online Redo Log: Thread 1 Group 1 Seq 15050 Reading mem 0
  Mem# 0 errs 0: /oracle/oradata/orcl/redo01.log
Wed Mar 23 14:16:25 2016
Completed redo application
Wed Mar 23 14:16:25 2016
RECOVERY OF THREAD 1 STUCK AT BLOCK 26185 OF FILE 3
Wed Mar 23 14:16:25 2016
RECOVERY OF THREAD 1 STUCK AT BLOCK 69385 OF FILE 3
Wed Mar 23 14:16:25 2016
RECOVERY OF THREAD 1 STUCK AT BLOCK 566 OF FILE 2
Wed Mar 23 14:16:25 2016
RECOVERY OF THREAD 1 STUCK AT BLOCK 89 OF FILE 2
Wed Mar 23 14:16:25 2016
RECOVERY OF THREAD 1 STUCK AT BLOCK 53769 OF FILE 3
Wed Mar 23 14:16:26 2016
Errors in file /oracle/admin/orcl/bdump/orcl_p012_6540.trc:
ORA-01172: recovery of thread 1 stuck at block 566 of file 2
ORA-01151: use media recovery to recover block, restore backup if needed
Wed Mar 23 14:16:26 2016
Errors in file /oracle/admin/orcl/bdump/orcl_p008_6532.trc:
ORA-01172: recovery of thread 1 stuck at block 53769 of file 3
ORA-01151: use media recovery to recover block, restore backup if needed
Wed Mar 23 14:16:26 2016
Errors in file /oracle/admin/orcl/bdump/orcl_p011_6538.trc:
ORA-01172: recovery of thread 1 stuck at block 69385 of file 3
ORA-01151: use media recovery to recover block, restore backup if needed
Wed Mar 23 14:16:26 2016
Errors in file /oracle/admin/orcl/bdump/orcl_p005_6526.trc:
ORA-01172: recovery of thread 1 stuck at block 26185 of file 3
ORA-01151: use media recovery to recover block, restore backup if needed
Wed Mar 23 14:16:27 2016
Errors in file /oracle/admin/orcl/bdump/orcl_p014_6544.trc:
ORA-01172: recovery of thread 1 stuck at block 89 of file 2
ORA-01151: use media recovery to recover block, restore backup if needed
Wed Mar 23 14:16:27 2016
Aborting crash recovery due to slave death, attempting serial crash recovery
Wed Mar 23 14:16:27 2016
Beginning crash recovery of 1 threads
Wed Mar 23 14:16:27 2016
Started redo scan
Wed Mar 23 14:16:27 2016
Completed redo scan
 62588 redo blocks read, 15 data blocks need recovery
Wed Mar 23 14:16:27 2016
Started redo application at
 Thread 1: logseq 15050, block 2, scn 2439828667
Wed Mar 23 14:16:27 2016
Recovery of Online Redo Log: Thread 1 Group 1 Seq 15050 Reading mem 0
  Mem# 0 errs 0: /oracle/oradata/orcl/redo01.log
RECOVERY OF THREAD 1 STUCK AT BLOCK 566 OF FILE 2
Wed Mar 23 14:16:27 2016
Aborting crash recovery due to error 1172
Wed Mar 23 14:16:27 2016
Errors in file /oracle/admin/orcl/udump/orcl_ora_6514.trc:
ORA-01172: recovery of thread 1 stuck at block 566 of file 2
ORA-01151: use media recovery to recover block, restore backup if needed
ORA-1172 signalled during: ALTER DATABASE OPEN...

ALTER DATABASE RECOVER datafile 1 报错
尝试recover datafile 1之后报ORA-600 kcbrapply_4,ORA-600 kcfrbd_3,ORA-600 kcbrapply_12等错误,从报错信息看,出现这些错误的原因,是由于断电导致坏块引起.

Thu Mar 24 21:50:18 2016
ALTER DATABASE RECOVER  datafile 1
Thu Mar 24 21:50:18 2016
Media Recovery Start
 parallel recovery started with 15 processes
Thu Mar 24 21:50:18 2016
Recovery of Online Redo Log: Thread 1 Group 1 Seq 15050 Reading mem 0
  Mem# 0 errs 0: /oracle/oradata/orcl/redo01.log
Thu Mar 24 21:50:19 2016
Errors in file /oracle/admin/orcl/bdump/orcl_p004_13391.trc:
ORA-00600: internal error code, arguments: [kcbrapply_4], [2], [], [], [], [], [], []
Thu Mar 24 21:50:19 2016
Errors in file /oracle/admin/orcl/bdump/orcl_p010_13403.trc:
ORA-00600: internal error code, arguments: [kcbrapply_4], [0], [], [], [], [], [], []
Thu Mar 24 21:50:19 2016
Errors in file /oracle/admin/orcl/bdump/orcl_p000_13383.trc:
ORA-00600: internal error code, arguments: [kcbrapply_4], [0], [], [], [], [], [], []
Thu Mar 24 21:50:19 2016
Errors in file /oracle/admin/orcl/bdump/orcl_p009_13401.trc:
ORA-00600: internal error code, arguments: [kcbrapply_4], [3], [], [], [], [], [], []
Thu Mar 24 21:50:19 2016
Hex dump of (file 1, block 61562) in trace file /oracle/admin/orcl/bdump/orcl_p001_13385.trc
Corrupt block relative dba: 0x0040f07a (file 1, block 61562)
Bad header found during media recovery
Data in bad block:
 type: 0 format: 0 rdba: 0xf07a0000
 last change scn: 0x916c.dc4b0040 seq: 0x0 flg: 0x00
 spare1: 0x6 spare2: 0xa2 spare3: 0xb088
 consistency value in tail: 0x06010fc1
 check value in block header: 0x601
 block checksum disabled
Thu Mar 24 21:50:19 2016
Hex dump of (file 1, block 55706) in trace file /oracle/admin/orcl/bdump/orcl_p014_13411.trc
Corrupt block relative dba: 0x0040d99a (file 1, block 55706)
Bad header found during media recovery
Data in bad block:
 type: 0 format: 0 rdba: 0xd99a0000
 last change scn: 0x916c.e1ad0040 seq: 0x0 flg: 0x00
 spare1: 0x6 spare2: 0xa2 spare3: 0xa520
 consistency value in tail: 0x06012222
 check value in block header: 0x601
 block checksum disabled
Thu Mar 24 21:50:19 2016
Errors in file /oracle/admin/orcl/bdump/orcl_p006_13395.trc:
ORA-00600: internal error code, arguments: [kcfrbd_3], [1], [3342335], [1], [0], [64000], [], []
Thu Mar 24 21:50:19 2016
Errors in file /oracle/admin/orcl/bdump/orcl_p003_13389.trc:
ORA-00600: internal error code, arguments: [kcfrbd_3], [1], [3932159], [1], [0], [64000], [], []
Thu Mar 24 21:50:19 2016
Errors in file /oracle/admin/orcl/bdump/orcl_p002_13387.trc:
ORA-00600: internal error code, arguments: [kcfrbd_3], [1], [2293759], [1], [0], [64000], [], []
Reread of rdba: 0x0040d99a (file 1, block 55706) found valid data
Thu Mar 24 21:50:19 2016
Errors in file /oracle/admin/orcl/bdump/orcl_p014_13411.trc:
ORA-00600: internal error code, arguments: [kcbrapply_12], [], [], [], [], [], [], []
Thu Mar 24 21:50:19 2016
Reread of rdba: 0x0040f07a (file 1, block 61562) found valid data
Thu Mar 24 21:50:19 2016
Errors in file /oracle/admin/orcl/bdump/orcl_p001_13385.trc:
ORA-00600: internal error code, arguments: [kcbrapply_12], [], [], [], [], [], [], []
Thu Mar 24 21:50:23 2016
Errors in file /oracle/admin/orcl/bdump/orcl_p014_13411.trc:
ORA-07445: exception encountered: core dump [kcbs_dump_adv_state()+450] [SIGSEGV] [Address not mapped to object] [0xB9782BF4] [] []
ORA-00600: internal error code, arguments: [kcbrapply_12], [], [], [], [], [], [], []
Thu Mar 24 21:50:23 2016
Errors in file /oracle/admin/orcl/bdump/orcl_p006_13395.trc:
ORA-07445: exception encountered: core dump [kcbs_dump_adv_state()+450] [SIGSEGV] [Address not mapped to object] [0xB9C82BF4] [] []
ORA-00600: internal error code, arguments: [kcfrbd_3], [1], [3342335], [1], [0], [64000], [], []
Thu Mar 24 21:50:23 2016
Errors in file /oracle/admin/orcl/bdump/orcl_p009_13401.trc:
ORA-07445: exception encountered: core dump [kcbs_dump_adv_state()+450] [SIGSEGV] [Address not mapped to object] [0xB9A02BF4] [] []
ORA-00600: internal error code, arguments: [kcbrapply_4], [3], [], [], [], [], [], []
Thu Mar 24 21:50:23 2016
Errors in file /oracle/admin/orcl/bdump/orcl_p003_13389.trc:
ORA-07445: exception encountered: core dump [kcbs_dump_adv_state()+450] [SIGSEGV] [Address not mapped to object] [0xB9F02AF4] [] []
ORA-00600: internal error code, arguments: [kcfrbd_3], [1], [3932159], [1], [0], [64000], [], []
Thu Mar 24 21:50:23 2016
Errors in file /oracle/admin/orcl/bdump/orcl_p004_13391.trc:
ORA-07445: exception encountered: core dump [kcbs_dump_adv_state()+450] [SIGSEGV] [Address not mapped to object] [0xBA182AF4] [] []
ORA-00600: internal error code, arguments: [kcbrapply_4], [2], [], [], [], [], [], []
Thu Mar 24 21:50:23 2016
Errors in file /oracle/admin/orcl/bdump/orcl_p010_13403.trc:
ORA-07445: exception encountered: core dump [kcbs_dump_adv_state()+450] [SIGSEGV] [Address not mapped to object] [0xBA402AF4] [] []
ORA-00600: internal error code, arguments: [kcbrapply_4], [0], [], [], [], [], [], []
Thu Mar 24 21:50:23 2016
Errors in file /oracle/admin/orcl/bdump/orcl_p000_13383.trc:
ORA-07445: exception encountered: core dump [kcbs_dump_adv_state()+450] [SIGSEGV] [Address not mapped to object] [0xB9282AF4] [] []
ORA-00600: internal error code, arguments: [kcbrapply_4], [0], [], [], [], [], [], []
Thu Mar 24 21:50:23 2016
Errors in file /oracle/admin/orcl/bdump/orcl_p001_13385.trc:
ORA-07445: exception encountered: core dump [kcbs_dump_adv_state()+450] [SIGSEGV] [Address not mapped to object] [0xB9C82AF4] [] []
ORA-00600: internal error code, arguments: [kcbrapply_12], [], [], [], [], [], [], []
Thu Mar 24 21:50:23 2016
Errors in file /oracle/admin/orcl/bdump/orcl_p002_13387.trc:
ORA-10562: Error occurred while applying redo to data block (file# 1, block# 11042)
ORA-10564: tablespace SYSTEM
ORA-01110: data file 1: '/oracle/oradata/orcl/system01.dbf'
ORA-00607: Internal error occurred while making a change to a data block
ORA-00600: internal error code, arguments: [kcfrbd_3], [1], [2293759], [1], [0], [64000], [], []

ALTER DATABASE RECOVER datafile 3 报错
该文件恢复主要报ORA-600 kcbrsearchflist_2,ORA-600 kdxlin:psno out of range,ORA-600 kcbs_dump_adv_state等错误

Thu Mar 24 21:52:04 2016
ALTER DATABASE RECOVER  datafile 3
Thu Mar 24 21:52:04 2016
Media Recovery Start
 parallel recovery started with 15 processes
Thu Mar 24 21:52:04 2016
Recovery of Online Redo Log: Thread 1 Group 1 Seq 15050 Reading mem 0
  Mem# 0 errs 0: /oracle/oradata/orcl/redo01.log
Thu Mar 24 21:52:05 2016
Errors in file /oracle/admin/orcl/bdump/orcl_p007_13462.trc:
ORA-00600: internal error code, arguments: [kdxlin:psno out of range], [], [], [], [], [], [], []
Thu Mar 24 21:52:05 2016
Errors in file /oracle/admin/orcl/bdump/orcl_p001_13450.trc:
ORA-00600: internal error code, arguments: [kcbrsearchflist_2], [], [], [], [], [], [], []
Thu Mar 24 21:52:05 2016
Errors in file /oracle/admin/orcl/bdump/orcl_p007_13462.trc:
ORA-07445: exception encountered: core dump [kcbs_dump_adv_state()+464] [SIGSEGV] [Address not mapped to object] [0xB9F076F4] [] []
ORA-00600: internal error code, arguments: [kdxlin:psno out of range], [], [], [], [], [], [], []
Thu Mar 24 21:52:05 2016
Errors in file /oracle/admin/orcl/bdump/orcl_p001_13450.trc:
ORA-07445: exception encountered: core dump [kcbs_dump_adv_state()+464] [SIGSEGV] [Address not mapped to object] [0xB9C874F4] [] []
ORA-00600: internal error code, arguments: [kcbrsearchflist_2], [], [], [], [], [], [], []
Thu Mar 24 21:52:06 2016
Errors in file /oracle/admin/orcl/bdump/orcl_p007_13462.trc:
ORA-07445: exception encountered: core dump [kcbs_dump_adv_state()+464] [SIGSEGV] [Address not mapped to object] [0xB9F066F4] [] []
ORA-07445: exception encountered: core dump [kcbs_dump_adv_state()+464] [SIGSEGV] [Address not mapped to object] [0xB9F076F4] [] []
ORA-00600: internal error code, arguments: [kdxlin:psno out of range], [], [], [], [], [], [], []
Thu Mar 24 21:52:06 2016
Errors in file /oracle/admin/orcl/bdump/orcl_p001_13450.trc:
ORA-07445: exception encountered: core dump [kcbs_dump_adv_state()+464] [SIGSEGV] [Address not mapped to object] [0xB9C864F4] [] []
ORA-07445: exception encountered: core dump [kcbs_dump_adv_state()+464] [SIGSEGV] [Address not mapped to object] [0xB9C874F4] [] []
ORA-00600: internal error code, arguments: [kcbrsearchflist_2], [], [], [], [], [], [], []

恢复过程

SQL> startup mount
ORACLE instance started.
Total System Global Area 2147483648 bytes
Fixed Size          1220432 bytes
Variable Size             369098928 bytes
Database Buffers         1761607680 bytes
Redo Buffers               15556608 bytes
Database mounted.
SQL> select file# from v$datafile;
     FILE#
----------
         1
         2
         3
         4
         5
         6
6 rows selected.
SQL> recover datafile 1;
ORA-03113: end-of-file on communication channel
SQL> startup mount
ORACLE instance started.
Total System Global Area 2147483648 bytes
Fixed Size          1220432 bytes
Variable Size             369098928 bytes
Database Buffers         1761607680 bytes
Redo Buffers               15556608 bytes
Database mounted.
SQL> recover datafile 3;
ORA-03113: end-of-file on communication channel
SQL> startup mount
ORACLE instance started.
Total System Global Area 2147483648 bytes
Fixed Size          1220432 bytes
Variable Size             369098928 bytes
Database Buffers         1761607680 bytes
Redo Buffers               15556608 bytes
Database mounted.
SQL> recover datafile 5;
Media recovery complete.
SQL> recover datafile 6;
Media recovery complete.
SQL> recover datafile 4;
Media recovery complete.
SQL> recover datafile 2;
Media recovery complete.
SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-00607: Internal error occurred while making a change to a data block
ORA-00600: internal error code, arguments: [4194], [57], [11], [], [], [], [],
[]
SQL> select open_mode from v$database;
OPEN_MODE
----------
READ WRITE

这次运气不错,system坏的是mon_mods$,undo异常可以重建,基本上可以说没有数据丢失,数据库恢复完成.
重要的库,通过open过程报错信息,分析可能的坏块所属对象,然后确定处理方法,以免造成永久性数据块损坏.

Solaris rm datafile recovery—利用句柄误删除数据文件恢复

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:Solaris rm datafile recovery—利用句柄误删除数据文件恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

今天早上接到有客户恢复请求,他一不小心在solaris系统中使用rm -rf oradata命令把一个分区下面的所有数据文件全部删除了。现在不知道怎么办,请求我们给予支持.上去检查发现比较幸运,数据库没有crash,看来直接考虑使用句柄进行恢复

root@CNISORCLSVR # uname -a
SunOS CNISORCLSVR 5.9 Generic_112233-08 sun4u sparc SUNW,Sun-Fire-880
root@CNISORCLSVR # ps -ef|grep lgwr
  oracle   597     1  0   Mar 05 ?       17:14 ora_lgwr_xifenfei
    root 28069 28043  0 18:51:17 pts/2    0:00 grep lgwr
root@CNISORCLSVR # ls -ltr
total 189348454
-r--r--r--   1 oracle   dba       657920 Apr 26  2002 12
c---------   1 root     sys       13, 12 Mar 27  2004 8
c---------   1 root     sys       13, 12 Mar 27  2004 10
-rw-r-----   0 oracle   dba      34359730176 Nov 12  2013 291
-rw-r-----   0 oracle   dba      1073750016 Nov 13  2013 293
D---------   1 root     root           0 Mar  5 19:31 11
-rw-r-----   1 oracle   dba         1758 Mar  5 22:04 9
-rw-rw----   1 oracle   dba           24 Mar  5 22:04 13
s---------   0 root     root           0 Mar  8 00:45 14
-rw-r-----   1 oracle   dba      1887444992 Mar 12 03:27 289
-rw-r-----   1 oracle   dba      943726592 Mar 12 11:17 290
-rw-r-----   0 oracle   dba      4294975488 Mar 13 00:09 292
-rw-r-----   0 oracle   dba      268443648 Mar 13 01:33 288
-rw-r-----   0 oracle   dba      536879104 Mar 13 01:33 279
-rw-r-----   1 oracle   dba      134225920 Mar 13 01:33 278
-rw-r-----   0 oracle   dba      134225920 Mar 13 01:33 269
-rw-r-----   1 oracle   dba      268443648 Mar 13 01:33 267
-rw-r-----   1 oracle   dba      148119552 Mar 13 01:33 266
-rw-r-----   1 oracle   dba      10493952 Mar 13 01:33 265
-rw-r-----   1 oracle   dba      26222592 Mar 13 01:33 264
-rw-r-----   1 oracle   dba      62922752 Mar 13 01:33 263
-rw-r-----   1 oracle   dba      20979712 Mar 13 01:33 262
-rw-r-----   0 oracle   dba      134225920 Mar 13 01:33 287
-rw-r-----   1 oracle   dba      209723392 Mar 13 01:33 285
-rw-r-----   0 oracle   dba      536879104 Mar 13 01:33 283
-rw-r-----   1 oracle   dba      67117056 Mar 13 01:33 282
-rw-r-----   0 oracle   dba      536879104 Mar 13 01:33 281
-rw-r-----   0 oracle   dba      536879104 Mar 13 01:33 280
-rw-r-----   0 oracle   dba      536879104 Mar 13 01:33 276
-rw-r-----   0 oracle   dba      1073750016 Mar 13 01:33 275
-rw-r-----   0 oracle   dba      2214600704 Mar 13 01:33 274
-rw-r-----   0 oracle   dba      134225920 Mar 13 01:33 273
-rw-r-----   0 oracle   dba      536879104 Mar 13 01:33 272
c---------   1 root     sys       13,  2 Mar 13 02:00 5
c---------   1 root     sys       13,  2 Mar 13 02:00 4
c---------   1 root     sys       13,  2 Mar 13 02:00 3
c---------   1 root     sys       13,  2 Mar 13 02:00 2
c---------   1 root     sys       13,  2 Mar 13 02:00 1
c---------   1 root     sys       13,  2 Mar 13 02:00 0
--w-------   1 oracle   dba      4640842 Mar 13 04:43 7
--w-------   1 oracle   dba      4640842 Mar 13 04:43 6
-rw-r-----   0 oracle   dba      1207967744 Mar 13 18:21 271
-rw-r-----   0 oracle   dba      15929974784 Mar 13 18:39 284
-rw-r-----   0 oracle   dba      134225920 Mar 13 18:45 277
-rw-r-----   0 oracle   dba      2122326016 Mar 13 18:46 286
-rw-r-----   0 oracle   dba      9261031424 Mar 13 18:47 270
-rw-r-----   0 oracle   dba      18253619200 Mar 13 18:47 268
-rw-r-----   1 oracle   dba      134225920 Mar 13 18:51 261
-rw-r-----   1 oracle   dba      524296192 Mar 13 18:51 260
-rw-r-----   1 oracle   dba      104858112 Mar 13 18:52 259
-rw-r-----   1 oracle   dba      1941504 Mar 13 18:52 258
-rw-r-----   1 oracle   dba      1941504 Mar 13 18:52 257
-rw-r-----   1 oracle   dba      1941504 Mar 13 18:52 256
SQL> select file#,name from v$datafile wehre name like '/disk%';
     FILE# NAME
---------- --------------------------------------------------
         9 /disk/oradata/xifenfei/xifenfei.dbf
        10 /disk/oradata/xifenfei/CSSN.dbf
        11 /disk/oradata/xifenfei/NCSSN.dbf
        12 /disk/oradata/xifenfei/CSIC_RDS.dbf
        13 /disk/oradata/xifenfei/CSIC_CSSN.dbf
        14 /disk/oradata/xifenfei/CNIS_I.dbf
        15 /disk/oradata/xifenfei/CNIS.dbf
        16 /disk/oradata/xifenfei/TRSWCM6_CSSN.dbf
        17 /disk/oradata/xifenfei/TRSWCM6_CSSN_PLUGINS.dbf
        18 /disk/oradata/xifenfei/DIGIREF.dbf
        20 /disk/oradata/xifenfei/TRSWCM.dbf
        21 /disk/oradata/xifenfei/TRSWCM52_NSLC.dbf
        22 /disk/oradata/xifenfei/TRSWCM52_PLUGINS_NSLC.dbf
        24 /disk/oradata/xifenfei/TRSWCM_PLUGINS.dbf
        25 /disk/oradata/xifenfei/CNIS_ALL.dbf
        27 /disk/oradata/xifenfei/undotbs01.dbf
        28 /disk/oradata/xifenfei/TRS_IDS02.dbf
        29 /disk/oradata/xifenfei/xdb02.dbf

在solaris中比较郁闷,虽然进入了fd目录,但是无法知道哪些文件句柄是删除,哪些是正常的,因此没有办法,只能使用lsof进一步分析

root@CNISORCLSVR # pkgadd -d lsof-4.80-sol9-sparc-local
The following packages are available:
  1  IBMlsof     lsof
                 (sparc) 4.80
Select package(s) you wish to process (or 'all' to process
all packages). (default: all) [?,??,q]: all
Processing package instance <IBMlsof> from </tmp/lsof-4.80-sol9-sparc-local>
lsof
(sparc) 4.80
Vic Abell
Using </usr/local> as the package base directory.
## Processing package information.
## Processing system information.
## Verifying disk space requirements.
## Checking for conflicts with packages already installed.
The following files are already installed on the system and are being
used by another package:
* /usr/local/bin <attribute change only>
* - conflict with a file which does not belong to any package.
Do you want to install these conflicting files [y,n,?,q] y
## Checking for setuid/setgid programs.
The following files are being installed with setuid and/or setgid
permissions:
 /usr/local/bin/lsof <setgid sys>
 /usr/local/bin/sparcv7/lsof <setgid sys>
 /usr/local/bin/sparcv9/lsof <setgid sys>
Do you want to install these as setuid/setgid files [y,n,?,q] y
Installing lsof as <IBMlsof>
## Installing part 1 of 1.
/usr/local/bin/lsof
/usr/local/bin/sparcv7/lsof
/usr/local/bin/sparcv9/lsof
/usr/local/doc/lsof/00.README.FIRST
/usr/local/doc/lsof/00CREDITS
/usr/local/doc/lsof/00DCACHE
/usr/local/doc/lsof/00DIALECTS
/usr/local/doc/lsof/00DIST
/usr/local/doc/lsof/00FAQ
/usr/local/doc/lsof/00LSOF-L
/usr/local/doc/lsof/00MANIFEST
/usr/local/doc/lsof/00PORTING
/usr/local/doc/lsof/00QUICKSTART
/usr/local/doc/lsof/00README
/usr/local/doc/lsof/00TEST
/usr/local/doc/lsof/00XCONFIG
/usr/local/doc/lsof/lsof.man
/usr/local/man/man8/lsof.8
[ verifying class <none> ]
Installation of <IBMlsof> was successful.
root@CNISORCLSVR # ./lsof -p 597
COMMAND PID   USER   FD   TYPE        DEVICE    SIZE/OFF   NODE NAME
oracle  597 oracle  cwd   VDIR          85,5        2048 106299 /export/home/oracle/app/product/9.2.0/dbs
oracle  597 oracle  txt   VREG          85,5    61272672   2332 /export/home/oracle/app/product/9.2.0/bin/oracle
…………
oracle  597 oracle  260u  VREG          85,5   524296192 106517 /export/home/oracle/oradata/xifenfei/system01.dbf
oracle  597 oracle  261u  VREG          85,5   134225920 106518 /export/home/oracle/oradata/xifenfei/undotbs01.dbf
…………
oracle  597 oracle  268u  VREG        118,70 18253619200  109 /disk (/dev/dsk/c2t600A0B800029CEFA0000036C491B270Bd0s6)
oracle  597 oracle  269u  VREG        118,70   134225920  110 /disk (/dev/dsk/c2t600A0B800029CEFA0000036C491B270Bd0s6)
oracle  597 oracle  270u  VREG        118,70  9261031424  111 /disk (/dev/dsk/c2t600A0B800029CEFA0000036C491B270Bd0s6)
…………
oracle  597 oracle  293u  VREG        118,70  1073750016   14 /disk (/dev/dsk/c2t600A0B800029CEFA0000036C491B270Bd0s6)

到这一步,基本上定位/disk部分是我们需要恢复的数据,从而可以定位到句柄,然后结合数据文件信息,直接使用cp命令恢复出来文件.然后数据库层面recover并且online.

cd /proc/597/fd
cp 269 /disk/oradata/cnisora2/CSSN.dbf
chown oracle:dba /disk/oradata/xifenfei/CSSN.dbf
SQL> recover datafile 10;
ORA-00283: 恢复会话因错误而取消
ORA-01124: 无法恢复数据文件 10 - 文件在使用中或在恢复中
ORA-01110: 数据文件 10: '/disk/oradata/xifenfei/CSSN.dbf'
SQL> alter database datafile 10 offline;
数据库已更改。
SQL> recover datafile 10;
完成介质恢复。
SQL> alter database datafile 10 online;
数据库已更改。

至此基本上恢复完成,万幸是数据库没有crash,遇到此类问题,千万不要盲目关闭数据库.另外数据库备份重于一切

ORA-10562 故障恢复—allow 1 corruption

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:ORA-10562 故障恢复—allow 1 corruption

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

朋友数据库由于存储变动,导致数据库瞬间hang住,然后直接crash,之后无法正常启动,请求技术支持.
数据库报ORA-00600[2131]错误
不能mount,可以通过重建控制文件解决

Mon Nov 30 20:35:38 2015
alter database mount
Mon Nov 30 20:35:38 2015
NOTE: Loaded library: System
Mon Nov 30 20:35:38 2015
SUCCESS: diskgroup DATADG was mounted
Mon Nov 30 20:35:38 2015
NOTE: dependency between database xifenfei and diskgroup resource ora.DATADG.dg is established
Errors in file /u01/app/oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_ora_26450.trc  (incident=3032256):
ORA-00600: internal error code, arguments: [2131], [33], [32], [], [], [], [], [], [], [], [], []
ORA-600 signalled during: alter database mount...

尝试recover数据库

Mon Nov 30 20:45:53 2015
ALTER DATABASE RECOVER  database
Media Recovery Start
 started logmerger process
Parallel Media Recovery started with 80 slaves
Mon Nov 30 20:45:56 2015
Recovery of Online Redo Log: Thread 2 Group 11 Seq 617 Reading mem 0
  Mem# 0: +DATADG/xifenfei/redo011.log
Recovery of Online Redo Log: Thread 1 Group 4 Seq 5410 Reading mem 0
  Mem# 0: +DATADG/xifenfei/redo04.log
Recovery of Online Redo Log: Thread 1 Group 5 Seq 5411 Reading mem 0
  Mem# 0: +DATADG/xifenfei/redo05.log
Mon Nov 30 20:46:07 2015
Recovery of Online Redo Log: Thread 1 Group 6 Seq 5412 Reading mem 0
  Mem# 0: +DATADG/xifenfei/redo06.log
Mon Nov 30 20:46:13 2015
Exception [type: SIGSEGV, Address not mapped to object] [ADDR:0xC] [PC:0x95FB502, kdxlin()+4088] [flags: 0x0, count: 1]
Errors in file /u01/app/oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_pr13_30480.trc  (incident=3032568):
ORA-07445: 出现异常错误: 核心转储 [kdxlin()+4088] [SIGSEGV] [ADDR:0xC] [PC:0x95FB502] [Address not mapped to object] []
Mon Nov 30 20:46:17 2015
Sweep [inc][3032568]: completed
Sweep [inc2][3032568]: completed
Mon Nov 30 20:46:31 2015
Slave exiting with ORA-10562 exception
Errors in file /u01/app/oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_pr13_30480.trc:
ORA-10562: Error occurred while applying redo to data block (file# 2, block# 165054)
ORA-10564: tablespace SYSAUX
ORA-01110: 数据文件 2: '+DATADG/xifenfei/datafile/sysaux.265.861925867'
ORA-10561: block type 'TRANSACTION MANAGED INDEX BLOCK', data object# 271
ORA-00607: 当更改数据块时出现内部错误
ORA-00602: 内部编程异常错误
ORA-07445: 出现异常错误: 核心转储 [kdxlin()+4088] [SIGSEGV] [ADDR:0xC] [PC:0x95FB502] [Address not mapped to object] []
Mon Nov 30 20:46:31 2015
Recovery Slave PR13 previously exited with exception 10562
Mon Nov 30 20:46:33 2015
Checker run found 28 new persistent data failures
Mon Nov 30 20:46:35 2015
Media Recovery failed with error 448
Errors in file /u01/app/oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_pr00_30400.trc:
ORA-00283: 恢复会话因错误而取消
ORA-00448: 后台进程正常结束
ORA-10562 signalled during: ALTER DATABASE RECOVER  database  ...

通过这里可以看到,由于在recover 操作之时,由于某种原因redo的数据无法apply到file 2 block 165054中,导致数据库recover database失败.

按照数据文件recover操作

SQL> recover datafile 1;
Media recovery complete.
SQL> recover datafile 3,4,5,6,7;
Media recovery complete.
SQL> recover datafile 8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28;
Media recovery complete.
SQL> recover datafile 2;
ORA-00283: recovery session canceled due to errors
ORA-10562: Error occurred while applying redo to data block (file# 2, block#
165054)
ORA-10564: tablespace SYSAUX
ORA-01110: data file 2: '+DATADG/xifenfei/datafile/sysaux.265.861925867'
ORA-10561: block type 'TRANSACTION MANAGED INDEX BLOCK', data object# 271
ORA-00607: Internal error occurred while making a change to a data block
ORA-00602: internal programming exception
ORA-07445: exception encountered: core dump [kdxlin()+4088] [SIGSEGV]
[ADDR:0xC] [PC:0x95FB502] [Address not mapped to object] []

错误提示和recover database一样,那我们只能让恢复跳过该block继续恢复,因为根据经验判断data object# 271不是系统核心对象,不会影响数据库的启动

跳过坏块继续恢复

SQL> recover  datafile 2 allow 1 corruption;
ORA-00283: recovery session canceled due to errors
ORA-00600: internal error code, arguments: [3020], [2], [69793], [8458401], [],
[], [], [], [], [], [], []
ORA-10567: Redo is inconsistent with data block (file# 2, block# 69793, file
offset is 571744256 bytes)
ORA-10564: tablespace SYSAUX
ORA-01110: data file 2: '+DATADG/xifenfei/datafile/sysaux.265.861925867'
ORA-10561: block type 'TRANSACTION MANAGED INDEX BLOCK', data object# 272
SQL> recover  datafile 2 allow 1 corruption;
Media recovery complete.
SQL> alter database open;
Database altered.

出现了ORA-600[3020] 继续跳过坏块,然后数据库顺利open,别忘记加tempfile

处理异常对象

SQL> select object_name,object_type from dba_objects where data_object_id in(272,271);
OBJECT_NAME
--------------------------------------------------------------------------------
OBJECT_TYPE
-------------------
SMON_SCN_TIME_TIM_IDX
INDEX
SMON_SCN_TIME_SCN_IDX
INDEX
SQL> select index_name from dba_indexes where table_name='SMON_SCN_TIME';
INDEX_NAME
------------------------------
SMON_SCN_TIME_TIM_IDX
SMON_SCN_TIME_SCN_IDX
SQL> set pages 1000
SQL> set long 1000
SQL> Select dbms_metadata.get_ddl('TABLE','SMON_SCN_TIME','SYS') FROM DUAL ;
DBMS_METADATA.GET_DDL('TABLE','SMON_SCN_TIME','SYS')
--------------------------------------------------------------------------------
  CREATE TABLE "SYS"."SMON_SCN_TIME"
   (    "THREAD" NUMBER,
        "TIME_MP" NUMBER,
        "TIME_DP" DATE,
        "SCN_WRP" NUMBER,
        "SCN_BAS" NUMBER,
        "NUM_MAPPINGS" NUMBER,
        "TIM_SCN_MAP" RAW(1200),
        "SCN" NUMBER DEFAULT 0,
        "ORIG_THREAD" NUMBER DEFAULT 0           /* for downgrade */
   ) CLUSTER "SYS"."SMON_SCN_TO_TIME_AUX" ("THREAD")
SQL> analyze table smon_scn_time validate structure cascade online;
analyze table smon_scn_time validate structure cascade online
*
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 2, block # 165054)
ORA-01110: data file 2: '+DATADG/xifenfei/datafile/sysaux.265.861925867'
SQL> truncate CLUSTER "SYS"."SMON_SCN_TO_TIME_AUX";
Cluster truncated.

关于SMON_SCN_TIME部分处理,可以参考:关于SMON_SCN_TIME若干问题说明.至此数据库基本上恢复完成,而且运气非常好,恢复的非常完美,数据实现0丢失.

ORA-01122 ORA-01210 故障恢复

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:ORA-01122 ORA-01210 故障恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

有朋友数据文件头出现错误ORA-01122和ORA-01210等错误,数据库无法正常open。
ORA-01210


因为平台是win,他们找我咨询win bbed,因为回老家电脑没有带,无法提供win的bbed.我通过dd部分文件头,然后在linux平台分析发现是该文件的文件头block大量坏块

bbed分析坏块情况

BBED> show all
        FILE#           0
        BLOCK#          1
        OFFSET          0
        DBA             0x00000000 (0 0,1)
        FILENAME        /tmp/30.dbf
        BIFILE          bifile.bbd
        LISTFILE
        BLOCKSIZE       8192
        MODE            Browse
        EDIT            Unrecoverable
        IBASE           Dec
        OBASE           Dec
        WIDTH           80
        COUNT           512
        LOGFILE         log.bbd
        SPOOL           No
BBED> set count 64
        COUNT           64
BBED> map
 File: /tmp/30.dbf (0)
 Block: 1                                     Dba:0x00000000
------------------------------------------------------------
BBED-00400: invalid blocktype (27)
BBED> d
 File: /tmp/30.dbf (0)
 Block: 1                Offsets:    0 to   63           Dba:0x00000000
------------------------------------------------------------------------
 1ba20000 03004400 bffd8a1d 0000000c acba0000 008f4500 00003455 fc020000
 02040000 00000000 00008001 04000000 00000000 00000000 949400b4 94514005
 <32 bytes per line>
BBED> set block +1
        BLOCK#          2
BBED> map
 File: /tmp/30.dbf (0)
 Block: 2                                     Dba:0x00000000
------------------------------------------------------------
BBED-00400: invalid blocktype (27)
BBED> d
 File: /tmp/30.dbf (0)
 Block: 2                Offsets:    0 to   63           Dba:0x00000000
------------------------------------------------------------------------
 1ba20000 04004400 bffd8a1d 0000000c a6e00000 008f4500 00003455 fc020000
 0204e81f 00000000 0000241e 05000000 00000000 00000000 11fc297f b426fe2b
 <32 bytes per line>
BBED> set block +1
        BLOCK#          3
BBED> d
 File: /tmp/30.dbf (0)
 Block: 3                Offsets:    0 to   63           Dba:0x00000000
------------------------------------------------------------------------
 1ba20000 05004400 bffd8a1d 0000000c 780a0000 008f4500 00003455 fc020000
 0204e81f 00000000 0000c001 06000000 00000000 00000000 2969a0d2 d30168a2
 <32 bytes per line>
BBED> set block +1
        BLOCK#          4
BBED> d
 File: /tmp/30.dbf (0)
 Block: 4                Offsets:    0 to   63           Dba:0x00000000
------------------------------------------------------------------------
 1ba20000 06004400 bffd8a1d 0000000c 6c5a0000 008f4500 00003455 fc020000
 0204e81f 00000000 0000f81d 07000000 00000000 00000000 7b51d409 6dc7ca4d
 <32 bytes per line>
BBED> set block +1
        BLOCK#          5
BBED> d
 File: /tmp/30.dbf (0)
 Block: 5                Offsets:    0 to   63           Dba:0x00000000
------------------------------------------------------------------------
 1ba20000 07004400 bffd8a1d 0000000c c5600000 008f4500 00003455 fc020000
 02040000 00000000 0000c001 08000000 00000000 00000000 14514005 25145200
 <32 bytes per line>
BBED> set block +1
        BLOCK#          6
BBED> d
 File: /tmp/30.dbf (0)
 Block: 6                Offsets:    0 to   63           Dba:0x00000000
------------------------------------------------------------------------
 1ba20000 08004400 bffd8a1d 0000000c 60480000 008f4500 00003455 fc020000
 0204e81f 00000000 0000c301 09000000 00000000 00000000 c2a1606a 7615130a
 <32 bytes per line>
BBED> set block +1
        BLOCK#          7
BBED> d
 File: /tmp/30.dbf (0)
 Block: 7                Offsets:    0 to   63           Dba:0x00000000
------------------------------------------------------------------------
 1ba20000 09004400 bffd8a1d 0000000c e3430000 008f4500 00003455 fc020000
 0204e81f 00000000 00000002 0a000000 00000000 00000000 00a28a28 00a28a28
 <32 bytes per line>
BBED> set block +1
        BLOCK#          8
BBED> d
 File: /tmp/30.dbf (0)
 Block: 8                Offsets:    0 to   63           Dba:0x00000000
------------------------------------------------------------------------
 1ba20000 0a004400 07fe8a1d 0000000c fc000000 008f4500 00003455 fc020000
 0205e81f 00000000 0000f41d 00000000 00000000 00000000 ffd8ffe0 00104a46
 <32 bytes per line>
BBED> set block +1
        BLOCK#          9
BBED> d
 File: /tmp/30.dbf (0)
 Block: 9                Offsets:    0 to   63           Dba:0x00000000
------------------------------------------------------------------------
 1ba20000 0b004400 07fe8a1d 0000000c 48da0000 008f4500 00003455 fc020000
 0205e81f 00000000 0000c601 01000000 00000000 00000000 b47d69d3 7fa96a6f
 <32 bytes per line>
BBED> set block +1
        BLOCK#          10
BBED> d
 File: /tmp/30.dbf (0)
 Block: 10               Offsets:    0 to   63           Dba:0x00000000
------------------------------------------------------------------------
 1ba20000 0c004400 07fe8a1d 0000000c be0f0000 008f4500 00003455 fc020000
 0205e81f 00000000 0000181d 02000000 00000000 00000000 9de3e868 4782d83a
 <32 bytes per line>
BBED> set block +1
        BLOCK#          11
BBED> d
 File: /tmp/30.dbf (0)
 Block: 11               Offsets:    0 to   63           Dba:0x00000000
------------------------------------------------------------------------
 1ba20000 0d004400 07fe8a1d 0000000c 9cd00000 008f4500 00003455 fc020000
 0205e81f 00000000 0000241e 03000000 00000000 00000000 dead1259 5919e385
 <32 bytes per line>
BBED> set block +1
        BLOCK#          12
BBED> d
 File: /tmp/30.dbf (0)
 Block: 12               Offsets:    0 to   63           Dba:0x00000000
------------------------------------------------------------------------
 1ba20000 0e004400 07fe8a1d 0000000c df450000 008f4500 00003455 fc020000
 0205e81f 00000000 00004001 04000000 00000000 00000000 31d9a292 9698828a
 <32 bytes per line>
BBED> set block +1
        BLOCK#          13
BBED> d
 File: /tmp/30.dbf (0)
 Block: 13               Offsets:    0 to   63           Dba:0x00000000
------------------------------------------------------------------------
 1ba20000 0f004400 07fe8a1d 0000000c 18790000 008f4500 00003455 fc020000
 02050000 00000000 00000002 05000000 00000000 00000000 b93f8235 5ea063b7
 <32 bytes per line>

拿block 1的rdba(04004400–倒序存储)分析[win文件拷贝到linux后使用bbed查看相差1 block]可以的出来block信息为file=1, block=262148,明显错误.

通过dul分析文件头损坏情况

Data UnLoader: 10.2.0.6.9 - Internal Only - on Tue Sep 29 22:15:22 2015
with 64-bit io functions
Copyright (c) 1994 2015 Bernard van Duijnen All rights reserved.
 Strictly Oracle Internal Use Only
DUL: Warning: Recreating file "dul.log"
Reading SCANNEDLOBPAGE.dat 1204 entries loaded and sorted 1204 entries
Reading SEG.dat 0 entries loaded
Reading EXT.dat 44 entries loaded and sorted 44 entries
Reading COMPATSEG.dat 0 entries loaded
DUL: Warning: Wrong DBA  0X00440004 (file=1, block=262148) (Ignored)
DUL: Error: While processing file# 30 block# 1
DUL: Warning: Found mismatch while checking file E:\TEMP\shebao\30.dbf
DUL: Warning: DUL osd_parameter or control.dul configuration error
DUL: Warning: Given file number(30) in control file does not match file# in dba(1)
DUL: Warning: Wrong DBA  0X00440004 (file=1, block=262148) (Ignored)
DUL: Error: While processing file# 30 block# 1
DUL>

通过bbed和dul证明文件头大量损坏,而且尚未有任何该文件的物理备份,因此恢复起来难道较大。

分析Oracle Database Recovery Check Result
通过对Oracle数据库异常恢复检查脚本(Oracle Database Recovery Check)的分析结果,我们意外的发现,人品不错,发现异常的文件创建时间为2015-09-26 19:39:33,进一步和客户沟通,这个文件存储为图片,少量丢失可以允许,优先恢复业务
Oracle-Database-Recovery-Check


有了这个结论,那处理起来就easy了,直接offline异常文件,然后分析丢失的表
从而确定时lob字典的少量extent数据分配到了file 30上
lob


为了避免查询对应lob之时出现错误,通过update 对应lob为空规避该问题

create table corrupt_lobs (corrupt_rowid rowid,table_name varchar2(100));
declare
  n number;
begin
  for cursor_lob in (select rowid r, xff_lob from xff.t_xifenfei) loop
  begin
    n:=dbms_lob.instr(cursor_lob.xff_lob,hextoraw('889911'));
  exception
    when others then
      insert into corrupt_lobs values (cursor_lob.r,'xff.t_xifenfei');
      commit;
    end;
  end loop;
end;
/
update xff.t_xifenfei
     set xff_lob = empty_blob()
     where rowid in (select corrupted_rowid from corrupt_lobs);

本次恢复是由于运气好,遇到异常文件刚好是最近加入,而且都是图片,客户允许少量丢失,如果是不允许丢失的数据文件,可能需要通过找历史的该文件的备份(Oracle 12C的第一次异常恢复—文件头坏块),在某些情况下,如果也没有此类备份,只能通过bbed重构block 1(如果有其他异常块一次处理,如果太多无法处理,最少也需要重构block 1),然后尝试open数据库或者使用dul之类工具处理(因为文件头损坏,工具可能不能识别文件无法恢复)

ORA-600 k2vcbk_2 故障恢复

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:ORA-600 k2vcbk_2 故障恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

有朋友找到我说他们数据库无法启动,数据库启动报ORA-600[k2vcbk_2]错误,数据库版本为11.2.0.2 RAC,操作系统是AIX 6.1

SQL> recover database;
Media recovery complete.
SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00600: internal error code, arguments: [k2vcbk_2], [], [], [], [], [], [],
[], [], [], [], []
Process ID: 7930020
Session ID: 49 Serial number: 14761

数据库节点1日志

Mon Sep 21 15:45:41 2015
Thread 1 advanced to log sequence 54076 (LGWR switch)
  Current log# 13 seq# 54076 mem# 0: +DG01/xifenfei/onlinelog/group_13.332.779459035
  Current log# 13 seq# 54076 mem# 1: +DG01/xifenfei/onlinelog/group_13.344.779582621
Mon Sep 21 15:45:44 2015
Archived Log entry 74655 added for thread 1 sequence 54075 ID 0x5a0bc0e1 dest 1:
Mon Sep 21 15:56:18 2015
Errors in file /oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_ora_18088342.trc  (incident=184348):
ORA-00600: 内部错误代码, 参数: [kturPOTS_0], [], [], [], [], [], [], [], [], [], [], []
Incident details in: /oracle/diag/rdbms/xifenfei/xifenfei1/incident/incdir_184348/xifenfei1_ora_18088342_i184348.trc
Mon Sep 21 15:56:34 2015
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Error 600 trapped in 2PC on transaction 7.16.120119. Cleaning up.
Error stack returned to user:
ORA-00600: 内部错误代码, 参数: [kturPOTS_0], [], [], [], [], [], [], [], [], [], [], []
Errors in file /oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_ora_18088342.trc  (incident=184349):
ORA-00603: ORACLE 服务器会话因致命错误而终止
ORA-00600: 内部错误代码, 参数: [kturPOTS_0], [], [], [], [], [], [], [], [], [], [], []
Mon Sep 21 15:56:34 2015
Dumping diagnostic data in directory=[cdmp_20150921155634], requested by (instance=1, osid=18088342), summary=[incident=184348].
Incident details in: /oracle/diag/rdbms/xifenfei/xifenfei1/incident/incdir_184349/xifenfei1_ora_18088342_i184349.trc
Mon Sep 21 15:56:35 2015
Sweep [inc][184349]: completed
Sweep [inc][184348]: completed
Sweep [inc2][184348]: completed
opiodr aborting process unknown ospid (18088342) as a result of ORA-603
Mon Sep 21 15:57:12 2015
Errors in file /oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_smon_7536810.trc  (incident=184274):
ORA-00600: internal error code, arguments: [k2vcbk_2], [], [], [], [], [], [], [], [], [], [], []
Incident details in: /oracle/diag/rdbms/xifenfei/xifenfei1/incident/incdir_184274/xifenfei1_smon_7536810_i184274.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Mon Sep 21 15:57:16 2015
Dumping diagnostic data in directory=[cdmp_20150921155716], requested by (instance=1, osid=7536810 (SMON)), summary=[incident=184274].
Fatal internal error happened while SMON was doing active transaction recovery.
Errors in file /oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_smon_7536810.trc:
ORA-00600: internal error code, arguments: [k2vcbk_2], [], [], [], [], [], [], [], [], [], [], []
SMON (ospid: 7536810): terminating the instance due to error 474
Mon Sep 21 15:57:18 2015
ORA-1092 : opitsk aborting process

数据库节点2日志

Mon Sep 21 15:21:50 2015
Archived Log entry 74653 added for thread 2 sequence 23559 ID 0x5a0bc0e1 dest 1:
Mon Sep 21 15:44:28 2015
Thread 2 advanced to log sequence 23561 (LGWR switch)
  Current log# 12 seq# 23561 mem# 0: +DG01/xifenfei/onlinelog/group_12.338.779457003
  Current log# 12 seq# 23561 mem# 1: +DG01/xifenfei/onlinelog/group_12.265.779582493
Mon Sep 21 15:44:31 2015
Archived Log entry 74654 added for thread 2 sequence 23560 ID 0x5a0bc0e1 dest 1:
Mon Sep 21 15:45:31 2015
DISTRIB TRAN xifenfei.1ebab0a5.20.3.1533822
  is local tran 20.3.1533822 (hex=14.03.17677e)
  insert pending committed tran, scn=14590688068086 (hex=d45.28c781f6)
Mon Sep 21 15:45:31 2015
DISTRIB TRAN xifenfei.1ebab0a5.20.3.1533822
  is local tran 20.3.1533822 (hex=14.03.17677e))
  delete pending committed tran, scn=14590688068086 (hex=d45.28c781f6)
Mon Sep 21 15:56:35 2015
Dumping diagnostic data in directory=[cdmp_20150921155634], requested by (instance=1, osid=18088342), summary=[incident=184348].
Mon Sep 21 15:57:10 2015
Error 3135 trapped in 2PC on transaction 20.11.1534704. Cleaning up.
Error stack returned to user:
ORA-03135: 连接失去联系
opidcl aborting process unknown ospid (9175532) as a result of ORA-604
Mon Sep 21 15:57:17 2015
Dumping diagnostic data in directory=[cdmp_20150921155716], requested by (instance=1, osid=7536810 (SMON)), summary=[incident=184274].
Mon Sep 21 15:57:23 2015
Reconfiguration started (old inc 18, new inc 20)
List of instances:
 2 (myinst: 2)
 Global Resource Directory frozen
 * dead instance detected - domain 0 invalid = TRUE
 Communication channels reestablished
 Master broadcasted resource hash value bitmaps
 Non-local Process blocks cleaned out
Mon Sep 21 15:57:23 2015
 LMS 2: 3 GCS shadows cancelled, 1 closed, 0 Xw survived
Mon Sep 21 15:57:23 2015
 LMS 0: 2 GCS shadows cancelled, 0 closed, 0 Xw survived
Mon Sep 21 15:57:23 2015
 LMS 1: 3 GCS shadows cancelled, 1 closed, 0 Xw survived
 Set master node info
 Submitted all remote-enqueue requests
 Dwn-cvts replayed, VALBLKs dubious
 All grantable enqueues granted
 Post SMON to start 1st pass IR
Mon Sep 21 15:57:23 2015
minact-scn: Inst 2 is now the master inc#:20 mmon proc-id:6816208 status:0x7
minact-scn status: grec-scn:0x0000.00000000 gmin-scn:0x0d45.28c2bb5c gcalc-scn:0x0d45.28c3bd2e
minact-scn: master found reconf/inst-rec before recscn scan old-inc#:20 new-inc#:20
Mon Sep 21 15:57:23 2015
Instance recovery: looking for dead threads
 Submitted all GCS remote-cache requests
 Post SMON to start 1st pass IR
 Fix write in gcs resources
Reconfiguration complete
Beginning instance recovery of 1 threads
 parallel recovery started with 31 processes
Started redo scan
Completed redo scan
 read 12626 KB redo, 1724 data blocks need recovery
Started redo application at
 Thread 1: logseq 54076, block 184416
Recovery of Online Redo Log: Thread 1 Group 13 Seq 54076 Reading mem 0
  Mem# 0: +DG01/xifenfei/onlinelog/group_13.332.779459035
  Mem# 1: +DG01/xifenfei/onlinelog/group_13.344.779582621
Completed redo application of 9.78MB
Completed instance recovery at
 Thread 1: logseq 54076, block 209669, scn 14590688357285
 1633 data blocks read, 1794 data blocks written, 12626 redo k-bytes read
Thread 1 advanced to log sequence 54077 (thread recovery)
Mon Sep 21 15:57:33 2015
Error 3113 trapped in 2PC on transaction 21.18.1965522. Cleaning up.
Redo thread 1 internally disabled at seq 54077 (SMON)
Error stack returned to user:
ORA-02050: 事务处理 21.18.1965522 已回退, 某些远程数据库可能有问题
ORA-03113: 通信通道的文件结尾
ORA-02063: 紧接着 line (起自 ZSK)
Mon Sep 21 15:57:34 2015
Archived Log entry 74656 added for thread 1 sequence 54076 ID 0x5a0bc0e1 dest 1:
Mon Sep 21 15:57:34 2015
ARC0: Archiving disabled thread 1 sequence 54077
Archived Log entry 74657 added for thread 1 sequence 54077 ID 0x5a0bc0e1 dest 1:
Mon Sep 21 15:57:35 2015
Thread 2 advanced to log sequence 23562 (LGWR switch)
  Current log# 8 seq# 23562 mem# 0: +DG01/xifenfei/onlinelog/group_8.334.779456945
  Current log# 8 seq# 23562 mem# 1: +DG01/xifenfei/onlinelog/group_8.267.779582453
Mon Sep 21 15:57:36 2015
Errors in file /oracle/diag/rdbms/xifenfei/xifenfei2/trace/xifenfei2_smon_6750672.trc  (incident=200218):
ORA-00600: internal error code, arguments: [k2vcbk_2], [], [], [], [], [], [], [], [], [], [], []
Incident details in: /oracle/diag/rdbms/xifenfei/xifenfei2/incident/incdir_200218/xifenfei2_smon_6750672_i200218.trc
Archived Log entry 74658 added for thread 2 sequence 23561 ID 0x5a0bc0e1 dest 1:
Mon Sep 21 15:57:38 2015
minact-scn: master continuing after IR
Mon Sep 21 15:57:41 2015
Dumping diagnostic data in directory=[cdmp_20150921155741], requested by (instance=2, osid=6750672 (SMON)), summary=[incident=200218].
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Fatal internal error happened while SMON was doing instance transaction recovery.
Errors in file /oracle/diag/rdbms/xifenfei/xifenfei2/trace/xifenfei2_smon_6750672.trc:
ORA-00600: internal error code, arguments: [k2vcbk_2], [], [], [], [], [], [], [], [], [], [], []
SMON (ospid: 6750672): terminating the instance due to error 474
Mon Sep 21 15:57:41 2015
ORA-1092 : opitsk aborting process
Mon Sep 21 15:57:42 2015
ORA-1092 : opitsk aborting process
Mon Sep 21 15:57:42 2015
License high water mark = 291
Instance terminated by SMON, pid = 6750672
USER (ospid: 18874814): terminating the instance

通过数据库日志大概可以看出来,由于节点2的分布式事事务异常,而在11.2.0.2中,分布式事务跨节点,引起节点2的pmon清理异常事务,但是由于bug,使得异常事务无法被清理掉,从而引起节点1 crash,节点1 crash之后节点2进行恢复,也因为分布式事务bug,导致smon回滚失败,实例也crash。无法进行回滚导致数据库无法正常启动,通过查询mos发现定位到是Bug 10222544 ORA-600 [k2vpci_2] from multi-branch distributed transaction
ORA-600-k2vpci_2


对于这类问题,由于分布事务无法清理,处理方法就是找出来事务人工提交或者直接屏蔽掉这个事务解决该问题

system01.dbf文件被offline,导致数据库报ORA-01245 ORA-01110故障恢复

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:system01.dbf文件被offline,导致数据库报ORA-01245 ORA-01110故障恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

有朋友找到我,说数据库做recover报ORA-01245和ORA-01110错误,无法继续恢复,请求支持

SQL> recover database using backup controlfile until cancel;
…………
第 1 行出现错误:
ORA-01245: RESETLOGS 完成时脱机文件 1 将丢失
ORA-01110: 数据文件 1: 'E:\APP\ADMINISTRATOR\ORADATA\HXV10\SYSTEM01.DBF'

通过Oracle Database Recovery Check检查数据库情况,发现datafile 1处于offline状态
oracle_recovery_check


Wed Aug 26 23:11:00 2015
alter database datafile 1 offline drop
Completed: alter database datafile 1 offline drop

从这里基本上可以知道为什么出现ORA-01245错误了,由于system表空间中文件被offline导致.

redo信息
oracle_recovery_check_redo

Mon Aug 24 22:38:35 2015
alter database clear unarchived logfile group 2
Clearing online log 2 of thread 1 sequence number 5705
Completed: alter database clear unarchived logfile group 2
Wed Aug 26 23:13:23 2015
alter database clear logfile group 3
Clearing online log 3 of thread 1 sequence number 5706
Completed: alter database clear logfile group 3

除当前redo之外,其他redo被clear

尝试恢复

SQL> alter database datafile 1 online;
数据库已更改。
SQL> recover database;
ORA-00283: 恢复会话因错误而取消
ORA-01610: 使用 BACKUP CONTROLFILE 选项的恢复必须已完成
SQL> recover database using backup controlfile;
ORA-00279: 更改 63960710 (在 08/23/2015 17:01:25 生成) 对于线程 1 是必需的
ORA-00289: 建议:
E:\APP\ADMINISTRATOR\FLASH_RECOVERY_AREA\HXV10\ARCHIVELOG\2015_08_27\O1_MF_1_570
5_%U_.ARC
ORA-00280: 更改 63960710 (用于线程 1) 在序列 #5705 中
指定日志: {<RET>=suggested | filename | AUTO | CANCEL}
E:\APP\ADMINISTRATOR\ORADATA\HXV10\REDO03.LOG
ORA-00310: 归档日志包含序列 5706; 要求序列 5705
ORA-00334: 归档日志: 'E:\APP\ADMINISTRATOR\ORADATA\HXV10\REDO03.LOG'
SQL> recover database using backup controlfile;
ORA-00279: 更改 63960710 (在 08/23/2015 17:01:25 生成) 对于线程 1 是必需的
ORA-00289: 建议:
E:\APP\ADMINISTRATOR\FLASH_RECOVERY_AREA\HXV10\ARCHIVELOG\2015_08_27\O1_MF_1_570
5_%U_.ARC
ORA-00280: 更改 63960710 (用于线程 1) 在序列 #5705 中
指定日志: {<RET>=suggested | filename | AUTO | CANCEL}
E:\APP\ADMINISTRATOR\ORADATA\HXV10\REDO02.LOG
ORA-00339: 归档日志未包含任何重做
ORA-00334: 归档日志: 'E:\APP\ADMINISTRATOR\ORADATA\HXV10\REDO02.LOG'
SQL> recover database using backup controlfile;
ORA-00279: 更改 63960710 (在 08/23/2015 17:01:25 生成) 对于线程 1 是必需的
ORA-00289: 建议:
E:\APP\ADMINISTRATOR\FLASH_RECOVERY_AREA\HXV10\ARCHIVELOG\2015_08_27\O1_MF_1_570
5_%U_.ARC
ORA-00280: 更改 63960710 (用于线程 1) 在序列 #5705 中
指定日志: {<RET>=suggested | filename | AUTO | CANCEL}
E:\APP\ADMINISTRATOR\ORADATA\HXV10\REDO01.LOG
ORA-00310: 归档日志包含序列 5707; 要求序列 5705
ORA-00334: 归档日志: 'E:\APP\ADMINISTRATOR\ORADATA\HXV10\REDO01.LOG'

数据库做恢复需要seq 5705的redo,但是redo已经被clear,导致现在数据库常规手段无法恢复,只用使用隐含参数屏蔽数据库前滚(一致性检查)

再次尝试打开数据库

ORACLE 例程已经启动。
Total System Global Area  778387456 bytes
Fixed Size                  1374808 bytes
Variable Size             486540712 bytes
Database Buffers          285212672 bytes
Redo Buffers                5259264 bytes
数据库装载完毕。
SQL> recover database using backup controlfile;
ORA-00279: 更改 63960710 (在 08/23/2015 17:01:25 生成) 对于线程 1 是必需的
ORA-00289: 建议:
E:\APP\ADMINISTRATOR\FLASH_RECOVERY_AREA\HXV10\ARCHIVELOG\2015_08_27\O1_MF_1_570
5_%U_.ARC
ORA-00280: 更改 63960710 (用于线程 1) 在序列 #5705 中
指定日志: {<RET>=suggested | filename | AUTO | CANCEL}
cancel
介质恢复已取消。
SQL> alter database open resetlogs;
数据库已更改。

在数据库恢复中,请不要对system表空间数据文件进行offline操作,如果对此类文件进行offline操作,讲在数据库恢复过程中出现ORA-01245和ORA-01110错误,而且文件还会出现SYSOFF状态

aix平台 ORA-01115 ORA-01110 ORA-27067 故障恢复

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:aix平台 ORA-01115 ORA-01110 ORA-27067 故障恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

接到朋友恢复请求,aix 5.3,Oracle 10.2.0.1平台,数据库启动报ORA-01115 ORA-01110 ORA-27067错误,数据库无法正常打开,通过分析,是由于10201在aix上面的bug导致,通过技巧规避,完美解决给问题,数据0丢失
数据库报错alert日志

Mon Aug 10 13:25:22 2015
ALTER DATABASE   MOUNT
Mon Aug 10 13:25:29 2015
Setting recovery target incarnation to 1
Mon Aug 10 13:25:29 2015
Successful mount of redo thread 1, with mount id 432339141
Mon Aug 10 13:25:29 2015
Database mounted in Exclusive Mode
Completed: ALTER DATABASE   MOUNT
Mon Aug 10 13:25:36 2015
alter database open
Mon Aug 10 13:25:36 2015
Beginning crash recovery of 1 threads
 parallel recovery started with 15 processes
Mon Aug 10 13:25:37 2015
Started redo scan
Mon Aug 10 13:25:52 2015
Completed redo scan
 7889582 redo blocks read, 75305 data blocks need recovery
Mon Aug 10 13:25:53 2015
Errors in file /dc/admin/datacent/bdump/datacent_p002_144124.trc:
ORA-01115: IO error reading block from file 2 (block # 40704)
ORA-01110: data file 2: '/dc/oradata/datacent/undotbs01.dbf'
ORA-27067: size of I/O buffer is invalid
Additional information: 2
Additional information: 1572864
Mon Aug 10 13:25:53 2015
Aborting crash recovery due to slave death, attempting serial crash recovery
Mon Aug 10 13:25:53 2015
Beginning crash recovery of 1 threads
Mon Aug 10 13:25:53 2015
Started redo scan
Mon Aug 10 13:26:09 2015
Completed redo scan
 7889582 redo blocks read, 75305 data blocks need recovery
Mon Aug 10 13:26:12 2015
Aborting crash recovery due to error 1115
Mon Aug 10 13:26:12 2015
Errors in file /dc/admin/datacent/udump/datacent_ora_123384.trc:
ORA-01115: IO error reading block from file 2 (block # 39077)
ORA-01110: data file 2: '/dc/oradata/datacent/undotbs01.dbf'
ORA-27067: size of I/O buffer is invalid
Additional information: 2
Additional information: 1310720
ORA-1115 signalled during: alter database open...

这里报的前面两个错误ORA-01115 ORA-01110我们都非常熟悉,类似数据库启动遇到坏块或者io错误之时可能就会报如此错误。但是ORA-27067确实不多见,从mos上看,很多是由于rman备份之时的bug可能导致该错误。

dbv检测undo坏块文件

DBVERIFY: Release 10.2.0.1.0 - Production on Mon Aug 10 23:18:15 2015
Copyright (c) 1982, 2003, Oracle and/or its affiliates.  All rights reserved.
DBVERIFY - Verification starting : FILE = /dc/oradata/datacent/undotbs01.dbf
DBVERIFY - Verification complete
Total Pages Examined         : 329600
Total Pages Processed (Data) : 0
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 0
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 327504
Total Pages Processed (Seg)  : 17
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 2096
Total Pages Marked Corrupt   : 0
Total Pages Influx           : 0
Total Pages Encrypted        : 0
Highest block SCN            : 1887888 (0.1887888)

这里可以看到,undo文件本身并没有逻辑和物理的坏块,证明因为数据库异常的原因,可能是由于ORA-27067: size of I/O buffer is invalid导致。根据官方文档ORA-01115 ORA-27067 DURING PARALLEL INSTANCE RECOVERY AFTER INSTANCE CRASH中的解释,我们基本上可以确定很可能是由于10.2.0.1在aix平台的jfs2系统中,由于大量事务操作,突然abort掉数据库(也可能断电),从而数据库在启动的时候进行实例恢复,而由于内部的bug,导致实例恢复无法成功。通过我们处理后的,数据库完美启动,数据0丢失

数据库启动日志

Mon Aug 10 16:34:14 2015
alter database open
Mon Aug 10 16:34:14 2015
Beginning crash recovery of 1 threads
 parallel recovery started with 15 processes
Mon Aug 10 16:34:14 2015
Started redo scan
Mon Aug 10 16:34:27 2015
Completed redo scan
 7889582 redo blocks read, 0 data blocks need recovery
Mon Aug 10 16:34:27 2015
Started redo application at
 Thread 1: logseq 664704, block 1286922
Mon Aug 10 16:34:27 2015
Recovery of Online Redo Log: Thread 1 Group 4 Seq 664704 Reading mem 0
  Mem# 0 errs 0: /dev/rredo04
Mon Aug 10 16:34:32 2015
Recovery of Online Redo Log: Thread 1 Group 5 Seq 664705 Reading mem 0
  Mem# 0 errs 0: /dev/rredo05
Mon Aug 10 16:34:38 2015
Recovery of Online Redo Log: Thread 1 Group 6 Seq 664706 Reading mem 0
  Mem# 0 errs 0: /dev/rredo06
Mon Aug 10 16:34:40 2015
Completed redo application
Mon Aug 10 16:34:40 2015
Completed crash recovery at
 Thread 1: logseq 664706, block 1017805, scn 8554793334
 0 data blocks read, 0 data blocks written, 7889582 redo blocks read
Mon Aug 10 16:34:40 2015
Thread 1 advanced to log sequence 664707
Thread 1 opened at log sequence 664707
  Current log# 1 seq# 664707 mem# 0: /dev/rredo01
Successful open of redo thread 1
Mon Aug 10 16:34:40 2015
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Mon Aug 10 16:34:40 2015
SMON: enabling cache recovery
Mon Aug 10 16:34:40 2015
Successfully onlined Undo Tablespace 1.
Mon Aug 10 16:34:40 2015
SMON: enabling tx recovery
Mon Aug 10 16:34:41 2015
Database Characterset is ZHS32GB18030
replication_dependency_tracking turned off (no async multimaster replication found)
WARNING: AQ_TM_PROCESSES is set to 0. System operation might be adversely affected.
Mon Aug 10 16:34:41 2015
SMON: Parallel transaction recovery tried
Mon Aug 10 16:34:42 2015
db_recovery_file_dest_size of 2048 MB is 0.00% used. This is a
user-specified limit on the amount of space that will be used by this
database for recovery-related files, and does not reflect the amount of
space available in the underlying filesystem or ASM diskgroup.
Mon Aug 10 16:34:42 2015
Completed: alter database open