oracle 8.1.6因断电无法启动恢复

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:oracle 8.1.6因断电无法启动恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

接到一个oralce 8.1.6的数据库恢复请求,由于断电之后,系统无法正常恢复,通过尝试recover datafile报错如下

SQL> select * from v$version;
BANNER
----------------------------------------------------------------
Oracle8i Enterprise Edition Release 8.1.6.0.0 - Production
PL/SQL Release 8.1.6.0.0 - Production
CORE	8.1.6.0.0	Production
TNS for 32-bit Windows: Version 8.1.6.0.0 - Production
NLSRTL Version 3.4.1.0.0 - Production
Tue Oct 16 14:17:01 2018
Media Recovery Datafile: 1
Media Recovery Start
Media Recovery Log
Recovery of Online Redo Log: Thread 1 Group 1 Seq 146597 Reading mem 0
  Mem# 0 errs 0: D:\ORACLE\ORADATA\ORCL\REDO01.LOG
Tue Oct 16 14:17:02 2018
Errors in file D:\Oracle\admin\orcl\udump\ORA03040.TRC:
ORA-00600: internal error code, arguments: [kcoapl_blkchk], [1], [30547], [6101], [], [], [], []
Tue Oct 16 14:17:03 2018
Errors in file D:\Oracle\admin\orcl\udump\ORA03040.TRC:
ORA-01578: ORACLE data block corrupted (file # 1, block # 30547)
ORA-01110: data file 1: 'D:\ORACLE\ORADATA\ORCL\SYSTEM01.DBF'
ORA-00607: Internal error occurred while making a change to a data block
ORA-00600: internal error code, arguments: [kcoapl_blkchk], [1], [30547], [6101], [], [], [], []
Tue Oct 16 14:17:03 2018
Errors in file D:\Oracle\admin\orcl\udump\ORA03040.TRC:
ORA-00314: log 2 of thread 1, expected sequence# 146598 doesn't match 146415
ORA-00312: online log 2 thread 1: 'D:\ORACLE\ORADATA\ORCL\REDO02.LOG'

这里可以获取到两个信息:1)system文件可能有坏块,导致数据库recover的时候报ORA-600 kcoapl_blkchk错误,2)数据库的redo可能异常了,由于8i数据库默认redo 10m,在业务繁忙时候切换较为频繁,而文件系统的cache导致redo信息比较老,而数据文件需要redo比较新,从而无法正常恢复成功。比较明显对于当前这样的情况只能是屏蔽数据一致性,强制拉库

ARC0: media recovery disabled
Tue Oct 16 14:17:39 2018
SMON: enabling cache recovery
Tue Oct 16 14:17:39 2018
Errors in file D:\Oracle\admin\orcl\udump\ORA02256.TRC:
ORA-00600: internal error code, arguments: [2662], [1], [1712082681], [1], [1712107587], [8388610], [], []
Tue Oct 16 14:17:41 2018
Errors in file D:\Oracle\admin\orcl\udump\ORA02256.TRC:
ORA-00600: internal error code, arguments: [2662], [1], [1712082682], [1], [1712107587], [8388610], [], []
ORA-00600: internal error code, arguments: [2662], [1], [1712082681], [1], [1712107587], [8388610], [], []
Tue Oct 16 14:17:43 2018
Errors in file D:\Oracle\admin\orcl\udump\ORA02256.TRC:
ORA-00600: internal error code, arguments: [2662], [1], [1712082682], [1], [1712107587], [8388610], [], []
ORA-00600: internal error code, arguments: [2662], [1], [1712082681], [1], [1712107587], [8388610], [], []
Tue Oct 16 14:17:45 2018
Errors in file D:\Oracle\admin\orcl\udump\ORA02256.TRC:
ORA-00603: ORACLE server session terminated by fatal error
ORA-00600: internal error code, arguments: [2662], [1], [1712082682], [1], [1712107587], [8388610], [], []
ORA-00600: internal error code, arguments: [2662], [1], [1712082681], [1], [1712107587], [8388610], [], []

ORA-600 2662这个错误比较常见,直接推数据库scn,启动库

Tue Oct 16 16:09:11 2018
Errors in file D:\Oracle\admin\orcl\bdump\orclSMON.TRC:
ORA-00600: internal error code, arguments: [4193], [29469], [29477], [], [], [], [], []
Recovery of Online Redo Log: Thread 1 Group 3 Seq 1 Reading mem 0
  Mem# 0 errs 0: D:\ORACLE\ORADATA\ORCL\REDO03.LOG
Tue Oct 16 16:09:12 2018
Errors in file D:\Oracle\admin\orcl\bdump\orclSMON.TRC:
ORA-01595: error freeing extent (3) of rollback segment (2))
ORA-00600: internal error code, arguments: [4193], [29469], [29477], [], [], [], [], []
Tue Oct 16 16:09:13 2018
Completed: ALTER DATABASE OPEN
Tue Oct 16 16:12:06 2018
CREATE ROLLBACK SEGMENT nrbs1 TABLESPACE rbs
Tue Oct 16 16:12:06 2018
Errors in file D:\Oracle\admin\orcl\udump\ORA02252.TRC:
ORA-00600: internal error code, arguments: [4194], [79], [38], [], [], [], [], []

错误比较明显ORA-600 4194,而且已经告知是由于rollback segment 2异常,通过屏蔽回滚段,open数据库,删除老回滚段,创建新回滚段(8i无undo自动管理)

SQL> startup
ORACLE instance started.
Total System Global Area 1549432076 bytes
Fixed Size                    70924 bytes
Variable Size             500707328 bytes
Database Buffers         1048576000 bytes
Redo Buffers                  77824 bytes
Database mounted.
Database opened.
SQL> drop rollback segment "RBS1";
Rollback segment dropped.
SQL> drop rollback segment "RBS2";
Rollback segment dropped.
SQL> drop rollback segment "RBS3";
Rollback segment dropped.
SQL> drop rollback segment "RBS4";
Rollback segment dropped.
SQL> drop rollback segment "RBS5";
Rollback segment dropped.
SQL> drop rollback segment "RBS6";
Rollback segment dropped.
SQL> CREATE ROLLBACK SEGMENT nrbs1 TABLESPACE rbs;
Rollback segment created.
SQL> CREATE ROLLBACK SEGMENT nrbs2 TABLESPACE rbs;
Rollback segment created.
SQL> CREATE ROLLBACK SEGMENT nrbs3 TABLESPACE rbs;
Rollback segment created.
SQL> CREATE ROLLBACK SEGMENT nrbs4 TABLESPACE rbs;
Rollback segment created.
SQL> CREATE ROLLBACK SEGMENT nrbs5 TABLESPACE rbs;
Rollback segment created.
SQL> CREATE ROLLBACK SEGMENT nrbs6 TABLESPACE rbs;
Rollback segment created.
SQL> CREATE ROLLBACK SEGMENT nrbs7 TABLESPACE rbs;
Rollback segment created.

客户安排导出导入,至此该库恢复完成

ORA-604 ORA-607 ORA-600

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:ORA-604 ORA-607 ORA-600

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

有客户数据库经过一系列恢复之后,出现ORA-604 ORA-607 ORA-600的错误,尝试各种方法无法打开,希望我们介入处理

Sat May 12 21:18:56 2018
SMON: enabling cache recovery
Sat May 12 21:18:57 2018
Errors in file d:\oracle\admin\xifenfei\udump\xifenfei_ora_3448.trc:
ORA-00600: 内部错误代码,参数: [4194], [34], [28], [], [], [], [], []
Sat May 12 21:19:00 2018
Recovery of Online Redo Log: Thread 1 Group 1 Seq 644 Reading mem 0
  Mem# 0 errs 0: D:\ORACLE\ORADATA\xifenfei\REDO01.LOG
Recovery of Online Redo Log: Thread 1 Group 1 Seq 644 Reading mem 0
  Mem# 0 errs 0: D:\ORACLE\ORADATA\xifenfei\REDO01.LOG
Sat May 12 21:19:01 2018
Errors in file d:\oracle\admin\xifenfei\udump\xifenfei_ora_3448.trc:
ORA-00604: 递归 SQL 层 1 出现错误
ORA-00607: 当更改数据块时出现内部错误
ORA-00600: 内部错误代码,参数: [4194], [34], [28], [], [], [], [], []
Error 604 happened during db open, shutting down database
USER: terminating instance due to error 604
Sat May 12 21:19:02 2018
Errors in file d:\oracle\admin\xifenfei\bdump\xifenfei_pmon_1840.trc:
ORA-00604: error occurred at recursive SQL level
Instance terminated by USER, pid = 3448
ORA-1092 signalled during: alter database open...

ORA-600 4194 trace文件
分析trace文件,确定ORA-600 4194对应的sql语句为update undo$ set name=:2……

*** 2018-05-12 21:18:57.000
ksedmp: internal or fatal error
ORA-00600: 内部错误代码,参数: [4194], [34], [28], [], [], [], [], []
Current SQL statement for this session:
update undo$ set name=:2,file#=:3,block#=:4,status$=:5,user#=:6,undosqn=:7,
xactsqn=:8,scnbas=:9,scnwrp=:10,inst#=:11,ts#=:12,spare1=:13 where us#=:1
----- Call Stack Trace -----
calling              call     entry                argument values in hex
location             type     point                (? means dubious value)
-------------------- -------- -------------------- ----------------------------
_ksedmp+147          CALLrel  _ksedst+0
_ksfdmp.108+e        CALLrel  _ksedmp+0            3
_kgeriv+89           CALLreg  00000000             212778 3
_kseipre.107+3f      CALLrel  _kgeriv+0
_ksesic2+24          CALLrel  _kseipre.107+0
__VInfreq__kturdb+8  CALLrel  _ksesic2+0           1062 0 22 0 1C
b
_kcoapl+1df          CALLreg  00000000             30A0F94 30A100A 11 6BB7E014
_kcbapl+71           CALLrel  _kcoapl+0            30A0F90 6BB7E000 1 0 2000
_kcrfwr+734          CALLrel  _kcbapl+0            30A0F90 6BBFC844 3014F9C
_kcbchg1+7ec         CALLrel  _kcrfwr+0
_ktuchg+630          CALLrel  _kcbchg1+0           0 4 3015224 301523C 0 0
_ktbchg2+75          CALLrel  _ktuchg+0            2 672D50F4 1 310DCD8 310DCE0
                                                   30A0F90 310D2F0 30A0ED0 0 0
_kddchg+18f          CALLrel  _ktbchg2+0           0 672D50F4 310DCD8 310DCE0
                                                   30A0F90 310D2E8 30A0ED0 0 0
_kduovw.53+6e3       CALLrel  _kddchg+0            310D2AC 310DCD8 310DCE0
                                                   30A0F90 30A0ED0 0 0
_kduurp.53+61a       CALLrel  _kduovw.53+0         310D2AC
_kdusru+aa5          CALLrel  _kduurp.53+0         310D2AC 672D514C
_kauupd+12e          CALLrel  _kdusru+0            310D6E0 672D514C 310D2AC 0
_updrow+729          CALLrel  _kauupd+0            310D6DC 672D514C 310D2AC 0
                                                   672D539C E F 672E4E48 12
                                                   301BBA0 301BBA4
_qerupFetch+107      CALLrel  _updrow+0
_updaul+202          CALL???  00000000             672DE9C4 0 672EC040 7FFF
_updThreePhaseExe+b  CALLrel  _updaul+0            672EBDD4 301BD30 0
6
_updexe+105          CALLrel  _updThreePhaseExe+0  672EBDD4 0 310D2AC 301BE0C
                                                   672EBDD4 1 301BE0C 0
_opiexe+f97          CALLrel  _updexe+0            672EBDD4 301BF48
_opiodr+4cd          CALLreg  00000000             4 3 301C894
_rpidrus.43+99       CALLrel  _opiodr+0            4 3 301C894 B
_skgmstack+71        CALLreg  00000000             301C484
_rpidru+6d           CALLrel  _skgmstack+0         301C49C 212600 F618 778198
                                                   301C484
_rpiswu2+17e         CALLreg  00000000             301C7BC
_rpidrv+109          CALLrel  _rpiswu2+0
_rpiexe+33           CALLrel  _rpidrv+0            B 4 301C894 8
_ktuscu+2a8          CALLrel  _rpiexe+0            B
_kqrcmt+2c2          CALL???  00000000             672EA898 3
..1.18_2.filter.95+  CALLrel  _kqrcmt+0            67B9C5F4 1 0 212778 212778 FF
159                                                0 0 0
..1.23_5.filter.99+  CALLrel  _ktcrcm+0            67B9C5F4 0 0 0 0 1 0 0
14d
_ktuini+64           CALLrel  _ktuiup.99+0         301D990
_adbdrv+2665         CALLrel  _ktuini+0            301D990
..1.5_1.filter.29+2  CALLrel  _adbdrv+0
9d
_opiosq0+9a4         CALLrel  _opiexe+0            4 0 301DDD8
_kpooprx+c6          CALLrel  _opiosq0+0           3 E 301DE70 24
_kpoal8+225          CALLrel  _kpooprx+0           301E738 301E680 13 1 0 24
_opiodr+4cd          CALLreg  00000000             5E 14 301E734
_ttcpip+a86          CALLreg  00000000             5E 14 301E734 0
_opitsk+2f4          CALLrel  _ttcpip+0
_opiino+5fc          CALLrel  _opitsk+0            0 0 2188C8 30CF020 E6 0
_opiodr+4cd          CALLreg  00000000             3C 4 301FBD4
_opidrv+233          CALLrel  _opiodr+0            3C 4 301FBD4 0
_sou2o+19            CALLrel  _opidrv+0
_opimai+10a          CALLrel  _sou2o+0
_OracleThreadStart@  CALLrel  _opimai+0
4+35c
7C80B726             CALLreg  00000000
--------------------- Binary Stack Dump ---------------------

进一步分析确定为system rollback segment header 异常

Block image after block recovery:
buffer tsn: 0 rdba: 0x00400009 (1/9)
scn: 0x0000.d794070f seq: 0x01 flg: 0x04 tail: 0x070f0e01
frmt: 0x02 chkval: 0x2320 type: 0x0e=KTU UNDO HEADER W/UNLIMITED EXTENTS
  Extent Control Header
  -----------------------------------------------------------------
  Extent Header:: spare1: 0      spare2: 0      #extents: 6      #blocks: 47
                  last map  0x00000000  #maps: 0      offset: 4128
      Highwater::  0x00400183  ext#: 2      blk#: 2      ext size: 8
  #blocks in seg. hdr's freelists: 0
  #blocks below: 0
  mapblk  0x00000000  offset: 2
                   Unlocked
     Map Header:: next  0x00000000  #extents: 6    obj#: 0      flag: 0x40000000
  Extent Map
  -----------------------------------------------------------------
   0x0040000a  length: 7
   0x00400011  length: 8
   0x00400181  length: 8
   0x00400189  length: 8
   0x00400191  length: 8
   0x00400199  length: 8
  TRN CTL:: seq: 0x0056 chd: 0x0054 ctl: 0x0052 inc: 0x00000000 nfb: 0x0001
            mgc: 0x8002 xts: 0x0068 flg: 0x0001 opt: 2147483646 (0x7ffffffe)
            uba: 0x00400183.0056.1b scn: 0x0000.d77996ab
Version: 0x01
  FREE BLOCK POOL::
    uba: 0x00400183.0056.1b ext: 0x2  spc: 0x794
    uba: 0x00000000.002f.21 ext: 0x5  spc: 0x1334
    uba: 0x00000000.002e.37 ext: 0x4  spc: 0x788
    uba: 0x00000000.0000.00 ext: 0x0  spc: 0x0
    uba: 0x00000000.0000.00 ext: 0x0  spc: 0x0
…………
ORA-00604: 递归 SQL 层 1 出现错误
ORA-00607: 当更改数据块时出现内部错误
ORA-00600: 内部错误代码,参数: [4194], [34], [28], [], [], [], [], []

这种问题需要通过通过bbed/ue修改ktuxc的相关内容,实现数据库open成功,可以参考另外几篇文章:
使用bbed解决ORA-00607/ORA-00600[4194]故障
通过bbed模拟ORA-00607/ORA-00600 4194 故障
ORA-607/ORA-600[4194]不一定是重大灾难
数据库报ORA-00607/ORA-00600[4194]错误

ORA-600 4194/ORA-600 4193/ORA-600 4137故障解决

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:ORA-600 4194/ORA-600 4193/ORA-600 4137故障解决

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

对于常见的undo异常错误,ORA-600 4193,ORA-600 4194,ORA-600 4137等错误的处理一般步骤.
适用版本

Oracle Database - Enterprise Edition - Version 9.2.0.1 to 11.2.0.4 [Release 9.2 to 11.2]
Information in this document applies to any platform.

报错现象

The following error is occurring in the alert.log right before the database crashes.
ORA-00600: internal error code, arguments: [4194], [#], [#], [], [], [], [], []
This error indicates that a mismatch has been detected between redo records and rollback (undo) records.
ARGUMENTS:
Arg [a] - Maximum Undo record number in Undo block
Arg [b] - Undo record number from Redo block
Since we are adding a new undo record to our undo block, we would expect that the new record number
 is equal to the maximum record number in the undo block plus one. Before Oracle can add
a new undo record to the undo block it validates that this is correct. If this validation fails,
 then an ORA-600 [4194] will be triggered.

报错原因

This also can be cause by the following defect
Bug 8240762 Abstract: Undo corruptions with ORA-600 [4193]/ORA-600 [4194] or ORA-600 [4137] after SHRINK
Details:
Undo corruption may be caused after a shrink and the same undo block may be used
for two different transactions causing several internal errors like:
ORA-600 [4193] / ORA-600 [4194] for new transactions
ORA-600 [4137] for a transaction rollback

处理步骤

Best practice to create a new undo tablespace.
This method includes segment check.
Create pfile from spfile to edit
>create pfile from spfile;
1. Shutdown the instance
2. set the following parameters in the pfile
    undo_management = manual
    event = '10513 trace name context forever, level 2'
3. >startup restrict pfile=<initsid.ora>
4. >select tablespace_name, status, segment_name from dba_rollback_segs where status != 'OFFLINE';
This is critical - we are looking for all undo segments to be offline - System will always be online.
If any are 'PARTLY AVAILABLE' or 'NEEDS RECOVERY' - Please open an issue with Oracle Support or update the current SR.
If all offline then continue to the next step
5. Create new undo tablespace - example
>create undo tablespace <new undo tablespace> datafile <datafile> size 2000M;
6. Drop old undo tablespace
>drop tablespace <old undo tablespace> including contents and datafiles;
7. >shutdown immediate;
8 >startup nomount;  --> Using your Original spfile
9 modify the spfile with the new undo tablespace name
  Alter system set undo_tablespace = '<new tablespace created in step 5>' scope=spfile;
10. >shutdown immediate;
11. >startup;  --> Using spfile
The reason we create a new undo tablespace first is to use new undo segment numbers
 that are higher then the current segments being used.
This way when a transaction goes to do block clean-out
the reference to that undo segment does not exist and continues with the block clean-out.

参考:tep by step to resolve ORA-600 4194 4193 4197 on database crash (Doc ID 1428786.1)

How to resolve ORA-600 [4194] errors

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:How to resolve ORA-600 [4194] errors

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

在oracle恢复中ORA-600 4194是一个非常常见的错误,该错误的主要原因是由于redo记录和undo(rollback)记录不匹配.
ORA 600 4194错误原因以及含义

ERROR:
  Format: ORA-600 [4194] [a] [b]
VERSIONS:
  versions 6.0 to 12.1
DESCRIPTION:
  A mismatch has been detected between Redo records and rollback (Undo)
  records.
  We are validating the Undo record number relating to the change being
  applied against the maximum undo record number recorded in the undo block.
  This error is reported when the validation fails.
ARGUMENTS:
  Arg [a] Maximum Undo record number in Undo block
  Arg [b] Undo record number from Redo block

ORA 600 4194 错误处理思路
第一步

Confirm whether the database is up and running or not.  If the database fails to start or crashes shortly
after startup due to this error occurring, then try setting event 10513 at level 2 in the init.ora/spfile
to disable transaction recovery and restart the instance, e.g.:
      event = "10513 trace name context forever, level 2"
This may allow the database to successfully open and stay up so that
the required diagnostics/actions can be performed.

第二步

In the trace file there should be an undo segment header dump, and so check
to see if the undo segment header shows an active transaction after recovery, e.g.:
TRN TBL    <---- Represents the Transaction table for the particular undo segment
index state cflags wrap# uel scn dba
---------------------------------------------------------------------------------------------
0x41 9 0x80 0x35ab6 0xffff 0x0695.38f6b959 0x1081e796
0x42 9 0x80 0x35bb1 0x000e 0x0695.38f6b028 0x1081e793
0x43 9 0x80 0x35b11 0x005d 0x0695.38f6b7ae 0x1081e795
0x44 9 0x80 0x359f0 0x0036 0x0695.38f69a91 0x1081e78e
0x45 10 0x80 0x35b1b 0x0000 0x0695.3a0aba4d 0x1081e796
0x46 9 0x80 0x35bb7 0x001c 0x0695.38f69bde 0x1081e78f
===================================
State ---> This column specifies the status of the transaction
                  9 -----> represents a commited transaction
                  10 ---> Represents a active transaction
Dba -----> Undo block containing the undo records
                  Strictly speaking this is the block at the end of the undo chain.
You can see from the transaction table that there is an active transaction
for this particular rollback/undo segment after recovery.
Therefore this rollback/undo segment and/or undo tablespace cannot be dropped without corrupting the database!
Therefore recreating the UNDO tablespace is not an option.

第三步

From the trace file determine the affected undo segment, e.g.:
Block image after block recovery:
UNDO BLK:
xid: 0x0015.02b.0001544b seq: 0x163e cnt: 0x12 irb: 0x12 icl: 0x0 flg: 0x0000
XID ==> Undo segment no + Slot no + Sequence no
Therefore, in this case the Undo Segment is:
USN# 0x15 (Hex) ==> 21 (Dec)  ==> _SYSSMU21$
So if and ONLY IF the transaction table shows no active transaction can the
 rollback/undo segment be offlined and dropped.Note however,
that before you can confirm if the entire UNDO tablespace can be dropped, you would need to check the
transaction tables of ALL active rollback/undo segments in the same wasy as the above.
The steps required to drop the rollback/undo segment are fully detailed in Note:179952.1,
but are briefly listed here for completeness:
If using Automatic Undo Management
Offline the undo segment using the _OFFLINE_ROLLBACK_SEGMENTS parameter and bounce the database as follows:
1.  Create  and edit the init.ora file for the instance to set the following parameters:
UNDO_MANAGEMENT=MANUAL
_OFFLINE_ROLLBACK_SEGMENTS=(_SYSSMU21$)
2.  Open the database in restricted mode to prevent user access, e.g.:
connect / as sysdba
startup restrict pfile = '<Full path to init.ora file>';
3.  Drop the rollback/undo segment, e.g.:
drop rollback segment "_SYSSMU21";
4.  Shutdown the instance, and remove the init.ora parameters added in point 1 and restart the instance, e.g.:
shutdown immediate
startup
If SMON was recovering the transaction then this may not work as we cannot open the database if it is determined
to be in an inconsistent state. I have reviewed a number of SRs where this approach was successful,
so it is important to try it first but understand that it may fail and you will have to resort to
a point in time recovery or forcing open the DB and recreating it.

第四步

Now we need to dump the undo block to see which object was affected.
We noted in Step 2 that this is the active transaction (from the trace file):
TRN TBL
index state cflags wrap# uel scn dba
0x45 10 0x80 0x35b1b 0x0000 0x0695.3a0aba4d 0x1081e796
Dba----------------> Undo block containing the undo records
dba--->0x1081e796 is the block containing the active transaction .
Use the WebIV tools to convert this RDBA to block number (block#) and file number (file#), e.g.:
V SPLIT ==> DBA (Hex) = File#,Block# (Hex File#,Block#)
= ===== === ===== ============
V8 10,10 ==> 276948886 (0x1081e796) = 66,124822 (0x42 0x1e796)
So the file# is 66 and the block# is 124822, so dump the block by issuing:
SQL> Alter system dump datafile 66 block 124822;
This will generate a trace file in the user_dump_dest.  The following is a sample of the information in the undo block:
UNDO BLK:
xid: 0x000c.045.00035b1b seq: 0x1e14 cnt: 0x17 irb: 0x17 icl: 0x0 flg: 0x0000
Rec Offset Rec Offset Rec Offset Rec Offset Rec Offset
---------------------------------------------------------------------------
0x01 0x1f8c 0x02 0x1f30 0x03 0x1ed4 0x04 0x1e78 0x05 0x1e1c
0x06 0x1dc0 0x07 0x1d64 0x08 0x1d08 0x09 0x1cac 0x0a 0x1c50
0x0b 0x1bf4 0x0c 0x1b98 0x0d 0x1b3c 0x0e 0x1ae0 0x0f 0x1a74
0x10 0x1a18 0x11 0x19bc 0x12 0x1960 0x13 0x1904 0x14 0x187c
0x15 0x181c 0x16 0x1798 0x17 0x173c
* Rec #0x16 slt: 0x45 objn: 1485619(0x0016ab33) objd: 1485619 tblspc: 71(0x00000047)
* Layer: 11 (Row) opc: 1 rci 0x00
Undo type: Regular undo Begin trans Last buffer split: No
Temp Object: No
Tablespace Undo: No
rdba: 0x00000000
*-----------------------------
uba: 0x1081e796.1e14.14 ctl max scn: 0x0695.38f69853 prv tx scn: 0x0695.38f698a1
KDO undo record:
KTB Redo
op: 0x04 ver: 0x01
op: L itl: scn: 0x0019.009.00034237 uba: 0x36c0cce4.1d2f.19
flg: C--- lkc: 0 scn: 0x0695.38f6b96b
KDO Op code: URP xtype: XA bdba: 0x35406893 hdba: 0x35406892
itli: 1 ispac: 0 maxfr: 4863
tabn: 0 slot: 0(0x0) flag: 0x2c lock: 0 ckix: 0
ncol: 1 nnew: 1 size: -1
col 0: [ 4] c3 0e 36 2e
*-----------------------------
* Rec #0x17 slt: 0x45 objn: 1485619(0x0016ab33) objd: 1485619 tblspc: 71(0x00000047)
* Layer: 11 (Row) opc: 1 rci 0x16
Undo type: Regular undo Last buffer split: No
Temp Object: No
Tablespace Undo: No
rdba: 0x00000000
*-----------------------------
From the trace file above:
UNDO BLK:
xid: 0x000c.045.00035b1b seq: 0x1e14 cnt: 0x17 irb: 0x17 icl: 0x0 flg: 0x0000
The undo segment with the active transaction is segment is 0x000c (Hex) which is 12 (Dec) as the XID is:
      Undo segment no + Slot no + Sequence no
This step is often skipped because it was performed earlier in step 3, however it is a good idea to do this
again now to make sure that the XID from the UNDO block matches the UNDO SEGMENT HEADER,
this way you have followed all the chain, from the UNDO SEGMENT to UNDO BLOCK, back and forth.
If there is a conflict here please check and make sure that the customer dumped the correct undo block.
Check for the value of irb which is an index which points you to the latest change done to the undo block.
This is the point from which a rollback would begin if one was issued.
From the trace file we see: 'irb: 0x17' so this points to record 0x17,
so search for this particular string i.e 0x17 and it will take you to undo record 'REC #0x17', e.g.:
* Rec #0x17 slt: 0x45 objn: 1485619(0x0016ab33) objd: 1485619 tblspc: 71(0x00000047)
* Layer: 11 (Row) opc: 1 rci 0x16
Undo type: Regular undo Last buffer split: No
Temp Object: No
Tablespace Undo: No
rdba: 0x00000000
*-----------------------------
Note the slot number (slt) is 0x45, the object number (objn) is the OBJECT_ID from dba_objects
and data object number (objd) is the DATA_OBJECT_ID from dba_objects.
These numbers may be the same but not necessarily, and so if the database is open then identify this object, e.g.:
        select object_name, owner, object_type, data_object_id from dba_objects where object_id = <objn>;
This is the object, which has an active transaction.  Note in the above trace file extract that rci
has a value of 0x16 which means that this record is at the end of an undo chain.
This means that the chain continues in another UNDO BLOCK.
Please refer to unpublished Note:281504.1 for information on Undo chains.
So the next record that needs to be rolled back is present in REC #X016.
If rci is 0x00 then it means that this is the first record present in the undo chain
and so you can check to see if there is rdba info, e.g.:
* Rec #0x16 slt: 0x45 objn: 1485619(0x0016ab33) objd: 1485619 tblspc: 71(0x00000047)
* Layer: 11 (Row) opc: 1 rci 0x00
Undo type: Regular undo Begin trans Last buffer split: No
Temp Object: No
Tablespace Undo: No
rdba: 0x00000000
*-----------------------------
uba: 0x1081e796.1e14.14 ctl max scn: 0x0695.38f69853 prv tx scn: 0x0695.38f698a1
KDO undo record:
KTB Redo
op: 0x04 ver: 0x01
op: L itl: scn: 0x0019.009.00034237 uba: 0x36c0cce4.1d2f.19
flg: C--- lkc: 0 scn: 0x0695.38f6b96b
KDO Op code: URP xtype: XA bdba: 0x35406893 hdba: 0x35406892
itli: 1 ispac: 0 maxfr: 4863
tabn: 0 slot: 0(0x0) flag: 0x2c lock: 0 ckix: 0
ncol: 1 nnew: 1 size: -1
col 0: [ 4] c3 0e 36 2e
*-----------------------------
If the object is an Index, drop and recreate it.  If it is a table,
then again the table would need to be dropped and recreated (or truncated)
so that its object number changes and hence the rollback/undo is no longer required.
If this isn't possible, then you have two options:
First take a backup of the database in its current state.
This is critical in case anything goes wrong and you lose the opportunity to salvage the data!
Option 1
 - Restore the undo segment datafile and the datafile containing the object and perform a full recovery.
   This can only be done if you have all the archived redo as you will need to do full recovery on these files.
OR
Option 2
If option 1 is not possible, you can use the unsupported method, e.g.:
Specify the undo segment in the _OFFLINE_ROLLBACK_SEGMENTS parameter and try to drop the rollback segment.
If there is an active transaction then this is not likely to work and you will probably need
to set the _CORRUPTED_ROLLBACK_SEGMENTS parameter as well

温馨提示:
1.隐含参数_OFFLINE_ROLLBACK_SEGMENTS/_CORRUPTED_ROLLBACK_SEGMENTS属于Oracle内部隐含参数,建议在Oracle support认可的情况下使用,因为使用之后可能导致数据库事务完整性彻底损坏
2.进行屏蔽事务之前,如果条件允许建议使用txchecker检查
2.使用上述方法恢复数据库之后,建议通过逻辑方式导出导入重建数据库

Exadata火线救援:10TB级数据恢复—强制拉库篇

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:Exadata火线救援:10TB级数据恢复—强制拉库篇

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

这个库的恢复有一些历史故事(【力荐】Exadata火线救援:10TB级数据修复经典案例详解!):xx运营商x2的1/4配置的oracle exadata机器,跑了近6年,最近有一个cell节点主机异常,在rebalance过程中,只有两个节点的cell其中一个节点坏了一个硬盘导致.导致asm diskgroup无法正常mount,最后该运营商运维三方通过amdu把该一体机中的数据文件全部抽出来,然后在恢复过程中出现大量错误无法解决,请求我们支持
数据库open过程报ORA-01555错误

Thu Jul  14 00:01:04 2016
alter database open
Thu Jul  14 00:01:04 2016
Thread 1 advanced to log sequence 2 (thread open)
Thread 1 opened at log sequence 2
  Current log# 2 seq# 2 mem# 0: /data/amdu/redo/DATA_EC_260.f
Successful open of redo thread 1
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Thu Jul  14 00:01:05 2016
SMON: enabling cache recovery
ORA-01555 caused by SQL statement below (SQL ID: 4krwuz0ctqxdt, SCN: 0x0b26.9f080238):
select ctime, mtime, stime from obj$ where obj# = :1
Errors in file /oracle/app/oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_ora_59546.trc:
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 1
ORA-01555: snapshot too old: rollback segment number 83 with name &quot;_SYSSMU83_1078760807$&quot; too small
Errors in file /oracle/app/oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_ora_59546.trc:
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 1
ORA-01555: snapshot too old: rollback segment number 83 with name &quot;_SYSSMU83_1078760807$&quot; too small
Error 704 happened during db open, shutting down database
USER (ospid: 59546): terminating the instance due to error 704
Instance terminated by USER, pid = 59546
ORA-1092 signalled during: alter database open...
opiodr aborting process unknown ospid (59546) as a result of ORA-1092

这个错误比较常见,可以通过推scn就可以解决,由于已经安装了scn patch,通过oradebug推scn解决该问题.

ORA-600 4194
这个ora 600 4194相对比较特殊,在SMON: enabling cache recovery之后立马报出来,然后实例直接open失败.

Thu Jul  14 00:06:15 2016
alter database open
Thu Jul  14 00:06:15 2016
Thread 1 advanced to log sequence 3 (thread open)
Thread 1 opened at log sequence 3
  Current log# 3 seq# 3 mem# 0: /data/amdu/redo/DATA_EC_263.f
Successful open of redo thread 1
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Thu Jul  14 00:06:15 2016
SMON: enabling cache recovery
Errors in file /oracle/app/oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_ora_60038.trc  (incident=1080450):
ORA-00600: internal error code, arguments: [4194], [], [], [], [], [], [], [], [], [], [], []
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Block recovery from logseq 3, block 3 to scn 12269096776739
Recovery of Online Redo Log: Thread 1 Group 3 Seq 3 Reading mem 0
  Mem# 0: /data/amdu/redo/DATA_EC_263.f
Block recovery stopped at EOT rba 3.5.16
Block recovery completed at rba 3.5.16, scn 2856.2670179361
Block recovery from logseq 3, block 3 to scn 12269096776736
Recovery of Online Redo Log: Thread 1 Group 3 Seq 3 Reading mem 0
  Mem# 0: /data/amdu/redo/DATA_EC_263.f
Block recovery completed at rba 3.5.16, scn 2856.2670179361
Errors in file /oracle/app/oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_ora_60038.trc:
ORA-00600: internal error code, arguments: [4194], [], [], [], [], [], [], [], [], [], [], []
Errors in file /oracle/app/oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_ora_60038.trc:
ORA-00600: internal error code, arguments: [4194], [], [], [], [], [], [], [], [], [], [], []
Error 600 happened during db open, shutting down database
USER (ospid: 60038): terminating the instance due to error 600
Instance terminated by USER, pid = 60038
ORA-1092 signalled during: alter database open...
opiodr aborting process unknown ospid (60038) as a result of ORA-1092

trace文件分析
打开数据库报ORA-600[4194]错误,对启动过程进行10046跟踪并且分析trace文件发现

PARSING IN CURSOR #140375370511672 len=148 dep=1 uid=0 oct=6 lid=0 tim=3501342849457766 hv=3540833987
ad='a47df47a8' sqlid='5ansr7r9htpq3'
update undo$ set name=:2,file#=:3,block#=:4,status$=:5,user#=:6,undosqn=:7,xactsqn=:8,scnbas=:9,
scnwrp=:10,inst#=:11,ts#=:12,spare1=:13 where us#=:1
END OF STMT
PARSE #140375370511672:c=27996,e=28041,p=66,cr=224,cu=0,mis=1,r=0,dep=1,og=4,plh=0,tim=3501342849457765
BINDS #140375370511672:
 Bind#0
  oacdty=01 mxl=32(20) mxlc=00 mal=00 scl=00 pre=00
  oacflg=18 fl2=0001 frm=01 csi=178 siz=32 off=0
  kxsbbbfp=a47e093ca  bln=32  avl=20  flg=09
  value=&quot;_SYSSMU1_2856534670$&quot;
 Bind#1
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=7fabae3b92b0  bln=24  avl=03  flg=05
  value=1024
 Bind#2
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=7fabae3b9280  bln=24  avl=03  flg=05
  value=128
 Bind#3
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=7fabae3b9248  bln=24  avl=02  flg=05
  value=5
 Bind#4
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=7fabae3b9218  bln=24  avl=02  flg=05
  value=1
 Bind#5
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=7fabae3b91e8  bln=24  avl=03  flg=05
  value=3398
 Bind#6
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=7fabae3b91b8  bln=24  avl=05  flg=05
  value=1485261
 Bind#7
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=7fabae3b9180  bln=24  avl=06  flg=05
  value=1946693999
 Bind#8
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=7fabae3b8ec8  bln=24  avl=03  flg=05
  value=2847
 Bind#9
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=7fabae3b8e98  bln=24  avl=02  flg=05
  value=1
 Bind#10
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=7fabae3b8e68  bln=24  avl=02  flg=05
  value=2
 Bind#11
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=7fabae3b8e38  bln=24  avl=02  flg=05
  value=2
 Bind#12
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=7fabae3b92e0  bln=22  avl=02  flg=05
  value=1
WAIT #140375370511672: nam='db file sequential read' ela= 21 file#=1 block#=179020 blocks=1 obj#=0 tim=3501342849459353
*** 2016-07-14 03:14:09.548
ORA-00600: internal error code, arguments: [4194], [], [], [], [], [], [], [], [], [], [], []

很明显数据库是在update undo$的时候,读取到file 1 block 179020的时候报错,继续分析trace文件

Error 600 in redo application callback
Dump of change vector:
TYP:0 CLS:16 AFN:1 DBA:0x0042bb4c OBJ:4294967295 SCN:0x0b26.a88e815e SEQ:1 OP:5.1 ENC:0 RBL:0
ktudb redo: siz: 272 spc: 4906 flg: 0x0012 seq: 0x0f4c rec: 0x0d
            xid:  0x0000.01b.00000b63
ktubl redo: slt: 27 rci: 0 opc: 11.1 [objn: 15 objd: 15 tsn: 0]
Undo type:  Regular undo        Begin trans    Last buffer split:  No
Temp Object:  No
Tablespace Undo:  No
             0x00000000  prev ctl uba: 0x0042bb4c.0f4c.0c
prev ctl max cmt scn:  0x0b23.ccb9eb89  prev tx cmt scn:  0x0b23.ccb9ebac
txn start scn:  0xffff.ffffffff  logon user: 0  prev brb: 4373321  prev bcl: 0 BuExt idx: 0 flg2: 0
KDO undo record:
KTB Redo
op: 0x04  ver: 0x01
compat bit: 4 (post-11) padding: 1
op: L  itl: xid:  0x0000.01c.00000b65 uba: 0x0042bb4a.0f4c.1d
                      flg: C---    lkc:  0     scn: 0x0b25.7391ee5c
KDO Op code: URP row dependencies Disabled
  xtype: XA flags: 0x00000000  bdba: 0x004000e1  hdba: 0x004000e0
itli: 2  ispac: 0  maxfr: 4863
tabn: 0 slot: 1(0x1) flag: 0x2c lock: 0 ckix: 0
ncol: 17 nnew: 12 size: 0
col  1: [20]  5f 53 59 53 53 4d 55 31 5f 32 38 35 36 35 33 34 36 37 30 24
col  2: [ 2]  c1 02
col  3: [ 3]  c2 0b 19
col  4: [ 3]  c2 02 1d
col  5: [ 6]  c5 14 2f 46 28 64
col  6: [ 3]  c2 1d 30
col  7: [ 5]  c4 02 31 35 3e
col  8: [ 3]  c2 22 63
col  9: [ 2]  c1 02
col 10: [ 2]  c1 04
col 11: [ 2]  c1 03
col 16: [ 2]  c1 03
Block after image is corrupt:
buffer tsn: 0 rdba: 0x0042bb4c (1/179020)
scn: 0x0b26.a88e815e seq: 0x01 flg: 0x04 tail: 0x815e0201
frmt: 0x02 chkval: 0xf022 type: 0x02=KTU UNDO BLOCK
Hex dump of block: st=0, typ_found=1
Dump of memory from 0x000000094E07A000 to 0x000000094E07C000
94E07A000 0000A202 0042BB4C A88E815E 04010B26  [....L.B.^...&amp;...]
94E07A010 0000F022 004E0000 00000B5B 1D1D0F4C  [&quot;.....N.[...L...]

我们知道在file 1 block 179020的时候redo和undo信息不匹配,出现了上述的ORA 600 4194的错误.进一步分析

Block image after block recovery:
buffer tsn: 0 rdba: 0x00400080 (1/128)
scn: 0x0b26.6389e19d seq: 0x01 flg: 0x04 tail: 0xe19d0e01
frmt: 0x02 chkval: 0x7c95 type: 0x0e=KTU UNDO HEADER W/UNLIMITED EXTENTS
Hex dump of block: st=0, typ_found=1
Dump of memory from 0x000000094DAB2000 to 0x000000094DAB4000
94DAB2000 0000A20E 00400080 6389E19D 04010B26  [......@....c&amp;...]
94DAB2010 00007C95 00000000 00000000 00000000  [.|..............]
94DAB2020 00000000 00000015 000002FF 00001020  [............ ...]
94DAB2030 0000000D 0000004C 00000080 0042BB4C  [....L.......L.B.]
  Extent Control Header
  -----------------------------------------------------------------
  Extent Header:: spare1: 0      spare2: 0      #extents: 21     #blocks: 767
                  last map  0x00000000  #maps: 0      offset: 4128
      Highwater::  0x0042bb4c  ext#: 13     blk#: 76     ext size: 128
  #blocks in seg. hdr's freelists: 0
  #blocks below: 0
  mapblk  0x00000000  offset: 13
                   Unlocked
     Map Header:: next  0x00000000  #extents: 21   obj#: 0      flag: 0x40000000
  Extent Map
  -----------------------------------------------------------------
   0x00400081  length: 7
   0x00413a38  length: 8
   0x00400088  length: 8
   0x00413a30  length: 8
   0x0042b888  length: 8
   0x0042b890  length: 8
   0x0042b898  length: 8
   0x0042b8a0  length: 8
   0x0042b8a8  length: 8
   0x0042b8b0  length: 8
   0x0042b8b8  length: 8
   0x0042b8c0  length: 8
   0x0042ba80  length: 128
   0x0042bb00  length: 128
   0x0042bc00  length: 128
   0x0042bc80  length: 128
   0x0042bb80  length: 128
   0x00400210  length: 8
   0x00400218  length: 8
   0x00400220  length: 8
   0x00400228  length: 8
  TRN CTL:: seq: 0x0f4c chd: 0x001b ctl: 0x0043 inc: 0x00000000 nfb: 0x0001
            mgc: 0x8002 xts: 0x0068 flg: 0x0001 opt: 2147483646 (0x7ffffffe)
            uba: 0x0042bb4c.0f4c.0c scn: 0x0b23.ccb9eb89
Version: 0x01
  FREE BLOCK POOL::
    uba: 0x0042bb4c.0f4c.0c ext: 0xd  spc: 0x132a
    uba: 0x00000000.0f4c.0c ext: 0xd  spc: 0x12f6
    uba: 0x00000000.0f4c.01 ext: 0xd  spc: 0x1ec8
    uba: 0x00000000.0f4c.04 ext: 0xd  spc: 0x1b86
    uba: 0x00000000.0f4c.09 ext: 0xd  spc: 0x162c
  TRN TBL::
  index  state cflags  wrap#    uel         scn            dba            parent-xid    nub     stmt_num
  ------------------------------------------------------------------------------------------------
   0x00    9    0x00  0x0b62  0x0011  0x0b25.2dacaf61  0x0042bb4b  0x0000.000.00000000  0x00000001   0x00000000
   0x01    9    0x00  0x0b64  0x0024  0x0b24.a6a2cf7b  0x0042bb4b  0x0000.000.00000000  0x00000001   0x00000000
   0x02    9    0x00  0x0b65  0x0036  0x0b25.7391eda0  0x0042bb4a  0x0000.000.00000000  0x00000001   0x00000000
   0x03    9    0x00  0x0b4f  0x0007  0x0b24.337bf49b  0x0042bb4a  0x0000.000.00000000  0x00000001   0x00000000
   0x04    9    0x00  0x0b64  0x0051  0x0b23.ff22c637  0x0042bb49  0x0000.000.00000000  0x00000001   0x00000000
   0x05    9    0x00  0x0b64  0x0022  0x0b26.4393eb1e  0x0042bb4c  0x0000.000.00000000  0x00000001   0x00000000
   0x06    9    0x00  0x0b66  0x0058  0x0b24.335c794d  0x0042bb49  0x0000.000.00000000  0x00000001   0x00000000
   0x07    9    0x00  0x0b4f  0x001d  0x0b24.4e05f2af  0x0042bb4a  0x0000.000.00000000  0x00000001   0x00000000
   0x08    9    0x00  0x0b65  0x005e  0x0b23.ff22c618  0x0042bb4a  0x0000.000.00000000  0x00000001   0x00000000
   0x09    9    0x00  0x0b5f  0x0035  0x0b24.337bf3d9  0x0042bb49  0x0000.000.00000000  0x00000001   0x00000000
   0x0a    9    0x00  0x0b64  0x004f  0x0b25.7391ee5f  0x0042bb4c  0x0000.000.00000000  0x00000001   0x00000000
   0x0b    9    0x00  0x0b64  0x0040  0x0b24.335c7bd7  0x0042bb49  0x0000.000.00000000  0x00000001   0x00000000
   0x0c    9    0x00  0x0b65  0x0002  0x0b25.7391e929  0x0042bb4a  0x0000.000.00000000  0x00000001   0x00000000
   0x0d    9    0x00  0x0b4f  0x0033  0x0b24.a6a2caa5  0x0042bb4a  0x0000.000.00000000  0x00000001   0x00000000
   0x0e    9    0x00  0x0b65  0x0008  0x0b23.ff22c616  0x0042bb49  0x0000.000.00000000  0x00000001   0x00000000
   0x0f    9    0x00  0x0b61  0x0038  0x0b26.6389e195  0x0042bb4c  0x0000.000.00000000  0x00000001   0x00000000
   0x10    9    0x00  0x0b4f  0x002a  0x0b24.bcff18b3  0x0042bb4b  0x0000.000.00000000  0x00000001   0x00000000
   0x11    9    0x00  0x0b5c  0x0059  0x0b25.2dacaf69  0x0042bb4b  0x0000.000.00000000  0x00000001   0x00000000
   0x12    9    0x00  0x0b65  0x0026  0x0b25.2dacb0a8  0x0042bb4b  0x0000.000.00000000  0x00000001   0x00000000
   0x13    9    0x00  0x0b66  0x0021  0x0b24.a6a2caaa  0x0042bb4a  0x0000.000.00000000  0x00000001   0x00000000
   0x14    9    0x00  0x0b62  0x0009  0x0b24.337bf3d7  0x0042bb49  0x0000.000.00000000  0x00000001   0x00000000
   0x15    9    0x00  0x0b63  0x0031  0x0b25.1b4e13ba  0x0042bb4b  0x0000.000.00000000  0x00000001   0x00000000
   0x16    9    0x00  0x0b66  0x003b  0x0b25.2dacee5d  0x0042bb4a  0x0000.000.00000000  0x00000001   0x00000000
   0x17    9    0x00  0x0b63  0x0034  0x0b26.6389e199  0x0042bb4c  0x0000.000.00000000  0x00000001   0x00000000
   0x18    9    0x00  0x0b5d  0x002f  0x0b24.bcff18af  0x0042bb4b  0x0000.000.00000000  0x00000001   0x00000000
   0x19    9    0x00  0x0b5b  0x004d  0x0b24.d60f78e3  0x0042bb4b  0x0000.000.00000000  0x00000001   0x00000000
   0x1a    9    0x00  0x0b60  0x005b  0x0b25.1b4e13be  0x0042bb4b  0x0000.000.00000000  0x00000001   0x00000000
   0x1b    9    0x00  0x0b62  0x003e  0x0b23.ccb9ebac  0x0042bb49  0x0000.000.00000000  0x00000001   0x00000000
   0x1c    9    0x00  0x0b65  0x000a  0x0b25.7391ee5c  0x0042bb4a  0x0000.000.00000000  0x00000001   0x00000000
   0x1d    9    0x00  0x0b64  0x002c  0x0b24.4e05f2b1  0x0042bb4a  0x0000.000.00000000  0x00000001   0x00000000
   0x1e    9    0x00  0x0b64  0x0045  0x0b24.33255ae9  0x0042bb49  0x0000.000.00000000  0x00000001   0x00000000
   0x1f    9    0x00  0x0b64  0x0015  0x0b25.1b4e13b5  0x0042bb4b  0x0000.000.00000000  0x00000001   0x00000000
   0x20    9    0x00  0x0b63  0x0050  0x0b24.335c79a4  0x0042bb49  0x0000.000.00000000  0x00000001   0x00000000
   0x21    9    0x00  0x0b5d  0x0001  0x0b24.a6a2caf0  0x0042bb4b  0x0000.000.00000000  0x00000001   0x00000000
   0x22    9    0x00  0x0b65  0x000f  0x0b26.4393eb20  0x0042bb4c  0x0000.000.00000000  0x00000001   0x00000000
   0x23    9    0x00  0x0b62  0x0042  0x0b24.337bf3d2  0x0042bb49  0x0000.000.00000000  0x00000001   0x00000000
   0x24    9    0x00  0x0b65  0x003c  0x0b24.a6a2d137  0x0042bb4b  0x0000.000.00000000  0x00000001   0x00000000
   0x25    9    0x00  0x0b62  0x0020  0x0b24.335c795d  0x0042bb49  0x0000.000.00000000  0x00000001   0x00000000
   0x26    9    0x00  0x0b63  0x0052  0x0b25.2dacee48  0x0042bb4a  0x0000.000.00000000  0x00000001   0x00000000
   0x27    9    0x00  0x0b4e  0x003a  0x0b25.7391ee58  0x0042bb4a  0x0000.000.00000000  0x00000001   0x00000000
   0x28    9    0x00  0x0b65  0x0049  0x0b25.2dacb089  0x0042bb4b  0x0000.000.00000000  0x00000001   0x00000000
   0x29    9    0x00  0x0b61  0x0030  0x0b24.bcff18bb  0x0042bb4b  0x0000.000.00000000  0x00000001   0x00000000
   0x2a    9    0x00  0x0b65  0x0057  0x0b24.bcff18b5  0x0042bb4b  0x0000.000.00000000  0x00000001   0x00000000
   0x2b    9    0x00  0x0b64  0x0054  0x0b25.2dacee55  0x0042bb4a  0x0000.000.00000000  0x00000001   0x00000000
   0x2c    9    0x00  0x0b61  0x000d  0x0b24.4e05f2b3  0x0042bb4a  0x0000.000.00000000  0x00000001   0x00000000
   0x2d    9    0x00  0x0b64  0x000e  0x0b23.ff22c611  0x0042bb49  0x0000.000.00000000  0x00000001   0x00000000
   0x2e    9    0x00  0x0b65  0x0053  0x0b25.7391e78a  0x0042bb4a  0x0000.000.00000000  0x00000001   0x00000000
   0x2f    9    0x00  0x0b66  0x0010  0x0b24.bcff18b1  0x0042bb4b  0x0000.000.00000000  0x00000001   0x00000000
   0x30    9    0x00  0x0b63  0x0019  0x0b24.d60f78e1  0x0042bb4b  0x0000.000.00000000  0x00000001   0x00000000
   0x31    9    0x00  0x0b65  0x001a  0x0b25.1b4e13bc  0x0042bb4b  0x0000.000.00000000  0x00000001   0x00000000
   0x32    9    0x00  0x0b65  0x0037  0x0b24.a6a2d369  0x0042bb4b  0x0000.000.00000000  0x00000001   0x00000000
   0x33    9    0x00  0x0b4f  0x0013  0x0b24.a6a2caa7  0x0042bb4b  0x0000.000.00000000  0x00000001   0x00000000
   0x34    9    0x00  0x0b63  0x0043  0x0b26.6389e19b  0x0042bb4c  0x0000.000.00000000  0x00000001   0x00000000
   0x35    9    0x00  0x0b65  0x0046  0x0b24.337bf409  0x0042bb49  0x0000.000.00000000  0x00000001   0x00000000
   0x36    9    0x00  0x0b52  0x0061  0x0b25.7391eda2  0x0042bb4a  0x0000.000.00000000  0x00000001   0x00000000
   0x37    9    0x00  0x0b64  0x0018  0x0b24.bcff18ad  0x0042bb4b  0x0000.000.00000000  0x00000001   0x00000000
   0x38    9    0x00  0x0b65  0x0017  0x0b26.6389e197  0x0042bb4c  0x0000.000.00000000  0x00000001   0x00000000
   0x39    9    0x00  0x0b62  0x0006  0x0b24.335c7947  0x0042bb4a  0x0000.000.00000000  0x00000001   0x00000000
   0x3a    9    0x00  0x0b64  0x001c  0x0b25.7391ee5a  0x0042bb4a  0x0000.000.00000000  0x00000001   0x00000000
   0x3b    9    0x00  0x0b4d  0x002e  0x0b25.7391e730  0x0042bb4a  0x0000.000.00000000  0x00000001   0x00000000
   0x3c    9    0x00  0x0b65  0x0032  0x0b24.a6a2d144  0x0042bb4b  0x0000.000.00000000  0x00000001   0x00000000
   0x3d    9    0x00  0x0b65  0x0016  0x0b25.2dacee59  0x0042bb4a  0x0000.000.00000000  0x00000001   0x00000000
   0x3e    9    0x00  0x0b63  0x002d  0x0b23.ff22c60b  0x0042bb49  0x0000.000.00000000  0x00000001   0x00000000
   0x3f    9    0x00  0x0b63  0x005f  0x0b24.335c7b57  0x0042bb49  0x0000.000.00000000  0x00000001   0x00000000
   0x40    9    0x00  0x0b65  0x0044  0x0b24.335c7bd9  0x0042bb49  0x0000.000.00000000  0x00000001   0x00000000
   0x41    9    0x00  0x0b65  0x0029  0x0b24.bcff18b9  0x0042bb4b  0x0000.000.00000000  0x00000001   0x00000000
   0x42    9    0x00  0x0b61  0x0014  0x0b24.337bf3d4  0x0042bb49  0x0000.000.00000000  0x00000001   0x00000000
   0x43    9    0x00  0x0b5b  0xffff  0x0b26.6389e19d  0x0042bb4c  0x0000.000.00000000  0x00000001   0x00000000
   0x44    9    0x00  0x0b4c  0x004b  0x0b24.335c7bdb  0x0042bb4a  0x0000.000.00000000  0x00000001   0x00000000
   0x45    9    0x00  0x0b61  0x0039  0x0b24.335c7945  0x0042bb49  0x0000.000.00000000  0x00000001   0x00000000
   0x46    9    0x00  0x0b65  0x005a  0x0b24.337bf421  0x0042bb4a  0x0000.000.00000000  0x00000001   0x00000000
   0x47    9    0x00  0x0b65  0x0027  0x0b25.7391eda8  0x0042bb4a  0x0000.000.00000000  0x00000001   0x00000000
   0x48    9    0x00  0x0b62  0x004e  0x0b24.335c7953  0x0042bb49  0x0000.000.00000000  0x00000001   0x00000000
   0x49    9    0x00  0x0b63  0x0012  0x0b25.2dacb09f  0x0042bb4b  0x0000.000.00000000  0x00000001   0x00000000
   0x4a    9    0x00  0x0b63  0x0023  0x0b24.335c7bf8  0x0042bb49  0x0000.000.00000000  0x00000001   0x00000000
   0x4b    9    0x00  0x0b5b  0x004a  0x0b24.335c7bf3  0x0042bb49  0x0000.000.00000000  0x00000001   0x00000000
   0x4c    9    0x00  0x0b63  0x003f  0x0b24.335c7b55  0x0042bb49  0x0000.000.00000000  0x00000001   0x00000000
   0x4d    9    0x00  0x0b61  0x001f  0x0b25.1b4e13b3  0x0042bb4b  0x0000.000.00000000  0x00000001   0x00000000
   0x4e    9    0x00  0x0b5a  0x0025  0x0b24.335c795b  0x0042bb49  0x0000.000.00000000  0x00000001   0x00000000
   0x4f    9    0x00  0x0b5f  0x005c  0x0b25.8ff588bb  0x0042bb4c  0x0000.000.00000000  0x00000001   0x00000000
   0x50    9    0x00  0x0b63  0x004c  0x0b24.335c79a7  0x0042bb49  0x0000.000.00000000  0x00000001   0x00000000
   0x51    9    0x00  0x0b64  0x005d  0x0b24.0c84cbf4  0x0042bb49  0x0000.000.00000000  0x00000001   0x00000000
   0x52    9    0x00  0x0b65  0x002b  0x0b25.2dacee49  0x0042bb4a  0x0000.000.00000000  0x00000001   0x00000000
   0x53    9    0x00  0x0b64  0x000c  0x0b25.7391e927  0x0042bb4a  0x0000.000.00000000  0x00000001   0x00000000
   0x54    9    0x00  0x0b63  0x003d  0x0b25.2dacee56  0x0042bb4a  0x0000.000.00000000  0x00000001   0x00000000
   0x55    9    0x00  0x0b64  0x0005  0x0b26.4393eada  0x0042bb4c  0x0000.000.00000000  0x00000001   0x00000000
   0x56    9    0x00  0x0b64  0x0055  0x0b26.4393ead3  0x0042bb4c  0x0000.000.00000000  0x00000001   0x00000000
   0x57    9    0x00  0x0b60  0x0041  0x0b24.bcff18b7  0x0042bb4b  0x0000.000.00000000  0x00000001   0x00000000
   0x58    9    0x00  0x0b65  0x0060  0x0b24.335c794f  0x0042bb4a  0x0000.000.00000000  0x00000001   0x00000000
   0x59    9    0x00  0x0b60  0x0028  0x0b25.2dacaf74  0x0042bb4b  0x0000.000.00000000  0x00000001   0x00000000
   0x5a    9    0x00  0x0b61  0x0003  0x0b24.337bf499  0x0042bb4a  0x0000.000.00000000  0x00000001   0x00000000
   0x5b    9    0x00  0x0b62  0x0000  0x0b25.1b4e13c0  0x0042bb4b  0x0000.000.00000000  0x00000001   0x00000000
   0x5c    9    0x00  0x0b62  0x0056  0x0b25.950a6e4b  0x0042bb4c  0x0000.000.00000000  0x00000001   0x00000000
   0x5d    9    0x00  0x0b63  0x001e  0x0b24.33255ae7  0x0042bb49  0x0000.000.00000000  0x00000001   0x00000000
   0x5e    9    0x00  0x0b5f  0x0004  0x0b23.ff22c635  0x0042bb49  0x0000.000.00000000  0x00000001   0x00000000
   0x5f    9    0x00  0x0b64  0x000b  0x0b24.335c7bd5  0x0042bb49  0x0000.000.00000000  0x00000001   0x00000000
   0x60    9    0x00  0x0b61  0x0048  0x0b24.335c7950  0x0042bb4b  0x0000.000.00000000  0x00000001   0x00000000
   0x61    9    0x00  0x0b65  0x0047  0x0b25.7391eda6  0x0042bb4a  0x0000.000.00000000  0x00000001   0x00000000
KQRCMT: Write failed with error=600 po=0xa47e092c0 cid=3
diagnostics : cid=3 hash=35e74caf flag=2a
ORA-00600: internal error code, arguments: [4194], [], [], [], [], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [4194], [], [], [], [], [], [], [], [], [], [], []

到这里我们基本上明白了报错的file 1 block 179020是由FREE BLOCK POOL分配出来的,现在解决给问题的思路就是直接使用bbed分配一个新块即可.

ORA-600 6711
数据库无法正常open报ORA 600 6711错误

Thu Jul  14 04:04:28 2016
alter database open
Beginning crash recovery of 1 threads
 parallel recovery started with 32 processes
Started redo scan
Completed redo scan
 read 1 KB redo, 0 data blocks need recovery
Started redo application at
 Thread 1: logseq 2, block 3
Recovery of Online Redo Log: Thread 1 Group 2 Seq 2 Reading mem 0
  Mem# 0: /data/amdu/redo/DATA_EC_260.f
Completed redo application of 0.00MB
Completed crash recovery at
 Thread 1: logseq 2, block 5, scn 12269633687653
 0 data blocks read, 0 data blocks written, 1 redo k-bytes read
Thu Jul  14 04:04:29 2016
Thread 1 advanced to log sequence 3 (thread open)
Thread 1 opened at log sequence 3
  Current log# 3 seq# 3 mem# 0: /data/amdu/redo/DATA_EC_263.f
Successful open of redo thread 1
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
After successful startup of the database, please remove
the parameters _allow_error_simulation and _smu_debug_mode
and restart the database
Thu Jul  14 04:04:29 2016
SMON: enabling cache recovery
Undo initialization finished serial:0 start:270626334 end:270626544 diff:210 (2 seconds)
Dictionary check beginning
Tablespace 'TEMP' #3 found in data dictionary,
but not in the controlfile. Adding to controlfile.
Corrected file 19 plugged in read-only status in control file
Corrected file 81 plugged in read-only status in control file
Corrected file 88 plugged in read-only status in control file
Corrected file 93 plugged in read-only status in control file
Corrected file 128 plugged in read-only status in control file
Corrected file 130 plugged in read-only status in control file
Corrected file 131 plugged in read-only status in control file
Corrected file 163 plugged in read-only status in control file
Corrected file 181 plugged in read-only status in control file
Corrected file 184 plugged in read-only status in control file
Corrected file 186 plugged in read-only status in control file
Corrected file 191 plugged in read-only status in control file
Corrected file 214 plugged in read-only status in control file
Corrected file 220 plugged in read-only status in control file
Dictionary check complete
Verifying file header compatibility for 11g tablespace encryption..
Verifying 11g file header compatibility for tablespace encryption completed
SMON: enabling tx recovery
*********************************************************************
WARNING: The following temporary tablespaces contain no files.
         This condition can occur when a backup controlfile has
         been restored.  It may be necessary to add files to these
         tablespaces.  That can be done using the SQL statement:
         ALTER TABLESPACE &lt;tablespace_name&gt; ADD TEMPFILE
         Alternatively, if these temporary tablespaces are no longer
         needed, then they can be dropped.
           Empty temporary tablespace: TEMP
*********************************************************************
Updating character set in controlfile to ZHS16GBK
WARNING: event 8105 is set. This event disables failed
         online index [re]build cleanup
No Resource Manager plan active
replication_dependency_tracking turned off (no async multimaster replication found)
Starting background process QMNC
Thu Jul  14 04:04:30 2016
QMNC started with pid=57, OS id=3549
LOGSTDBY: Validating controlfile with logical metadata
LOGSTDBY: Validation complete
Errors in file /oracle/app/oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_ora_3401.trc  (incident=1440450):
ORA-00600: internal error code, arguments: [6711], [4293062], [1], [4318348], [0], [], [], [], [], [], [], []
Incident details in: /oracle/app/oracle/diag/rdbms/xifenfei/xifenfei/incident/incdir_1440450/xifenfei_ora_3401_i1440450.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Error 600 in kwqmnpartition(), aborting txn
Thu Jul  14 04:04:32 2016
Dumping diagnostic data in directory=[cdmp_20801214040432], requested by (instance=1, osid=3401), summary=[incident=1440450].
Thu Jul  14 04:04:32 2016
Errors in file /oracle/app/oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_mmon_3329.trc  (incident=1440178):
ORA-00600: internal error code, arguments: [6711], [4293062], [1], [4318348], [0], [], [], [], [], [], [], []
Incident details in: /oracle/app/oracle/diag/rdbms/xifenfei/xifenfei/incident/incdir_1440178/xifenfei_mmon_3329_i1440178.trc
Errors in file /oracle/app/oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_ora_3401.trc  (incident=1440451):
ORA-00600: internal error code, arguments: [6711], [4293062], [1], [4318348], [0], [], [], [], [], [], [], []
Incident details in: /oracle/app/oracle/diag/rdbms/xifenfei/xifenfei/incident/incdir_1440451/xifenfei_ora_3401_i1440451.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Errors in file /oracle/app/oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_mmon_3329.trc  (incident=1440179):
ORA-00600: internal error code, arguments: [6711], [4293062], [1], [4318348], [0], [], [], [], [], [], [], []
Incident details in: /oracle/app/oracle/diag/rdbms/xifenfei/xifenfei/incident/incdir_1440179/xifenfei_mmon_3329_i1440179.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
ORA-600 signalled during: alter database open...

数据库正常open失败,但是可以upgrade启动.根据对trace文件的分析,定位到问题在HISTGRM$表上面,进一步分析该表结构为

CREATE TABLE SYS.HISTGRM$
(
  OBJ#      NUMBER,
  COL#      NUMBER,
  ROW#      NUMBER,
  BUCKET    NUMBER,
  ENDPOINT  NUMBER,
  INTCOL#   NUMBER,
  EPVALUE   VARCHAR2(1000 BYTE),
  SPARE1    NUMBER,
  SPARE2    NUMBER
)
CLUSTER SYS.C_OBJ#_INTCOL#(OBJ#, INTCOL#);

这个和ORA-600 6711错误相匹配了

ERROR:
ORA-600 [6711] [a] [b]  [d]
VERSIONS:
versions 6.0 to 12.1
DESCRIPTION:
This error is generated when we find more blocks on a cluster key
chain than are supposed to be there.
Usually this indicates that the chain contains a loop within itself. We
cannot have more than 65535 blocks in a chain.
ARGUMENTS:
Arg [a] beginning DBA
Arg [b] table slot number (in table index)
Arg {c} dba of key next in chain
Arg [d] row slot of key next in chain

需要处理该错误,也就是需要处理CLUSTER SYS.C_OBJ#_INTCOL#,这个可以通过重建来实现,但是当重建之时发生

ORA-00600: internal error code, arguments: [kkoipt:invalid aptyp], [0], [0], [], [], [], [], [], [], [], [], []
ORA-08102: index key not found, obj# 39, file 1, block 1374829 (2)

这里错误比较明显obj#=39为obj$的i_obj4的index的记录和表不匹配,导致任何ddl无法执行,因此如果要处理C_OBJ#_INTCOL#就必须要先处理i_obj4的问题.通过一些技巧重建i_obj4,然后重建C_OBJ#_INTCOL#,数据库终于可以正常打开.由于大量数据字典不一致,exp/expdp导出依旧有问题,通过dblink直接拉数据到新库,完成本次恢复
补充说明:1. 在这个库的恢复过程中,我们还使用了大量的event和隐含参数,因为比较常规而且不涉及核心环节,因为未列举出来;2. 由于当时操作记录未能够保留日志因此相关操作步骤无法贴出来,本文只能提供恢复处理思路
再次提醒各位数据库做好备份,做好巡检工作,哪怕是强大的Oracle exadata也禁不起无备份折腾,数据重于一切

ORA-01555 ORA-600 kdiulk:kcbz_objdchk ORA-600 kdBlkCheckError等错误恢复

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:ORA-01555 ORA-600 kdiulk:kcbz_objdchk ORA-600 kdBlkCheckError等错误恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

数据库启动ORA-00704,0RA-00604,ORA-01555导致数据库无法启动

Tue May 31 17:32:42 2016
SMON: enabling cache recovery
SUCCESS: diskgroup RECOVERY was mounted
ARC3: Archival started
ARC0: STARTING ARCH PROCESSES COMPLETE
ORA-01555 caused by SQL statement below (SQL ID: 4krwuz0ctqxdt, SCN: 0x0004.3af84bee):
select ctime, mtime, stime from obj$ where obj# = :1
Archived Log entry 5 added for thread 1 sequence 10 ID 0x86a261e7 dest 1:
Errors in file /opt/app/oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_ora_12779.trc:
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 1
ORA-01555: snapshot too old: rollback segment number 7 with name "_SYSSMU7_1592079335$" too small
Errors in file /opt/app/oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_ora_12779.trc:
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 1
ORA-01555: snapshot too old: rollback segment number 7 with name "_SYSSMU7_1592079335$" too small
Error 704 happened during db open, shutting down database
USER (ospid: 12779): terminating the instance due to error 704

通过bbed修改事务之后启动数据库

Tue May 31 17:35:49 2016
SMON: enabling tx recovery
*********************************************************************
WARNING: The following temporary tablespaces contain no files.
         This condition can occur when a backup controlfile has
         been restored.  It may be necessary to add files to these
         tablespaces.  That can be done using the SQL statement:
         ALTER TABLESPACE <tablespace_name> ADD TEMPFILE
         Alternatively, if these temporary tablespaces are no longer
         needed, then they can be dropped.
           Empty temporary tablespace: TEMP
*********************************************************************
Updating character set in controlfile to AL32UTF8
Tue May 31 17:35:50 2016
Errors in file /opt/app/oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_p021_13862.trc  (incident=166002):
ORA-00600: 内部错误代码, 参数: [kdiulk:kcbz_objdchk], [0], [0], [1], [], [], [], [], [], [], [], []
Tue May 31 17:35:50 2016
Errors in file /opt/app/oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_p010_13818.trc  (incident=165914):
ORA-00600: 内部错误代码, 参数: [kdiulk:kcbz_objdchk], [0], [0], [1], [], [], [], [], [], [], [], []
Tue May 31 17:35:50 2016
Errors in file /opt/app/oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_p004_13794.trc  (incident=165866):
ORA-00600: 内部错误代码, 参数: [kdiulk:kcbz_objdchk], [0], [0], [1], [], [], [], [], [], [], [], []
Tue May 31 17:35:50 2016
Errors in file /opt/app/oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_p011_13822.trc  (incident=165922):
ORA-00600: 内部错误代码, 参数: [kdiulk:kcbz_objdchk], [0], [0], [1], [], [], [], [], [], [], [], []
Tue May 31 17:35:50 2016
Errors in file /opt/app/oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_p016_13842.trc  (incident=165962):
ORA-00600: 内部错误代码, 参数: [kdiulk:kcbz_objdchk], [0], [0], [1], [], [], [], [], [], [], [], []

ORA-600 [kdiulk:kcbz_objdchk] trace文件

*** SESSION ID:(3.5) 2016-05-31 17:35:50.068
OBJD MISMATCH typ=6, seg.obj=-2, diskobj=222225, dsflg=0, dsobj=285890, tid=285890, cls=1
ORA-00600: 内部错误代码, 参数: [kdiulk:kcbz_objdchk], [0], [0], [1], [], [], [], [], [], [], [], []
Parallel Transaction recovery server caught exception 600
begin Parallel Recovery Context Dump
nsi: 48, nsactive: 48
, nirsi: 1, nidti: 1, ndt: 1, rescan: 0, ptrs: 48
[ktprsi] wdone: 50
[ktpritp 378651b8] ktprsi:
37903b60 37903b78 37903b90 37903ba8 37903bc0 37903bd8 37903bf0 37903c08 37903c20 37903c38 37903c50
37903c68 37903c80 37903c98 37903cb0 37903cc8 37903ce0 37903cf8 37903d10 37903d28 37903d40 37903d58
37903d70 37903d88 37903da0 37903db8 37903dd0 37903de8 37903e00 37903e18 37903e30 37903e48 37903e60
37903e78 37903e90 37903ea8 37903ec0 37903ed8 37903ef0 37903f08 37903f20 37903f38 37903f50 37903f68
37903f80 37903f98 37903fb0 37903fc8
[ktprht] nhb: 47, nfl: 20247, flg: 2
*** 2016-05-31 17:36:08.584
[ktprhb] nfl: 1, nelem: 97, flg: 0, sqn: 1
flist: 37698940
nhe: [ktprhe 32] sqn: -1297235803
[kturur] uoff: -1797708320, sqn: 4
uba: 0x098004cd.07e4.0b
*-----------------------------
* Rec #0xb  slt: 0x07  objn: 123986(0x0001e452)  objd: 285891  tblspc: 10(0x0000000a)
*       Layer:  10 (Index)   opc: 22   rci 0x0a
Undo type:  Regular undo   Last buffer split:  No
Temp Object:  No
Tablespace Undo:  No
rdba: 0x00000000

这里基本上可以确定是由于undo index中的dataobj#和block中的dataobj# 不匹配.在数据库undo回滚之时出现该错误.可以通过跳过undo回滚,然后重建对象

Tue May 31 17:36:06 2016
Simulated error on redo application.
Block recovery from logseq 12, block 959 to scn 20401094719
Recovery of Online Redo Log: Thread 1 Group 3 Seq 12 Reading mem 0
  Mem# 0: +DATA/xifenfei/onlinelog/group_3.263.802446627
Block recovery completed at rba 12.1012.16, scn 4.3221225536
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Simulated error for redo application done.
Errors in file /opt/app/oracle/diag/rdbms/xifenfei/xifenfei/trace/xifenfei_p009_13814.trc  (incident=165906):
ORA-00600: 内部错误代码, 参数: [kdBlkCheckError], [26], [950417], [18025], [], [], [], [], [], [], [], []

这些错误是由于数据库block逻辑异常导致,错过参数含义
在10g中ORA-600 kddummy_blkchk 在11g中ORA-600 kdBlkCheckError

ARGUMENTS:
Arg [a] Absolute file number
Arg [b] Bock number
Arg  Internal error code returned from kcbchk() which indicates the problem encountered.
See Note 46389.1 for details of block check codes.

根据QREF kddummy_blkchk / kdBlkCheckError Check Codes Listing (Full) (Doc ID 1264040.1)分析
这里的18025是代码的KCBTEMAP_EC_START + KTS4_EC_SBFREE部分异常,主要表现在Incorrect firstfree or nfree 可以通过设置一些参数进行屏蔽

在恢复过程中还有其他错误

ORA-600 encountered when generating server alert SMG-4128
ORA-00600: internal error code, arguments: [ktcpoptx:!cmt top lvl], [], [], [], [], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [4406], [0x1026B65348], [0x000000000], [2], [6215], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [ktcpoptx:!cmt top lvl], [], [], [], [], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [4194], [], [], [], [], [], [], [], [], [], [], []
ORACLE Instance xifenfei (pid = 15) - Error 600 encountered while recovering transaction (10, 7) on object 123986.
ORA-00600: internal error code, arguments: [], [], [], [], [], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [kdsgrp1], [], [], [], [], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [kturbleurec1], [], [], [], [], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [kewrose_1], [600],
  [ORA-00600: internal error code, arguments: [4194], [], [], [], [], [], [], [], [], [], [], []
Non-fatal internal error happenned while SMON was doing logging scn->time mapping.

通过整体分析错误主要是由于undo异常导致,通过设置_corrupted_rollback_segments设置db_block_checking等相关参数,清理SMON_SCN_TIME等操作数据库没有其他异常报错,让其通过逻辑方式重建库

ORA-00354 ORA-00353 ORA-00312异常处理

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:ORA-00354 ORA-00353 ORA-00312异常处理

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

数据库启动报错
WIN平台oracle 9.2.0.6版本数据库redo log block header损坏,ORA-00354 ORA-00353 ORA-00312错误导致数据库无法启动

SQL >alter database open;
*
ERROR at line 1:
ORA-00354: corrupt redo log block header
ORA-00353: log corruption near block 1892904 change 281470950178815
ORA-00312: online log 3 thread 1: 'D:\ORACLE\ORADATA\ZOYO\REDO03.LOG'
Sun Jan 24 15:44:05 2016
Database mounted in Exclusive Mode.
Completed: alter database mount exclusive
Sun Jan 24 15:44:05 2016
alter database open
Sun Jan 24 15:44:05 2016
Beginning crash recovery of 1 threads
Sun Jan 24 15:44:05 2016
Started redo scan
ORA-354 signalled during: alter database open...
Shutting down instance: further logons disabled
Shutting down instance (immediate)
License high water mark = 3
Sun Jan 24 15:44:32 2016
ALTER DATABASE CLOSE NORMAL
ORA-1109 signalled during: ALTER DATABASE CLOSE NORMAL...

通过分析,确定损坏的redo03为当前redo,无法使用正常方法打开,加上_allow_resetlogs_corruption参数,尝试打开库,依旧失败

数据库报ORA-600 2662错误

Sun Jan 24 16:26:30 2016
SMON: enabling cache recovery
Sun Jan 24 16:26:30 2016
Errors in file d:\oracle\admin\zoyo\udump\zoyo_ora_640.trc:
ORA-00600: 内部错误代码,参数: [2662], [0], [31563641], [0], [31563654], [4194721], [], []
Sun Jan 24 16:26:31 2016
Errors in file d:\oracle\admin\zoyo\udump\zoyo_ora_640.trc:
ORA-00704: 引导程序进程失败
ORA-00600: 内部错误代码,参数: [2662], [0], [31563641], [0], [31563654], [4194721], [], []
Sun Jan 24 16:26:31 2016
Error 704 happened during db open, shutting down database
USER: terminating instance due to error 704
Instance terminated by USER, pid = 640
ORA-1092 signalled during: alter database open resetlogs...

ORA 600 2662的错误处理
根据经验,这个错误只需要推scn即可,可以通过bbed,隐含参数,event,oradebug,修改控制文件等方法进行,推scn之后,数据库报熟悉的ORA-00604 ORA-00607 ORA-600 4194错误,以前我们遇到的block大部分是128,这次报异常block为9.实际中跟版本有关系,在ORACLE 9.2.0.6中该错误为file 1 block 9.大部分版本为128

Sun Jan 24 16:29:39 2016
SMON: enabling cache recovery
Sun Jan 24 16:29:39 2016
Errors in file d:\oracle\admin\zoyo\udump\zoyo_ora_3432.trc:
ORA-00600: 内部错误代码,参数: [4194], [14], [5], [], [], [], [], []
Sun Jan 24 16:29:39 2016
Doing block recovery for fno: 1 blk: 401
Sun Jan 24 16:29:39 2016
Recovery of Online Redo Log: Thread 1 Group 1 Seq 2 Reading mem 0
  Mem# 0 errs 0: D:\ORACLE\ORADATA\ZOYO\REDO01.LOG
Doing block recovery for fno: 1 blk: 9
Sun Jan 24 16:29:40 2016
Recovery of Online Redo Log: Thread 1 Group 1 Seq 2 Reading mem 0
  Mem# 0 errs 0: D:\ORACLE\ORADATA\ZOYO\REDO01.LOG
Sun Jan 24 16:29:40 2016
Errors in file d:\oracle\admin\zoyo\udump\zoyo_ora_3432.trc:
ORA-00604: 递归 SQL 层 1 出现错误
ORA-00607: 当更改数据块时出现内部错误
ORA-00600: 内部错误代码,参数: [4194], [14], [5], [], [], [], [], []
Error 604 happened during db open, shutting down database
USER: terminating instance due to error 604
Instance terminated by USER, pid = 3432

ORA-00604 ORA-00607 ORA-600 4194分析trace文件

*** 2016-01-24 16:29:40.031
Recovery of Online Redo Log: Thread 1 Group 1 Seq 2 Reading mem 0
Block image after block recovery:
buffer tsn: 0 rdba: 0x00400009 (1/9)
scn: 0x0000.01e112e1 seq: 0x01 flg: 0x04 tail: 0x12e10e01
frmt: 0x02 chkval: 0xba76 type: 0x0e=KTU UNDO HEADER W/UNLIMITED EXTENTS
  Extent Control Header
  -----------------------------------------------------------------
  Extent Header:: spare1: 0      spare2: 0      #extents: 6      #blocks: 47
                  last map  0x00000000  #maps: 0      offset: 4128
      Highwater::  0x00400191  ext#: 4      blk#: 0      ext size: 8
  #blocks in seg. hdr's freelists: 0
  #blocks below: 0
  mapblk  0x00000000  offset: 4
                   Unlocked
     Map Header:: next  0x00000000  #extents: 6    obj#: 0      flag: 0x40000000
  Extent Map
  -----------------------------------------------------------------
   0x0040000a  length: 7
   0x00400011  length: 8
   0x00400181  length: 8
   0x00400189  length: 8
   0x00400191  length: 8
   0x00400199  length: 8
  TRN CTL:: seq: 0x008e chd: 0x0060 ctl: 0x0024 inc: 0x00000000 nfb: 0x0001
            mgc: 0x8002 xts: 0x0068 flg: 0x0001 opt: 2147483646 (0x7ffffffe)
            uba: 0x00400191.008e.04 scn: 0x0000.01ded29c
Version: 0x01
  FREE BLOCK POOL::
    uba: 0x00400191.008e.04 ext: 0x4  spc: 0x1c3e
    uba: 0x00000000.002f.21 ext: 0x5  spc: 0x1334
    uba: 0x00000000.002e.37 ext: 0x4  spc: 0x788
    uba: 0x00000000.0000.00 ext: 0x0  spc: 0x0
    uba: 0x00000000.0000.00 ext: 0x0  spc: 0x0
  TRN TBL::

从这里可以确定undo segment header中的分配block记录有问题,清除ktuxc.fbp.fbp[N].kuba.kdba相关记录,数据库正常打开

    . struct ktuxc  kernel transaction undo xaction table control with 15 members
    . {
    .   struct kscn   scn with 3 members
    .   {
04148     ub4           bas      = 0X9CD2DE01 = 31380124
04152     ub2           wrp      = 0X0000 = 0
04154     cc32          pad      = 0X0000 = 0
    .   }
    .   struct kuba   uba with 4 members
    .   {
04156     kdba          dba      = 0X91014000 = 0x00400191 file 1 block 401
04160     ub2           seq      = 0X8E00 = 142
04162     ub1           rec      = 0X04 = 4
04163     cc16          pad      = 0X00 = 0
    .   }
04164   sb2           flg      = 0X0100 = 1
04166   ub2           seq      = 0X8E00 = 142
04168   sb2           nfb      = 0X0100 = 1
04170   cc32          pad1     = 0X0000 = 0
04172   ub4           inc      = 0X00000000 = 0
04176   sb2           chd      = 0X6000 = 96
04178   sb2           ctl      = 0X2400 = 36
04180   ub2x          mgc      = 0X0280 = 0x8002
04182   ub2           ver      = 0X0100 = 1
04184   ub2           xts      = 0X6800 = 104
04186   cc32          pad2     = 0X0000 = 0
04188   ub4           opt      = 0XFEFFFF7F = 2147483646
    .   ktufb fbp[5] (array with 5 elements)
    .     struct fbp   [0] with 3 members
    .     {
    .       struct kuba   uba with 4 members
    .       {
04192         kdba          dba      = 0X91014000 = 0x00400191 file 1 block 401
04196         ub2           seq      = 0X8E00 = 142
04198         ub1           rec      = 0X04 = 4
04199         cc16          pad      = 0X00 = 0
    .       }
04200       sb2           ext      = 0X0400 = 4
04202       sb2           spc      = 0X3E1C = 7230
    .     }
    .     struct fbp   [1] with 3 members
    .     {
    .       struct kuba   uba with 4 members
    .       {
04204         kdba          dba      = 0X00000000 = 0x00000000 file 0 block 0
04208         ub2           seq      = 0X2F00 = 47
04210         ub1           rec      = 0X21 = 33
04211         cc16          pad      = 0X00 = 0
    .       }
04212       sb2           ext      = 0X0500 = 5
04214       sb2           spc      = 0X3413 = 4916
    .     }
    .     struct fbp   [2] with 3 members
    .     {
    .       struct kuba   uba with 4 members
    .       {
04216         kdba          dba      = 0X00000000 = 0x00000000 file 0 block 0
04220         ub2           seq      = 0X2E00 = 46
04222         ub1           rec      = 0X37 = 55
04223         cc16          pad      = 0X00 = 0
    .       }
04224       sb2           ext      = 0X0400 = 4
04226       sb2           spc      = 0X8807 = 1928
    .     }
    .     struct fbp   [3] with 3 members
    .     {
    .       struct kuba   uba with 4 members
    .       {
04228         kdba          dba      = 0X00000000 = 0x00000000 file 0 block 0
04232         ub2           seq      = 0X0000 = 0
04234         ub1           rec      = 0X00 = 0
04235         cc16          pad      = 0X00 = 0
    .       }
04236       sb2           ext      = 0X0000 = 0
04238       sb2           spc      = 0X0000 = 0
    .     }
    .     struct fbp   [4] with 3 members
    .     {
    .       struct kuba   uba with 4 members
    .       {
04240         kdba          dba      = 0X00000000 = 0x00000000 file 0 block 0
04244         ub2           seq      = 0X0000 = 0
04246         ub1           rec      = 0X00 = 0
04247         cc16          pad      = 0X00 = 0
    .       }
04248       sb2           ext      = 0X0000 = 0
04250       sb2           spc      = 0X0000 = 0
    .     }
    . }
Sun Jan 24 16:44:52 2016
SMON: enabling tx recovery
Sun Jan 24 16:44:52 2016
Database Characterset is ZHS16GBK
replication_dependency_tracking turned off (no async multimaster replication found)
Completed: ALTER DATABASE OPEN

重建控制文件丢失数据文件导致悲剧

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:重建控制文件丢失数据文件导致悲剧

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

在Oracle职业生涯中,恢复过生产环境数据库也有几百个.对于Oracle恢复我还是相当的自信,今天因为自己的一时过于自信,对于环境错了错误的判断,简单问题复杂化,差点变成悲剧
数据库最初故障

Thu Sep 25 09:27:26 2014
MMON started with pid=15, OS id=1968
starting up 1 dispatcher(s) for network address '(ADDRESS=(PARTIAL=YES)(PROTOCOL=TCP))'...
starting up 1 shared server(s) ...
ORACLE_BASE from environment = F:\oracle
Thu Sep 25 09:27:26 2014
ALTER DATABASE   MOUNT
Thu Sep 25 09:27:26 2014
MMNL started with pid=16, OS id=5976
Errors in file f:\oracle\diag\rdbms\orcl\orcl\trace\orcl_ora_4624.trc:
ORA-00202: ????: ''F:\ORACLE\ORADATA\ORCL\CONTROL01.CTL
ORA-27070: ????/????
OSD-04006: ReadFile() 失败, 无法读取文件
O/S-Error: (OS 23) 数据错误(循环冗余检查)。
Thu Sep 25 09:28:31 2014
ORA-204 signalled during: ALTER DATABASE   MOUNT...

因为硬件或者系统层面问题,导致控制文件无法正常访问

重建控制文件

Fri Sep 26 12:28:44 2014
Successful mount of redo thread 1, with mount id 1387065723
Completed: CREATE CONTROLFILE REUSE DATABASE "orcl" RESETLOGS ARCHIVELOG
    MAXLOGFILES 5
    MAXLOGMEMBERS 3
    MAXDATAFILES 100
    MAXINSTANCES 2
    MAXLOGHISTORY 226
LOGFILE
  GROUP 1 'F:\oracle\oradata\orcl\REDO01.LOG'  SIZE 50M,  --redo log ????
  GROUP 2 'F:\oracle\oradata\orcl\REDO02.LOG'  SIZE 50M,  --redo log ????
  GROUP 3 'F:\oracle\oradata\orcl\REDO03.LOG'  SIZE 50M  --redo log ????
-- STANDBY LOGFILE
DATAFILE
  'F:\oracle\oradata\orcl\SYSAUX01.DBF',  --sysaux???????
  'F:\oracle\oradata\orcl\SYSTEM01.DBF',
  'F:\oracle\oradata\orcl\USERS01.DBF', --user????????
  'F:\oracle\oradata\orcl\UNDOTBS01.DBF' --undo???????
CHARACTER SET ZHS16GBK
Fri Sep 26 12:29:55 2014
alter database open resetlogs
ORA-1194 signalled during: alter database open resetlogs...

埋下了雷,创建控制文件中未全部列举出来所有数据文件

进行不完全恢复,尝试resetlogs库发现redo异常

Fri Sep 26 14:13:24 2014
ALTER DATABASE   MOUNT
Fri Sep 26 14:13:24 2014
MMNL started with pid=16, OS id=9024
Successful mount of redo thread 1, with mount id 1387037444
Database mounted in Exclusive Mode
Lost write protection disabled
Completed: ALTER DATABASE   MOUNT
Fri Sep 26 14:14:08 2014
alter database open resetlogs
RESETLOGS is being done without consistancy checks. This may result
in a corrupted database. The database should be recreated.
Fri Sep 26 14:15:16 2014
Errors in file f:\oracle\diag\rdbms\orcl\orcl\trace\orcl_ora_3720.trc:
ORA-00333: 重做日志读取块 2049 计数 6143 出错
ORA-00312: 联机日志 1 线程 1: 'F:\ORACLE\ORADATA\ORCL\REDO01.LOG'
ORA-27070: 异步读取/写入失败
OSD-04016: 异步 I/O 请求排队时出错。
O/S-Error: (OS 23) 数据错误(循环冗余检查)。
Fri Sep 26 14:16:24 2014
Errors in file f:\oracle\diag\rdbms\orcl\orcl\trace\orcl_ora_3720.trc:
ORA-00333: 重做日志读取块 1 计数 8191 出错
ORA-00312: 联机日志 1 线程 1: 'F:\ORACLE\ORADATA\ORCL\REDO01.LOG'
ORA-27070: 异步读取/写入失败
OSD-04006: ReadFile() 失败, 无法读取文件
O/S-Error: (OS 23) 数据错误(循环冗余检查)。
Errors in file f:\oracle\diag\rdbms\orcl\orcl\trace\orcl_ora_3720.trc:
ORA-00333: 重做日志读取块 1 计数 8191 出错
ARCH: All Archive destinations made inactive due to error 333

使用隐含参数尝试拉库,报ORA-600[2662]

Fri Sep 26 14:16:45 2014
SMON: enabling cache recovery
Errors in file f:\oracle\diag\rdbms\orcl\orcl\trace\orcl_ora_3720.trc  (incident=57761):
ORA-00600: 内部错误代码, 参数: [2662], [0], [38221304], [0], [38352371], [4194545], [], [], [], [], [], []
Incident details in: f:\oracle\diag\rdbms\orcl\orcl\incident\incdir_57761\orcl_ora_3720_i57761.trc
Fri Sep 26 14:16:45 2014
ARC3 started with pid=23, OS id=9692
ARC3: Archival started
ARC0: STARTING ARCH PROCESSES COMPLETE
Errors in file f:\oracle\diag\rdbms\orcl\orcl\trace\orcl_ora_3720.trc:
ORA-00704: 引导程序进程失败
ORA-00704: 引导程序进程失败
ORA-00600: 内部错误代码, 参数: [2662], [0], [38221304], [0], [38352371], [4194545], [], [], [], [], [], []
Errors in file f:\oracle\diag\rdbms\orcl\orcl\trace\orcl_ora_3720.trc:
ORA-00704: 引导程序进程失败
ORA-00704: 引导程序进程失败
ORA-00600: 内部错误代码, 参数: [2662], [0], [38221304], [0], [38352371], [4194545], [], [], [], [], [], []
Error 704 happened during db open, shutting down database
USER (ospid: 3720): terminating the instance due to error 704
Instance terminated by USER, pid = 3720
ORA-1092 signalled during: alter database open resetlogs...
opiodr aborting process unknown ospid (3720) as a result of ORA-1092

数据库在未使用所有数据文件的情况下,进行了resetlogs操作,悲剧的本质已经注定,我的失误是没有评估好现状,还继续在错误的道路上越走越远.

我开始接手该库现况

Database mounted in Exclusive Mode
Lost write protection disabled
Completed: ALTER DATABASE   MOUNT
Fri Sep 26 14:18:55 2014
alter database open
Errors in file f:\oracle\diag\rdbms\orcl\orcl\trace\orcl_ora_8968.trc:
ORA-01113: 文件 1 需要介质恢复
ORA-01110: 数据文件 1: 'F:\ORACLE\ORADATA\ORCL\SYSTEM01.DBF'
ORA-1113 signalled during: alter database open...
Fri Sep 26 14:19:31 2014
alter database open
Errors in file f:\oracle\diag\rdbms\orcl\orcl\trace\orcl_ora_8968.trc:
ORA-01113: 文件 1 需要介质恢复
ORA-01110: 数据文件 1: 'F:\ORACLE\ORADATA\ORCL\SYSTEM01.DBF'
ORA-1113 signalled during: alter database open ...
Fri Sep 26 14:22:26 2014
ALTER DATABASE RECOVER  database
Media Recovery Start
 started logmerger process
Fri Sep 26 14:22:26 2014
Media Recovery failed with error 16433
Recovery Slave PR00 previously exited with exception 283
ORA-283 signalled during: ALTER DATABASE RECOVER  database  ...
Fri Sep 26 14:24:25 2014
ALTER DATABASE RECOVER  datafile 'F:\ORACLE\ORADATA\ORCL\SYSTEM01.DBF'
Media Recovery Start
Media Recovery failed with error 16433
ORA-283 signalled during: ALTER DATABASE RECOVER  datafile 'F:\ORACLE\ORADATA\ORCL\SYSTEM01.DBF'  ...
Fri Sep 26 14:28:47 2014
alter database open read write
Errors in file f:\oracle\diag\rdbms\orcl\orcl\trace\orcl_ora_8968.trc:
ORA-01113: 文件 1 需要介质恢复
ORA-01110: 数据文件 1: 'F:\ORACLE\ORADATA\ORCL\SYSTEM01.DBF'
ORA-1113 signalled during: alter database open read write...
Fri Sep 26 14:31:48 2014
ALTER DATABASE RECOVER  datafile 'F:\oracle\oradata\orcl\SYSTEM01.DBF'
Media Recovery Start
Media Recovery failed with error 16433
ORA-283 signalled during: ALTER DATABASE RECOVER  datafile 'F:\oracle\oradata\orcl\SYSTEM01.DBF'  ...

提示ORA-01110: 数据文件 1需要恢复,尝试recover操作

尝试recover操作

连接到:
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
SQL> recover database ;
ORA-00283: recovery session canceled due to errors
ORA-16433: The database must be opened in read/write mode.
SQL> alter database backup controlfile to trace as 'd:\ctl.txt';
alter database backup controlfile to trace as 'd:\ctl.txt'
*
第 1 行出现错误:
ORA-16433: 必须以读/写模式打开数据库。
SQL> recover database using backup controlfile;
ORA-00283: recovery session canceled due to errors
ORA-16433: The database must be opened in read/write mode.

重建控制文件

SQL> shutdown immediate;
ORA-01109: 数据库未打开
已经卸载数据库。
ORACLE 例程已经关闭。
SQL> STARTUP NOMOUNT
ORACLE 例程已经启动。
Total System Global Area  970895360 bytes
Fixed Size                  1375452 bytes
Variable Size             603980580 bytes
Database Buffers          360710144 bytes
Redo Buffers                4829184 bytes
SQL> CREATE CONTROLFILE REUSE DATABASE orcl NORESETLOGS FORCE LOGGING ARCHIVELOG
  2      MAXLOGFILES 16
  3      MAXLOGMEMBERS 3
  4      MAXDATAFILES 100
  5      MAXINSTANCES 8
  6      MAXLOGHISTORY 2921
  7  LOGFILE
  8    GROUP 1 'F:\ORACLE\ORADATA\ORCL\REDO01.LOG'  SIZE 50M,
  9    GROUP 2 'F:\ORACLE\ORADATA\ORCL\REDO02.LOG'  SIZE 50M,
 10    GROUP 3 'F:\ORACLE\ORADATA\ORCL\REDO03.LOG'  SIZE 50M
 11  DATAFILE
 12    'F:\ORACLE\ORADATA\ORCL\SYSTEM01.DBF',
 13    'F:\ORACLE\ORADATA\ORCL\SYSAUX01.DBF',
 14    'F:\ORACLE\ORADATA\ORCL\UNDOTBS01.DBF',
 15    'F:\ORACLE\ORADATA\ORCL\USERS01.DBF'
 16  CHARACTER SET ZHS16GBK
 17  ;
控制文件已创建。

这一步严重发错,在恢复前未认真看alert日志,太依赖v$datafile查询出来结果,导致重建控制文件丢失数据文件,埋下大雷。根据前面alert日志报错ORA-600 2662,决定一并处理该问题,然后进行恢复

SQL> shutdown immediate;
ORA-01109: ??????
已经卸载数据库。
ORACLE 例程已经关闭。
SQL> startup pfile='d:\pfile.txt'  mount;
ORACLE 例程已经启动。
Total System Global Area  970895360 bytes
Fixed Size                  1375452 bytes
Variable Size             603980580 bytes
Database Buffers          360710144 bytes
Redo Buffers                4829184 bytes
数据库装载完毕。
SQL> recover database;
完成介质恢复。
SQL> alter database open;
alter database open
*
第 1 行出现错误:
ORA-00603: ORACLE server session terminated by fatal error
ORA-00600: internal error code, arguments: [4194], [], [

数据库报ORA-600 4194,直接修改undo_management=manual,然后尝试启动数据库

SQL> conn / as sysdba
已连接到空闲例程。
SQL> startup pfile='d:\pfile.txt'
ORACLE 例程已经启动。
Total System Global Area  970895360 bytes
Fixed Size                  1375452 bytes
Variable Size             603980580 bytes
Database Buffers          360710144 bytes
Redo Buffers                4829184 bytes
数据库装载完毕。
数据库已经打开。
SQL> select name from v$datafile;
NAME
--------------------------------------------------------------------------------
F:\ORACLE\ORADATA\ORCL\SYSTEM01.DBF
F:\ORACLE\ORADATA\ORCL\SYSAUX01.DBF
F:\ORACLE\ORADATA\ORCL\UNDOTBS01.DBF
F:\ORACLE\ORADATA\ORCL\USERS01.DBF
F:\ORACLE\PRODUCT\11.2.0\DBHOME_1\DATABASE\MISSING00005
F:\ORACLE\PRODUCT\11.2.0\DBHOME_1\DATABASE\MISSING00006
已选择6行。
SQL> alter database rename file 'F:\ORACLE\PRODUCT\11.2.0\DBHOME_1\DATABASE\MISSING00005'
2   to 'F:\oracle\oradata\SOURCE_DATA1.DBF';
数据库已更改。
SQL> alter database rename file 'F:\ORACLE\PRODUCT\11.2.0\DBHOME_1\DATABASE\MISSING00006'
2   to 'F:\oracle\oradata\SOURCE_idx1.DBF';
数据库已更改。
SQL> shutdown immediate;
数据库已经关闭。
已经卸载数据库。
ORACLE 例程已经关闭。
SQL> startup mount pfile='d:\pfile.txt'
ORACLE 例程已经启动。
Total System Global Area  970895360 bytes
Fixed Size                  1375452 bytes
Variable Size             603980580 bytes
Database Buffers          360710144 bytes
Redo Buffers                4829184 bytes
数据库装载完毕。
SQL> alter datafile 5 online;
alter datafile 5 online
      *
第 1 行出现错误:
ORA-00940: 无效的 ALTER 命令
SQL> alter database datafile 5 online;
数据库已更改。
SQL> alter database datafile 6 online;
数据库已更改。
SQL> recover database until cancel;
ORA-00283: recovery session canceled due to errors
ORA-19909: datafile 5 belongs to an orphan incarnation
ORA-01110: data file 5: 'F:\ORACLE\ORADATA\SOURCE_DATA1.DBF'
SQL> alter database open resetlogs;
alter database open resetlogs
*
第 1 行出现错误:
ORA-01139: RESETLOGS 选项仅在不完全数据库恢复后有效
SQL> alter database datafile 6 offline;
数据库已更改。
SQL> alter database datafile 5 offline;
数据库已更改。
SQL> recover database until cancel;
完成介质恢复。
SQL> alter database datafile 6 online;
数据库已更改。
SQL> alter database datafile 5 online;
数据库已更改。
SQL> alter database open resetlogs;
数据库已更改。

还好结合一些隐含参数侥幸恢复成功,差点到了要使用bbed的程度

这次的恢复告诉我:Oracle数据库恢复千万比大意,需要认真分析alert日志和咨询客户做了那些操作,不然可能导致万劫不复之禁地

又一起存储故障导致ORA-00333 ORA-00312恢复

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:又一起存储故障导致ORA-00333 ORA-00312恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

数据库启动报ORA-00333 ORA-00312错误,无法正常open数据库

Thu Aug 07 10:42:03  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\bdump\szcg_arc0_4724.trc:
ORA-00333: redo log read error block 63489 count 2048
ORA-00312: online log 2 thread 1: 'F:\ORADATA\SZCG\REDO02.LOG'
ORA-27091: unable to queue I/O
ORA-27070: async read/write failed
OSD-04006: ReadFile() 失败, 无法读取文件
O/S-Error: (OS 1) 函数不正确。
Thu Aug 07 10:42:03  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\bdump\szcg_arc0_4724.trc:
ORA-00333: redo log read error block 63489 count 2048
Thu Aug 07 10:42:03  2014
ARC0: All Archive destinations made inactive due to error 333
Thu Aug 07 10:42:03  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_1856.trc:
ORA-00449: 后台进程 'LGWR' 因错误 340 异常终止
ORA-00340: 处理联机日志  (用于线程 ) 时出现 I/O 错误
Thu Aug 07 10:42:03  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_6548.trc:
ORA-00449: 后台进程 'LGWR' 因错误 340 异常终止
ORA-00340: 处理联机日志  (用于线程 ) 时出现 I/O 错误
Thu Aug 07 10:42:03  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_8104.trc:
ORA-00449: 后台进程 'LGWR' 因错误 340 异常终止
ORA-00340: 处理联机日志  (用于线程 ) 时出现 I/O 错误
Thu Aug 07 10:42:03  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\bdump\szcg_lgwr_884.trc:
ORA-00340: IO error processing online log 3 of thread 1
ORA-00345: redo log write error block 65238 count 13
ORA-00312: online log 3 thread 1: 'F:\ORADATA\SZCG\REDO03.LOG'
ORA-27070: async read/write failed
OSD-04016: 异步 I/O 请求排队时出错。
O/S-Error: (OS 1) 函数不正确。
Thu Aug 07 10:42:03  2014
LGWR: terminating instance due to error 340
Thu Aug 07 10:42:05  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_8104.trc:
ORA-00603: ORACLE server session terminated by fatal error
ORA-00449: background process 'LGWR' unexpectedly terminated with error 340
ORA-00340: IO error processing online log  of thread
Thu Aug 07 10:42:05  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_1856.trc:
ORA-00603: ORACLE server session terminated by fatal error
ORA-00449: background process 'LGWR' unexpectedly terminated with error 340
ORA-00340: IO error processing online log  of thread
Thu Aug 07 10:42:05  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_6548.trc:
ORA-00603: ORACLE server session terminated by fatal error
ORA-00449: background process 'LGWR' unexpectedly terminated with error 340
ORA-00340: IO error processing online log  of thread
Thu Aug 07 17:40:05  2014
ALTER DATABASE OPEN
Thu Aug 07 17:40:05  2014
Beginning crash recovery of 1 threads
 parallel recovery started with 15 processes
Thu Aug 07 17:40:06  2014
Started redo scan
Thu Aug 07 17:40:06  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_5168.trc:
ORA-00333: 重做日志读取块 63016 计数 8192 出错
ORA-00312: 联机日志 3 线程 1: 'F:\ORADATA\SZCG\REDO03.LOG'
ORA-27070: 异步读取/写入失败
OSD-04016: 异步 I/O 请求排队时出错。
O/S-Error: (OS 1) 函数不正确。
Thu Aug 07 17:40:06  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_5168.trc:
ORA-00333: 重做日志读取块 63016 计数 8192 出错
ORA-00312: 联机日志 3 线程 1: 'F:\ORADATA\SZCG\REDO03.LOG'
ORA-27091: 无法将 I/O 排队
ORA-27070: 异步读取/写入失败
OSD-04006: ReadFile() 失败, 无法读取文件
O/S-Error: (OS 1) 函数不正确。
Thu Aug 07 17:40:06  2014
Aborting crash recovery due to error 333
Thu Aug 07 17:40:06  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_5168.trc:
ORA-00333: 重做日志读取块 63016 计数 8192 出错
ORA-333 signalled during: ALTER DATABASE OPEN...

进一步检查发现在7月6日系统就已经报io异常

Sun Jul 06 10:05:23  2014
ARC0: All Archive destinations made inactive due to error 333
Sun Jul 06 10:06:07  2014
KCF: write/open error block=0xd03 online=1
     file=3 F:\ORADATA\SZCG\SYSAUX01.DBF
     error=27070 txt: 'OSD-04016: 异步 I/O 请求排队时出错。
O/S-Error: (OS 1) 函数不正确。'
Automatic datafile offline due to write error on
file 3: F:\ORADATA\SZCG\SYSAUX01.DBF
Sun Jul 06 10:06:23  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\bdump\szcg_arc1_2676.trc:
ORA-00333: redo log read error block 63489 count 2048
ORA-00312: online log 2 thread 1: 'F:\ORADATA\SZCG\REDO02.LOG'
ORA-27091: unable to queue I/O
ORA-27070: async read/write failed
OSD-04006: ReadFile() 失败, 无法读取文件
O/S-Error: (OS 1) 函数不正确。
Thu Aug 07 10:36:54  2014
ARC1: All Archive destinations made inactive due to error 333
Thu Aug 07 10:37:25  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\bdump\szcg_m000_5832.trc:
ORA-01135: file 3 accessed for DML/query is offline
ORA-01110: data file 3: 'F:\ORADATA\SZCG\SYSAUX01.DBF'

检查硬件发现raid一块盘完全损坏,另外一块盘也处于告警状态,保护现场拷贝文件过程中发现redo02,redo03,sysaux无法拷贝,使用rman检查发现
sysaux-block


因为redo完全损坏,使用工具跳过坏块,拷贝相关有坏块文件到其他目录,重命名相关文件尝试启动数据库,依然报ORA-00333 ORA-00312

Started redo scan
Thu Aug 07 17:40:06  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_5168.trc:
ORA-00333: 重做日志读取块 63016 计数 8192 出错
ORA-00312: 联机日志 3 线程 1: 'F:\ORADATA\SZCG\REDO03.LOG'
ORA-27070: 异步读取/写入失败
OSD-04016: 异步 I/O 请求排队时出错。
O/S-Error: (OS 1) 函数不正确。
Thu Aug 07 17:40:06  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_5168.trc:
ORA-00333: 重做日志读取块 63016 计数 8192 出错
ORA-00312: 联机日志 3 线程 1: 'F:\ORADATA\SZCG\REDO03.LOG'
ORA-27091: 无法将 I/O 排队
ORA-27070: 异步读取/写入失败
OSD-04006: ReadFile() 失败, 无法读取文件
O/S-Error: (OS 1) 函数不正确。
Thu Aug 07 17:40:06  2014
Aborting crash recovery due to error 333
Thu Aug 07 17:40:06  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_5168.trc:
ORA-00333: 重做日志读取块 63016 计数 8192 出错
ORA-333 signalled during: ALTER DATABASE OPEN...

设置隐含参数_allow_resetlogs_corruption,尝试强制拉库

Started redo scan
Fri Aug 08 12:13:25  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_3892.trc:
ORA-00333: 重做日志读取块 63016 计数 8192 出错
ORA-00312: 联机日志 3 线程 1: 'F:\ORADATA\SZCG\REDO03.LOG'
ORA-27070: 异步读取/写入失败
OSD-04016: 异步 I/O 请求排队时出错。
O/S-Error: (OS 1) 函数不正确。
Fri Aug 08 12:13:25  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_3892.trc:
ORA-00333: 重做日志读取块 63016 计数 8192 出错
ORA-00312: 联机日志 3 线程 1: 'F:\ORADATA\SZCG\REDO03.LOG'
ORA-27091: 无法将 I/O 排队
ORA-27070: 异步读取/写入失败
OSD-04006: ReadFile() 失败, 无法读取文件
O/S-Error: (OS 1) 函数不正确。
Fri Aug 08 12:13:25  2014
Aborting crash recovery due to error 333
Fri Aug 08 12:13:25  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_3892.trc:
ORA-00333: 重做日志读取块 63016 计数 8192 出错
ORA-333 signalled during: ALTER DATABASE OPEN...
Fri Aug 08 12:13:45  2014
ALTER DATABASE RECOVER  database until cancel
Fri Aug 08 12:13:45  2014
Media Recovery Start
 parallel recovery started with 15 processes
ORA-279 signalled during: ALTER DATABASE RECOVER  database until cancel  ...
Fri Aug 08 12:13:55  2014
ALTER DATABASE RECOVER    CANCEL
Fri Aug 08 12:13:59  2014
ORA-1547 signalled during: ALTER DATABASE RECOVER    CANCEL  ...
Fri Aug 08 12:13:59  2014
ALTER DATABASE RECOVER CANCEL
ORA-1112 signalled during: ALTER DATABASE RECOVER CANCEL ...
Fri Aug 08 12:14:12  2014
alter database open resetlogs
Fri Aug 08 12:14:13  2014
RESETLOGS is being done without consistancy checks. This may result
in a corrupted database. The database should be recreated.
ORA-1245 signalled during: alter database open resetlogs...
Fri Aug 08 12:54:11  2014
alter tablespace sysaux offline
Fri Aug 08 12:54:11  2014
ORA-1109 signalled during: alter tablespace sysaux offline...
Fri Aug 08 13:05:30  2014
alter database open
Fri Aug 08 13:05:30  2014
ORA-1589 signalled during: alter database open...

在offline过程中,数据库检查到sysaux数据文件为offline状态,当表空间只有一个数据文件,而且该数据文件为offline,数据库将会尝试offline sysaux表空间,但是发现该表空间文件非正常scn,无法offline 表空间,导致resetlogs操作失败。这里是操作失误应该先online相关数据文件,然后再进行resetlogs操作

Sat Aug 09 11:56:03  2014
alter database datafile 3 online
Sat Aug 09 11:56:04  2014
Completed: alter database datafile 3 online
Sat Aug 09 11:56:08  2014
alter database open resetlogs
RESETLOGS is being done without consistancy checks. This may result
in a corrupted database. The database should be recreated.
Sat Aug 09 11:56:18  2014
ARCH: Encountered disk I/O error 19502
Sat Aug 09 11:56:18  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_4516.trc:
ORA-19502: 文件 "F:\ARCHIVE\ARC01745_0814618167.001", 块编号 55297 写错误 (块大小 = 512)
ORA-27072: 文件 I/O 错误
OSD-04008: WriteFile() 失败, 无法写入文件
O/S-Error: (OS 1) 函数不正确。
ORA-19502: 文件 "F:\ARCHIVE\ARC01745_0814618167.001", 块编号 55297 写错误 (块大小 = 512)
Sat Aug 09 11:56:18  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_4516.trc:
ORA-19502: 文件 "F:\ARCHIVE\ARC01745_0814618167.001", 块编号 55297 写错误 (块大小 = 512)
ORA-27072: 文件 I/O 错误
OSD-04008: WriteFile() 失败, 无法写入文件
O/S-Error: (OS 1) 函数不正确。
ORA-19502: 文件 "F:\ARCHIVE\ARC01745_0814618167.001", 块编号 55297 写错误 (块大小 = 512)
ARCH: I/O error 19502 archiving log 3 to 'F:\ARCHIVE\ARC01745_0814618167.001'
Sat Aug 09 11:56:18  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_4516.trc:
ORA-00265: 要求实例恢复, 无法设置 ARCHIVELOG 模式
Archive all online redo logfiles failed:265
RESETLOGS after incomplete recovery UNTIL CHANGE 77983856
Resetting resetlogs activation ID 3562192628 (0xd452bef4)
Online log F:\ORADATA\SZCG\REDO01.LOG: Thread 1 Group 1 was previously cleared
Online log F:\ORADATA\SZCG\REDO02.LOG: Thread 1 Group 2 was previously cleared
Online log D:\REDO04.LOG: Thread 1 Group 4 was previously cleared
Sat Aug 09 11:56:22  2014
Setting recovery target incarnation to 3
Sat Aug 09 11:56:23  2014
Assigning activation ID 3602586269 (0xd6bb1a9d)
LGWR: STARTING ARCH PROCESSES
ARC0 started with pid=33, OS id=5900
Sat Aug 09 11:56:23  2014
ARC0: Archival started
ARC1: Archival started
LGWR: STARTING ARCH PROCESSES COMPLETE
ARC1 started with pid=34, OS id=5776
Sat Aug 09 11:56:24  2014
Thread 1 opened at log sequence 1
  Current log# 1 seq# 1 mem# 0: F:\ORADATA\SZCG\REDO01.LOG
Successful open of redo thread 1
Sat Aug 09 11:56:24  2014
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Sat Aug 09 11:56:24  2014
ARC1: Becoming the 'no FAL' ARCH
ARC1: Becoming the 'no SRL' ARCH
Sat Aug 09 11:56:24  2014
ARC0: Becoming the heartbeat ARCH
Sat Aug 09 11:56:24  2014
SMON: enabling cache recovery
Sat Aug 09 11:56:25  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_4516.trc:
ORA-00600: 内部错误代码, 参数: [2662], [0], [77983864], [0], [77992379], [8388617], [], []
Sat Aug 09 11:56:26  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_4516.trc:
ORA-00600: 内部错误代码, 参数: [2662], [0], [77983864], [0], [77992379], [8388617], [], []
Sat Aug 09 11:56:26  2014
Error 600 happened during db open, shutting down database
USER: terminating instance due to error 600
Instance terminated by USER, pid = 4516
ORA-1092 signalled during: alter database open resetlogs...

ORA-600 2662这个错误很熟悉,直接推SCN,数据库open,但是报ORA-600 4194

Sat Aug 09 12:01:28  2014
SMON: enabling cache recovery
Dictionary check complete
Sat Aug 09 12:01:32  2014
SMON: enabling tx recovery
Sat Aug 09 12:01:32  2014
Database Characterset is ZHS16GBK
Opening with internal Resource Manager plan
replication_dependency_tracking turned off (no async multimaster replication found)
Starting background process QMNC
QMNC started with pid=34, OS id=6116
Sat Aug 09 12:01:34  2014
LOGSTDBY: Validating controlfile with logical metadata
Sat Aug 09 12:01:34  2014
LOGSTDBY: Validation complete
Sat Aug 09 12:01:34  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\bdump\szcg_smon_920.trc:
ORA-00600: internal error code, arguments: [4194], [21], [53], [], [], [], [], []
Sat Aug 09 12:01:36  2014
Doing block recovery for file 2 block 319
Resuming block recovery (PMON) for file 2 block 319
Block recovery from logseq 2, block 56 to scn 1073742003
Sat Aug 09 12:01:36  2014
Recovery of Online Redo Log: Thread 1 Group 2 Seq 2 Reading mem 0
  Mem# 0: F:\ORADATA\SZCG\REDO02.LOG
Block recovery stopped at EOT rba 2.79.16
Block recovery completed at rba 2.79.16, scn 0.1073742002
Doing block recovery for file 2 block 153
Resuming block recovery (PMON) for file 2 block 153
Block recovery from logseq 2, block 56 to scn 1073741986
Sat Aug 09 12:01:36  2014
Recovery of Online Redo Log: Thread 1 Group 2 Seq 2 Reading mem 0
  Mem# 0: F:\ORADATA\SZCG\REDO02.LOG
Block recovery completed at rba 2.66.16, scn 0.1073741988
Sat Aug 09 12:01:36  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\bdump\szcg_smon_920.trc:
ORA-01595: error freeing extent (4) of rollback segment (10))
ORA-00607: Internal error occurred while making a change to a data block
ORA-00600: internal error code, arguments: [4194], [21], [53], [], [], [], [], []
Sat Aug 09 12:01:36  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_5272.trc:
ORA-00600: internal error code, arguments: [4194], [21], [53], [], [], [], [], []
Sat Aug 09 12:01:36  2014
Completed: alter database open

尝试重建undo表空间并切换undo_tabspace到新undo表空间解决,因为数据库在恢复过程中使用了隐含参数强制拉库,不能保证数据一致性,强烈建议逻辑方式重建数据库
在本次故障中,所幸的是只有redo和sysaux文件损坏,如果是业务数据文件或者system数据文件损坏,恢复的后果可能更加麻烦,丢失数据可能更加多。再次说明:数据库备份非常重要,数据的安全性不能完全寄希望于硬件之上

ORACLE 8.1.7 数据库ORA-600 4194故障恢复

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:ORACLE 8.1.7 数据库ORA-600 4194故障恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

一个817数据库报ORA-600 4194 无法正常启动

Fri Jul 25 10:49:47 2014
Database mounted in Exclusive Mode.
Completed: ALTER DATABASE   MOUNT
Fri Jul 25 10:49:58 2014
ALTER DATABASE RECOVER  database
Fri Jul 25 10:49:58 2014
Media Recovery Start
Media Recovery Log
Recovery of Online Redo Log: Thread 1 Group 2 Seq 3320 Reading mem 0
  Mem# 0 errs 0: D:\ORACLE\ORADATA\ORCL\REDO02.LOG
Media Recovery Complete
Completed: ALTER DATABASE RECOVER  database
Fri Jul 25 10:50:09 2014
alter database open
Beginning crash recovery of 1 threads
Fri Jul 25 10:50:09 2014
Thread recovery: start rolling forward thread 1
Recovery of Online Redo Log: Thread 1 Group 2 Seq 3320 Reading mem 0
  Mem# 0 errs 0: D:\ORACLE\ORADATA\ORCL\REDO02.LOG
Fri Jul 25 10:50:09 2014
Thread recovery: finish rolling forward thread 1
Thread recovery: 0 data blocks read, 0 data blocks written, 3 redo blocks read
Crash recovery completed successfully
Fri Jul 25 10:50:09 2014
Thread 1 advanced to log sequence 3321
Thread 1 opened at log sequence 3321
  Current log# 3 seq# 3321 mem# 0: D:\ORACLE\ORADATA\ORCL\REDO01.LOG
Successful open of redo thread 1.
Fri Jul 25 10:50:09 2014
SMON: enabling cache recovery
Fri Jul 25 10:50:10 2014
Errors in file D:\oracle\admin\ORCL\udump\ORA03216.TRC:
ORA-00600: ??????????: [4194], [12], [37], [], [], [], [], []
Fri Jul 25 10:50:10 2014
Recovery of Online Redo Log: Thread 1 Group 3 Seq 3321 Reading mem 0
  Mem# 0 errs 0: D:\ORACLE\ORADATA\ORCL\REDO01.LOG
Fri Jul 25 10:50:10 2014
SMON: disabling cache recovery
Fri Jul 25 10:50:10 2014
ORA-600 signalled during: alter database open

ORA-600 4194这个错误在数据库异常恢复中非常常见,因为库不是很重要,因此就是直接屏蔽掉故障回滚段,然后强制拉库,该库的恢复过程中,也直接使用隐含参数屏蔽回滚段
_corrupted_rollback_segments= RBS0, RBS1, RBS2, RBS3, RBS4, RBS5, RBS6, RBS_HDSYS,数据库依然无法open,进一步分析trace文件

Fri Jul 25 11:26:07 2014
ORACLE V8.1.7.0.0 - Production vsnsta=0
vsnsql=e vsnxtr=3
Windows 2000 Version 5.2 Service Pack 2, CPU type 586
Oracle8i Release 8.1.7.0.0 - Production
JServer Release 8.1.7.0.0 - Production
Windows 2000 Version 5.2 Service Pack 2, CPU type 586
Instance name: orcl
Redo thread mounted by this instance: 1
Oracle process number: 14
Windows thread id: 3648, image: ORACLE.EXE
*** SESSION ID:(11.1) 2014-07-25 11:26:07.843
*** 2014-07-25 11:26:07.843
ksedmp: internal or fatal error
ORA-00600: ??????????: [4194], [12], [37], [], [], [], [], []
Current SQL statement for this session:
update undo$ set name=:2,file#=:3,block#=:4,status$=:5,user#=:6,undosqn=:7,xactsqn=:8,
scnbas=:9,scnwrp=:10,inst#=:11,ts#=:12 where us#=:1
----- Call Stack Trace -----

这里很明显看出来,数据库是在open过程中,update undo$表遭遇到ORA-600 4194,因为该过程需要使用系统回滚段,但是由于其所对应的undo和redo信息不一致,所以无法正常启动数据库.继续读trace文件

  Extent Control Header
  -----------------------------------------------------------------
  Extent Header:: spare1: 0      space2: 0      #extents: 5      #blocks: 49
                  last map  0x00000000  #maps: 0      offset: 4128
      Highwater::  0x00400006  ext#: 0      blk#: 3      ext size: 9
  #blocks in seg. hdr's freelists: 0
  #blocks below: 0
  mapblk  0x00000000  offset: 0
                   Unlocked
     Map Header:: next  0x00000000  #extents: 5    obj#: 0      flag: 0x40000000
  Extent Map
  -----------------------------------------------------------------
   0x00400003  length: 9
   0x0040000c  length: 10
   0x0040008f  length: 10
   0x00400099  length: 10
   0x004000a3  length: 10
  TRN CTL:: seq: 0x003c chd: 0x004e ctl: 0x0050 inc: 0x00000000 nfb: 0x0000
            mgc: 0x8002 xts: 0x0068 flg: 0x0001 opt: 2147483646 (0x7ffffffe)
            uba: 0x00400006.003c.25 scn: 0x0000.009a4009
Version: 0x01
  FREE BLOCK POOL::
    uba: 0x00000000.003c.24 ext: 0x0  spc: 0x196
    uba: 0x00000000.001f.14 ext: 0x1  spc: 0x16f6
    uba: 0x00000000.0018.02 ext: 0x4  spc: 0x1f1a
    uba: 0x00000000.0000.00 ext: 0x0  spc: 0x0
    uba: 0x00000000.0000.00 ext: 0x0  spc: 0x0
  TRN TBL::

通过这里可以看出来,数据库在启动的时候,使用system undo的block为为0x00400006,使用bbed清除掉该uba记录,让数据库启动的时候重新分配system undo block给数据库执行update undo$使用,数据库open成功

BBED> m /x 0x00000000
 File: D:\ORACLE\ORADATA\ORCL\SYSTEM01.DBF (0)
 Block: 2                Offsets: 4188 to 4192           Dba:0x00000000
------------------------------------------------------------------------
 00000000 3c002400 00009601 00000000 1f001400 0100f616 00000000 18000200
BBED> m /x 0x0000
 File: D:\ORACLE\ORADATA\ORCL\SYSTEM01.DBF (0)
 Block: 2                Offsets: 4028 to 4032           Dba:0x00000000
------------------------------------------------------------------------
 00000000 00000000 3c005000 02800100 68000000 feffff7f 06004000 3c002400
Sat Jul 26 12:09:21 2014
Thread recovery: start rolling forward thread 1
Recovery of Online Redo Log: Thread 1 Group 2 Seq 3326 Reading mem 0
  Mem# 0 errs 0: D:\ORACLE\ORADATA\ORCL\REDO02.LOG
Sat Jul 26 12:09:21 2014
Thread recovery: finish rolling forward thread 1
Thread recovery: 0 data blocks read, 0 data blocks written, 3 redo blocks read
Crash recovery completed successfully
Sat Jul 26 12:09:22 2014
Thread 1 advanced to log sequence 3327
Thread 1 opened at log sequence 3327
  Current log# 3 seq# 3327 mem# 0: D:\ORACLE\ORADATA\ORCL\REDO01.LOG
Successful open of redo thread 1.
Sat Jul 26 12:09:22 2014
SMON: enabling cache recovery
SMON: enabling tx recovery
Sat Jul 26 12:09:39 2014
Completed: alter database open