又一起存储故障导致ORA-00333 ORA-00312恢复

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:又一起存储故障导致ORA-00333 ORA-00312恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

数据库启动报ORA-00333 ORA-00312错误,无法正常open数据库

Thu Aug 07 10:42:03  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\bdump\szcg_arc0_4724.trc:
ORA-00333: redo log read error block 63489 count 2048
ORA-00312: online log 2 thread 1: 'F:\ORADATA\SZCG\REDO02.LOG'
ORA-27091: unable to queue I/O
ORA-27070: async read/write failed
OSD-04006: ReadFile() 失败, 无法读取文件
O/S-Error: (OS 1) 函数不正确。
Thu Aug 07 10:42:03  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\bdump\szcg_arc0_4724.trc:
ORA-00333: redo log read error block 63489 count 2048
Thu Aug 07 10:42:03  2014
ARC0: All Archive destinations made inactive due to error 333
Thu Aug 07 10:42:03  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_1856.trc:
ORA-00449: 后台进程 'LGWR' 因错误 340 异常终止
ORA-00340: 处理联机日志  (用于线程 ) 时出现 I/O 错误
Thu Aug 07 10:42:03  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_6548.trc:
ORA-00449: 后台进程 'LGWR' 因错误 340 异常终止
ORA-00340: 处理联机日志  (用于线程 ) 时出现 I/O 错误
Thu Aug 07 10:42:03  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_8104.trc:
ORA-00449: 后台进程 'LGWR' 因错误 340 异常终止
ORA-00340: 处理联机日志  (用于线程 ) 时出现 I/O 错误
Thu Aug 07 10:42:03  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\bdump\szcg_lgwr_884.trc:
ORA-00340: IO error processing online log 3 of thread 1
ORA-00345: redo log write error block 65238 count 13
ORA-00312: online log 3 thread 1: 'F:\ORADATA\SZCG\REDO03.LOG'
ORA-27070: async read/write failed
OSD-04016: 异步 I/O 请求排队时出错。
O/S-Error: (OS 1) 函数不正确。
Thu Aug 07 10:42:03  2014
LGWR: terminating instance due to error 340
Thu Aug 07 10:42:05  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_8104.trc:
ORA-00603: ORACLE server session terminated by fatal error
ORA-00449: background process 'LGWR' unexpectedly terminated with error 340
ORA-00340: IO error processing online log  of thread
Thu Aug 07 10:42:05  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_1856.trc:
ORA-00603: ORACLE server session terminated by fatal error
ORA-00449: background process 'LGWR' unexpectedly terminated with error 340
ORA-00340: IO error processing online log  of thread
Thu Aug 07 10:42:05  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_6548.trc:
ORA-00603: ORACLE server session terminated by fatal error
ORA-00449: background process 'LGWR' unexpectedly terminated with error 340
ORA-00340: IO error processing online log  of thread
Thu Aug 07 17:40:05  2014
ALTER DATABASE OPEN
Thu Aug 07 17:40:05  2014
Beginning crash recovery of 1 threads
 parallel recovery started with 15 processes
Thu Aug 07 17:40:06  2014
Started redo scan
Thu Aug 07 17:40:06  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_5168.trc:
ORA-00333: 重做日志读取块 63016 计数 8192 出错
ORA-00312: 联机日志 3 线程 1: 'F:\ORADATA\SZCG\REDO03.LOG'
ORA-27070: 异步读取/写入失败
OSD-04016: 异步 I/O 请求排队时出错。
O/S-Error: (OS 1) 函数不正确。
Thu Aug 07 17:40:06  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_5168.trc:
ORA-00333: 重做日志读取块 63016 计数 8192 出错
ORA-00312: 联机日志 3 线程 1: 'F:\ORADATA\SZCG\REDO03.LOG'
ORA-27091: 无法将 I/O 排队
ORA-27070: 异步读取/写入失败
OSD-04006: ReadFile() 失败, 无法读取文件
O/S-Error: (OS 1) 函数不正确。
Thu Aug 07 17:40:06  2014
Aborting crash recovery due to error 333
Thu Aug 07 17:40:06  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_5168.trc:
ORA-00333: 重做日志读取块 63016 计数 8192 出错
ORA-333 signalled during: ALTER DATABASE OPEN...

进一步检查发现在7月6日系统就已经报io异常

Sun Jul 06 10:05:23  2014
ARC0: All Archive destinations made inactive due to error 333
Sun Jul 06 10:06:07  2014
KCF: write/open error block=0xd03 online=1
     file=3 F:\ORADATA\SZCG\SYSAUX01.DBF
     error=27070 txt: 'OSD-04016: 异步 I/O 请求排队时出错。
O/S-Error: (OS 1) 函数不正确。'
Automatic datafile offline due to write error on
file 3: F:\ORADATA\SZCG\SYSAUX01.DBF
Sun Jul 06 10:06:23  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\bdump\szcg_arc1_2676.trc:
ORA-00333: redo log read error block 63489 count 2048
ORA-00312: online log 2 thread 1: 'F:\ORADATA\SZCG\REDO02.LOG'
ORA-27091: unable to queue I/O
ORA-27070: async read/write failed
OSD-04006: ReadFile() 失败, 无法读取文件
O/S-Error: (OS 1) 函数不正确。
Thu Aug 07 10:36:54  2014
ARC1: All Archive destinations made inactive due to error 333
Thu Aug 07 10:37:25  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\bdump\szcg_m000_5832.trc:
ORA-01135: file 3 accessed for DML/query is offline
ORA-01110: data file 3: 'F:\ORADATA\SZCG\SYSAUX01.DBF'

检查硬件发现raid一块盘完全损坏,另外一块盘也处于告警状态,保护现场拷贝文件过程中发现redo02,redo03,sysaux无法拷贝,使用rman检查发现
sysaux-block


因为redo完全损坏,使用工具跳过坏块,拷贝相关有坏块文件到其他目录,重命名相关文件尝试启动数据库,依然报ORA-00333 ORA-00312

Started redo scan
Thu Aug 07 17:40:06  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_5168.trc:
ORA-00333: 重做日志读取块 63016 计数 8192 出错
ORA-00312: 联机日志 3 线程 1: 'F:\ORADATA\SZCG\REDO03.LOG'
ORA-27070: 异步读取/写入失败
OSD-04016: 异步 I/O 请求排队时出错。
O/S-Error: (OS 1) 函数不正确。
Thu Aug 07 17:40:06  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_5168.trc:
ORA-00333: 重做日志读取块 63016 计数 8192 出错
ORA-00312: 联机日志 3 线程 1: 'F:\ORADATA\SZCG\REDO03.LOG'
ORA-27091: 无法将 I/O 排队
ORA-27070: 异步读取/写入失败
OSD-04006: ReadFile() 失败, 无法读取文件
O/S-Error: (OS 1) 函数不正确。
Thu Aug 07 17:40:06  2014
Aborting crash recovery due to error 333
Thu Aug 07 17:40:06  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_5168.trc:
ORA-00333: 重做日志读取块 63016 计数 8192 出错
ORA-333 signalled during: ALTER DATABASE OPEN...

设置隐含参数_allow_resetlogs_corruption,尝试强制拉库

Started redo scan
Fri Aug 08 12:13:25  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_3892.trc:
ORA-00333: 重做日志读取块 63016 计数 8192 出错
ORA-00312: 联机日志 3 线程 1: 'F:\ORADATA\SZCG\REDO03.LOG'
ORA-27070: 异步读取/写入失败
OSD-04016: 异步 I/O 请求排队时出错。
O/S-Error: (OS 1) 函数不正确。
Fri Aug 08 12:13:25  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_3892.trc:
ORA-00333: 重做日志读取块 63016 计数 8192 出错
ORA-00312: 联机日志 3 线程 1: 'F:\ORADATA\SZCG\REDO03.LOG'
ORA-27091: 无法将 I/O 排队
ORA-27070: 异步读取/写入失败
OSD-04006: ReadFile() 失败, 无法读取文件
O/S-Error: (OS 1) 函数不正确。
Fri Aug 08 12:13:25  2014
Aborting crash recovery due to error 333
Fri Aug 08 12:13:25  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_3892.trc:
ORA-00333: 重做日志读取块 63016 计数 8192 出错
ORA-333 signalled during: ALTER DATABASE OPEN...
Fri Aug 08 12:13:45  2014
ALTER DATABASE RECOVER  database until cancel
Fri Aug 08 12:13:45  2014
Media Recovery Start
 parallel recovery started with 15 processes
ORA-279 signalled during: ALTER DATABASE RECOVER  database until cancel  ...
Fri Aug 08 12:13:55  2014
ALTER DATABASE RECOVER    CANCEL
Fri Aug 08 12:13:59  2014
ORA-1547 signalled during: ALTER DATABASE RECOVER    CANCEL  ...
Fri Aug 08 12:13:59  2014
ALTER DATABASE RECOVER CANCEL
ORA-1112 signalled during: ALTER DATABASE RECOVER CANCEL ...
Fri Aug 08 12:14:12  2014
alter database open resetlogs
Fri Aug 08 12:14:13  2014
RESETLOGS is being done without consistancy checks. This may result
in a corrupted database. The database should be recreated.
ORA-1245 signalled during: alter database open resetlogs...
Fri Aug 08 12:54:11  2014
alter tablespace sysaux offline
Fri Aug 08 12:54:11  2014
ORA-1109 signalled during: alter tablespace sysaux offline...
Fri Aug 08 13:05:30  2014
alter database open
Fri Aug 08 13:05:30  2014
ORA-1589 signalled during: alter database open...

在offline过程中,数据库检查到sysaux数据文件为offline状态,当表空间只有一个数据文件,而且该数据文件为offline,数据库将会尝试offline sysaux表空间,但是发现该表空间文件非正常scn,无法offline 表空间,导致resetlogs操作失败。这里是操作失误应该先online相关数据文件,然后再进行resetlogs操作

Sat Aug 09 11:56:03  2014
alter database datafile 3 online
Sat Aug 09 11:56:04  2014
Completed: alter database datafile 3 online
Sat Aug 09 11:56:08  2014
alter database open resetlogs
RESETLOGS is being done without consistancy checks. This may result
in a corrupted database. The database should be recreated.
Sat Aug 09 11:56:18  2014
ARCH: Encountered disk I/O error 19502
Sat Aug 09 11:56:18  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_4516.trc:
ORA-19502: 文件 "F:\ARCHIVE\ARC01745_0814618167.001", 块编号 55297 写错误 (块大小 = 512)
ORA-27072: 文件 I/O 错误
OSD-04008: WriteFile() 失败, 无法写入文件
O/S-Error: (OS 1) 函数不正确。
ORA-19502: 文件 "F:\ARCHIVE\ARC01745_0814618167.001", 块编号 55297 写错误 (块大小 = 512)
Sat Aug 09 11:56:18  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_4516.trc:
ORA-19502: 文件 "F:\ARCHIVE\ARC01745_0814618167.001", 块编号 55297 写错误 (块大小 = 512)
ORA-27072: 文件 I/O 错误
OSD-04008: WriteFile() 失败, 无法写入文件
O/S-Error: (OS 1) 函数不正确。
ORA-19502: 文件 "F:\ARCHIVE\ARC01745_0814618167.001", 块编号 55297 写错误 (块大小 = 512)
ARCH: I/O error 19502 archiving log 3 to 'F:\ARCHIVE\ARC01745_0814618167.001'
Sat Aug 09 11:56:18  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_4516.trc:
ORA-00265: 要求实例恢复, 无法设置 ARCHIVELOG 模式
Archive all online redo logfiles failed:265
RESETLOGS after incomplete recovery UNTIL CHANGE 77983856
Resetting resetlogs activation ID 3562192628 (0xd452bef4)
Online log F:\ORADATA\SZCG\REDO01.LOG: Thread 1 Group 1 was previously cleared
Online log F:\ORADATA\SZCG\REDO02.LOG: Thread 1 Group 2 was previously cleared
Online log D:\REDO04.LOG: Thread 1 Group 4 was previously cleared
Sat Aug 09 11:56:22  2014
Setting recovery target incarnation to 3
Sat Aug 09 11:56:23  2014
Assigning activation ID 3602586269 (0xd6bb1a9d)
LGWR: STARTING ARCH PROCESSES
ARC0 started with pid=33, OS id=5900
Sat Aug 09 11:56:23  2014
ARC0: Archival started
ARC1: Archival started
LGWR: STARTING ARCH PROCESSES COMPLETE
ARC1 started with pid=34, OS id=5776
Sat Aug 09 11:56:24  2014
Thread 1 opened at log sequence 1
  Current log# 1 seq# 1 mem# 0: F:\ORADATA\SZCG\REDO01.LOG
Successful open of redo thread 1
Sat Aug 09 11:56:24  2014
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Sat Aug 09 11:56:24  2014
ARC1: Becoming the 'no FAL' ARCH
ARC1: Becoming the 'no SRL' ARCH
Sat Aug 09 11:56:24  2014
ARC0: Becoming the heartbeat ARCH
Sat Aug 09 11:56:24  2014
SMON: enabling cache recovery
Sat Aug 09 11:56:25  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_4516.trc:
ORA-00600: 内部错误代码, 参数: [2662], [0], [77983864], [0], [77992379], [8388617], [], []
Sat Aug 09 11:56:26  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_4516.trc:
ORA-00600: 内部错误代码, 参数: [2662], [0], [77983864], [0], [77992379], [8388617], [], []
Sat Aug 09 11:56:26  2014
Error 600 happened during db open, shutting down database
USER: terminating instance due to error 600
Instance terminated by USER, pid = 4516
ORA-1092 signalled during: alter database open resetlogs...

ORA-600 2662这个错误很熟悉,直接推SCN,数据库open,但是报ORA-600 4194

Sat Aug 09 12:01:28  2014
SMON: enabling cache recovery
Dictionary check complete
Sat Aug 09 12:01:32  2014
SMON: enabling tx recovery
Sat Aug 09 12:01:32  2014
Database Characterset is ZHS16GBK
Opening with internal Resource Manager plan
replication_dependency_tracking turned off (no async multimaster replication found)
Starting background process QMNC
QMNC started with pid=34, OS id=6116
Sat Aug 09 12:01:34  2014
LOGSTDBY: Validating controlfile with logical metadata
Sat Aug 09 12:01:34  2014
LOGSTDBY: Validation complete
Sat Aug 09 12:01:34  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\bdump\szcg_smon_920.trc:
ORA-00600: internal error code, arguments: [4194], [21], [53], [], [], [], [], []
Sat Aug 09 12:01:36  2014
Doing block recovery for file 2 block 319
Resuming block recovery (PMON) for file 2 block 319
Block recovery from logseq 2, block 56 to scn 1073742003
Sat Aug 09 12:01:36  2014
Recovery of Online Redo Log: Thread 1 Group 2 Seq 2 Reading mem 0
  Mem# 0: F:\ORADATA\SZCG\REDO02.LOG
Block recovery stopped at EOT rba 2.79.16
Block recovery completed at rba 2.79.16, scn 0.1073742002
Doing block recovery for file 2 block 153
Resuming block recovery (PMON) for file 2 block 153
Block recovery from logseq 2, block 56 to scn 1073741986
Sat Aug 09 12:01:36  2014
Recovery of Online Redo Log: Thread 1 Group 2 Seq 2 Reading mem 0
  Mem# 0: F:\ORADATA\SZCG\REDO02.LOG
Block recovery completed at rba 2.66.16, scn 0.1073741988
Sat Aug 09 12:01:36  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\bdump\szcg_smon_920.trc:
ORA-01595: error freeing extent (4) of rollback segment (10))
ORA-00607: Internal error occurred while making a change to a data block
ORA-00600: internal error code, arguments: [4194], [21], [53], [], [], [], [], []
Sat Aug 09 12:01:36  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_5272.trc:
ORA-00600: internal error code, arguments: [4194], [21], [53], [], [], [], [], []
Sat Aug 09 12:01:36  2014
Completed: alter database open

尝试重建undo表空间并切换undo_tabspace到新undo表空间解决,因为数据库在恢复过程中使用了隐含参数强制拉库,不能保证数据一致性,强烈建议逻辑方式重建数据库
在本次故障中,所幸的是只有redo和sysaux文件损坏,如果是业务数据文件或者system数据文件损坏,恢复的后果可能更加麻烦,丢失数据可能更加多。再次说明:数据库备份非常重要,数据的安全性不能完全寄希望于硬件之上

ORACLE 8.1.7 数据库ORA-600 4194故障恢复

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:ORACLE 8.1.7 数据库ORA-600 4194故障恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

一个817数据库报ORA-600 4194 无法正常启动

Fri Jul 25 10:49:47 2014
Database mounted in Exclusive Mode.
Completed: ALTER DATABASE   MOUNT
Fri Jul 25 10:49:58 2014
ALTER DATABASE RECOVER  database
Fri Jul 25 10:49:58 2014
Media Recovery Start
Media Recovery Log
Recovery of Online Redo Log: Thread 1 Group 2 Seq 3320 Reading mem 0
  Mem# 0 errs 0: D:\ORACLE\ORADATA\ORCL\REDO02.LOG
Media Recovery Complete
Completed: ALTER DATABASE RECOVER  database
Fri Jul 25 10:50:09 2014
alter database open
Beginning crash recovery of 1 threads
Fri Jul 25 10:50:09 2014
Thread recovery: start rolling forward thread 1
Recovery of Online Redo Log: Thread 1 Group 2 Seq 3320 Reading mem 0
  Mem# 0 errs 0: D:\ORACLE\ORADATA\ORCL\REDO02.LOG
Fri Jul 25 10:50:09 2014
Thread recovery: finish rolling forward thread 1
Thread recovery: 0 data blocks read, 0 data blocks written, 3 redo blocks read
Crash recovery completed successfully
Fri Jul 25 10:50:09 2014
Thread 1 advanced to log sequence 3321
Thread 1 opened at log sequence 3321
  Current log# 3 seq# 3321 mem# 0: D:\ORACLE\ORADATA\ORCL\REDO01.LOG
Successful open of redo thread 1.
Fri Jul 25 10:50:09 2014
SMON: enabling cache recovery
Fri Jul 25 10:50:10 2014
Errors in file D:\oracle\admin\ORCL\udump\ORA03216.TRC:
ORA-00600: ??????????: [4194], [12], [37], [], [], [], [], []
Fri Jul 25 10:50:10 2014
Recovery of Online Redo Log: Thread 1 Group 3 Seq 3321 Reading mem 0
  Mem# 0 errs 0: D:\ORACLE\ORADATA\ORCL\REDO01.LOG
Fri Jul 25 10:50:10 2014
SMON: disabling cache recovery
Fri Jul 25 10:50:10 2014
ORA-600 signalled during: alter database open

ORA-600 4194这个错误在数据库异常恢复中非常常见,因为库不是很重要,因此就是直接屏蔽掉故障回滚段,然后强制拉库,该库的恢复过程中,也直接使用隐含参数屏蔽回滚段
_corrupted_rollback_segments= RBS0, RBS1, RBS2, RBS3, RBS4, RBS5, RBS6, RBS_HDSYS,数据库依然无法open,进一步分析trace文件

Fri Jul 25 11:26:07 2014
ORACLE V8.1.7.0.0 - Production vsnsta=0
vsnsql=e vsnxtr=3
Windows 2000 Version 5.2 Service Pack 2, CPU type 586
Oracle8i Release 8.1.7.0.0 - Production
JServer Release 8.1.7.0.0 - Production
Windows 2000 Version 5.2 Service Pack 2, CPU type 586
Instance name: orcl
Redo thread mounted by this instance: 1
Oracle process number: 14
Windows thread id: 3648, image: ORACLE.EXE
*** SESSION ID:(11.1) 2014-07-25 11:26:07.843
*** 2014-07-25 11:26:07.843
ksedmp: internal or fatal error
ORA-00600: ??????????: [4194], [12], [37], [], [], [], [], []
Current SQL statement for this session:
update undo$ set name=:2,file#=:3,block#=:4,status$=:5,user#=:6,undosqn=:7,xactsqn=:8,
scnbas=:9,scnwrp=:10,inst#=:11,ts#=:12 where us#=:1
----- Call Stack Trace -----

这里很明显看出来,数据库是在open过程中,update undo$表遭遇到ORA-600 4194,因为该过程需要使用系统回滚段,但是由于其所对应的undo和redo信息不一致,所以无法正常启动数据库.继续读trace文件

  Extent Control Header
  -----------------------------------------------------------------
  Extent Header:: spare1: 0      space2: 0      #extents: 5      #blocks: 49
                  last map  0x00000000  #maps: 0      offset: 4128
      Highwater::  0x00400006  ext#: 0      blk#: 3      ext size: 9
  #blocks in seg. hdr's freelists: 0
  #blocks below: 0
  mapblk  0x00000000  offset: 0
                   Unlocked
     Map Header:: next  0x00000000  #extents: 5    obj#: 0      flag: 0x40000000
  Extent Map
  -----------------------------------------------------------------
   0x00400003  length: 9
   0x0040000c  length: 10
   0x0040008f  length: 10
   0x00400099  length: 10
   0x004000a3  length: 10
  TRN CTL:: seq: 0x003c chd: 0x004e ctl: 0x0050 inc: 0x00000000 nfb: 0x0000
            mgc: 0x8002 xts: 0x0068 flg: 0x0001 opt: 2147483646 (0x7ffffffe)
            uba: 0x00400006.003c.25 scn: 0x0000.009a4009
Version: 0x01
  FREE BLOCK POOL::
    uba: 0x00000000.003c.24 ext: 0x0  spc: 0x196
    uba: 0x00000000.001f.14 ext: 0x1  spc: 0x16f6
    uba: 0x00000000.0018.02 ext: 0x4  spc: 0x1f1a
    uba: 0x00000000.0000.00 ext: 0x0  spc: 0x0
    uba: 0x00000000.0000.00 ext: 0x0  spc: 0x0
  TRN TBL::

通过这里可以看出来,数据库在启动的时候,使用system undo的block为为0x00400006,使用bbed清除掉该uba记录,让数据库启动的时候重新分配system undo block给数据库执行update undo$使用,数据库open成功

BBED> m /x 0x00000000
 File: D:\ORACLE\ORADATA\ORCL\SYSTEM01.DBF (0)
 Block: 2                Offsets: 4188 to 4192           Dba:0x00000000
------------------------------------------------------------------------
 00000000 3c002400 00009601 00000000 1f001400 0100f616 00000000 18000200
BBED> m /x 0x0000
 File: D:\ORACLE\ORADATA\ORCL\SYSTEM01.DBF (0)
 Block: 2                Offsets: 4028 to 4032           Dba:0x00000000
------------------------------------------------------------------------
 00000000 00000000 3c005000 02800100 68000000 feffff7f 06004000 3c002400
Sat Jul 26 12:09:21 2014
Thread recovery: start rolling forward thread 1
Recovery of Online Redo Log: Thread 1 Group 2 Seq 3326 Reading mem 0
  Mem# 0 errs 0: D:\ORACLE\ORADATA\ORCL\REDO02.LOG
Sat Jul 26 12:09:21 2014
Thread recovery: finish rolling forward thread 1
Thread recovery: 0 data blocks read, 0 data blocks written, 3 redo blocks read
Crash recovery completed successfully
Sat Jul 26 12:09:22 2014
Thread 1 advanced to log sequence 3327
Thread 1 opened at log sequence 3327
  Current log# 3 seq# 3327 mem# 0: D:\ORACLE\ORADATA\ORCL\REDO01.LOG
Successful open of redo thread 1.
Sat Jul 26 12:09:22 2014
SMON: enabling cache recovery
SMON: enabling tx recovery
Sat Jul 26 12:09:39 2014
Completed: alter database open

undo异常总结和恢复思路

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:undo异常总结和恢复思路

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

UNDO异常报错千奇百怪,针对本人遇到的比较常见的undo异常报错进行汇总,仅供参考,数据库恢复过程是千奇百怪的,不能照搬硬套.
ORA-00704/ORA-00376
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 2
ORA-00376: file 3 cannot be read at this time
ORA-01110: data file 3: ‘/u01/oracle/oradata/ora11g/undotbs01.dbf’
Error 704 happened during db open, shutting down database
USER (ospid: 17864): terminating the instance due to error 704
Instance terminated by USER, pid = 17864
ORA-1092 signalled during: alter database open…
opiodr aborting process unknown ospid (17864) as a result of ORA-1092

ORA-00600[4097]
Fri Aug 31 23:14:10 2012
Errors in file /u01/oradata/orcl/bdump/orcl_smon_15589.trc:
ORA-00600: internal error code, arguments: [4097], [], [], [], [], [], [], []
Fri Aug 31 23:14:12 2012
Non-fatal internal error happenned while SMON was doing logging scn->time mapping.
SMON encountered 1 out of maximum 100 non-fatal internal errors.

ORA-01595/ORA-00600[4194]
Fri Aug 31 23:14:14 2012
Errors in file /u01/oradata/orcl/bdump/orcl_smon_15589.trc:
ORA-01595: error freeing extent (2) of rollback segment (4))
ORA-00607: Internal error occurred while making a change to a data block
ORA-00600: internal error code, arguments: [4194], [48], [34], [], [], [], [], []

0RA-00600[4193]
Tue Feb 14 09:35:34 2012
Errors in file d:\oracle\product\10.2.0\admin\interlib\udump\interlib_ora_2824.trc:
ORA-00603: ORACLE server session terminated by fatal error
ORA-00600: internal error code, arguments: [4193], [2005], [2008], [], [], [], [], []

ORA-00600[kcfrbd_3]
Wed Dec 05 10:26:35 2012
SMON: enabling tx recovery
Wed Dec 05 10:26:35 2012
Database Characterset is ZHS16GBK
Wed Dec 05 10:26:35 2012
Errors in file d:\oracle\product\10.2.0\admin\orcl\bdump\orcl_smon_548.trc:
ORA-00600: internal error code, arguments: [kcfrbd_3], [2], [2279045], [1], [2277120], [2277120], [], []
SMON: terminating instance due to error 474

ORA-00600[4137]
Fri Jul 6 18:00:40 2012
SMON: ignoring slave err,downgrading to serial rollback
Fri Jul 6 18:00:41 2012
Errors in file /usr/local/oracle/admin/techdb/bdump/techdb_smon_16636.trc:
ORA-00600: internal error code, arguments: [4137], [], [], [], [], [], [], []
ORACLE Instance techdb (pid = 8) – Error 600 encountered while recovering transaction (3, 17).

ORA-01595/ORA-01594
Sat May 12 21:54:17 2012
Errors in file /oracle/app/admin/prmdb/bdump/prmdb2_smon_483522.trc:
ORA-01595: error freeing extent (2) of rollback segment (19))
ORA-01594: attempt to wrap into rollback segment (19) extent (2) which is being freed

ORA-00704/ORA-01555
Fri May 4 21:04:21 2012
select ctime, mtime, stime from obj$ where obj# = :1
Fri May 4 21:04:21 2012
Errors in file /oracle/admin/standdb/udump/perfdb_ora_1286288.trc:
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 1
ORA-01555: snapshot too old: rollback segment number 40 with name “_SYSSMU40$” too small
Error 704 happened during db open, shutting down database
USER: terminating instance due to error 704
Instance terminated by USER, pid = 1286288
ORA-1092 signalled during: alter database open resetlogs…

ORA-00607/ORA-00600[4194]
Block recovery completed at rba 3994.5.16, scn 0.89979533
Thu Jul 26 13:21:11 2012
Errors in file /orasvr/admin/mispdata/udump/mispdata_ora_2865.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-00607: Internal error occurred while making a change to a data block
ORA-00600: internal error code, arguments: [4194], [31], [2], [], [], [], [], []
Error 604 happened during db open, shutting down database
USER: terminating instance due to error 604
Instance terminated by USER, pid = 2865
ORA-1092 signalled during: ALTER DATABASE OPEN…

ORA-00704/ORA-00600[4000]
Thu Feb 28 19:29:13 2013
Errors in file /u1/PROD/prodora/db/tech_st/10.2.0/admin/PROD_oracle/udump/prod_ora_20989.trc:
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00600: internal error code, arguments: [4000], [50], [], [], [], [], [], []
Thu Feb 28 19:29:13 2013
Error 704 happened during db open, shutting down database
USER: terminating instance due to error 704
Instance terminated by USER, pid = 20989
ORA-1092 signalled during: ALTER DATABASE OPEN RESETLOGS…

undo异常恢复处理思路
除了极少数undo坏块,undo文件丢失外,大部分undo异常是因为redo未被正常进行前滚,从而导致undo回滚异常数据库无法open,解决此类问题,需要结合一般需要结合redo异常处理技巧在其中,一般undo异常处理思路
1.切换undo_management= MANUAL尝试启动数据库,如果不成功进入2
2.设置10513 等event尝试启动数据库,如果不成功进入3
3.使用_offline_rollback_segments/_corrupted_rollback_segments屏蔽回滚段
4.如果依然不能open数据库,考虑使用bbed工具提交事务,修改回滚段状态等操作
5.如果依然还不能open数据库,考虑使用dul

如果您按照上述步骤还不能解决,请联系我们,将为您提供专业数据库技术支持
Phone:17813235971    Q Q:107644445    E-Mail:dba@xifenfei.com

姊妹篇
ORACLE REDO各种异常恢复
ORACLE丢失各种文件导致数据库不能OPEN恢复

ORA-607/ORA-600[4194]不一定是重大灾难

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:ORA-607/ORA-600[4194]不一定是重大灾难

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

以前解决过ORA-607/ORA-600[4194]和模拟过ORA-607/ORA-600[4194]错误,所以固定思维任务ORA-607/ORA-600[4194]可能就是重大灾难,通过这个案例来说明ORA-607/ORA-600[4194]可能也就是一个常规的不能再常规的错误:有一网友数据库因意外关闭电源导致启动过程出现ORA-00607/ORA-00600[4194]/ORA-00600[4097]的错误,使得数据库启动失败.

SMON: enabling tx recovery
Fri Aug 31 23:14:08 2012
Database Characterset is ZHS16GBK
replication_dependency_tracking turned off (no async multimaster replication found)
Starting background process QMNC
QMNC started with pid=19, OS id=15619
Fri Aug 31 23:14:10 2012
Errors in file /u01/oradata/orcl/bdump/orcl_smon_15589.trc:
ORA-00600: internal error code, arguments: [4097], [], [], [], [], [], [], []
Fri Aug 31 23:14:12 2012
Non-fatal internal error happenned while SMON was doing logging scn->time mapping.
SMON encountered 1 out of maximum 100 non-fatal internal errors.
Fri Aug 31 23:14:12 2012
Completed: alter database open
Fri Aug 31 23:14:14 2012
Errors in file /u01/oradata/orcl/bdump/orcl_smon_15589.trc:
ORA-01595: error freeing extent (2) of rollback segment (4))
ORA-00607: Internal error occurred while making a change to a data block
ORA-00600: internal error code, arguments: [4194], [48], [34], [], [], [], [], []
Fri Aug 31 23:29:41 2012
Errors in file /u01/oradata/orcl/bdump/orcl_smon_15589.trc:
ORA-00607: Internal error occurred while making a change to a data block
ORA-00600: internal error code, arguments: [4194], [17], [10], [], [], [], [], []
Fri Aug 31 23:29:43 2012
Errors in file /u01/oradata/orcl/bdump/orcl_smon_15589.trc:
ORA-00600: internal error code, arguments: [4194], [48], [34], [], [], [], [], []
Fri Aug 31 23:29:44 2012
Errors in file /u01/oradata/orcl/bdump/orcl_pmon_15577.trc:
ORA-00474: SMON process terminated with error
Fri Aug 31 23:29:44 2012
PMON: terminating instance due to error 474
Instance terminated by PMON, pid = 15577

通过alert日志可以定位到SMON_SCN_TIME表或者其回滚操作可能异常,结合alert和trace分析,发现这次错误的操作主要sql语句为:

ksedmp: internal or fatal error
ORA-00600: internal error code, arguments: [4194], [48], [34], [], [], [], [], []
Current SQL statement for this session:
UPDATE SYS.COL_USAGE$
   SET EQUALITY_PREDS    = EQUALITY_PREDS +
                           DECODE(BITAND(:FLAG, 1), 0, 0, 1),
       EQUIJOIN_PREDS    = EQUIJOIN_PREDS +
                           DECODE(BITAND(:FLAG, 2), 0, 0, 1),
       NONEQUIJOIN_PREDS = NONEQUIJOIN_PREDS +
                           DECODE(BITAND(:FLAG, 4), 0, 0, 1),
       RANGE_PREDS       = RANGE_PREDS + DECODE(BITAND(:FLAG, 8), 0, 0, 1),
       LIKE_PREDS        = LIKE_PREDS + DECODE(BITAND(:FLAG, 16), 0, 0, 1),
       NULL_PREDS        = NULL_PREDS + DECODE(BITAND(:FLAG, 32), 0, 0, 1),
       TIMESTAMP         = :TIME
 WHERE OBJ# = :OBJN
   AND INTCOL# = :COLN
ksedmp: internal or fatal error
ORA-00600: internal error code, arguments: [4194], [17], [10], [], [], [], [], []
Current SQL statement for this session:
UPDATE SYS.MON_MODS$
   SET INSERTS       = INSERTS + :INS,
       UPDATES       = UPDATES + :UPD,
       DELETES       = DELETES + :DEL,
       FLAGS        =
       (DECODE(BITAND(FLAGS, :FLAG), :FLAG, FLAGS, FLAGS + :FLAG)),
       DROP_SEGMENTS = DROP_SEGMENTS + :DROPSEG,
       TIMESTAMP     = :TIME
 WHERE OBJ# = :OBJN
ksedmp: internal or fatal error
ORA-00600: internal error code, arguments: [4097], [], [], [], [], [], [], []
Current SQL statement for this session:
INSERT INTO SMON_SCN_TIME
  (THREAD,
   TIME_MP,
   TIME_DP,
   SCN,
   SCN_WRP,
   SCN_BAS,
   NUM_MAPPINGS,
   TIM_SCN_MAP)
VALUES
  (0, :1, :2, :3, :4, :5, :6, :7)

这里主要涉及到对oracle的三张表的操作
COL_USAGE$:主要是在收集统计信息的时候作为是否需要收集列直方图信息参考
MON_MODS$:Oracle主要利用该表来记录那些表的数据发生改变,方便收集统计信息
SMON_SCN_TIME:记录SCN和TIME的对应关系
通过这里的分析可以确定这三张表中的数据对于数据库来说不是致命的基表信息,在数据库运行过程中可以清理掉这些信息,最多就是因为数据库性能的下降或者SCN和TIME互转功能不完善.

解决思路
完整的undo异常处理顺序
1.从alert中可以看出来数据库是在open之后由于SMON回滚到上述几条sql异常导致数据库down,所以可以尝试使用system回滚段启动数据库,看看是否可以屏蔽相关问题
2.如果方法1不可行,那使用event屏蔽smon对回滚段的相关操作,使得数据库正常启动
3.如果由于存在特殊事务,event无法屏蔽,尝试使用隐含参数处理该问题
4.如果隐含参数尚无法解决给问题考虑使用bbed
5.如果bbed不能解决,那只能选择dul或者其类似工具处理
这个案例中我们明确的看到是因为上面的三条sql回滚异常出现问题导致,对于这样的问题,经过测试使用方法1和2都能够顺利解决问题(open库之后需要重建undo,删除有问题undo表空间,修改参数[可能包括event],切换undo表空间).因为遇到几次ORA-607/ORA-600[4194]是因为system rollback损坏导致,所以这次开始也认为是一次比较复杂的恢复,最后证明这次是一种非常常规的恢复.对于ORACLE的数据库恢复有经验可能会比较快的定位问题,但是如果按照固定的思路去想可能会让自己走进死胡同.

使用bbed解决ORA-00607/ORA-00600[4194]故障

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:使用bbed解决ORA-00607/ORA-00600[4194]故障

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

ORA-00607/ORA-00600[4194]错误
数据库启动因为出现ORA-00607/ORA-00600[4194],导致数据库不能正常open

Fri Nov  4 23:10:37 2011
SMON: enabling cache recovery
Fri Nov  4 23:10:37 2011
ARC2: Archival started
ARC0: STARTING ARCH PROCESSES COMPLETE
ARC0: Becoming the heartbeat ARCH
ARC2 started with pid=18, OS id=21535
Fri Nov  4 23:10:38 2011
Errors in file /u01/oracle/admin/XFF/udump/xff_ora_21529.trc:
ORA-00600: internal error code, arguments: [4194], [35], [6], [], [], [], [], []
Fri Nov  4 23:10:41 2011
Doing block recovery for file 1 block 18
Block recovery from logseq 2, block 48668 to scn 458453
Fri Nov  4 23:10:41 2011
Recovery of Online Redo Log: Thread 1 Group 1 Seq 2 Reading mem 0
  Mem# 0 errs 0: /u01/oracle/oradata/XFF/redo01.log
Block recovery stopped at EOT rba 2.48670.16
Block recovery completed at rba 2.48670.16, scn 0.458451
Doing block recovery for file 1 block 9
Block recovery from logseq 2, block 48668 to scn 458450
Fri Nov  4 23:10:41 2011
Recovery of Online Redo Log: Thread 1 Group 1 Seq 2 Reading mem 0
  Mem# 0 errs 0: /u01/oracle/oradata/XFF/redo01.log
Block recovery completed at rba 2.48670.16, scn 0.458451
Fri Nov  4 23:10:41 2011
Errors in file /u01/oracle/admin/XFF/udump/xff_ora_21529.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-00607: Internal error occurred while making a change to a data block
ORA-00600: internal error code, arguments: [4194], [35], [6], [], [], [], [], []
Error 604 happened during db open, shutting down database
USER: terminating instance due to error 604
Instance terminated by USER, pid = 21529
ORA-1092 signalled during: ALTER DATABASE OPEN...

分析trace文件

*** SESSION ID:(159.3) 2011-11-04 23:10:37.648
tkcrrsarc: (WARN) Failed to find ARCH for message (message:0x1)
tkcrrpa: (WARN) Failed initial attempt to send ARCH message (message:0x1)
*** ktuc_diag_dmp: dump of current change vector
ktudb redo: siz: 252 spc: 7200 flg: 0x0012 seq: 0x0037 rec: 0x06
            xid:  0x0000.022.00000028
ktubl redo: slt: 34 rci: 0 opc: 11.1 objn: 15 objd: 15 tsn: 0
Undo type:  Regular undo        Begin trans    Last buffer split:  No
Temp Object:  No
Tablespace Undo:  No
             0x00000000  prev ctl uba: 0x00400012.0037.1f
prev ctl max cmt scn:  0x0000.0006c75b  prev tx cmt scn:  0x0000.0006c75d
txn start scn:  0xffff.ffffffff  logon user: 0  prev brb: 4194318  prev bcl: 0 KDO undo record:
KTB Redo
op: 0x04  ver: 0x01
op: L  itl: xid:  0x0000.020.00000029 uba: 0x00400013.0037.05
                      flg: C---    lkc:  0     scn: 0x0000.0006fecb
KDO Op code: URP row dependencies Disabled
  xtype: XA flags: 0x00000000  bdba: 0x0040006a  hdba: 0x00400069
itli: 1  ispac: 0  maxfr: 4863
tabn: 0 slot: 1(0x1) flag: 0x2c lock: 0 ckix: 191
ncol: 17 nnew: 12 size: 0
col  1: [ 9]  5f 53 59 53 53 4d 55 31 24
col  2: [ 2]  c1 02
col  3: [ 2]  c1 03
col  4: [ 2]  c1 0a
col  5: [ 4]  c3 2e 55 0a
col  6: [ 1]  80
col  7: [ 3]  c2 02 59
col  8: [ 3]  c2 02 02
col  9: [ 1]  80
col 10: [ 2]  c1 03
col 11: [ 2]  c1 02
col 16: [ 2]  c1 02
*** 2011-11-04 23:10:38.086
ksedmp: internal or fatal error
ORA-00600: internal error code, arguments: [4194], [35], [6], [], [], [], [], []
Current SQL statement for this session:
update undo$ set name=:2,file#=:3,block#=:4,status$=:5,user#=:6,undosqn=:7,xactsqn=:8,scnbas=:9,
scnwrp=:10,inst#=:11,ts#=:12,spare1=:13 where us#=:1
----- Call Stack Trace -----
calling              call     entry                argument values in hex
location             type     point                (? means dubious value)
-------------------- -------- -------------------- ----------------------------
ksedst()+27          call     ksedst1()            0 ? 1 ?
ksedmp()+557         call     ksedst()             0 ? 0 ? 0 ? 0 ? 0 ? 0 ?
ksfdmp()+19          call     ksedmp()             3 ? BFFA8C28 ? AC152C0 ?
                                                   CBD2DA0 ? 3 ? BFFA9764 ?
kgeriv()+188         call     00000000             CBD2DA0 ? 3 ?
kseipre()+42         call     kgeriv()             CBD2DA0 ? B6A50020 ? 1062 ?
                                                   2 ? BFFA8C68 ? BFFA8C5C ?
ksesic2()+21         call     kseipre()            1062 ? 2 ? BFFA8C68 ?
                                                   32B36940 ? BFFA8D38 ?
                                                   8C4A3A9 ?
kturdb()+1757        call     ksesic2()            1062 ? 0 ? 23 ? 0 ? 0 ? 6 ?
                                                   0 ?
kco_issue_callback(  call     00000000             B6A09FA4 ? B6A0A01E ? 11 ?
)+176                                              2D306014 ? B6A387C0 ?
kcoapl()+2440        call     kco_issue_callback(  B6A09FA0 ? 2D306000 ?
                              )                    B6A387C0 ?
kcbapl()+322         call     kcoapl()             B6A09FA0 ? 2D306000 ? 1 ? 0 ?
                                                   2000 ? 0 ? B6A387C0 ?
kcrfw_redo_gen()+94  call     kcbapl()             B6A09FA0 ? 2D3F6A1C ?
10                                                 CBE3AE8 ? 0 ? B6A387C0 ?
kcbchg1_main()+8669  call     kcrfw_redo_gen()     3 ? BFFA9358 ? BFFA9370 ?
                                                   CBE3AE8 ? 0 ? BFFA9390 ?
kcbchg1()+63         call     kcbchg1_main()       0 ? 3 ? BFFA97B0 ? BFFA9798 ?
                                                   0 ? 0 ?
ktuchg()+3344        call     kcbchg1()            0 ? 3 ? BFFA97B0 ? BFFA9798 ?
                                                   0 ? 0 ?
ktbchg2()+493        call     ktuchg()             2 ? 2F9EEF8C ? 3 ? B6A0CA98 ?
                                                   B6A0CAA0 ? B6A09FA0 ?
                                                   B6A387C0 ? B6A0C7A0 ? 0 ? 0 ?
kddchg()+1661        call     ktbchg2()            0 ? 2F9EEF8C ? B6A0CA98 ?
                                                   B6A0CAA0 ? B6A09FA0 ?
                                                   B6A387B8 ? B6A0C7A0 ? 0 ? 0 ?
kduovw()+7960        call     kddchg()             B6A3877C ? B6A0CA98 ?
                                                   B6A0CAA0 ? B6A09FA0 ?
                                                   B6A0C7A0 ? 0 ? 0 ? BFFA9C58 ?
kduurp()+2316        call     kduovw()             B6A3877C ? 0 ? 10 ?
                                                   B6A357A4 ? 0 ? B6A3877C ?
kdusru()+4339        call     kduurp()             B6A3877C ? 958412D ?
                                                   CBDC720 ? BFFA9FEC ? B8 ?
                                                   B6A40380 ?
kauupd()+366         call     kdusru()             B6A357A4 ? 2F9EEFF8 ?
                                                   B6A3877C ? 0 ?
updrow()+5889        call     kauupd()             B6A357A0 ? 2F9EEFF8 ?
                                                   B6A3877C ? 0 ? 2FA479FC ? E ?
                                                   F ? 2F9EF31C ? 12 ?
                                                   BFFB0544 ? BFFB04E4 ?
qerupRowProcedure()  call     updrow()             2F9E5B64 ? 7FFF ? DB4 ? 48 ?
+62                                                2F9EFBF4 ? BFFB08B4 ?
qerupFetch()+1187    call     00000000             2F9EF4B0 ? 7FFF ?
updaul()+3474        call     00000000             2F9EF4B0 ? 0 ? 2F9EF370 ?
                                                   7FFF ?
updThreePhaseExe()+  call     updaul()             2F9E5B64 ? BFFB0D2C ? 0 ?
3470
updexe()+813         call     updThreePhaseExe()   2F9E5B64 ? 0 ? B6A3877C ?
                                                   BFFB0E00 ? 2F9E5B64 ? 1 ?
                                                   BFFB0E00 ? 0 ?
opiexe()+17967       call     updexe()             2F9E5B64 ? BFFB1074 ?
opiodr()+2347        call     00000000             4 ? 4 ? BFFB25A8 ?
rpidrus()+434        call     opiodr()             4 ? 4 ? BFFB25A8 ? 2 ?
skgmstack()+210      call     00000000             BFFB2004 ? 97492FE ?
                                                   CBD2E9C ? BFFB1FE8 ?
                                                   BFFB24EC ? BFFB2004 ?
rpidru()+98          call     skgmstack()          BFFB1FE8 ? CBD2B60 ? F618 ?
                                                   9749546 ? BFFB2004 ?
rpiswu2()+1061       call     00000000             BFFB24EC ? BFFB25E8 ?
                                                   BFFB2500 ? 2 ? BFFB24B0 ?
                                                   5953 ?
rpidrv()+1915        call     rpiswu2()            32F0A1D4 ? 0 ? BFFB24B0 ? 2 ?
                                                   BFFB2528 ? 0 ? BFFB24B0 ? 0 ?
                                                   9749800 ? 97498DC ?
                                                   BFFB24EC ? 8 ?
rpiexe()+65          call     rpidrv()             2 ? 4 ? BFFB25A8 ? 8 ?
ktuscu()+697         call     rpiexe()             2 ? 1C ? 2A ? 32FF3404 ? 0 ?
                                                   BFFB2710 ?
kqrcmt()+945         call     00000000             32AFA70C ? 3 ?
ktcrcm()+945         call     kqrcmt()             31A2B84C ? 1 ? 0 ?
ktuswr()+1855        call     ktcrcm()             31A2B84C ? 0 ? 0 ? 0 ? 0 ?
                                                   1 ? 0 ? 0 ?
ktusmous_online_und  call     ktuswr()             1 ? 0 ? 0 ? 0 ? 0 ? 0 ?
oseg()+951
ktusmout_online_ut(  call     ktusmous_online_und  1 ? A ? 0 ? 3 ?
)+737                         oseg()
ktusmiut_init_ut()+  call     ktusmout_online_ut(  1 ? 0 ? 0 ?
1084                          )
ktuini()+688         call     ktusmiut_init_ut()   0 ? BFFB4744 ? CBD2E9C ?
                                                   CBD2E9C ? CBD2DA0 ? 7 ?
adbdrv()+5699        call     ktuini()             0 ? 0 ? 0 ? 0 ? 64000000 ?
                                                   3 ?
opiexe()+18301       call     adbdrv()             59D4 ? 0 ? 9EE16E2F ? 494C4 ?
                                                   32B33CD0 ? 0 ?
opiosq0()+3918       call     opiexe()             4 ? 0 ? BFFB8988 ?
kpooprx()+250        call     opiosq0()            3 ? E ? BFFB8B90 ? A4 ?
kpoal8()+867         call     kpooprx()            BFFBAD68 ? BFFB990C ? 13 ?
                                                   1 ? 0 ? A4 ?
opiodr()+2347        call     00000000             5E ? 17 ? BFFBAD64 ?
ttcpip()+4227        call     00000000             5E ? 17 ? BFFBAD64 ? 0 ?
                                                   DABCA66 ? 93 ?
opitsk()+1991        call     ttcpip()             CBDA5A0 ? 5E ? BFFBAD64 ? 0 ?
                                                   BFFBA244 ? BFFBAE88 ?
opiino()+1387        call     opitsk()             0 ? 0 ?
opiodr()+2347        call     00000000             3C ? 4 ? BFFBB950 ?
opidrv()+915         call     opiodr()             3C ? 4 ? BFFBB950 ? 0 ?
sou2o()+113          call     opidrv()             3C ? 4 ? BFFBB950 ?
opimai_real()+212    call     sou2o()              BFFBB934 ? 3C ? 4 ?
                                                   BFFBB950 ?
main()+111           call     opimai_real()        2 ? BFFBB980 ?
__libc_start_main()  call     00000000             2 ? BFFBBA44 ? BFFBBA50 ?
+220                                               47D9A828 ? 0 ? 1 ?
--------------------- Binary Stack Dump ---------------------

数据库在open的时候,需要去修改undo$对象的状态,从2该为3(offline->online)这个时候需要使用到系统回滚段,但是在使用系统回滚段的时候,使用uba=0x00400012的时候发生异常,导致数据库不能正常open,从而出现了ORA-00600[4194]的错误.而出现这个故障的原因,很可能是由于file 1 block 18块的异常导致.我们需要做的,就是让数据库启动的时候不使用file 1 block 18的block,而让数据库去另外的分配一个undo块.

bbed清除rollback分配块信息

[oracle@xifenfei ~]$ bbed listfile=list mode=edit password=blockedit
BBED: Release 2.0.0.0.0 - Limited Production on Sat Nov 5 01:11:49 2011
Copyright (c) 1982, 2005, Oracle.  All rights reserved.
************* !!! For Oracle Internal Use only !!! ***************
BBED> set file 1 block 9
        FILE#           1
        BLOCK#          9
BBED> map
 File: /u01/oracle/oradata/XFF/system01.dbf (1)
 Block: 9                                     Dba:0x00400009
------------------------------------------------------------
 Unlimited Undo Segment Header
 struct kcbh, 20 bytes                      @0
 struct ktech, 72 bytes                     @20
 struct ktemh, 16 bytes                     @92
 struct ktetb[6], 48 bytes                  @108
 struct ktuxc, 104 bytes                    @4148
 struct ktuxe[255], 10200 bytes             @4252
 ub4 tailchk                                @8188
BBED> p ktuxc
struct ktuxc, 104 bytes                     @4148
   struct ktuxcscn, 8 bytes                 @4148
      ub4 kscnbas                           @4148     0x0006c75b
      ub2 kscnwrp                           @4152     0x0000
   struct ktuxcuba, 8 bytes                 @4156
      ub4 kubadba                           @4156     0x00400012
      ub2 kubaseq                           @4160     0x0037
      ub1 kubarec                           @4162     0x1f
   sb2 ktuxcflg                             @4164     1 (KTUXCFSK)
   ub2 ktuxcseq                             @4166     0x0037
   sb2 ktuxcnfb                             @4168     1
   ub4 ktuxcinc                             @4172     0x00000000
   sb2 ktuxcchd                             @4176     34
   sb2 ktuxcctl                             @4178     32
   ub2 ktuxcmgc                             @4180     0x8002
   ub4 ktuxcopt                             @4188     0x7ffffffe
   struct ktuxcfbp[0], 12 bytes             @4192
      struct ktufbuba, 8 bytes              @4192
         ub4 kubadba                        @4192     0x00400012
         ub2 kubaseq                        @4196     0x0037
         ub1 kubarec                        @4198     0x05
      sb2 ktufbext                          @4200     1
      sb2 ktufbspc                          @4202     7200
   struct ktuxcfbp[1], 12 bytes             @4204
      struct ktufbuba, 8 bytes              @4204
         ub4 kubadba                        @4204     0x00000000
         ub2 kubaseq                        @4208     0x0035
         ub1 kubarec                        @4210     0x2a
      sb2 ktufbext                          @4212     5
      sb2 ktufbspc                          @4214     3446
   struct ktuxcfbp[2], 12 bytes             @4216
      struct ktufbuba, 8 bytes              @4216
         ub4 kubadba                        @4216     0x00000000
         ub2 kubaseq                        @4220     0x0035
         ub1 kubarec                        @4222     0x37
      sb2 ktufbext                          @4224     5
      sb2 ktufbspc                          @4226     1336
   struct ktuxcfbp[3], 12 bytes             @4228
      struct ktufbuba, 8 bytes              @4228
         ub4 kubadba                        @4228     0x00000000
         ub2 kubaseq                        @4232     0x0000
         ub1 kubarec                        @4234     0x00
      sb2 ktufbext                          @4236     0
      sb2 ktufbspc                          @4238     0
   struct ktuxcfbp[4], 12 bytes             @4240
      struct ktufbuba, 8 bytes              @4240
         ub4 kubadba                        @4240     0x00000000
         ub2 kubaseq                        @4244     0x0000
         ub1 kubarec                        @4246     0x00
      sb2 ktufbext                          @4248     0
      sb2 ktufbspc                          @4250     0
BBED> set count 16
        COUNT           16
########################################################
使用bbed修改相关参数
########################################################

启动数据库

SQL> startup
ORACLE instance started.
Total System Global Area  318767104 bytes
Fixed Size                  1219160 bytes
Variable Size              96470440 bytes
Database Buffers          213909504 bytes
Redo Buffers                7168000 bytes
Database mounted.
Database opened.
SQL> select * from v$version;
BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Prod
PL/SQL Release 10.2.0.1.0 - Production
CORE    10.2.0.1.0      Production
TNS for Linux: Version 10.2.0.1.0 - Production
NLSRTL Version 10.2.0.1.0 - Production

通过bbed模拟ORA-00607/ORA-00600 4194 故障

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:通过bbed模拟ORA-00607/ORA-00600 4194 故障

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

在数据库恢复的案例中,遇到system rollback异常的故障算是中彩票了.处理起来比较麻烦,有些情况甚至是无法处理.这里通过试验模拟ORA-00607/ORA-00600[4194].类此的错误在一次银联的数据库恢复中也遇到过,不过当时由于功底不深,理解出现部分误差.
通过bbed模拟ORA-00607/ORA-00600[4194]错误

[oracle@xifenfei ~]$ bbed listfile=list mode=edit password=blockedit
BBED: Release 2.0.0.0.0 - Limited Production on Fri Nov 4 22:59:51 2011
Copyright (c) 1982, 2005, Oracle.  All rights reserved.
************* !!! For Oracle Internal Use only !!! ***************
BBED> info
 File#  Name                                                        Size(blks)
 -----  ----                                                        ----------
     1  /u01/oracle/oradata/XFF/system01.dbf                                 0
     2  /u01/oracle/oradata/XFF/undotbs01.dbf                                0
     3  /u01/oracle/oradata/XFF/sysaux01.dbf                                 0
     4  /u01/oracle/oradata/XFF/users01.dbf                                  0
     5  /u01/oracle/oradata/XFF/datfttuser.dbf                               0
BBED> set block 9
        BLOCK#          9
BBED> map
 File: /u01/oracle/oradata/XFF/system01.dbf (1)
 Block: 9                                     Dba:0x00400009
------------------------------------------------------------
 Unlimited Undo Segment Header
 struct kcbh, 20 bytes                      @0
 struct ktech, 72 bytes                     @20
 struct ktemh, 16 bytes                     @92
 struct ktetb[6], 48 bytes                  @108
 struct ktuxc, 104 bytes                    @4148
 struct ktuxe[255], 10200 bytes             @4252
 ub4 tailchk                                @8188
BBED> p ktuxc
struct ktuxc, 104 bytes                     @4148
   struct ktuxcscn, 8 bytes                 @4148
      ub4 kscnbas                           @4148     0x0006c75b
      ub2 kscnwrp                           @4152     0x0000
   struct ktuxcuba, 8 bytes                 @4156
      ub4 kubadba                           @4156     0x00400012
      ub2 kubaseq                           @4160     0x0037
      ub1 kubarec                           @4162     0x1f
   sb2 ktuxcflg                             @4164     1 (KTUXCFSK)
   ub2 ktuxcseq                             @4166     0x0037
   sb2 ktuxcnfb                             @4168     1            <==free undo block num
   ub4 ktuxcinc                             @4172     0x00000000
   sb2 ktuxcchd                             @4176     34
   sb2 ktuxcctl                             @4178     32
   ub2 ktuxcmgc                             @4180     0x8002
   ub4 ktuxcopt                             @4188     0x7ffffffe
   struct ktuxcfbp[0], 12 bytes             @4192
      struct ktufbuba, 8 bytes              @4192
         ub4 kubadba                        @4192     0x00400013    <==uba (模拟试验修改为其他uba地址)
         ub2 kubaseq                        @4196     0x0037        <==uba sequence
         ub1 kubarec                        @4198     0x05
      sb2 ktufbext                          @4200     1
      sb2 ktufbspc                          @4202     7200
   struct ktuxcfbp[1], 12 bytes             @4204
      struct ktufbuba, 8 bytes              @4204
         ub4 kubadba                        @4204     0x00000000
         ub2 kubaseq                        @4208     0x0035
         ub1 kubarec                        @4210     0x2a
      sb2 ktufbext                          @4212     5
      sb2 ktufbspc                          @4214     3446
   struct ktuxcfbp[2], 12 bytes             @4216
      struct ktufbuba, 8 bytes              @4216
         ub4 kubadba                        @4216     0x00000000
         ub2 kubaseq                        @4220     0x0035
         ub1 kubarec                        @4222     0x37
      sb2 ktufbext                          @4224     5
      sb2 ktufbspc                          @4226     1336
   struct ktuxcfbp[3], 12 bytes             @4228
      struct ktufbuba, 8 bytes              @4228
         ub4 kubadba                        @4228     0x00000000
         ub2 kubaseq                        @4232     0x0000
         ub1 kubarec                        @4234     0x00
      sb2 ktufbext                          @4236     0
      sb2 ktufbspc                          @4238     0
   struct ktuxcfbp[4], 12 bytes             @4240
      struct ktufbuba, 8 bytes              @4240
         ub4 kubadba                        @4240     0x00000000
         ub2 kubaseq                        @4244     0x0000
         ub1 kubarec                        @4246     0x00
      sb2 ktufbext                          @4248     0
      sb2 ktufbspc                          @4250     0
BBED> set dba 0x00400013
        DBA             0x00400013 (4194323 1,19)
BBED> p ktubh
struct ktubh, 26 bytes                      @20
   struct ktubhxid, 8 bytes                 @20
      ub2 kxidusn                           @20       0x0000
      ub2 kxidslt                           @22       0x0020
      ub4 kxidsqn                           @24       0x00000029
   ub2 ktubhseq                             @28       0x0037    <==uba seq
   ub1 ktubhcnt                             @30       0x05
   ub1 ktubhirb                             @31       0x05
   ub1 ktubhicl                             @32       0x00
   ub1 ktubhflg                             @33       0x00
   ub2 ktubhidx[0]                          @34       0x1fe8
   ub2 ktubhidx[1]                          @36       0x1f2c
   ub2 ktubhidx[2]                          @38       0x1e70
   ub2 ktubhidx[3]                          @40       0x1db4
   ub2 ktubhidx[4]                          @42       0x1cf8
   ub2 ktubhidx[5]                          @44       0x1c3c
BBED> set dba 0x00400012
        DBA             0x00400012 (4194322 1,18)
BBED> p ktubh
struct ktubh, 86 bytes                      @20
   struct ktubhxid, 8 bytes                 @20
      ub2 kxidusn                           @20       0x0000
      ub2 kxidslt                           @22       0x0020
      ub4 kxidsqn                           @24       0x00000029
   ub2 ktubhseq                             @28       0x0037
   ub1 ktubhcnt                             @30       0x23
   ub1 ktubhirb                             @31       0x23
   ub1 ktubhicl                             @32       0x00
   ub1 ktubhflg                             @33       0x00
   ub2 ktubhidx[0]                          @34       0x1fe8
   …………
   ub2 ktubhidx[35]                         @104      0x00b4
BBED> set block 9
        BLOCK#          9
BBED> set count 16
        COUNT           16
BBED> m /x 12004000 offset 4192
Warning: contents of previous BIFILE will be lost. Proceed? (Y/N) y
 File: /u01/oracle/oradata/XFF/system01.dbf (1)
 Block: 9                Offsets: 4192 to 4207           Dba:0x00400009
------------------------------------------------------------------------
 12004000 37000500 0100201c 00000000
 <32 bytes per line>
BBED>  p ktuxc
struct ktuxc, 104 bytes                     @4148
   struct ktuxcscn, 8 bytes                 @4148
      ub4 kscnbas                           @4148     0x0006c75b
      ub2 kscnwrp                           @4152     0x0000
   struct ktuxcuba, 8 bytes                 @4156
      ub4 kubadba                           @4156     0x00400012
      ub2 kubaseq                           @4160     0x0037
      ub1 kubarec                           @4162     0x1f
   sb2 ktuxcflg                             @4164     1 (KTUXCFSK)
   ub2 ktuxcseq                             @4166     0x0037
   sb2 ktuxcnfb                             @4168     1
   ub4 ktuxcinc                             @4172     0x00000000
   sb2 ktuxcchd                             @4176     34
   sb2 ktuxcctl                             @4178     32
   ub2 ktuxcmgc                             @4180     0x8002
   ub4 ktuxcopt                             @4188     0x7ffffffe
   struct ktuxcfbp[0], 12 bytes             @4192
      struct ktufbuba, 8 bytes              @4192
         ub4 kubadba                        @4192     0x00400012  <==uba已经被修改
         ub2 kubaseq                        @4196     0x0037
         ub1 kubarec                        @4198     0x05
      sb2 ktufbext                          @4200     1
      sb2 ktufbspc                          @4202     7200
   struct ktuxcfbp[1], 12 bytes             @4204
      struct ktufbuba, 8 bytes              @4204
         ub4 kubadba                        @4204     0x00000000
         ub2 kubaseq                        @4208     0x0035
         ub1 kubarec                        @4210     0x2a
      sb2 ktufbext                          @4212     5
      sb2 ktufbspc                          @4214     3446
   struct ktuxcfbp[2], 12 bytes             @4216
      struct ktufbuba, 8 bytes              @4216
         ub4 kubadba                        @4216     0x00000000
         ub2 kubaseq                        @4220     0x0035
         ub1 kubarec                        @4222     0x37
      sb2 ktufbext                          @4224     5
      sb2 ktufbspc                          @4226     1336
   struct ktuxcfbp[3], 12 bytes             @4228
      struct ktufbuba, 8 bytes              @4228
         ub4 kubadba                        @4228     0x00000000
         ub2 kubaseq                        @4232     0x0000
         ub1 kubarec                        @4234     0x00
      sb2 ktufbext                          @4236     0
      sb2 ktufbspc                          @4238     0
   struct ktuxcfbp[4], 12 bytes             @4240
      struct ktufbuba, 8 bytes              @4240
         ub4 kubadba                        @4240     0x00000000
         ub2 kubaseq                        @4244     0x0000
         ub1 kubarec                        @4246     0x00
      sb2 ktufbext                          @4248     0
      sb2 ktufbspc                          @4250     0
BBED> sum apply
Check value for File 1, Block 9:
current = 0xe686, required = 0xe686

启动数据库

SQL> startup
ORACLE instance started.
Total System Global Area  318767104 bytes
Fixed Size                  1219160 bytes
Variable Size              96470440 bytes
Database Buffers          213909504 bytes
Redo Buffers                7168000 bytes
Database mounted.
ORA-01092: ORACLE instance terminated. Disconnection forced

alert日志

Fri Nov  4 23:10:37 2011
SMON: enabling cache recovery
Fri Nov  4 23:10:37 2011
ARC2: Archival started
ARC0: STARTING ARCH PROCESSES COMPLETE
ARC0: Becoming the heartbeat ARCH
ARC2 started with pid=18, OS id=21535
Fri Nov  4 23:10:38 2011
Errors in file /u01/oracle/admin/XFF/udump/xff_ora_21529.trc:
ORA-00600: internal error code, arguments: [4194], [35], [6], [], [], [], [], []
Fri Nov  4 23:10:41 2011
Doing block recovery for file 1 block 18
Block recovery from logseq 2, block 48668 to scn 458453
Fri Nov  4 23:10:41 2011
Recovery of Online Redo Log: Thread 1 Group 1 Seq 2 Reading mem 0
  Mem# 0 errs 0: /u01/oracle/oradata/XFF/redo01.log
Block recovery stopped at EOT rba 2.48670.16
Block recovery completed at rba 2.48670.16, scn 0.458451
Doing block recovery for file 1 block 9
Block recovery from logseq 2, block 48668 to scn 458450
Fri Nov  4 23:10:41 2011
Recovery of Online Redo Log: Thread 1 Group 1 Seq 2 Reading mem 0
  Mem# 0 errs 0: /u01/oracle/oradata/XFF/redo01.log
Block recovery completed at rba 2.48670.16, scn 0.458451
Fri Nov  4 23:10:41 2011
Errors in file /u01/oracle/admin/XFF/udump/xff_ora_21529.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-00607: Internal error occurred while making a change to a data block
ORA-00600: internal error code, arguments: [4194], [35], [6], [], [], [], [], []
Error 604 happened during db open, shutting down database
USER: terminating instance due to error 604
Instance terminated by USER, pid = 21529
ORA-1092 signalled during: ALTER DATABASE OPEN...

数据库报ORA-00607/ORA-00600[4194]错误

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:数据库报ORA-00607/ORA-00600[4194]错误

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

昨天晚上处理一起比较奇特的ORA-00600[4194]错误的数据库恢复案例,客户数据库刚刚上线,因为一时疏忽没有做备份.谁知天有不测风云,就这样的系统也会出问题(数据库文件总共 5g redo log sequence#=9).这个事故告诉我们:作为dba在任何时候都不要有侥幸心理,备份重于一切
数据库报ORA-00607/ORA-00600[4194]错误

Thu Jul 26 13:21:11 2012
SMON: enabling cache recovery
Thu Jul 26 13:21:11 2012
Errors in file /orasvr/admin/mispdata/udump/mispdata_ora_2865.trc:
ORA-00600: internal error code, arguments: [4194], [31], [2], [], [], [], [], []
Thu Jul 26 13:21:11 2012
Doing block recovery for file 1 block 18
Block recovery from logseq 3994, block 3 to scn 89979535
Thu Jul 26 13:21:11 2012
Recovery of Online Redo Log: Thread 1 Group 1 Seq 3994 Reading mem 0
  Mem# 0: /orasvr/mispdata/redo01.log
Block recovery stopped at EOT rba 3994.5.16
Block recovery completed at rba 3994.5.16, scn 0.89979533
Doing block recovery for file 1 block 9
Block recovery from logseq 3994, block 3 to scn 89979532
Thu Jul 26 13:21:11 2012
Recovery of Online Redo Log: Thread 1 Group 1 Seq 3994 Reading mem 0
  Mem# 0: /orasvr/mispdata/redo01.log
Block recovery completed at rba 3994.5.16, scn 0.89979533
Thu Jul 26 13:21:11 2012
Errors in file /orasvr/admin/mispdata/udump/mispdata_ora_2865.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-00607: Internal error occurred while making a change to a data block
ORA-00600: internal error code, arguments: [4194], [31], [2], [], [], [], [], []
Error 604 happened during db open, shutting down database
USER: terminating instance due to error 604
Instance terminated by USER, pid = 2865
ORA-1092 signalled during: ALTER DATABASE OPEN...

通过alert日志中,我们可以发现是因为ORA-00600[4194]导致数据库不能被正常open,但是这次不同的是在报ORA-00600之前有ORA-00607的错误出现,根据这个提示,应该是一个基本的数据块有问题导致.而ORA-00600[4194]是因为undo和redo不一致导致.对于本错误放在一起分析,大概的评估是因为内部对象的异常出现ora-607,导致undo和redo不一致出现ORA-00600[4194].

trace文件分析

--dump redo
DUMP OF REDO FROM FILE '/orasvr/mispdata/redo02.log'
 Opcodes *.*
 DBAs (file#, block#):
      (1, 18)
 RBAs: 0x000000.00000000.0000 thru 0xffffffff.ffffffff.ffff
 SCNs: scn: 0x0000.00000000 thru scn: 0xffff.ffffffff
 Times: creation thru eternity
 FILE HEADER:
	Compatibility Vsn = 169870080=0xa200300
	Db ID=658120234=0x273a1e2a, Db Name='MISPDATA'
	Activation ID=658142762=0x273a762a
	Control Seq=16668=0x411c, File size=102400=0x19000
	File Number=2, Blksiz=512, File Type=2 LOG
 descrip:"Thread 0001, Seq# 0000003992, SCN 0x0000055c5e3c-0x0000055cac62"
 thread: 1 nab: 0x5 seq: 0x00000f98 hws: 0x6 eot: 0 dis: 0
 resetlogs count: 0x2d42646a scn: 0x0000.00000001 (1)
 resetlogs terminal rcv count: 0x0 scn: 0x0000.00000000
 prev resetlogs count: 0x0 scn: 0x0000.00000000
 prev resetlogs terminal rcv count: 0x0 scn: 0x0000.00000000
 Low  scn: 0x0000.055c5e3c (89939516) 07/26/2012 11:17:42
 Next scn: 0x0000.055cac62 (89959522) 07/26/2012 13:16:19
 Enabled scn: 0x0000.00000001 (1) 08/16/2011 11:50:10
 Thread closed scn: 0x0000.055cac61 (89959521) 07/26/2012 11:17:42
 Disk cksum: 0x3088 Calc cksum: 0x3088
 Terminal recovery stop scn: 0x0000.00000000
 Terminal recovery  01/01/1988 00:00:00
 Most recent redo scn: 0x0000.00000000
 Largest LWN: 0 blocks
 End-of-redo stream : No
 Unprotected mode
 Miscellaneous flags: 0x0
 Thread internal enable indicator: thr: 0, seq: 0 scn: 0x0000.00000000
--ORA-00600错误提示
*** 2012-07-26 13:21:11.566
ksedmp: internal or fatal error
ORA-00600: internal error code, arguments: [4194], [31], [2], [], [], [], [], []
Current SQL statement for this session:
update undo$ set name=:2,file#=:3,block#=:4,status$=:5,user#=:6,undosqn=:7,
xactsqn=:8,scnbas=:9,scnwrp=:10,inst#=:11,ts#=:12,spare1=:13 where us#=:1
--ora-607
Error 607 in redo application callback
TYP:0 CLS:16 AFN:1 DBA:0x00400012 OBJ:4294967295 SCN:0x0000.0551610e SEQ:  1 OP:5.1
ktudb redo: siz: 256 spc: 7892 flg: 0x0012 seq: 0x003d rec: 0x02
            xid:  0x0000.026.00000035
ktubl redo: slt: 38 rci: 0 opc: 11.1 objn: 15 objd: 15 tsn: 0
Undo type:  Regular undo        Begin trans    Last buffer split:  No
Temp Object:  No
Tablespace Undo:  No
             0x00000000  prev ctl uba: 0x00400012.003d.01
prev ctl max cmt scn:  0x0000.0550709b  prev tx cmt scn:  0x0000.0550709c
txn start scn:  0xffff.ffffffff  logon user: 0  prev brb: 4194318  prev bcl: 0 KDO undo record:
KTB Redo
op: 0x04  ver: 0x01
op: L  itl: xid:  0x0000.01e.00000035 uba: 0x00400012.003d.01
                      flg: C---    lkc:  0     scn: 0x0000.05511296
KDO Op code: URP row dependencies Disabled
  xtype: XA flags: 0x00000000  bdba: 0x0040006a  hdba: 0x00400069
itli: 1  ispac: 0  maxfr: 4863
tabn: 0 slot: 1(0x1) flag: 0x2c lock: 0 ckix: 0
ncol: 17 nnew: 12 size: 0
col  1: [ 9]  5f 53 59 53 53 4d 55 31 24
col  2: [ 2]  c1 02
col  3: [ 2]  c1 03
col  4: [ 2]  c1 0a
col  5: [ 5]  c4 5a 12 5a 14
col  6: [ 1]  80
col  7: [ 4]  c3 08 5f 3d
col  8: [ 4]  c3 02 38 52
col  9: [ 1]  80
col 10: [ 2]  c1 04
col 11: [ 2]  c1 02
col 16: [ 2]  c1 02
Block after image is corrupt:
buffer tsn: 0 rdba: 0x00400012 (1/18)
scn: 0x0000.0551610e seq: 0x01 flg: 0x04 tail: 0x610e0201
frmt: 0x02 chkval: 0x65f8 type: 0x02=KTU UNDO BLOCK

这里信息比较多:
1.dump redo部分得到file 1 block 18块可能异常
2.ora-600部分可以得出数据库在执行undo$对象update的回滚操作时候报错
3.通过ora-607信息得到update undo$记录对应的数据块是file 1 block 106(dba 0x00400069),在相同数据库版本数据库中查询.也就是说undo$这个回滚段回滚的时候出现错误.

SQL> SELECT OWNER, SEGMENT_NAME, SEGMENT_TYPE, TABLESPACE_NAME, A.PARTITION_NAME
  2    FROM DBA_EXTENTS A
  3   WHERE FILE_ID = &FILE_ID
  4     AND &BLOCK_ID BETWEEN BLOCK_ID AND BLOCK_ID + BLOCKS - 1;
Enter value for file_id: 1
old   3:  WHERE FILE_ID = &FILE_ID
new   3:  WHERE FILE_ID = 1
Enter value for block_id: 106
old   4:    AND &BLOCK_ID BETWEEN BLOCK_ID AND BLOCK_ID + BLOCKS - 1
new   4:    AND 106 BETWEEN BLOCK_ID AND BLOCK_ID + BLOCKS - 1
OWNER
------------------------------
SEGMENT_NAME
--------------------------------------------------------------------------------
SEGMENT_TYPE       TABLESPACE_NAME                PARTITION_NAME
------------------ ------------------------------ ------------------------------
SYS
UNDO$
TABLE              SYSTEM

4.发现dba 0x00400012发现坏块是file 1 block 18,查询坏块对象为

SQL> SELECT OWNER, SEGMENT_NAME, SEGMENT_TYPE, TABLESPACE_NAME, A.PARTITION_NAME
  2    FROM DBA_EXTENTS A
  3   WHERE FILE_ID = &FILE_ID
  4     AND &BLOCK_ID BETWEEN BLOCK_ID AND BLOCK_ID + BLOCKS - 1;
Enter value for file_id: 1
old   3:  WHERE FILE_ID = &FILE_ID
new   3:  WHERE FILE_ID = 1
Enter value for block_id: 18
old   4:    AND &BLOCK_ID BETWEEN BLOCK_ID AND BLOCK_ID + BLOCKS - 1
new   4:    AND 18 BETWEEN BLOCK_ID AND BLOCK_ID + BLOCKS - 1
OWNER
------------------------------
SEGMENT_NAME
--------------------------------------------------------------------------------
SEGMENT_TYPE       TABLESPACE_NAME                PARTITION_NAME
------------------ ------------------------------ ------------------------------
SYS
SYSTEM
ROLLBACK           SYSTEM

通过这里的分析,大概可以确定这次故障的原因:
因为ROLLBACK(file 1 block 18)坏块,redo 恢复undo 出现异常出现ORA-607,使得undo和redo不一致从而出现ORA-00600[4194],导致undo$(file 1 block 106)中的一条update事务不能被正常提交或者回滚,从而使得该数据库不能被正常打开.
针对这个库因为ROLLBACK异常,使用隐含参数无法屏蔽该回滚段,因为这个数据量非常小,我们选择了挖数据文件.如果数据量比较大,可以通过bbed尝试提交undo$(file 1 block 106)数据块中事务,看人品是否能够正常启动.

ORA-600[4194]/[4193]解决

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:ORA-600[4194]/[4193]解决

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

朋友的库启动的时候出现ORA-600[4194]/[4193]错误

Tue Feb 14 09:34:11 2012
Errors in file d:\oracle\product\10.2.0\admin\interlib\bdump\interlib_smon_2784.trc:
ORA-01595: error freeing extent (2) of rollback segment (3))
ORA-00607: Internal error occurred while making a change to a data block
ORA-00600: internal error code, arguments: [4194], [6], [30], [], [], [], [], []
Tue Feb 14 09:35:34 2012
Errors in file d:\oracle\product\10.2.0\admin\interlib\udump\interlib_ora_2824.trc:
ORA-00603: ORACLE server session terminated by fatal error
ORA-00600: internal error code, arguments: [4193], [2005], [2008], [], [], [], [], []
ORA-00600: internal error code, arguments: [4193], [2005], [2008], [], [], [], [], []
Tue Feb 14 09:36:30 2012
DEBUG: Replaying xcb 0x1fa24174, pmd 0x1fba06d4 for failed op 8
Doing block recovery for file 2 block 177
No block recovery was needed
Tue Feb 14 09:37:30 2012
Errors in file d:\oracle\product\10.2.0\admin\interlib\bdump\interlib_pmon_2732.trc:
ORA-00600: internal error code, arguments: [4193], [2005], [2008], [], [], [], [], []
Tue Feb 14 09:37:31 2012
Errors in file d:\oracle\product\10.2.0\admin\interlib\bdump\interlib_pmon_2732.trc:
ORA-00600: internal error code, arguments: [4193], [2005], [2008], [], [], [], [], []

从这里可以看到出现了ORA-600[4194]/[4193],第一感觉就是undo出现问题。
4193:表示undo和redo不一致(Arg [a] Undo record seq number,Arg [b] Redo record seq number );
4194:表示也是undo和redo不一致(Arg [a] Maximum Undo record number in Undo block,Arg [b] Undo record number from Redo block)
至于为什么有时候会只出现其中一个,我不太清楚,求答案

直接设置了下面参数,数据库就意外的open成功,这位朋友比较幸运

undo_tablespace=SYSTEM
undo_management=MANUAL

既然库已经open,然后新建undo空间,删除出问题的undo,做如下修改,数据库恢复完成

undo_tablespace=新undo
undo_management=AUTO

如果出现极端的情况可能需要做如下处理:
1.使用_offline_rollback_segments和_corrupted_rollback_segments屏蔽掉有问题的undo segment
2.继续可能出现ora-600[2662],需要推进scn

ORA-00600[4194]故障解决

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:ORA-00600[4194]故障解决

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

朋友数据库因为断电,导致数据库正常启动片刻之后,自动down掉
一、alert日志

Starting ORACLE instance (normal)
LICENSE_MAX_SESSION = 0
LICENSE_SESSIONS_WARNING = 0
Picked latch-free SCN scheme 2
Autotune of undo retention is turned on.
IMODE=BR
ILAT =18
LICENSE_MAX_USERS = 0
SYS auditing is disabled
ksdpec: called for event 13740 prior to event group initialization
Starting up ORACLE RDBMS Version: 10.2.0.1.0.
System parameters with non-default values:
  processes                = 150
  __shared_pool_size       = 58720256
  __large_pool_size        = 4194304
  __java_pool_size         = 4194304
  __streams_pool_size      = 4194304
  nls_date_format          = yyyy-mm-dd hh24:mi:ss
  sga_target               = 335544320
  control_files            = /u02/ezhou/control01.ctl
  db_block_size            = 8192
  compatible               = 10.2.0.1.0
  log_archive_dest         = /u02/arch
  log_archive_max_processes= 10
  db_file_multiblock_read_count= 16
  fast_start_mttr_target   = 300
  undo_management          = AUTO
  undo_tablespace          = UNDOTBS1
  remote_login_passwordfile= EXCLUSIVE
  db_domain                =
  dispatchers              = (PROTOCOL=TCP) (SERVICE=ezhouXDB)
  job_queue_processes      = 10
  background_dump_dest     = /u01/pp/oracle/admin/ezhou/bdump
  user_dump_dest           = /u01/pp/oracle/admin/ezhou/udump
  core_dump_dest           = /u01/pp/oracle/admin/ezhou/cdump
  audit_file_dest          = /u01/pp/oracle/admin/ezhou/adump
  db_name                  = ezhou
  open_cursors             = 400
  sql_trace                = TRUE
  pga_aggregate_target     = 94371840
MMAN started with pid=4, OS id=5539
PMON started with pid=2, OS id=5535
DBW0 started with pid=5, OS id=5541
LGWR started with pid=6, OS id=5543
SMON started with pid=8, OS id=5547
CJQ0 started with pid=10, OS id=5577
RECO started with pid=9, OS id=5575
Sat Dec 10 17:15:40 2011
starting up 1 dispatcher(s) for network address '(ADDRESS=(PARTIAL=YES)(PROTOCOL=TCP))'
MMNL started with pid=12, OS id=5581
MMON started with pid=11, OS id=5579
Sat Dec 10 17:15:40 2011
starting up 1 shared server(s) ...
PSP0 started with pid=3, OS id=5537
CKPT started with pid=7, OS id=5545
Sat Dec 10 17:15:42 2011
ALTER DATABASE   MOUNT
Sat Dec 10 17:15:46 2011
Setting recovery target incarnation to 3
Sat Dec 10 17:15:47 2011
Successful mount of redo thread 1, with mount id 4055654398
Sat Dec 10 17:15:47 2011
Database mounted in Exclusive Mode
Completed: ALTER DATABASE   MOUNT
Sat Dec 10 17:15:47 2011
ALTER DATABASE OPEN
Sat Dec 10 17:15:47 2011
Beginning crash recovery of 1 threads
Sat Dec 10 17:15:47 2011
Started redo scan
Sat Dec 10 17:15:48 2011
Completed redo scan
 319 redo blocks read, 98 data blocks need recovery
Sat Dec 10 17:15:50 2011
Started redo application at
 Thread 1: logseq 24, block 3
Sat Dec 10 17:15:50 2011
Recovery of Online Redo Log: Thread 1 Group 3 Seq 24 Reading mem 0
  Mem# 0 errs 0: /u02/ezhou/redo03.log
Sat Dec 10 17:15:50 2011
Completed redo application
Sat Dec 10 17:15:51 2011
Completed crash recovery at
 Thread 1: logseq 24, block 322, scn 6168722
 98 data blocks read, 98 data blocks written, 319 redo blocks read
Sat Dec 10 17:15:51 2011
LGWR: STARTING ARCH PROCESSES
ARC1 started with pid=17, OS id=5645
ARC0 started with pid=16, OS id=5643
ARC3 started with pid=19, OS id=5649
ARC4 started with pid=20, OS id=5651
ARC2 started with pid=18, OS id=5647
ARC6 started with pid=22, OS id=5655
ARC7 started with pid=23, OS id=5657
ARC5 started with pid=21, OS id=5653
ARC8 started with pid=24, OS id=5659
Sat Dec 10 17:15:52 2011
ARC0: Archival started
ARC1: Archival started
ARC2: Archival started
ARC3: Archival started
ARC4: Archival started
ARC5: Archival started
ARC6: Archival started
ARC7: Archival started
ARC8: Archival started
ARC9: Archival started
LGWR: STARTING ARCH PROCESSES COMPLETE
ARC9 started with pid=25, OS id=5661
Sat Dec 10 17:15:52 2011
Thread 1 advanced to log sequence 25
Sat Dec 10 17:15:53 2011
ARC2: STARTING ARCH PROCESSES
Sat Dec 10 17:15:53 2011
ARC6: Becoming the 'no FAL' ARCH
ARC6: Becoming the 'no SRL' ARCH
Sat Dec 10 17:15:53 2011
ARC3: Becoming the heartbeat ARCH
Sat Dec 10 17:15:53 2011
Thread 1 opened at log sequence 25
  Current log# 1 seq# 25 mem# 0: /u02/ezhou/redo01.log
  Current log# 1 seq# 25 mem# 1: /u02/ezhou/redo01a.rdo
Successful open of redo thread 1
Sat Dec 10 17:15:53 2011
SMON: enabling cache recovery
Sat Dec 10 17:15:54 2011
ARCa: Archival started
ARC2: STARTING ARCH PROCESSES COMPLETE
ARCa started with pid=26, OS id=5663
Sat Dec 10 17:15:57 2011
Successfully onlined Undo Tablespace 1.
Sat Dec 10 17:15:57 2011
SMON: enabling tx recovery
Sat Dec 10 17:15:57 2011
Database Characterset is AL32UTF8
replication_dependency_tracking turned off (no async multimaster replication found)
Starting background process QMNC
QMNC started with pid=27, OS id=5666
Sat Dec 10 17:16:13 2011
Errors in file /u01/pp/oracle/admin/ezhou/bdump/ezhou_smon_5547.trc:
ORA-00600: internal error code, arguments: [4194], [30], [27], [], [], [], [], []
Sat Dec 10 17:16:17 2011
Completed: ALTER DATABASE OPEN
Sat Dec 10 17:16:27 2011
Doing block recovery for file 2 block 4124
Block recovery from logseq 25, block 68 to scn 6168829
Sat Dec 10 17:16:27 2011
Recovery of Online Redo Log: Thread 1 Group 1 Seq 25 Reading mem 0
  Mem# 0 errs 0: /u02/ezhou/redo01.log
  Mem# 1 errs 0: /u02/ezhou/redo01a.rdo
Block recovery stopped at EOT rba 25.126.16
Block recovery completed at rba 25.126.16, scn 0.6168829
Doing block recovery for file 2 block 73
Block recovery from logseq 25, block 68 to scn 6168786
Sat Dec 10 17:16:28 2011
Recovery of Online Redo Log: Thread 1 Group 1 Seq 25 Reading mem 0
  Mem# 0 errs 0: /u02/ezhou/redo01.log
  Mem# 1 errs 0: /u02/ezhou/redo01a.rdo
Block recovery completed at rba 25.69.16, scn 0.6168789
Sat Dec 10 17:16:28 2011
Errors in file /u01/pp/oracle/admin/ezhou/bdump/ezhou_smon_5547.trc:
ORA-01595: error freeing extent (2) of rollback segment (5))
ORA-00607: Internal error occurred while making a change to a data block
ORA-00600: internal error code, arguments: [4194], [30], [27], [], [], [], [], []
Sat Dec 10 17:16:30 2011
Errors in file /u01/pp/oracle/admin/ezhou/bdump/ezhou_j002_5690.trc:
ORA-00600: internal error code, arguments: [4194], [30], [27], [], [], [], [], []
Sat Dec 10 17:16:37 2011
Doing block recovery for file 2 block 4124
Block recovery from logseq 25, block 68 to scn 6168829
Sat Dec 10 17:16:37 2011
Recovery of Online Redo Log: Thread 1 Group 1 Seq 25 Reading mem 0
  Mem# 0 errs 0: /u02/ezhou/redo01.log
  Mem# 1 errs 0: /u02/ezhou/redo01a.rdo
Block recovery completed at rba 25.126.16, scn 0.6168830
Doing block recovery for file 2 block 73
Block recovery from logseq 25, block 68 to scn 6168841
Sat Dec 10 17:16:37 2011
Recovery of Online Redo Log: Thread 1 Group 1 Seq 25 Reading mem 0
  Mem# 0 errs 0: /u02/ezhou/redo01.log
  Mem# 1 errs 0: /u02/ezhou/redo01a.rdo
Block recovery completed at rba 25.149.16, scn 0.6168843
Sat Dec 10 17:16:37 2011
Errors in file /u01/pp/oracle/admin/ezhou/bdump/ezhou_j002_5690.trc:
ORA-12012: error on auto execute of job 8886
ORA-00607: Internal error occurred while making a change to a data block
Sat Dec 10 17:16:41 2011
Errors in file /u01/pp/oracle/admin/ezhou/bdump/ezhou_j003_5692.trc:
ORA-00600: internal error code, arguments: [4194], [30], [27], [], [], [], [], []
Sat Dec 10 17:16:42 2011
DEBUG: Replaying xcb 0x32a2b17c, pmd 0x32bdbd24 for failed op 8
Doing block recovery for file 2 block 4124
Block recovery from logseq 25, block 68 to scn 6168829
Sat Dec 10 17:16:42 2011
Recovery of Online Redo Log: Thread 1 Group 1 Seq 25 Reading mem 0
  Mem# 0 errs 0: /u02/ezhou/redo01.log
  Mem# 1 errs 0: /u02/ezhou/redo01a.rdo
Block recovery completed at rba 25.126.16, scn 0.6168830
Sat Dec 10 17:16:43 2011
Errors in file /u01/pp/oracle/admin/ezhou/bdump/ezhou_j003_5692.trc:
ORA-00600: internal error code, arguments: [4194], [30], [27], [], [], [], [], []
ORA-00600: internal error code, arguments: [4194], [30], [27], [], [], [], [], []
Sat Dec 10 17:16:46 2011
Errors in file /u01/pp/oracle/admin/ezhou/bdump/ezhou_j003_5692.trc:
ORA-00600: internal error code, arguments: [4194], [30], [27], [], [], [], [], []
ORA-00600: internal error code, arguments: [4194], [30], [27], [], [], [], [], []
Sat Dec 10 17:17:46 2011
DEBUG: Replaying xcb 0x32a2b17c, pmd 0x32bdbd24 for failed op 8
Doing block recovery for file 2 block 4124
Block recovery from logseq 25, block 68 to scn 6168829
Sat Dec 10 17:17:46 2011
Recovery of Online Redo Log: Thread 1 Group 1 Seq 25 Reading mem 0
  Mem# 0 errs 0: /u02/ezhou/redo01.log
  Mem# 1 errs 0: /u02/ezhou/redo01a.rdo
Block recovery completed at rba 25.126.16, scn 0.6168830
Sat Dec 10 17:17:48 2011
Errors in file /u01/pp/oracle/admin/ezhou/bdump/ezhou_pmon_5535.trc:
ORA-00600: internal error code, arguments: [4194], [30], [27], [], [], [], [], []
Sat Dec 10 17:17:49 2011
Errors in file /u01/pp/oracle/admin/ezhou/bdump/ezhou_pmon_5535.trc:
ORA-00600: internal error code, arguments: [4194], [30], [27], [], [], [], [], []
PMON: terminating instance due to error 472
Instance terminated by PMON, pid = 5535

二、MOS记录

ERROR:
  ORA-600 [4194] [a] [b]
VERSIONS:
  versions 6.0 to 10.1
DESCRIPTION:
  A mismatch has been detected between Redo records and rollback (Undo)
  records.
  We are validating the Undo record number relating to the change being
  applied against the maximum undo record number recorded in the undo block.
  This error is reported when the validation fails.
ARGUMENTS:
  Arg [a] Maximum Undo record number in Undo block
  Arg [b] Undo record number from Redo block

三、解决办法
1、修改参数
undo_management= MANUAL
undo_tablespace= SYSTEM
2、打开数据库,删除当前undo空间,重建新undo空间
3、修改参数
undo_management= AUTO
undo_tablespace= UNDOTBSNEW
4、重新启动数据库