联系:手机/微信(+86 17813235971) QQ(107644445)
作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]
alert日志报ORA-00600[4137]与ORA-00600 [4198]错误
数据库报如下错误,运行一段时间数据库自动down掉
Fri Jul 6 18:00:40 2012 SMON: ignoring slave err,downgrading to serial rollback Fri Jul 6 18:00:41 2012 Errors in file /usr/local/oracle/admin/techdb/bdump/techdb_smon_16636.trc: ORA-00600: internal error code, arguments: [4137], [], [], [], [], [], [], [] ORACLE Instance techdb (pid = 8) - Error 600 encountered while recovering transaction (3, 17). Fri Jul 6 18:00:41 2012 Errors in file /usr/local/oracle/admin/techdb/bdump/techdb_smon_16636.trc: ORA-00600: internal error code, arguments: [4137], [], [], [], [], [], [], [] Fri Jul 6 18:05:53 2012 SMON: Restarting fast_start parallel rollback Fri Jul 6 18:05:54 2012 Errors in file /usr/local/oracle/admin/techdb/bdump/techdb_p000_17124.trc: ORA-00600: internal error code, arguments: [4198], [9], [], [], [], [], [], [] ………… Wed Jul 6 18:50:38 2012 Errors in file /usr/local/oracle/admin/techdb/bdump/techdb_pmon_4473.trc: ORA-00474: SMON process terminated with error Wed Jul 6 18:50:38 2012 PMON: terminating instance due to error 474
从三个地方得出3号回滚段异常
1.trace文件
SMON: about to recover undo segment 3 Parallel Transaction recovery caught exception 12801 Parallel Transaction recovery caught error 30317 *** 2012-07-06 17:55:19.042 SMON: Restarting fast_start parallel rollback SMON: about to recover undo segment 3 SMON: mark undo segment 3 as available SMON: about to recover undo segment 3 SMON: mark undo segment 3 as available Parallel Transaction recovery caught exception 12801 Parallel Transaction recovery caught error 607 *** 2012-07-06 17:55:19.761 SMON: ignoring slave err,downgrading to serial rollback SMON: about to recover undo segment 3 XID passed in =xid: 0x0003.011.00003c2b XID from Undo block =xid: 0x0004.020.00002b35
2.alert中提示while recovering transaction (3, 17)
3.查询dba_rollback_segs发现_SYSSMU3$是NEED RECOVERY状态
尝试删除_SYSSMU3$
使用隐含参数_offline_rollback_segments= _SYSSMU3$
Fri Jul 6 18:16:19 2012
Completed: ALTER DATABASE OPEN
Fri Jul 6 18:16:56 2012
drop rollback segment "_SYSSMU3$"
Fri Jul 6 18:16:57 2012
Errors in file /usr/local/oracle/admin/techdb/udump/techdb_ora_17381.trc:
ORA-00600: internal error code, arguments: [kddummy_blkchk], [2], [41], [38508], [], [], [], []
Fri Jul 6 18:16:57 2012
Doing block recovery for file 2 block 41
Block recovery from logseq 209591, block 183 to scn 7788878085
Fri Jul 6 18:16:57 2012
Recovery of Online Redo Log: Thread 1 Group 1 Seq 209591 Reading mem 0
Mem# 0 errs 0: /usr/local/oracle/oradata/techdb/redo01.log
Block recovery completed at rba 209591.225.16, scn 1.3493910790
ORA-607 signalled during: drop rollback segment "_SYSSMU3$"...
Fri Jul 6 18:16:57 2012
Corrupt Block Found
TSN = 1, TSNAME = UNDOTBS1
RFN = 2, BLK = 41, RDBA = 8388649
OBJN = 0, OBJD = -1, OBJECT = _NEXT_OBJECT, SUBOBJECT =
SEGMENT OWNER = SYS, SEGMENT TYPE = Invalid Type
Fri Jul 6 18:16:57 2012
Errors in file /usr/local/oracle/admin/techdb/bdump/techdb_smon_17367.trc:
ORA-00600: internal error code, arguments: [kddummy_blkchk], [2], [41], [38508], [], [], [], []
Doing block recovery for file 2 block 41
Block recovery from logseq 209591, block 183 to scn 7788878085
Fri Jul 6 18:17:46 2012
Errors in file /usr/local/oracle/admin/techdb/bdump/techdb_pmon_17355.trc:
ORA-00474: SMON process terminated with error
Fri Jul 6 18:17:46 2012
PMON: terminating instance due to error 474
Fri Jul 6 18:17:46 2012
Errors in file /usr/local/oracle/admin/techdb/bdump/techdb_dbw0_17361.trc:
ORA-00474: SMON process terminated with error
Fri Jul 6 18:17:46 2012
Errors in file /usr/local/oracle/admin/techdb/bdump/techdb_lgwr_17363.trc:
ORA-00474: SMON process terminated with error
Instance terminated by PMON, pid = 17355
这里可以看出在使用隐含参数删除异常回滚段的时候,因为该回滚段有坏块出现ORA-00600[kddummy_blkchk]使得数据库donw掉,重启过几次该库都因为这个错误直接down.
查看trace文件发现
SMON: about to recover undo segment 3
SMON: mark undo segment 3 as needs recovery
*** 2012-07-06 18:16:57.734
Block Checking: DBA = 8388649, Block Type = System Managed Segment Header Block
ERROR: SMU Segment Header Corrupted. Error Code = 38508
ktu4smck: starting extent(0x77) of txn slot #0x11 is invalid.
valid value (0 - 0x76)
TRN CTL:: seq: 0xed38 chd: 0x0020 ctl: 0x002a inc: 0x00000000 nfb: 0x0000
mgc: 0x8201 xts: 0x0068 flg: 0x0001 opt: 2147483646 (0x7ffffffe)
uba: 0x00a6610a.ed38.1d scn: 0x0001.d030de86
Version: 0x01
因为该库是因为undo的3号回滚段的header出现坏块,即使使用了隐含参数屏蔽该回滚段恢复,smon进程依然会去读回滚段header,从而出现该错误导致直接down掉.
处理方案
1.使用隐含参数屏蔽异常回滚段_offline_rollback_segments= _SYSSMU3$
2.修改undo_tablespace=SYSTEM/undo_management=MANUAL
3.启动数据库,快速删除包含_SYSSMU3$ undo表空间
4.新建undo表空间
5.修改undo_tablespace=new_undo/undo_management=AUTO,除掉隐含参数
6.使用新参数文件重启数据库
7.建议:使用逻辑导出导入重建数据库
补充说明在该次故障处理过程中,忘记尝试采用event来屏蔽回滚,不知道该方法是否可以屏蔽对回滚段header的读
event = 10513 trace name context forever,level 2
ORA-600 [4137] “XID in Undo and Redo Does Not Match” [ID 43914.1]
ERROR: Format: ORA-600 [4137] VERSIONS: versions 7.0 to 10.1 DESCRIPTION: While backing out an undo record (i.e. at the time of rollback) we found a transaction id mis-match indicating either a corruption in the rollback segment or corruption in an object which the rollback segment is trying to apply undo records on. This would indicate a corrupted rollback segment. FUNCTIONALITY: Kernel Transaction Undo Recovery IMPACT: POSSIBLE PHYSICAL CORRUPTION in Rollback segments SUGGESTIONS: Signalled during rollback (also rollback for consistent read). The consistency check that compares the transaction id of the transaction being rolled back against the transaction id in undo block being applied is failing. A possible cause is a lost write to the undo segment. The main approach is to identify the file containing the bad undo segment block and treat it as if the file is corrupt. Consult the trace file for this information. If in archivelog mode, restore the file & roll forward. If in Noarchivelog mode, restore from a cold backup taken before the error was reported. Alternatively, you can look at dba_rollback_segs data dictionary view. If the status column that describes what state the rollback segment is currently in is "needs recovery" then lookup the following article for posible solution. Note:28812.1 Rollback Segment Needs Recovery If the Known Issues section below does not help in terms of identifying a solution, please submit the trace files and alert.log to Oracle Support Services for further analysis.