联系:手机/微信(+86 17813235971) QQ(107644445)
标题:ORA-00600[kcratr1_lostwrt]/ORA-00600[3020]错误恢复
作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]
open数据库alert日志报ORA-00600[kcratr1_lostwrt]错误
Mon May 14 14:57:28 2012 ALTER DATABASE OPEN Mon May 14 14:57:29 2012 Beginning crash recovery of 1 threads Mon May 14 14:57:29 2012 Started redo scan Mon May 14 14:57:29 2012 Errors in file d:\oracle\admin\cqgasold\udump\cqgasold_ora_504.trc: ORA-00600: 内部错误代码,参数: [kcratr1_lostwrt], [], [], [], [], [], [], [] ORA-600 signalled during: alter database open...
查询相关SCN
同一个查询中SCN相同,省略
SQL> select file#,online_status "STATUS",to_char(change#,'9999999999999999') "SCN",
2 To_char(time,'yyyy-mm-dd hh24:mi:ss')"TIME" from v$recover_file;
未选定行
SQL> select file#,to_char(checkpoint_change#,'999999999999999') "SCN",
2 to_char(last_change#,'999999999999999')"STOP_SCN" from v$datafile;
FILE# SCN STOP_SCN
---------- ---------------- ----------------
1 7842987188
2 7842987188
3 7842987188
SQL> select file#,to_char(checkpoint_change#,'9999999999999999') "SCN",
2 to_char(RESETLOGS_CHANGE#,'9999999999999999') "RESETLOGS SCN"
3 from v$datafile_header;
FILE# SCN RESETLOGS SCN
---------- ----------------- -----------------
1 7842991811 1
2 7842991811 1
3 7842991811 1
这里看到奇怪现象datafile scn小于datafile_header scn,数据库异常断电一般来说也不会出现这样的情况,个人猜测是错误的恢复或者使用历史控制文件导致,对于这样的现状,我先尝试着使用using backup controlfile方式恢复,结果失败.估计控制文件有异常,本着先拉起库原则,重建控制文件.
进行完全恢复
SQL> recover database; ORA-00283: 恢复会话因错误而取消 ORA-00600: 内部错误代码,参数: [3020], [8388617], [1], [23403], [25], [112],[], [] ORA-10567: Redo is inconsistent with data block (file# 2, block# 9) ORA-10564: tablespace UNDOTBS1 ORA-01110: 数据文件 2: 'D:\ORACLE\ORADATA\CQGASOLD\UNDO_1.DBF' ORA-10560: block type 'KTU SMU HEADER BLOCK'
尝试跳过坏块继续恢复
SQL> recover database allow 1 corruption; ORA-00283: 恢复会话因错误而取消 ORA-00600: 内部错误代码,参数: [3020], [8388610], [1], [23403], [2264], [16],[], [] ORA-10567: Redo is inconsistent with data block (file# 2, block# 2) ORA-10564: tablespace UNDOTBS1 ORA-01110: 数据文件 2: 'D:\ORACLE\ORADATA\CQGASOLD\UNDO_1.DBF' ORA-10560: block type 'KTFB Bitmapped File Space Header'
使用dbv检查坏块数量
C:\>dbv file='d:\oracle\oradata\cqgasold\undo_1.dbf' blocksize=8192 DBVERIFY: Release 9.2.0.5.0 - Production on 星期二 5月 15 19:43:42 2012 Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved. DBVERIFY - 验证正在开始 : FILE = d:\oracle\oradata\cqgasold\undo_1.dbf DBV-00200: 块, dba 8388617, 已经标记为崩溃 汇入的页369 - 可能是介质损坏 *** Corrupt block relative dba: 0x00800171 (file 2, block 369) Fractured block found during dbv: Data in bad block - type: 2 format: 2 rdba: 0x00800171 last change scn: 0x0001.d37c57db seq: 0x1 flg: 0x04 consistency value in tail: 0x4e930260 check value in block header: 0x8202, computed block checksum: 0x4e5f spare1: 0x0, spare2: 0x0, spare3: 0x0 *** 汇入的页417 - 可能是介质损坏 *** Corrupt block relative dba: 0x008001a1 (file 2, block 417) Fractured block found during dbv: Data in bad block - type: 2 format: 2 rdba: 0x008001a1 last change scn: 0x0001.d37c53d4 seq: 0x2 flg: 0x04 consistency value in tail: 0x4b6b0201 check value in block header: 0x6ae7, computed block checksum: 0x5abc spare1: 0x0, spare2: 0x0, spare3: 0x0 *** ………… --类此记录很多,我放弃了跳过坏块修复的方法
恢复过程中提示坏块数据库文件离线恢复
SQL> alter database datafile 'd:\oracle\oradata\cqgasold\undo_1.dbf' offline; 数据库已更改。 SQL> recover database; 完成介质恢复。 SQL> alter database open; alter database open * ERROR 位于第 1 行: ORA-00604: 递归 SQL 层 1 出现错误 ORA-00376: 此时无法读取文件 2 ORA-01110: 数据文件 2: 'D:\ORACLE\ORADATA\CQGASOLD\UNDO_1.DBF'
到了这一步,根据经验,数据库被open的可能性很多了,很有可能是open以后因为smon回滚导致数据库down
查看日志,屏蔽回滚段,完成恢复
Tue May 15 19:59:52 2012 alter database open Tue May 15 19:59:52 2012 Beginning crash recovery of 1 threads Tue May 15 19:59:52 2012 Started redo scan Tue May 15 19:59:52 2012 Completed redo scan 323 redo blocks read, 82 data blocks need recovery Tue May 15 19:59:52 2012 Started recovery at Thread 1: logseq 23404, block 3, scn 0.0 Recovery of Online Redo Log: Thread 1 Group 4 Seq 23404 Reading mem 0 Mem# 0 errs 0: F:\ORACLE\ORADATA\LOGCQGASOLD4.ORA Tue May 15 19:59:52 2012 Completed redo application Tue May 15 19:59:52 2012 Ended recovery at Thread 1: logseq 23404, block 326, scn 1.3548264979 82 data blocks read, 82 data blocks written, 323 redo blocks read Crash recovery completed successfully Tue May 15 19:59:53 2012 Thread 1 advanced to log sequence 23405 Thread 1 opened at log sequence 23405 Current log# 2 seq# 23405 mem# 0: D:\ORACLE\ORADATA\CQGASOLD\REDO02.LOG Successful open of redo thread 1 Tue May 15 19:59:53 2012 SMON: enabling cache recovery SMON: enabling tx recovery Tue May 15 19:59:54 2012 Database Characterset is ZHS16GBK Tue May 15 19:59:55 2012 replication_dependency_tracking turned off (no async multimaster replication found) ORA-604 signalled during: alter database open... Tue May 15 19:59:56 2012 SMON: about to recover undo segment 1 SMON: mark undo segment 1 as needs recovery SMON: about to recover undo segment 2 SMON: mark undo segment 2 as needs recovery SMON: about to recover undo segment 3 SMON: mark undo segment 3 as needs recovery SMON: about to recover undo segment 4 SMON: mark undo segment 4 as needs recovery SMON: about to recover undo segment 5 SMON: mark undo segment 5 as needs recovery SMON: about to recover undo segment 6 SMON: mark undo segment 6 as needs recovery SMON: about to recover undo segment 7 SMON: mark undo segment 7 as needs recovery SMON: about to recover undo segment 8 SMON: mark undo segment 8 as needs recovery SMON: about to recover undo segment 9 SMON: mark undo segment 9 as needs recovery SMON: about to recover undo segment 10 SMON: mark undo segment 10 as needs recovery Tue May 15 20:00:37 2012 Shutting down instance (abort)
看到这里,可以大概确定是因为undo文件离线,导致回滚段异常.
这个问题,基本上可以确定通过隐含参数屏蔽回滚段,然后open数据库,重建undo删除异常undo,数据库恢复完成。
ORA-600 [3020] “Stuck Recovery” [ID 30866.1]
Applies to: Oracle Server - Enterprise Edition - Version 9.2.0.1 to 11.2.0.2 [Release 9.2 to 11.2] Information in this document applies to any platform. Symptoms Recovery session fails: SQL> recover database; ORA-00283: recovery session canceled due to errors ORA-00600: internal error code, arguments: [3020], [13204236], [1], [1], [33082], [236], [], [] ORA-10567: Redo is inconsistent with data block (file# 3, block# 621324) ORA-10564: tablespace DATA ORA-01110: data file 3: '<dir>/PROD_Data01.dbf' ORA-10561: block type 'TRANSACTION MANAGED DATA BLOCK', data object# 25535 Cause This is stuck recovery - redo cannot be applied to a block because the scn of the block, when read from disk is NOT the scn that we expect it to be. Solution If the affected file belongs to SYSTEM or UNDO tablespace you should restore the database from backup and do point in time recovery to just before the problem log is applied. If you do not have a backup available, than the options are VERY limited. Please open a Service Request with Oracle Support Services for assistance on this. If the affected file belongs to SYSAUX, this is the ONLY file affected and it is offline then a better option than point in time recovery of the whole database would be to create a NEW database and then use Transportable Tablespace feature to plug in all other tablespaces. Another point worth noting: if SYSAUX is ONLINE but contains corrupt blocks, then before considering restore and recovery, use the SQL in point 3 below to identify the affected object and raise a call with Oracle Support Services to ask if it is possible to drop and recreate the object (unpublished Note 333665.1). For Data Guard environments, refer to Note:1265884.1 Otherwise: 1. Use Trial Recovery to determine the extent of the problem: SQL> recover database test; This will tell you how many blocks (n) would be left corrupted after recovery - check the alert log for details of the blocks affected. If there are a large number of corruptions reported you may decide to restore from backup and issue point in time recovery. However, if only a few blocks are reported as corrupt you could proceed with recovery : 2. Skip the corrupted block(s) SQL> recover database allow 1 corruption; Do this <n> times, <n> being the number of blocks reported as corrupt in step 1 above. This will allow recovery to continue, 'skipping' the blocks that cannot be recovered and leaving them marked as 'corrupt' after which the database can be opened. 3. Take the corrupt blocks reported in the alert log and for each, identify the object that the block belongs to: SQL> SELECT tablespace_name, segment_type, owner, segment_name FROM dba_extents WHERE file_id = <file#> and <block> between block_id AND block_id + blocks - 1; Replacing <file#> and <block> with the file# and block ids reported in the alert log. For each object identified - take steps to resolve the corruption: - if it belongs to an index then simply drop and recreate the index - if it belong to a user object then consider recreating the object or extracting what you can from the object; if necessary raise a Service Request with Oracle and request assistance with extracting data from a corrupt object. Additional related documents : Note 1265884.1 Resolving ORA-752 or ORA-600 [3020] During Standby Recovery Note 283269.1 Stuck recovery of database ORA-00600[3020]600[3020]恢复过程alert日志记录