难见的oracle 9i恢复—2023年

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:难见的oracle 9i恢复—2023年

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

时过境迁,以前恢复大量oracle 8/9版本的库,现在一套oracle 9i的库都比较稀奇了.今天恢复客户一套9.2.0.6的aix环境rac库,通过分析确认主要问题:
1. 重建控制文件,resetlogs库遗漏数据文件
missing_dbf


2. 数据库启动主要报错ORA-600 2663和ORA-600 kclchkblk_4

Tue Nov  8 09:10:05 2022
Successfully onlined Undo Tablespace 1.
Dictionary check beginning
Tablespace 'TEMP' #2 found in data dictionary,
but not in the controlfile. Adding to controlfile.
File #84 found in data dictionary but not in controlfile.
Creating OFFLINE file 'MISSING00084' in the controlfile.
This file can no longer be recovered so it must be dropped.
Dictionary check complete
Tue Nov  8 09:10:05 2022
SMON: enabling tx recovery
Tue Nov  8 09:10:05 2022
Database Characterset is ZHS16GBK
Tue Nov  8 09:10:05 2022
Errors in file /u01/prod/proddb/9.2.0/admin/udump/prod1_ora_536662.trc:
ORA-00600: internal error code, arguments: [2663], [3301], [2638369768], [3301], [2640322622], [], [], []
Tue Nov  8 09:10:06 2022
Errors in file /u01/prod/proddb/9.2.0/admin/bdump/prod1_smon_647352.trc:
ORA-00600: internal error code, arguments: [kclchkblk_4], [3301], [18446744072061740072],[3301],[18446744072052954088]
Tue Nov  8 09:10:06 2022
Errors in file /u01/prod/proddb/9.2.0/admin/udump/prod1_ora_536662.trc:
ORA-00600: internal error code, arguments: [2663], [3301], [2638369768], [3301], [2640322622], [], [], []
Error 600 happened during db open, shutting down database
USER: terminating instance due to error 600
Instance terminated by USER, pid = 536662
ORA-1092 signalled during: alter database open...

根据客户文件名称的规则,推算出来84号文件实际的文件名(因为使用的是lv[aix的hacmp管理的lv的裸设备方式]),通过dbv确认文件无坏块

DBVERIFY: Release 9.2.0.6.0 - Production on Sat May 13 16:44:09 2023

Copyright (c) 1982, 2002, Oracle Corporation.  All rights reserved.

DBVERIFY - Verification starting : FILE = /dev/ra_txn_ind12.dbf


DBVERIFY - Verification complete

Total Pages Examined         : 256000
Total Pages Processed (Data) : 0
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 299
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 13
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 255688
Total Pages Marked Corrupt   : 0
Total Pages Influx           : 0
Highest block SCN            : 11177081099136 (2602.1576194944)

bbed验证文件该文件是否是84号文件

$ bbed blocksize=8192 filename='/dev/ra_txn_ind12.dbf'   
Password: 

BBED: Release 2.0.0.0.0 - Limited Production on Mon May 15 09:45:44 2023

Copyright (c) 1982, 2002, Oracle Corporation.  All rights reserved.

************* !!! For Oracle Internal Use only !!! ***************

BBED> map
 File: /dev/ra_txn_ind12.dbf (0)
 Block: 1                                     Dba:0x00000000
------------------------------------------------------------
 Data File Header

 struct kcvfh, 608 bytes                    @0       

 ub4 tailchk                                @8188    


BBED> p kcvfh
struct kcvfh, 608 bytes                     @0       
   struct kcvfhbfh, 20 bytes                @0       
      ub1 type_kcbh                         @0        0x0b
      ub1 frmt_kcbh                         @1        0x02
      ub1 spare1_kcbh                       @2        0x00
      ub1 spare2_kcbh                       @3        0x00
      ub4 rdba_kcbh                         @4        0x15000001
      ub4 bas_kcbh                          @8        0x00000000
      ub2 wrp_kcbh                          @12       0x0000
      ub1 seq_kcbh                          @14       0x01
      ub1 flg_kcbh                          @15       0x04 (KCBHFCKV)
      ub2 chkval_kcbh                       @16       0x1b4a
      ub2 spare3_kcbh                       @18       0x0000
   struct kcvfhhdr, 76 bytes                @20      
      ub4 kccfhswv                          @20       0x09200000
      ub4 kccfhcvn                          @24       0x08000000
      ub4 kccfhdbi                          @28       0x05d15ccf
      ……
      ub4 kccfhcsq                          @40       0x00525a20
      ub4 kccfhfsz                          @44       0x0003e800
      s_blkz kccfhbsz                       @48       0x00
      ub2 kccfhfno                          @52       0x0054
      ub2 kccfhtyp                          @54       0x0003
   ……
   ub4 kcvfhrfn                             @528      0x00000054  ---确认是84号文件
  ……

通过bbed修改文件相关信息,然后尝试rename文件,但是recover datafile 84报错

Mon May 15 09:49:44 2023
alter database rename file '/u01/prod/proddb/9.2.0/dbs/MISSING00084' to '/dev/ra_txn_ind12.dbf'
Mon May 15 09:49:44 2023
Completed: alter database rename file '/u01/prod/proddb/9.2.0
Mon May 15 09:51:15 2023
ALTER DATABASE RECOVER  datafile 84  
Media Recovery Start
Mon May 15 09:51:15 2023
Errors in file /u01/prod/proddb/9.2.0/admin/udump/prod1_ora_467190.trc:
ORA-07445: exception encountered: core dump [] [] [] [] [] []

通过处理之后,数据库recover 正常,但是open报ORA-600 4193错误

Mon May 15 09:57:53 2023
ALTER DATABASE RECOVER  DATABASE  
Media Recovery Start
Mon May 15 09:57:53 2023
Recovery of Online Redo Log: Thread 1 Group 1 Seq 4 Reading mem 0
  Mem# 0 errs 0: /dev/rlog01a.dbf
  Mem# 1 errs 0: /dev/rlog01b.dbf
Media Recovery Complete
Completed: ALTER DATABASE RECOVER  DATABASE  
Mon May 15 09:59:24 2023
alter database open
Mon May 15 09:59:24 2023
Beginning crash recovery of 1 threads
Mon May 15 09:59:24 2023
Started redo scan
Mon May 15 09:59:24 2023
Completed redo scan
 75 redo blocks read, 0 data blocks need recovery
Mon May 15 09:59:24 2023
Started recovery at
 Thread 1: logseq 4, block 2, scn 3301.2638369687
Mon May 15 09:59:24 2023
Recovery of Online Redo Log: Thread 1 Group 1 Seq 4 Reading mem 0
  Mem# 0 errs 0: /dev/rlog01a.dbf
  Mem# 1 errs 0: /dev/rlog01b.dbf
Mon May 15 09:59:24 2023
Completed redo application
Mon May 15 09:59:24 2023
Ended recovery at
 Thread 1: logseq 4, block 77, scn 3301.2638389765
 0 data blocks read, 0 data blocks written, 75 redo blocks read
Crash recovery completed successfully
Mon May 15 09:59:25 2023
Thread 1 advanced to log sequence 5
Thread 1 opened at log sequence 5
  Current log# 2 seq# 5 mem# 0: /dev/rlog02a.dbf
  Current log# 2 seq# 5 mem# 1: /dev/rlog02b.dbf
Successful open of redo thread 1
Mon May 15 09:59:25 2023
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Mon May 15 09:59:25 2023
SMON: enabling cache recovery
Mon May 15 09:59:25 2023
ARC0: Media recovery disabled
Mon May 15 09:59:25 2023
Successfully onlined Undo Tablespace 1.
Dictionary check beginning
Tablespace 'TEMP' #2 found in data dictionary,
but not in the controlfile. Adding to controlfile.
Dictionary check complete
Mon May 15 09:59:25 2023
SMON: enabling tx recovery
Mon May 15 09:59:25 2023
Database Characterset is ZHS16GBK
Mon May 15 09:59:25 2023
Errors in file /u01/prod/proddb/9.2.0/admin/bdump/prod1_smon_413872.trc:
ORA-00600: internal error code, arguments: [4193], [781], [6399], [], [], [], [], []
Mon May 15 09:59:25 2023
Errors in file /u01/prod/proddb/9.2.0/admin/udump/prod1_ora_844004.trc:
ORA-00600: internal error code, arguments: [4193], [56042], [1895], [], [], [], [], []
Mon May 15 09:59:26 2023
Doing block recovery for fno: 12 blk: 153
Mon May 15 09:59:26 2023
Doing block recovery for fno: 12 blk: 2893
Mon May 15 09:59:26 2023
Recovery of Online Redo Log: Thread 1 Group 2 Seq 5 Reading mem 0
  Mem# 0 errs 0: /dev/rlog02a.dbf
Mon May 15 09:59:26 2023
Recovery of Online Redo Log: Thread 1 Group 2 Seq 5 Reading mem 0
Mon May 15 09:59:26 2023
  Mem# 1 errs 0: /dev/rlog02b.dbf
Mon May 15 09:59:26 2023
  Mem# 0 errs 0: /dev/rlog02a.dbf
  Mem# 1 errs 0: /dev/rlog02b.dbf
Doing block recovery for fno: 12 blk: 3009
Mon May 15 09:59:26 2023
Recovery of Online Redo Log: Thread 1 Group 2 Seq 5 Reading mem 0
  Mem# 0 errs 0: /dev/rlog02a.dbf
  Mem# 1 errs 0: /dev/rlog02b.dbf
Mon May 15 09:59:26 2023
Doing block recovery for fno: 12 blk: 89
Mon May 15 09:59:26 2023
Recovery of Online Redo Log: Thread 1 Group 2 Seq 5 Reading mem 0
  Mem# 0 errs 0: /dev/rlog02a.dbf
  Mem# 1 errs 0: /dev/rlog02b.dbf
Mon May 15 09:59:26 2023
Errors in file /u01/prod/proddb/9.2.0/admin/udump/prod1_ora_844004.trc:
ORA-00607: Internal error occurred while making a change to a data block
ORA-00600: internal error code, arguments: [4193], [56042], [1895], [], [], [], [], []
Error 607 happened during db open, shutting down database
USER: terminating instance due to error 607
Instance terminated by USER, pid = 844004
ORA-1092 signalled during: alter database open...

绕过该错误之后,数据库启动报ORA-600 2662错误

$ sqlplus "/ as sysdba"

SQL*Plus: Release 9.2.0.6.0 - Production on Mon May 15 10:04:44 2023

Copyright (c) 1982, 2002, Oracle Corporation.  All rights reserved.

Connected to an idle instance.

SQL> startup mount pfile='/tmp/pfile'
ORACLE instance started.

Total System Global Area 1102023336 bytes
Fixed Size                   744104 bytes
Variable Size             922746880 bytes
Database Buffers          167772160 bytes
Redo Buffers               10760192 bytes
Database mounted.
SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-03113: end-of-file on communication channel
Mon May 15 10:05:03 2023
SMON: enabling cache recovery
Mon May 15 10:05:03 2023
ARC0: Media recovery disabled
Mon May 15 10:05:03 2023
SMON: enabling tx recovery
Mon May 15 10:05:03 2023
Database Characterset is ZHS16GBK
Mon May 15 10:05:03 2023
Errors in file /u01/prod/proddb/9.2.0/admin/bdump/prod1_smon_413880.trc:
ORA-00600: internal error code, arguments: [2662], [3301], [2638409995], [3301], [2644132966], [4195678]
Mon May 15 10:05:04 2023
Non-fatal internal error happenned while SMON was doing temporary segment drop.
SMON encountered 1 out of maximum 100 non-fatal internal errors.
Mon May 15 10:05:04 2023
Errors in file /u01/prod/proddb/9.2.0/admin/bdump/prod1_smon_413880.trc:
ORA-00600: internal error code, arguments: [2662], [3301], [2638409998], [3301], [2644132966], [4195678]
Mon May 15 10:05:04 2023
Errors in file /u01/prod/proddb/9.2.0/admin/bdump/prod1_smon_413880.trc:
ORA-00600: internal error code, arguments: [2662], [3301], [2638409998], [3301], [2644132966], [4195678]
SMON: terminating instance due to error 600
Instance terminated by SMON, pid = 413880

解决该错误之后,数据库open正常

$ sqlplus "/ as sysdba"

SQL*Plus: Release 9.2.0.6.0 - Production on Mon May 15 10:10:30 2023

Copyright (c) 1982, 2002, Oracle Corporation.  All rights reserved.

Connected to an idle instance.

SQL> startup mount pfile='/tmp/pfile'
ORACLE instance started.

Total System Global Area 1102023336 bytes
Fixed Size                   744104 bytes
Variable Size             922746880 bytes
Database Buffers          167772160 bytes
Redo Buffers               10760192 bytes
Database mounted.
SQL> alter database open;

Database altered.

逻辑方式导出数据,本次恢复任务基本完成.
以前有过的类似恢复案例(类似较多选择典型几个):
ORA-600 2663
ORA-600 2663 故障恢复
ORA-600 2662
ora-600 2662和ora-600 kclchkblk_4恢复
redo异常 ORA-600 kclchkblk_4 故障恢复
ORA-600 4193 错误说明和解决
ORA-00600 [2662]和ORA-00600 [4194]恢复

存储重启,oracle无法启动故障处理

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:存储重启,oracle无法启动故障处理

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

有客户由于机房要停电,正常关闭两个节点数据库,通过数据库alert日志均可看到类似如下记录,证明数据库确实是正常shutdown immediate
1
2
然后关闭存储,启动存储之后发现数据库无法正常启动(数据scn不一致).相关信息如下:
20220707175525


最初报ORA-214错

that ORACLE_BASE be set in the environment
Wed Jul 06 00:50:02 2022
ALTER SYSTEM SET local_listener=' (ADDRESS=(PROTOCOL=TCP)(HOST=10.10.10.10)(PORT=1521))' SCOPE=MEMORY SID='xffdb2';
ALTER DATABASE MOUNT /* db agent *//* {1:42392:203} */
This instance was first to mount
NOTE: Loaded library: System 
SUCCESS: diskgroup DATA1 was mounted
SUCCESS: diskgroup DATA2 was mounted
ORA-214 signalled during: ALTER DATABASE MOUNT /* db agent *//* {1:42392:203} */...
NOTE: dependency between database xffdb and diskgroup resource ora.DATA1.dg is established
NOTE: dependency between database xffdb and diskgroup resource ora.DATA2.dg is established

提示ctl不存在,通过处理之后报ORA-600 2131错误

Wed Jul 06 01:55:45 2022
ALTER SYSTEM SET local_listener=' (ADDRESS=(PROTOCOL=TCP)(HOST=10.10.10.10)(PORT=1521))' SCOPE=MEMORY SID='xffdb2';
ALTER DATABASE MOUNT /* db agent *//* {1:42392:663} */
This instance was first to mount
NOTE: Loaded library: System 
SUCCESS: diskgroup DATA1 was mounted
SUCCESS: diskgroup DATA2 was mounted
NOTE: dependency between database xffdb and diskgroup resource ora.DATA1.dg is established
NOTE: dependency between database xffdb and diskgroup resource ora.DATA2.dg is established
Errors in file /u01/app/oracle/diag/rdbms/xffdb/xffdb2/trace/xffdb2_ora_47746.trc  (incident=576488):
ORA-00600: internal error code, arguments: [2131], [33], [32], [], [], [], [], [], [], [], [], []
Incident details in: /u01/app/oracle/diag/rdbms/xffdb/xffdb2/incident/incdir_576488/xffdb2_ora_47746_i576488.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
ORA-600 signalled during: ALTER DATABASE MOUNT /* db agent *//* {1:42392:663} */...

重建控制文件后恢复报错

Parallel Media Recovery started with 127 slaves
ORA-279 signalled during: ALTER DATABASE RECOVER  database using BACKUP CONTROLFILE  ...
Wed Jul 06 02:41:04 2022
ALTER DATABASE RECOVER    LOGFILE '+DATA3/xffdb/archivelog/2022_07_05/thread_2_seq_40889.18030.1109269215'  
Media Recovery Log +DATA3/xffdb/archivelog/2022_07_05/thread_2_seq_40889.18030.1109269215
Wed Jul 06 02:41:04 2022
Errors with log +DATA3/xffdb/archivelog/2022_07_05/thread_2_seq_40889.18030.1109269215
Errors in file /u01/app/oracle/diag/rdbms/xffdb/xffdb1/trace/xffdb1_pr00_96503.trc:
ORA-00325: archived log for thread 1, wrong thread # 2 in header
ORA-00334: archived log: '+DATA3/xffdb/archivelog/2022_07_05/thread_2_seq_40889.18030.1109269215'
ORA-325 signalled during: ALTER DATABASE RECOVER    LOGFILE '+DATA3/thread_2_seq_40889.18030.1109269215'  ...
ALTER DATABASE RECOVER CANCEL 
Media Recovery Canceled
Completed: ALTER DATABASE RECOVER CANCEL 
…………
Wed Jul 06 02:22:25 2022
ALTER DATABASE RECOVER  DATABASE  
Media Recovery Start
 started logmerger process
Only allocated 127 recovery slaves (requested 128)
Parallel Media Recovery started with 127 slaves
Wed Jul 06 02:22:28 2022
Errors in file /u01/app/oracle/diag/rdbms/xffdb/xffdb1/trace/xffdb1_pr00_77044.trc:
ORA-00313: open failed for members of log group 7 of thread 1
Media Recovery failed with error 313
Errors in file /u01/app/oracle/diag/rdbms/xffdb/xffdb1/trace/xffdb1_pr00_77044.trc:
ORA-00283: recovery session canceled due to errors
ORA-00313: open failed for members of log group 7 of thread 1
Wed Jul 06 02:22:28 2022
Errors in file /u01/app/oracle/diag/rdbms/xffdb/xffdb1/trace/xffdb1_m000_77318.trc:
ORA-00322: log 4 of thread 2 is not current copy
ORA-00312: online log 4 thread 2: '+DATA3/xffdb/onlinelog/group_4.16148.1107795635'
Errors in file /u01/app/oracle/diag/rdbms/xffdb/xffdb1/trace/xffdb1_m000_77318.trc:
ORA-00322: log 7 of thread 1 is not current copy
ORA-00312: online log 7 thread 1: '+DATA3/xffdb/onlinelog/group_7.18959.1107796013'
Errors in file /u01/app/oracle/diag/rdbms/xffdb/xffdb1/trace/xffdb1_m000_77318.trc:
ORA-00314: log 9 of thread 1, expected sequence# 133495 doesn't match 133490
ORA-00312: online log 9 thread 1: '+DATA3/xffdb/onlinelog/group_9.3142.1107796071'
Checker run found 208 new persistent data failures
ORA-10877 signalled during: ALTER DATABASE RECOVER  DATABASE  ...
…………
Only allocated 127 recovery slaves (requested 128)
Parallel Media Recovery started with 127 slaves
ORA-279 signalled during: ALTER DATABASE RECOVER  database using backup controlfile  ...
Wed Jul 06 06:15:26 2022
ALTER DATABASE RECOVER    LOGFILE '+DATA3/xffdb/onlinelog/group_4.16442.1107795653'  
Media Recovery Log +DATA3/xffdb/onlinelog/group_4.16442.1107795653
ORA-279 signalled during: ALTER DATABASE RECOVER    LOGFILE '+DATA3/xffdb/onlinelog/group_4.16442.1107795653'  ...
Wed Jul 06 06:15:43 2022
ALTER DATABASE RECOVER    LOGFILE '+DATA3/xffdb/onlinelog/group_7.18959.1107796013'  
Media Recovery Log +DATA3/xffdb/onlinelog/group_7.18959.1107796013
Wed Jul 06 06:15:50 2022
Errors with log +DATA3/xffdb/onlinelog/group_7.18959.1107796013
Wed Jul 06 06:15:50 2022
Errors in file /u01/app/oracle/diag/rdbms/xffdb/xffdb1/trace/xffdb1_pr29_306479.trc  (incident=961030):
ORA-00600: internal error code, arguments: [6102], [13], [17], [], [], [], [], [], [], [], [], []
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Wed Jul 06 06:15:54 2022
Sweep [inc][961030]: completed
Sweep [inc2][961030]: completed
Slave exiting with ORA-10562 exception
Errors in file /u01/app/oracle/diag/rdbms/xffdb/xffdb1/trace/xffdb1_pr29_306479.trc:
ORA-10562: Error occurred while applying redo to data block (file# 159, block# 3591756)
ORA-10564: tablespace LIS
ORA-01110: data file 159: '+DATA1/xffdb/datafile/lis.379.1080445903'
ORA-10561: block type 'TRANSACTION MANAGED INDEX BLOCK', data object# 138875
ORA-00600: internal error code, arguments: [6102], [13], [17], [], [], [], [], [], [], [], [], []
Wed Jul 06 06:15:59 2022
Recovery Slave PR29 previously exited with exception 10562

基于上述情况,很可能是由于存储重启之后,cache或者某些数据没有写入到数据文件和redo中,数据库重启之后redo不是最新的[ORA-00322错误可以证明,],数据文件也需要进行恢复(不是数据库正常关闭之后该有的情况),而且redo和数据文件还不一致[ORA-00600 6102可以证明],对于类似这样的情况,只能尝试强制打开数据库,报ORA-600 2663

SQL> alter database open resetlogs;
alter database open resetlogs
*
ERROR at line 1:
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00600: internal error code, arguments: [2663], [1393], [4159455578],
[1393], [4160374753], [], [], [], [], [], [], []
Process ID: 357910
Session ID: 1585 Serial number: 7
Wed Jul 06 06:57:25 2022
SMON: enabling cache recovery
Errors in file /u01/app/oracle/diag/rdbms/xffdb/xffdb1/trace/xffdb1_ora_357910.trc  (incident=1056360):
ORA-00600: internal error code, arguments: [2663], [1393],[4159455578],[1393],[4160374753],[], [], [], []
Redo thread 2 internally disabled at seq 1 (CKPT)
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Errors in file /u01/app/oracle/diag/rdbms/xffdb/xffdb1/trace/xffdb1_arc3_360348.trc:
ORA-00600: internal error code, arguments: [ORA_NPI_ERROR],[600], 
  [ORA-00600: internal error code, arguments: [kffbAddBlk04]
Unable to create archive log file '+DATA3'
ARC3: Error 19504 Creating archive log file to '+DATA3'
ARCH: Archival error occurred on a closed thread. Archiver continuing
ORACLE Instance xffdb1 - Archival Error. Archiver continuing.
ARCH: Archival error occurred on a closed thread. Archiver continuing
ORACLE Instance xffdb1 - Archival Error. Archiver continuing.
Wed Jul 06 06:57:34 2022
Errors in file /u01/app/oracle/diag/rdbms/xffdb/xffdb1/incident/incdir_1056360/xffdb1_ora_357910_i1056360.trc:
ORA-00339: archived log does not contain any redo
ORA-00334: archived log: '+DATA1/xffdb/onlinelog/group_4.424.1109314453'
ORA-00600: internal error code, arguments: [2663], [1393], [4159455578], [1393], [4160374753], [], [],
Wed Jul 06 06:57:34 2022
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Errors in file /u01/app/oracle/diag/rdbms/xffdb/xffdb1/trace/xffdb1_ora_357910.trc:
ORA-00600: internal error code, arguments: [2663], [1393], [4159455578], [1393], [4160374753], [], [], []
Errors in file /u01/app/oracle/diag/rdbms/xffdb/xffdb1/trace/xffdb1_ora_357910.trc:
ORA-00600: internal error code, arguments: [2663], [1393], [4159455578], [1393], [4160374753], [], [], []
Error 600 happened during db open, shutting down database
USER (ospid: 357910): terminating the instance due to error 600
Instance terminated by USER, pid = 357910
ORA-1092 signalled during: alter database open resetlogs...
opiodr aborting process unknown ospid (357910) as a result of ORA-1092
Wed Jul 06 06:57:35 2022
ORA-1092 : opitsk aborting process

该错误比较常见,参考:ORA-600 2663,也可以利用我的Patch_SCN小工具快速解决,后续数据库报ORA-03113错

SQL> alter database open ;
alter database open 
*
ERROR at line 1:
ORA-03113: end-of-file on communication channel
Process ID: 369324
Session ID: 1585 Serial number: 1

查看alert日志,确认具体报错为kgegpa

Successful open of redo thread 1
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Wed Jul 06 07:17:08 2022
SMON: enabling cache recovery
ARC1: Archiving disabled thread 2 sequence 1
Archived Log entry 1 added for thread 1 sequence 1 ID 0x36317f52 dest 1:
Archived Log entry 2 added for thread 1 sequence 2 ID 0x36317f52 dest 1:
Archived Log entry 3 added for thread 2 sequence 1 ID 0x0 dest 1:
Exception [type:SIGSEGV, Address not mapped to object][ADDR:0x4D562123][PC:0x983CDD6,kgegpa()+40][flags: 0x0,count:1]
Exception [type:SIGSEGV, Address not mapped to object][ADDR:0x4D562123][PC:0x983B84A, kgebse()+776][flags: 0x2,count:2]
Exception [type:SIGSEGV, Address not mapped to object][ADDR:0x4D562123][PC:0x983B84A, kgebse()+776][flags: 0x2,count:2]
Wed Jul 06 07:17:11 2022
PMON (ospid: 377647): terminating the instance due to error 397

该问题有过类似的案例通过处理数据库open成功:
在数据库恢复遭遇ORA-07445 kgegpa错误
Exception [type: SIGSEGV, Address not mapped to object] [] [ kgegpa()+36]

ORA-600 3020错误引起ORA-600 2663

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:ORA-600 3020错误引起ORA-600 2663

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

数据库recover异常ORA-600 3020

SQL> recover database using backup controlfile until cancel;
ORA-00279: change 5693717234723 generated at 01/19/2021 10:44:52 needed for
thread 1
ORA-00289: suggestion : +RECOVER/arch/1_294845_938895110.dbf
ORA-00280: change 5693717234723 for thread 1 is in sequence #294845


Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
+BACKUP/xifenfei/onlinelog/group_5.258.973180257
ORA-00279: change 5693717234723 generated at 01/15/2021 11:41:15 needed for
thread 2
ORA-00289: suggestion : +RECOVER/arch/2_336576_938895110.dbf
ORA-00280: change 5693717234723 for thread 2 is in sequence #336576


Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
+DATA1/xifenfei/onlinelog/group_8.298.962885887
ORA-00600: internal error code, arguments: [3020], [128], [248606],
[537119518], [], [], [], [], [], [], [], []
ORA-10567: Redo is inconsistent with data block (file# 128, block# 248606, file
offset is 2036580352 bytes)
ORA-10564: tablespace UNDOTBS1
ORA-01110: data file 128: '+DATA1/xifenfei/datafile/undotbs1_02.dbf'
ORA-10560: block type 'KTU UNDO BLOCK'


ORA-01112: media recovery not started

这个错误比较简单,一般是允许坏块继续恢复

SQL> recover database using backup controlfile allow 1 corruption;
ORA-00279: change 5693717234839 generated at 01/19/2021 10:44:52 needed for
thread 1
ORA-00289: suggestion : +RECOVER/arch/1_294845_938895110.dbf
ORA-00280: change 5693717234839 for thread 1 is in sequence #294845


Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
+BACKUP/xifenfei/onlinelog/group_5.258.973180257
ORA-00279: change 5693717234839 generated at 01/15/2021 11:41:15 needed for
thread 2
ORA-00289: suggestion : +RECOVER/arch/2_336576_938895110.dbf
ORA-00280: change 5693717234839 for thread 2 is in sequence #336576


Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
+DATA1/xifenfei/onlinelog/group_8.298.962885887
ORA-00279: change 5693717637654 generated at 01/19/2021 10:47:25 needed for
thread 1
ORA-00289: suggestion : +RECOVER/arch/1_294846_938895110.dbf
ORA-00280: change 5693717637654 for thread 1 is in sequence #294846
ORA-00278: log file '+BACKUP/xifenfei/onlinelog/group_5.258.973180257' no longer
needed for this recovery


Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
+RECOVER/xifenfei/onlinelog/group_3.258.973180321
ORA-00279: change 5693717705759 generated at 01/19/2021 10:48:07 needed for
thread 1
ORA-00289: suggestion : +RECOVER/arch/1_294847_938895110.dbf
ORA-00280: change 5693717705759 for thread 1 is in sequence #294847
ORA-00278: log file '+RECOVER/xifenfei/onlinelog/group_3.258.973180321' no
longer needed for this recovery


Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
+BACKUP/xifenfei/onlinelog/group_7.265.973181365
Log applied.
Media recovery complete.

后续重建ctl,尝试recover库,报ORA-10877错误

SQL> startup mount pfile='/tmp/pfile'
ORACLE instance started.

Total System Global Area 1.0088E+10 bytes
Fixed Size		    2261928 bytes
Variable Size		 2181041240 bytes
Database Buffers	 7851737088 bytes
Redo Buffers		   53149696 bytes
Database mounted.
SQL> recover database;
ORA-10877: error signaled in parallel recovery slave


--对应的alert日志
Wed Jan 20 13:34:04 2021
ALTER DATABASE RECOVER  database  
Media Recovery Start
 started logmerger process
Parallel Media Recovery started with 64 slaves
Wed Jan 20 13:34:06 2021
Errors in file /u01/app/oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_pr00_50593.trc:
ORA-00313: open failed for members of log group 7 of thread 1
Media Recovery failed with error 313
Errors in file /u01/app/oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_pr00_50593.trc:
ORA-00283: recovery session canceled due to errors
ORA-00313: open failed for members of log group 7 of thread 1
ORA-10877 signalled during: ALTER DATABASE RECOVER  database  ...

resetlogs失败open数据库失败,ORA-600 2663

Wed Jan 20 13:42:34 2021
Setting recovery target incarnation to 2
Initializing SCN for created control file
Database SCN compatibility initialized to 3
Warning - High Database SCN: Current SCN value is 5693718057561, threshold SCN value is 0
If you have not previously reported this warning on this database, please notify Oracle Support so that additional diagnosis can be performed.
Wed Jan 20 13:42:35 2021
Assigning activation ID 3801294256 (0xe29325b0)
Thread 1 opened at log sequence 1
  Current log# 1 seq# 1 mem# 0: +RECOVER/xifenfei/onlinelog/group_1.260.973179783
  Current log# 1 seq# 1 mem# 1: +BACKUP/xifenfei/onlinelog/group_1.260.973179787
Successful open of redo thread 1
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Wed Jan 20 13:42:35 2021
SMON: enabling cache recovery
Errors in file /u01/app/oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_ora_52800.trc  (incident=189187):
ORA-00600: internal error code, arguments: [2663], [1325], [2886390384], [1325], [2886403118], [], [], [], [], [], [], []
Incident details in: /u01/app/oracle/diag/rdbms/xifenfei/xifenfei1/incident/incdir_189187/xifenfei1_ora_52800_i189187.trc
Wed Jan 20 13:42:38 2021
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Errors in file /u01/app/oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_ora_52800.trc:
ORA-00600: internal error code, arguments: [2663], [1325], [2886390384], [1325], [2886403118], [], [], [], [], [], [], []
Errors in file /u01/app/oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_ora_52800.trc:
ORA-00600: internal error code, arguments: [2663], [1325], [2886390384], [1325], [2886403118], [], [], [], [], [], [], []
Error 600 happened during db open, shutting down database
USER (ospid: 52800): terminating the instance due to error 600

这个错误比较明显,由于scn的异常导致,通过调整scn,数据库正常open成功,然后使用hcheck检查数据库字典一致(运气不错),没有太大问题,后续建议客户进行逻辑迁移
20210121234330


ORA-600 2663

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:ORA-600 2663

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

在大家熟悉的ORA-600 2662错误中还有一个类似的ORA-600 2663的错误,以下是对该2663的解释说明

ERROR:
  Format: ORA-600 [2663] [a] [b] {c} [d]
VERSIONS:
  versions 10.1 to 11.1
DESCRIPTION:
  A data block SCN is ahead of the current SCN.
  The ORA-600 [2663] occurs when an SCN is compared to the dependent SCN
  stored in a UGA variable.
  If the SCN is less than the dependent SCN then we signal the ORA-600 [2663]
  internal error.
ARGUMENTS:
  Arg [a]  Current SCN WRAP
  Arg [b]  Current SCN BASE
  Arg {c}  dependent SCN WRAP
  Arg [d]  dependent SCN BASE
FUNCTIONALITY:
  Kernel Cache Redo File Redo Generation
IMPACT:
  INSTANCE FAILURE
  POSSIBLE PHYSICAL CORRUPTION
SUGGESTIONS:
  There are different situations where ORA-600 [2663] can be raised.
  It can be raised on startup or duing database operation.
  If not using RAC, Real Application Clusters,, check that 2 instances
  have not mounted the same database.
  Check for SMON traces and have the alert.log and trace files ready
  to send to support.
  Check the SCN difference [argument d]-[argument b].
  If the SCNs in the error are very close, then try to shutdown and startup
  the instance several times.
  In some situations, the SCN increment during startup may permit the
  database to open. Keep track of the number of times you attempted a
  startup.

SYSTEM表空间坏块恢复—C_TS#对象坏块恢复(file 1 block 60)

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:SYSTEM表空间坏块恢复—C_TS#对象坏块恢复(file 1 block 60)

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

一朋友给我电话,说他们客户公司数据库故障,被另外一家公司恢复了一天不能正常恢复,请求我协助解决.接手一看数据库已经被破坏的不像样子了,根据alert日志信息大概分析了故障原因和上家公司处理情况。后面接手后通过bbed修复block数据库恢复过程,在本次恢复中出现大量ORA-600错误,主要包括ORA-00600 400,ORA-00600 2662,ORA-00600 2663,ORA-00600 krhpfh_03-1209,ORA-00600 3600,ORA-00600 ktsitbs_info1,ORA-00600 4137,ORA-00600 4511,ORA-00600 4198,ORA-00600 6807等
故障原因redo文件丢失

Thu Nov 20 11:28:39 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_lgwr_1404.trc:
ORA-00313: open failed for members of log group 7 of thread 1
ORA-00312: online log 9 thread 1: '/data2/oradata/redo0902.log'
ORA-27037: unable to obtain file status
SVR4 Error: 2: No such file or directory
Additional information: 3
Thu Nov 20 11:28:39 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_lgwr_1404.trc:
ORA-00313: open failed for members of log group 7 of thread 1
ORA-00312: online log 9 thread 1: '/data2/oradata/redo0902.log'
ORA-27037: unable to obtain file status
SVR4 Error: 2: No such file or directory
Additional information: 3
Thu Nov 20 11:28:39 2014
LGWR: terminating instance due to error 313
Thu Nov 20 11:28:39 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_pmon_1394.trc:
ORA-00313: open failed for members of log group  of thread
Thu Nov 20 11:28:39 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_ckpt_1406.trc:
ORA-00313: open failed for members of log group  of thread
Instance terminated by LGWR, pid = 1404

尝试clear redo文件方式恢复

Thu Nov 20 13:04:16 2014
alter database clear logfile group 9
Thu Nov 20 13:04:16 2014
ORA-1624 signalled during: alter database clear logfile group 9...
Thu Nov 20 13:04:45 2014
alter database clear logfile group 9
Thu Nov 20 13:04:46 2014
ORA-1624 signalled during: alter database clear logfile group 9...
Thu Nov 20 13:04:59 2014
alter database clear unarchived logfile group 9
Thu Nov 20 13:04:59 2014
ORA-1624 signalled during: alter database clear unarchived logfile group 9...
Thu Nov 20 13:05:00 2014
alter database clear unarchived logfile group 9
Thu Nov 20 13:05:00 2014
ORA-1624 signalled during: alter database clear unarchived logfile group 9...

不完全恢复resetlogs尝试打开数据库

ORA-279 signalled during: ALTER DATABASE RECOVER  database using backup controlfile  ...
Thu Nov 20 13:49:01 2014
ALTER DATABASE RECOVER    CONTINUE DEFAULT
Thu Nov 20 13:49:02 2014
Media Recovery Log /opt/oracle/flash_recovery_area/xifenfei/archivelog/2014_11_20/o1_mf_1_285999_%u_.arc
Errors with log /opt/oracle/flash_recovery_area/xifenfei/archivelog/2014_11_20/o1_mf_1_285999_%u_.arc
ORA-308 signalled during: ALTER DATABASE RECOVER    CONTINUE DEFAULT  ...
Thu Nov 20 13:49:02 2014
ALTER DATABASE RECOVER    CONTINUE DEFAULT
Thu Nov 20 13:49:02 2014
Media Recovery Log /opt/oracle/flash_recovery_area/xifenfei/archivelog/2014_11_20/o1_mf_1_285999_%u_.arc
Errors with log /opt/oracle/flash_recovery_area/xifenfei/archivelog/2014_11_20/o1_mf_1_285999_%u_.arc
ORA-308 signalled during: ALTER DATABASE RECOVER    CONTINUE DEFAULT  ...
Thu Nov 20 13:49:02 2014
ALTER DATABASE RECOVER CANCEL
Thu Nov 20 13:49:03 2014
Media Recovery Canceled
Completed: ALTER DATABASE RECOVER CANCEL
Thu Nov 20 13:49:33 2014
alter database open resetlogs
Thu Nov 20 13:49:34 2014
ORA-1113 signalled during: alter database open resetlogs...

使用隐含参数

_allow_resetlogs_corruption= TRUE

进行不完全恢复,尝试open数据库报ORA-600 4000错误

Thu Nov 20 14:35:02 2014
ALTER DATABASE   MOUNT
Thu Nov 20 14:35:07 2014
Setting recovery target incarnation to 2
Thu Nov 20 14:35:07 2014
Successful mount of redo thread 1, with mount id 4039504598
Thu Nov 20 14:35:07 2014
Database mounted in Exclusive Mode
Completed: ALTER DATABASE   MOUNT
Thu Nov 20 14:40:33 2014
ALTER DATABASE RECOVER  database until cancel
Thu Nov 20 14:40:33 2014
Media Recovery Start
Thu Nov 20 14:40:33 2014
Media Recovery failed with error 1610
ORA-283 signalled during: ALTER DATABASE RECOVER  database until cancel  ...
Thu Nov 20 14:41:23 2014
ALTER DATABASE RECOVER  database using backup controlfile until cancel
Thu Nov 20 14:43:08 2014
alter database open resetlogs
Thu Nov 20 14:43:08 2014
RESETLOGS is being done without consistancy checks. This may result
in a corrupted database. The database should be recreated.
RESETLOGS after incomplete recovery UNTIL CHANGE 31293973571
Resetting resetlogs activation ID 3855216310 (0xe5c9eeb6)
Online log /data2/oradata/redo0802.log: Thread 1 Group 8 was previously cleared
Online log /data2/oradata/redo0902.log: Thread 1 Group 9 was previously cleared
Thu Nov 20 14:43:14 2014
Setting recovery target incarnation to 3
Thu Nov 20 14:43:14 2014
Assigning activation ID 4039504598 (0xf0c5f2d6)
Thread 1 opened at log sequence 1
  Current log# 9 seq# 1 mem# 0: /data2/oradata/redo0902.log
Successful open of redo thread 1
Thu Nov 20 14:43:14 2014
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Thu Nov 20 14:43:14 2014
SMON: enabling cache recovery
Thu Nov 20 14:43:14 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_1844.trc:
ORA-00600: internal error code, arguments: [4000], [17], [], [], [], [], [], []
Thu Nov 20 14:43:16 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_1844.trc:
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00600: internal error code, arguments: [4000], [17], [], [], [], [], [], []
Thu Nov 20 14:43:16 2014
Error 704 happened during db open, shutting down database
USER: terminating instance due to error 704
Instance terminated by USER, pid = 1844
ORA-1092 signalled during: alter database open resetlogs...

尝试隐含屏蔽回滚段

_corrupted_rollback_segments= _SYSSMU1$, _SYSSMU2$,…………

错误依旧ORA-600 4000

Thu Nov 20 15:09:21 2014
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Thu Nov 20 15:09:21 2014
SMON: enabling cache recovery
Thu Nov 20 15:09:21 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_624.trc:
ORA-00600: internal error code, arguments: [4000], [17], [], [], [], [], [], []
Thu Nov 20 15:09:23 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_624.trc:
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00600: internal error code, arguments: [4000], [17], [], [], [], [], [], []
Thu Nov 20 15:09:23 2014
Error 704 happened during db open, shutting down database
USER: terminating instance due to error 704
Instance terminated by USER, pid = 624
ORA-1092 signalled during: alter database open

多次重启,resetlogs后,数据库出现ORA-600 2662错误

Successful open of redo thread 1
Thu Nov 20 17:13:24 2014
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Thu Nov 20 17:13:24 2014
SMON: enabling cache recovery
Thu Nov 20 17:13:24 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_7967.trc:
ORA-00600: internal error code, arguments: [2662], [7], [1229382552], [7], [1229560642], [8388633], [], []
Thu Nov 20 17:13:25 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_7967.trc:
ORA-00600: internal error code, arguments: [2662], [7], [1229382552], [7], [1229560642], [8388633], [], []
Thu Nov 20 17:13:25 2014
Error 600 happened during db open, shutting down database
USER: terminating instance due to error 600
Instance terminated by USER, pid = 7967
ORA-1092 signalled during: ALTER DATABASE OPEN...
Thu Nov 20 17:18:23 2014
USER: terminating instance due to error 1092
Instance terminated by USER, pid = 7967

offline undo相关文件,尝试打开数据库

Database mounted in Exclusive Mode
Completed: ALTER DATABASE   MOUNT
Thu Nov 20 17:52:31 2014
ALTER DATABASE RECOVER  database until cancel
Thu Nov 20 17:52:31 2014
Media Recovery Start
 parallel recovery started with 15 processes
ORA-279 signalled during: ALTER DATABASE RECOVER  database until cancel  ...
Thu Nov 20 17:53:42 2014
ALTER DATABASE RECOVER CANCEL
Thu Nov 20 17:53:44 2014
ORA-1547 signalled during: ALTER DATABASE RECOVER CANCEL ...
Thu Nov 20 17:56:34 2014
alter database datafile '/opt/oracle/oradata/xifenfei/undotbs01.dbf' offline
Thu Nov 20 17:56:35 2014
Completed: alter database datafile '/opt/oracle/oradata/xifenfei/undotbs01.dbf' offline
Thu Nov 20 17:57:01 2014
alter database datafile '/data2/oradata/undotbs02.dbf' offline
Thu Nov 20 17:57:02 2014
Completed: alter database datafile '/data2/oradata/undotbs02.dbf' offline
Thu Nov 20 17:57:26 2014
alter database datafile '/data2/oradata/undotbs03.dbf' offline
Thu Nov 20 17:57:27 2014
Completed: alter database datafile '/data2/oradata/undotbs03.dbf' offline
Thu Nov 20 17:57:43 2014
alter database open resetlogs
Thu Nov 20 17:57:43 2014
RESETLOGS is being done without consistancy checks. This may result
in a corrupted database. The database should be recreated.
ORA-1245 signalled during: alter database open resetlogs...
Thu Nov 20 17:58:58 2014
alter database datafile '/opt/oracle/oradata/xifenfei/undotbs01.dbf' offline drop
Thu Nov 20 17:58:58 2014
Completed: alter database datafile '/opt/oracle/oradata/xifenfei/undotbs01.dbf' offline drop
Thu Nov 20 17:59:15 2014
alter database open resetlogs
Thu Nov 20 17:59:15 2014
RESETLOGS is being done without consistancy checks. This may result
in a corrupted database. The database should be recreated.
ORA-1245 signalled during: alter database open resetlogs...
Thu Nov 20 17:59:35 2014
alter database datafile '/data2/oradata/undotbs02.dbf' offline drop
Thu Nov 20 17:59:35 2014
Completed: alter database datafile '/data2/oradata/undotbs02.dbf' offline drop
Thu Nov 20 17:59:50 2014
alter database datafile '/data2/oradata/undotbs03.dbf' offline drop
Thu Nov 20 17:59:50 2014
Completed: alter database datafile '/data2/oradata/undotbs03.dbf' offline drop
Thu Nov 20 18:00:07 2014
alter database open resetlogs
Thu Nov 20 18:00:07 2014
RESETLOGS is being done without consistancy checks. This may result
in a corrupted database. The database should be recreated.
RESETLOGS after incomplete recovery UNTIL CHANGE 31294173628
Resetting resetlogs activation ID 4039492628 (0xf0c5c414)
Online log /data2/oradata/redo0802.log: Thread 1 Group 8 was previously cleared
Thu Nov 20 18:00:14 2014
Setting recovery target incarnation to 8
Thu Nov 20 18:00:14 2014
Assigning activation ID 4039504142 (0xf0c5f10e)
Thread 1 opened at log sequence 1
  Current log# 9 seq# 1 mem# 0: /data2/oradata/redo0902.log
Successful open of redo thread 1
Thu Nov 20 18:00:15 2014
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Thu Nov 20 18:00:15 2014
SMON: enabling cache recovery
Thu Nov 20 18:00:15 2014
Successfully onlined Undo Tablespace 1.
Dictionary check beginning
File #2 is offline, but is part of an online tablespace.
data file 2: '/opt/oracle/oradata/xifenfei/undotbs01.dbf'
File #100 is offline, but is part of an online tablespace.
data file 100: '/data2/oradata/undotbs02.dbf'
Thu Nov 20 18:00:28 2014
File #185 is offline, but is part of an online tablespace.
data file 185: '/data2/oradata/undotbs03.dbf'
Dictionary check complete
Thu Nov 20 18:00:35 2014
SMON: enabling tx recovery
Thu Nov 20 18:00:36 2014
Database Characterset is ZHS16CGB231280
Thu Nov 20 18:00:37 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_28472.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-00376: file 185 cannot be read at this time
ORA-01110: data file 185: '/data2/oradata/undotbs03.dbf'
Error 604 happened during db open, shutting down database
USER: terminating instance due to error 604
Thu Nov 20 18:00:37 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_lgwr_28450.trc:
ORA-00604: error occurred at recursive SQL level
Thu Nov 20 18:00:37 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_dbw0_28446.trc:
ORA-00604: error occurred at recursive SQL level
Instance terminated by USER, pid = 28472
ORA-1092 signalled during: alter database open resetlogs...

不知道做了什么操作出现file 1 block 60坏块,很可能bbed修改错误导致

Thu Nov 20 19:18:15 2014
SMON: enabling cache recovery
Thu Nov 20 19:18:16 2014
Hex dump of (file 1, block 60) in trace file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_13232.trc
Corrupt block relative dba: 0x0040003c (file 1, block 60)
Bad header found during buffer read
Data in bad block:
 type: 128 format: 0 rdba: 0x0040003c
 last change scn: 0x0005.ebe04bc9 seq: 0x2 flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x4bc90602
 check value in block header: 0x6faa
 computed block checksum: 0x0
Reread of rdba: 0x0040003c (file 1, block 60) found same corrupted data
Successfully onlined Undo Tablespace 1.
Thu Nov 20 19:18:16 2014
SMON: enabling tx recovery
Thu Nov 20 19:18:17 2014
Database Characterset is ZHS16CGB231280
Thu Nov 20 19:18:17 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_13232.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-00376: file 185 cannot be read at this time
ORA-01110: data file 185: '/data2/oradata/undotbs03.dbf'
Error 604 happened during db open, shutting down database
USER: terminating instance due to error 604
Instance terminated by USER, pid = 13232
ORA-1092 signalled during: alter database open...

尝试不完全恢复,并resetlogs操作

ALTER DATABASE RECOVER  database until cancel
Thu Nov 20 19:33:41 2014
Media Recovery Start
Datafile 2 is on orphaned branch
          File status = 4
        Abs fuzzy SCN = 0
 Hot backup fuzzy SCN = 0
Thu Nov 20 19:33:41 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_20878.trc:
ORA-00600: internal error code, arguments: [krhpfh_03-1209], [2], [864151207], [864153315], [1229402557], [7], [0], [0]
ORA-01110: data file 2: '/opt/oracle/oradata/xifenfei/undotbs01.dbf'
Thu Nov 20 19:33:42 2014
Media Recovery failed with error 600
ORA-283 signalled during: ALTER DATABASE RECOVER  database until cancel  ...
Thu Nov 20 19:34:06 2014
alter database open resetlogs
Thu Nov 20 19:34:06 2014
ORA-1139 signalled during: alter database open resetlogs...
Thu Nov 20 19:34:17 2014
alter database open
Thu Nov 20 19:34:17 2014
ORA-1190 signalled during: alter database open...
Thu Nov 20 19:35:57 2014
ALTER DATABASE RECOVER  database until cancel
Thu Nov 20 19:35:57 2014
Media Recovery Start
Datafile 2 is on orphaned branch
          File status = 4
        Abs fuzzy SCN = 0
 Hot backup fuzzy SCN = 0
Thu Nov 20 19:35:58 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_20878.trc:
ORA-00600: internal error code, arguments: [krhpfh_03-1209], [2], [864151207], [864153315], [1229402557], [7], [0], [0]
ORA-01110: data file 2: '/opt/oracle/oradata/xifenfei/undotbs01.dbf'
Thu Nov 20 19:35:59 2014
Media Recovery failed with error 600
ORA-283 signalled during: ALTER DATABASE RECOVER  database until cancel  ...
Thu Nov 20 19:37:19 2014
alter database open resetlogs
Thu Nov 20 19:37:19 2014
ORA-1139 signalled during: alter database open resetlogs...

继续打开报 ORA-600 3600错误

Thu Nov 20 19:43:14 2014
alter database datafile '/opt/oracle/oradata/xifenfei/undotbs01.dbf' offline drop
Thu Nov 20 19:43:14 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_dbw0_20856.trc:
ORA-00600: internal error code, arguments: [3600], [2], [14], [], [], [], [], []
Thu Nov 20 19:43:15 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_dbw0_20856.trc:
ORA-00600: internal error code, arguments: [3600], [2], [14], [], [], [], [], []
Thu Nov 20 19:43:15 2014
DBW0: terminating instance due to error 471
Instance terminated by DBW0, pid = 20856

<strong>中间多次重启和resetlogs,还出现ORA-600 2663错误</strong>

Fri Nov 21 12:35:12 2014
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Fri Nov 21 12:35:12 2014
SMON: enabling cache recovery
Fri Nov 21 12:35:13 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_15596.trc:
ORA-00600: internal error code, arguments: [2663], [7], [1229543007], [7], [1229560642], [], [], []
Fri Nov 21 12:35:14 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_15596.trc:
ORA-00600: internal error code, arguments: [2663], [7], [1229543007], [7], [1229560642], [], [], []
Fri Nov 21 12:35:14 2014
Error 600 happened during db open, shutting down database
USER: terminating instance due to error 600
Fri Nov 21 12:35:14 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_mman_15572.trc:
ORA-00600: internal error code, arguments: [], [], [], [], [], [], [], []
Fri Nov 21 12:35:14 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_dbw1_15576.trc:
ORA-00600: internal error code, arguments: [], [], [], [], [], [], [], []
Instance terminated by USER, pid = 15596
ORA-1092 signalled during: ALTER DATABASE OPEN..

继续尝试打开数据库出现ORA-600 ktsitbs_info1错误

SMON: enabling cache recovery
Fri Nov 21 13:54:25 2014
Hex dump of (file 1, block 60) in trace file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_21111.trc
Corrupt block relative dba: 0x0040003c (file 1, block 60)
Bad header found during buffer read
Data in bad block:
 type: 128 format: 0 rdba: 0x0040003c
 last change scn: 0x0005.ebe04bc9 seq: 0x2 flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x4bc90602
 check value in block header: 0x6faa
 computed block checksum: 0x0
Reread of rdba: 0x0040003c (file 1, block 60) found same corrupted data
Fri Nov 21 13:54:25 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_21111.trc:
ORA-00600: internal error code, arguments: [ktsitbs_info1], [2], [], [], [], [], [], []
Fri Nov 21 13:54:27 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_21111.trc:
ORA-00600: internal error code, arguments: [ktsitbs_info1], [2], [], [], [], [], [], []
Error 600 happened during db open, shutting down database
USER: terminating instance due to error 600
Instance terminated by USER, pid = 21111
ORA-1092 signalled during: alter database open...

以上是客户数据库故障原因和问题大概的处理过程,下面是我接手后的处理过程


dbv 检查system01.dbf文件,得到结果

HNDX-DB% dbv file=/opt/oracle/oradata/xifenfei/system01.dbf
DBVERIFY: Release 10.2.0.1.0 - Production on Fri Nov 21 16:22:37 2014
Copyright (c) 1982, 2005, Oracle.  All rights reserved.
DBVERIFY - Verification starting : FILE = /opt/oracle/oradata/xifenfei/system01.dbf
Page 60 is marked corrupt
Corrupt block relative dba: 0x0040003c (file 1, block 60)
Bad header found during dbv:
Data in bad block:
 type: 128 format: 0 rdba: 0x0040003c
 last change scn: 0x0005.ebe04bc9 seq: 0x2 flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x4bc90602
 check value in block header: 0x6faa
 computed block checksum: 0x0
Corrupt block relative dba: 0x004001f2 (file 1, block 498)
Bad check value found during buffer read
Data in bad block:
 type: 6 format: 2 rdba: 0x004001f2
 last change scn: 0x0007.49499ca1 seq: 0x1 flg: 0x06
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x9ca10601
 check value in block header: 0xe458
 computed block checksum: 0x9720
DBVERIFY - Verification complete
Total Pages Examined         : 786432
Total Pages Processed (Data) : 201131
Total Pages Failing   (Data) : 2
Total Pages Processed (Index): 221394
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 60265
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 303641
Total Pages Marked Corrupt   : 2
Total Pages Influx           : 0
Highest block SCN            : 1229823477 (7.1229823477)

这里知道数据库有两个坏块,而且根据对于bootstrap$的经验,可以大概确定60坏块很可能是C_TS#,第一反应type异常,498可能是seq$

对数据库启动过程做10046,得到trace文件

PARSING IN CURSOR #1 len=275 dep=2 uid=0 oct=3 lid=0 tim=27978051403575 hv=3408408745 ad='7df93cd0'
select name,online$,contents$,undofile#,undoblock#,blocksize,dflmaxext,dflinit,dflincr,dflextpct,dflminext,
dflminlen, owner#,scnwrp,scnbas, NVL(pitrscnwrp, 0), NVL(pitrscnbas, 0), dflogging, bitmapped, inc#, flags,
plugged, NVL(spare1,0), NVL(spare2,0) from ts$ where ts#=:1
END OF STMT
PARSE #1:c=0,e=92,p=0,cr=0,cu=0,mis=0,r=0,dep=2,og=4,tim=27978051403569
BINDS #1:
kkscoacd
 Bind#0
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=ffffffff7dbac9a8  bln=22  avl=02  flg=05
  value=2
EXEC #1:c=0,e=310,p=0,cr=0,cu=0,mis=0,r=0,dep=2,og=4,tim=27978051404296
WAIT #1: nam='db file sequential read' ela= 42 file#=1 block#=60 blocks=1 obj#=-1 tim=27978051404449
Hex dump of (file 1, block 60)
Corrupt block relative dba: 0x0040003c (file 1, block 60)
Bad header found during buffer read
Data in bad block:
 type: 128 format: 0 rdba: 0x0040003c
 last change scn: 0x0005.ebe04bc9 seq: 0x2 flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x4bc90602
 check value in block header: 0x6faa
 computed block checksum: 0x0
Reread of rdba: 0x0040003c (file 1, block 60) found same corrupted data
FETCH #1:c=10000,e=4072,p=1,cr=2,cu=0,mis=0,r=0,dep=2,og=4,tim=27978051408438
STAT #1 id=1 cnt=0 pid=0 pos=1 obj=16 op='TABLE ACCESS CLUSTER TS$ (cr=2 pr=1 pw=0 time=4075 us)'
STAT #1 id=2 cnt=1 pid=1 pos=1 obj=7 op='INDEX UNIQUE SCAN I_TS# (cr=1 pr=0 pw=0 time=13 us)'
*** 2014-11-22 14:44:43.235
ksedmp: internal or fatal error
ORA-00600: internal error code, arguments: [ktsitbs_info1], [2], [], [], [], [], [], []
Current SQL statement for this session:
select max(maxconcurrency) from sys.wrh$_undostat  where instance_number = :1 and dbid = :2
and snap_id in   (select snap_id from dba_hist_snapshot where end_interval_time >
(select max(end_interval_time)-7 from dba_hist_snapshot))

这里显示了数据库启动报ORA-00600[ktsitbs_info1],[2],明显的表示了b中的2是表示表空间号,由于ts$坏块,无法读取ts$中表空间信息,从而出现数据字典不一致,从而出现该错误。所以恢复该库的关键是修复file 1 block 60.

bbed尝试修复file 1 block 60

HNDX-DB% bbed password=blockedit mode=edit
BBED: Release 2.0.0.0.0 - Limited Production on Sat Nov 22 15:16:26 2014
Copyright (c) 1982, 2005, Oracle.  All rights reserved.
************* !!! For Oracle Internal Use only !!! ***************
BBED> set filename '/opt/oracle/oradata/xifenfei/system01.dbf'
        FILENAME        /opt/oracle/oradata/xifenfei/system01.dbf
BBED> set block 8192
        BLOCK#          8192
BBED> set block 60
        BLOCK#          60
BBED> set count 64
        COUNT           64
BBED> map
 File: /opt/oracle/oradata/xifenfei/system01.dbf (0)
 Block: 60                                    Dba:0x00000000
------------------------------------------------------------
BBED-00400: invalid blocktype (128)
BBED> set block 61
        BLOCK#          61
BBED> map
 File: /opt/oracle/oradata/xifenfei/system01.dbf (0)
 Block: 61                                    Dba:0x00000000
------------------------------------------------------------
 KTB Data Block (Table/Cluster)
 struct kcbh, 20 bytes                      @0
 struct ktbbh, 72 bytes                     @20
 struct kdbh, 14 bytes                      @92
 struct kdbt[3], 12 bytes                   @106
 sb2 kdbr[2]                                @118
 ub1 freespace[7959]                        @122
 ub1 rowdata[107]                           @8081
 ub4 tailchk                                @8188
BBED> p kcbh
struct kcbh, 20 bytes                       @0
   ub1 type_kcbh                            @0        0x06
   ub1 frmt_kcbh                            @1        0xa2
   ub1 spare1_kcbh                          @2        0x00
   ub1 spare2_kcbh                          @3        0x00
   ub4 rdba_kcbh                            @4        0x0040003d
   ub4 bas_kcbh                             @8        0x0000235b
   ub2 wrp_kcbh                             @12       0x0000
   ub1 seq_kcbh                             @14       0x01
   ub1 flg_kcbh                             @15       0x04 (KCBHFCKV)
   ub2 chkval_kcbh                          @16       0x7a85
   ub2 spare3_kcbh                          @18       0x0000
BBED> set block 60
        BLOCK#          60
BBED> d
 File: /opt/oracle/oradata/xifenfei/system01.dbf (0)
 Block: 60               Offsets:    0 to   63           Dba:0x00000000
------------------------------------------------------------------------
 80000000 0040003c ebe04bc9 00050204 6faa0000 01000000 00000006 29b3a204
 00040ca0 00020200 00000000 000a0000 00000002 0080009b 00000100 80000000
 <32 bytes per line>
BBED> d block 61
 File: /opt/oracle/oradata/xifenfei/system01.dbf (0)
 Block: 61               Offsets:    0 to   63           Dba:0x00000000
------------------------------------------------------------------------
 06a20000 0040003d 0000235b 00000104 7a850000 01000000 00000006 00001837
 00001738 00020200 00000000 0007002e 00000002 00800075 00012300 80000000
 <32 bytes per line>
BBED> set block 60
        BLOCK#          60
BBED> m /x 06a2
 File: /opt/oracle/oradata/xifenfei/system01.dbf (0)
 Block: 60               Offsets:    0 to   63           Dba:0x00000000
------------------------------------------------------------------------
 06a20000 0040003c ebe04bc9 00050204 6faa0000 01000000 00000006 29b3a204
 00040ca0 00020200 00000000 000a0000 00000002 0080009b 00000100 80000000
 <32 bytes per line>
BBED> map
 File: /opt/oracle/oradata/xifenfei/system01.dbf (0)
 Block: 60                                    Dba:0x00000000
------------------------------------------------------------
 KTB Data Block (Table/Cluster)
 struct kcbh, 20 bytes                      @0
 struct ktbbh, 72 bytes                     @20
 struct kdbh, 14 bytes                      @92
 struct kdbt[3], 12 bytes                   @106
 sb2 kdbr[2]                                @118
 ub1 freespace[7598]                        @122
 ub1 rowdata[468]                           @7720
 ub4 tailchk                                @8188
BBED> sum apply
Check value for File 0, Block 60:
current = 0xe908, required = 0xe908
BBED> verify
DBVERIFY - Verification starting
FILE = /opt/oracle/oradata/xifenfei/system01.dbf
BLOCK = 60
DBVERIFY - Verification complete
Total Blocks Examined         : 1
Total Blocks Processed (Data) : 1
Total Blocks Failing   (Data) : 0
Total Blocks Processed (Index): 0
Total Blocks Failing   (Index): 0
Total Blocks Empty            : 0
Total Blocks Marked Corrupt   : 0
Total Blocks Influx           : 0
BBED>

尝试启动数据库

Sat Nov 22 15:51:33 2014
alter database open
Sat Nov 22 15:51:34 2014
Thread 1 opened at log sequence 7
  Current log# 8 seq# 7 mem# 0: /data2/oradata/redo0802.log
Successful open of redo thread 1
Sat Nov 22 15:51:34 2014
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Sat Nov 22 15:51:34 2014
SMON: enabling cache recovery
SMON: enabling tx recovery
Sat Nov 22 15:51:34 2014
Database Characterset is ZHS16CGB231280
Hex dump of (file 1, block 498) in trace file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_2818.trc
Corrupt block relative dba: 0x004001f2 (file 1, block 498)
Bad check value found during buffer read
Data in bad block:
 type: 6 format: 2 rdba: 0x004001f2
 last change scn: 0x0007.49499ca1 seq: 0x1 flg: 0x06
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x9ca10601
 check value in block header: 0xe458
 computed block checksum: 0x9720
Reread of rdba: 0x004001f2 (file 1, block 498) found same corrupted data
Sat Nov 22 15:51:35 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_smon_2803.trc:
ORA-00600: internal error code, arguments: [4000], [12], [], [], [], [], [], []
replication_dependency_tracking turned off (no async multimaster replication found)
Starting background process QMNC
QMNC started with pid=18, OS id=3000
Sat Nov 22 15:51:36 2014
Completed: alter database open
Sat Nov 22 15:51:36 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_3010.trc:
ORA-00600: internal error code, arguments: [6807], [AUDSES$], [144], [], [], [], [], []
Sat Nov 22 15:51:37 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_mmon_2809.trc:
ORA-00600: internal error code, arguments: [6807], [WRI$_ALERT_SEQUENCE], [8783], [], [], [], [], []
Sat Nov 22 15:51:37 2014
Non-fatal internal error happenned while SMON was doing non-existent object cleanup.
SMON encountered 1 out of maximum 100 non-fatal internal errors.
Sat Nov 22 15:51:38 2014
ORA-600 encountered when generating server alert SMG-3000
Sat Nov 22 15:51:38 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_mmon_2809.trc:
ORA-00600: internal error code, arguments: [ktcpoptx_0], [0x772705E60], [], [], [], [], [], []

只要出现ORA-600 4000和ORA-600 6807错误,其中ORA-600 6807错误比较明显是由于seq$坏块,导致AUDSES$ seq异常导致。ORA-600 4000应该是回滚段异常,继续分析回滚段

SQL> select name,ts#,status$ from undo$;
NAME                                  TS#    STATUS$
------------------------------ ---------- ----------
SYSTEM                                  0          2
_SYSSMU1$                               1          2
_SYSSMU2$                               1          2
_SYSSMU3$                               1          2
…………
_SYSSMU168$                             1          2
_SYSSMU169$                             1          2

这里很异常,system回滚段在数据库open之后,按照常理不可能处于STATUS$=2(OFFLINE)状态。而且其他回滚段全部为OFFLINE状态也属于异常情况.而且尝试drop undo报ORA-01561,另外在dba_rollback_segs中无SYSTEM(查询结果忘记保存)

SQL> drop tablespace undotbs1 including contents;
drop tablespace undotbs1 including contents
*
ERROR at line 1:
ORA-01561: failed to remove all objects in the tablespace specified

通过这一系列很怀疑是由于bbed 修改了undo$等相关基表信息导致现在system中的undo信息混乱.信息反馈给客户后,客户想起来昨天给他们恢复的公司在bbed操作前备份了system01.dbf.突然感觉救星来了.实在怕不懂bbed的人折腾bbed

dbv检测备份文件

DBVERIFY - Verification starting : FILE = /data3/backup/system01.dbf_bak
Page 60 is marked corrupt
Corrupt block relative dba: 0x0040003c (file 1, block 60)
Bad header found during dbv:
Data in bad block:
 type: 128 format: 0 rdba: 0x0040003c
 last change scn: 0x0005.ebe04bc9 seq: 0x2 flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x4bc90602
 check value in block header: 0x6faa
 computed block checksum: 0x0
Block Checking: DBA = 4194802, Block Type = KTB-managed data block
data header at 0x1002ef05c
kdbchk: row locked by non-existent transaction
        table=0   slot=4
        lockid=1   ktbbhitc=2
Page 498 failed with check code 6101
DBVERIFY - Verification complete
Total Pages Examined         : 786432
Total Pages Processed (Data) : 201131
Total Pages Failing   (Data) : 1
Total Pages Processed (Index): 221394
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 60265
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 303641
Total Pages Marked Corrupt   : 1
Total Pages Influx           : 0
Highest block SCN            : 1229823477 (7.1229823477)

好家伙只有一个物理坏块和一个逻辑坏块,而对于物理坏块block 60已经知道如何修复,逻辑坏块可以尝试设置隐含参数跳过去,bbed修改相关block(同上步骤)

再次启动数据库

dd if=/opt/oracle/oradata/xifenfei/system01.dbf bs=8192 count=2 of=/tmp/system01.2
dd if=/tmp/system01.2 of=/data3/backup/system01.dbf_bak bs=8192 count=2 conv=notrunc
Sat Nov 22 17:52:50 2014
Database mounted in Exclusive Mode
Completed: ALTER DATABASE   MOUNT
Sat Nov 22 17:53:38 2014
alter database rename file '/opt/oracle/oradata/xifenfei/system01.dbf' to '/data3/backup/system01.dbf_bak'
Sat Nov 22 17:53:39 2014
Completed: alter database rename file '/opt/oracle/oradata/xifenfei/system01.dbf' to '/data3/backup/system01.dbf_bak'
Sat Nov 22 17:55:43 2014
alter database open
Sat Nov 22 17:55:48 2014
LGWR: STARTING ARCH PROCESSES
ARC0 started with pid=18, OS id=15858
Sat Nov 22 17:56:10 2014
ARC0: Archival started
ARC1: Archival started
LGWR: STARTING ARCH PROCESSES COMPLETE
ARC1 started with pid=17, OS id=15879
Sat Nov 22 17:56:19 2014
Thread 1 opened at log sequence 7
  Current log# 8 seq# 7 mem# 0: /data2/oradata/redo0802.log
Successful open of redo thread 1
Sat Nov 22 17:56:19 2014
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Sat Nov 22 17:56:19 2014
SMON: enabling cache recovery
SMON: enabling tx recovery
Sat Nov 22 17:56:20 2014
ARC1: STARTING ARCH PROCESSES
Sat Nov 22 17:56:20 2014
ARC0: Becoming the 'no FAL' ARCH
ARC0: Becoming the 'no SRL' ARCH
Sat Nov 22 17:56:22 2014
Database Characterset is ZHS16CGB231280
replication_dependency_tracking turned off (no async multimaster replication found)
Starting background process QMNC
Sat Nov 22 17:56:33 2014
ARC2: Archival started
ARC1: STARTING ARCH PROCESSES COMPLETE
ARC1: Becoming the heartbeat ARCH
ARC2 started with pid=23, OS id=15928
QMNC started with pid=25, OS id=15996
Sat Nov 22 17:57:11 2014
Completed: alter database open
Sat Nov 22 17:57:18 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_16010.trc:
ORA-00600: internal error code, arguments: [4511], [], [], [], [], [], [], []
Sat Nov 22 17:57:26 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_16012.trc:
ORA-00600: internal error code, arguments: [4511], [], [], [], [], [], [], []
Sat Nov 22 17:58:17 2014
Starting background process EMN0
Sat Nov 22 18:00:03 2014
Shutting down instance: further logons disabled
EMN0 started with pid=71, OS id=16421
Sat Nov 22 18:00:12 2014
SMON: Restarting fast_start parallel rollback
Sat Nov 22 18:00:23 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_p000_15951.trc:
ORA-00600: internal error code, arguments: [4198], [9], [], [], [], [], [], []
Sat Nov 22 18:00:24 2014
Stopping background process CJQ0
Sat Nov 22 18:00:24 2014
Stopping background process QMNC
Sat Nov 22 18:00:27 2014
Doing block recovery for file 2 block 41
Block recovery from logseq 7, block 180883 to scn 214748389244
Sat Nov 22 18:00:27 2014
Recovery of Online Redo Log: Thread 1 Group 8 Seq 7 Reading mem 0
  Mem# 0 errs 0: /data2/oradata/redo0802.log
Block recovery stopped at EOT rba 7.180988.16
Block recovery completed at rba 7.180988.16, scn 50.24441
Sat Nov 22 18:00:32 2014
Stopping background process MMNL
Sat Nov 22 18:00:38 2014
Stopping background process MMON
Sat Nov 22 18:00:41 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_smon_15395.trc:
ORA-00600: internal error code, arguments: [4137], [], [], [], [], [], [], []
Sat Nov 22 18:00:42 2014
ORACLE Instance xifenfei (pid = 9) - Error 600 encountered while recovering transaction (3, 4).
Sat Nov 22 18:00:42 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_smon_15395.trc:
ORA-00600: internal error code, arguments: [4137], [], [], [], [], [], [], []

这里都是很常规的错误,查询undo$也已经正常,重建新undo表空间删除老undo,然后alert日志中无其他报错,数据库恢复至此完成,建议客户导出导入重建数据库

ORA-600 2663 故障恢复

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:ORA-600 2663 故障恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

朋友数据库启动遭遇ORA-00600[2663]

Mon Sep 22 19:24:20 2014
Thread 1 advanced to log sequence 17 (thread open)
Thread 1 opened at log sequence 17
  Current log# 17 seq# 17 mem# 0: /u02/orayali2/redo17.log
Successful open of redo thread 1
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Mon Sep 22 19:24:20 2014
SMON: enabling cache recovery
Errors in file /u01/app/oracle/diag/rdbms/orayali2/orayali2/trace/orayali2_ora_20722.trc  (incident=336180):
ORA-00600: internal error code, arguments: [2663], [13], [3140023138], [13], [3141216403], [], [], [], [], [], [], []
Incident details in: /u01/app/oracle/diag/rdbms/orayali2/orayali2/incident/incdir_336180/orayali2_ora_20722_i336180.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Errors in file /u01/app/oracle/diag/rdbms/orayali2/orayali2/trace/orayali2_ora_20722.trc:
ORA-00600: internal error code, arguments: [2663], [13], [3140023138], [13], [3141216403], [], [], [], [], [], [], []
Errors in file /u01/app/oracle/diag/rdbms/orayali2/orayali2/trace/orayali2_ora_20722.trc:
ORA-00600: internal error code, arguments: [2663], [13], [3140023138], [13], [3141216403], [], [], [], [], [], [], []
Error 600 happened during db open, shutting down database
USER (ospid: 20722): terminating the instance due to error 600
Instance terminated by USER, pid = 20722
ORA-1092 signalled during: alter database open...
opiodr aborting process unknown ospid (20722) as a result of ORA-1092
Mon Sep 22 19:24:24 2014
ORA-1092 : opitsk aborting process

ORA-600[2663]与常见的ORA-600[2662]类似,都是由于block的scn大于文件头的scn导致,只不过错误的对象不一样而已.对于该类问题,我们的处理方法一般就是简单的推scn,但是这个库比较特殊11.2.0.3.5版本,一般方法无法推scn,因为收集操作日志有限,贴出核心操作步骤

[oracle@orayali2 OPatch]$ uname -a
Linux orayali2 2.6.32-279.el6.x86_64 #1 SMP Wed Jun 13 18:24:36 EDT 2012 x86_64 x86_64 x86_64 GNU/Linux
[oracle@orayali2 OPatch]$ sqlplus / as sysdba
SQL*Plus: Release 11.2.0.3.0 Production on Mon Sep 22 19:09:18 2014
Copyright (c) 1982, 2011, Oracle.  All rights reserved.
Connected to an idle instance.
SQL> startup mount
ORACLE instance started.
Total System Global Area 1.6535E+10 bytes
Fixed Size                  2244792 bytes
Variable Size            9898561352 bytes
Database Buffers         6610223104 bytes
Redo Buffers               24256512 bytes
Database mounted.
SQL> oradebug setmypid
Statement processed.
SQL>  oradebug DUMPvar SGA kcsgscn_
kcslf kcsgscn_ [060019598, 0600195C8) = 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 60019278 00000000
SQL> oradebug poke 0x060019598 8 0x0000000000000040
BEFORE: [060019598, 0600195A0) = 00000000 00000000
AFTER:  [060019598, 0600195A0) = 00000040 00000000
SQL> oradebug DUMPvar SGA kcsgscn_
kcslf kcsgscn_ [060019598, 0600195C8) = 00000040 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 60019278 00000000
SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-30012: undo tablespace 'SYSTEM' does not exist or of wrong type
Process ID: 21174
Session ID: 1563 Serial number: 3

现在错误已经改变,而是出现了ORA-30012的错误

alter database open
Beginning crash recovery of 1 threads
 parallel recovery started with 31 processes
Started redo scan
Completed redo scan
 read 4 KB redo, 0 data blocks need recovery
Started redo application at
 Thread 1: logseq 17, block 2, scn 58974597984
Recovery of Online Redo Log: Thread 1 Group 17 Seq 17 Reading mem 0
  Mem# 0: /u02/orayali2/redo17.log
Completed redo application of 0.00MB
Completed crash recovery at
 Thread 1: logseq 17, block 3, scn 58974617986
 0 data blocks read, 0 data blocks written, 4 redo k-bytes read
Mon Sep 22 19:30:05 2014
Thread 1 advanced to log sequence 18 (thread open)
Thread 1 opened at log sequence 18
  Current log# 18 seq# 18 mem# 0: /u02/orayali2/redo18.log
Successful open of redo thread 1
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Mon Sep 22 19:30:05 2014
SMON: enabling cache recovery
Undo initialization errored: err:30012 serial:0 start:1143146928 end:1143147338 diff:410 (4 seconds)
Errors in file /u01/app/oracle/diag/rdbms/orayali2/orayali2/trace/orayali2_ora_21174.trc:
ORA-30012: undo tablespace 'SYSTEM' does not exist or of wrong type
Errors in file /u01/app/oracle/diag/rdbms/orayali2/orayali2/trace/orayali2_ora_21174.trc:
ORA-30012: undo tablespace 'SYSTEM' does not exist or of wrong type
Error 30012 happened during db open, shutting down database
USER (ospid: 21174): terminating the instance due to error 30012
Instance terminated by USER, pid = 21174
ORA-1092 signalled during: alter database open...
opiodr aborting process unknown ospid (21174) as a result of ORA-1092
Mon Sep 22 19:30:08 2014
ORA-1092 : opitsk aborting process

猜测原因是undo设置有问题导致,检查果然发现undo_management=auto,而undo_tablespace=SYSTEM

SQL> startup mount
ORACLE instance started.
Total System Global Area 1.6535E+10 bytes
Fixed Size                  2244792 bytes
Variable Size            9898561352 bytes
Database Buffers         6610223104 bytes
Redo Buffers               24256512 bytes
Database mounted.
SQL> show parameter undo;
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
undo_management                      string      AUTO
undo_retention                       integer     10800
undo_tablespace                      string      SYSTEM
SQL> alter system set undo_management=manual scope=spfile;
System altered.
SQL> shutdown immediate;
ORA-01109: database not open
Database dismounted.
ORACLE instance shut down.
SQL> startup
ORACLE instance started.
Total System Global Area 1.6535E+10 bytes
Fixed Size                  2244792 bytes
Variable Size            9898561352 bytes
Database Buffers         6610223104 bytes
Redo Buffers               24256512 bytes
Database mounted.
Database opened.

解决该问题修改undo_management=manual即可