ORA-600 2662 – 第4页

记录一次system表空间坏块(ORA-01578)数据库恢复

半夜朋友打来求救电话,说xx医院his系统因为存储异常导致system坏块无法正常启动,因为是win平台无法使用bbed,无法修复system 坏块,请求技术支持
dbv检查system文件报坏块

对应具体地址为:file 1 block 39041和66738

判断控制文件异常
通过数据库恢复检查脚本(Oracle Database Recovery Check)脚本检测数据库发现控制文件明显异常(checkpoint scn)

尝试恢复数据库

因此对该库进行了不完全恢复,然后尝试resetlogs打开数据库,数据库报ORA-600 2662错误

Fri Aug 29 02:35:08 2014
alter database open resetlogs
Fri Aug 29 02:35:11 2014
RESETLOGS after complete recovery through change 451371288
Resetting resetlogs activation ID 1232269761 (0x4972f1c1)
Fri Aug 29 02:35:15 2014
Setting recovery target incarnation to 3
Fri Aug 29 02:35:15 2014
Assigning activation ID 1384652231 (0x52881dc7)
LGWR: STARTING ARCH PROCESSES
ARC0 started with pid=17, OS id=1084
Fri Aug 29 02:35:15 2014
ARC0: Archival started
ARC1: Archival started
LGWR: STARTING ARCH PROCESSES COMPLETE
ARC1 started with pid=18, OS id=2836
Fri Aug 29 02:35:15 2014
Thread 1 opened at log sequence 1
  Current log# 1 seq# 1 mem# 0: Z:\ORACLE\PRODUCT\10.2.0\ORCL\REDO01.LOG
Successful open of redo thread 1
Fri Aug 29 02:35:15 2014
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Fri Aug 29 02:35:15 2014
ARC1: Becoming the 'no FAL' ARCH
ARC1: Becoming the 'no SRL' ARCH
Fri Aug 29 02:35:15 2014
ARC0: Becoming the heartbeat ARCH
Fri Aug 29 02:35:15 2014
SMON: enabling cache recovery
Fri Aug 29 02:35:16 2014
Errors in file d:\oracle\product\10.2.0\admin\orcl\udump\orcl_ora_4824.trc:
ORA-00600: 内部错误代码, 参数: [2662], [0], [451371311], [0], [451374534], [8388977], [], []
Fri Aug 29 02:35:16 2014
Errors in file d:\oracle\product\10.2.0\admin\orcl\udump\orcl_ora_4824.trc:
ORA-00600: 内部错误代码, 参数: [2662], [0], [451371311], [0], [451374534], [8388977], [], []
Fri Aug 29 02:35:16 2014
Error 600 happened during db open, shutting down database
USER: terminating instance due to error 600
Fri Aug 29 02:35:17 2014
Errors in file d:\oracle\product\10.2.0\admin\orcl\bdump\orcl_smon_2928.trc:
ORA-00600: ??????, ??: [], [], [], [], [], [], [], []
Instance terminated by USER, pid = 4824
ORA-1092 signalled during: alter database open resetlogs...

ORA-600 2662 该错误解决思路很明显,推进scn,数据库报ORA-01578

Fri Aug 29 02:42:47 2014
SMON: enabling cache recovery
Fri Aug 29 02:42:47 2014
Successfully onlined Undo Tablespace 1.
Dictionary check beginning
Dictionary check complete
Fri Aug 29 02:42:49 2014
SMON: enabling tx recovery
Fri Aug 29 02:42:49 2014
Database Characterset is ZHS16GBK
Opening with internal Resource Manager plan
where NUMA PG = 1, CPUs = 16
replication_dependency_tracking turned off (no async multimaster replication found)
Fri Aug 29 02:42:50 2014
Errors in file d:\oracle\product\10.2.0\admin\orcl\bdump\orcl_smon_4804.trc:
ORA-00604: 递归 SQL 级别 1 出现错误
ORA-01578: ORACLE 数据块损坏 (文件号 1, 块号 39041)
ORA-01110: 数据文件 1: 'Z:\ORACLE\PRODUCT\10.2.0\ORCL\SYSTEM01.DBF'
Fri Aug 29 02:42:50 2014
LOGSTDBY: Validating controlfile with logical metadata
Fri Aug 29 02:42:51 2014
LOGSTDBY: Validation complete
ORA-604 signalled during: alter database open...

使用event跳过坏块，启动数据库成功

Fri Aug 29 02:48:59 2014
SMON: enabling cache recovery
Fri Aug 29 02:49:00 2014
Successfully onlined Undo Tablespace 1.
Fri Aug 29 02:49:00 2014
SMON: enabling tx recovery
Fri Aug 29 02:49:00 2014
Database Characterset is ZHS16GBK
Opening with internal Resource Manager plan
where NUMA PG = 1, CPUs = 16
replication_dependency_tracking turned off (no async multimaster replication found)
Starting background process QMNC
QMNC started with pid=34, OS id=3096
Fri Aug 29 02:49:01 2014
db_recovery_file_dest_size of 4096 MB is 0.00% used. This is a
user-specified limit on the amount of space that will be used by this
database for recovery-related files, and does not reflect the amount of
space available in the underlying filesystem or ASM diskgroup.
Fri Aug 29 02:49:01 2014
Completed: alter database open

查询坏块对象

因为这些对象均不是核心对象,直接进行truncate然后插入老数据

后续还有大量错误修复

ORA-12012: error on auto execute of job 1
ORA-08102: index key not found, obj# 239, file 1, block 1674 (2)
ORA-00600: 内部错误代码, 参数: [kcbz_check_objd_typ], [0], [0], [1], [], [], [], []
ORA-00600: internal error code, arguments: [6749], [3], [12606796], [173], [], [], [], []
ORA-00600: 内部错误代码, 参数: [13013], [52898], [52895], [38288618], [44], [38288618], [17], []
ORA-00600: 内部错误代码, 参数: [13013], [5001], [52895], [38286476], [5], [38286476], [17], []

再次说明,很多时候数据库恢复不要看成多神秘,就是几个参数搞定,更加不要神化有坏块就bbed修复,当然非常极端,使用N中工具,N种尝试的也存在.做好备份重于一切

又一起存储故障导致ORA-00333 ORA-00312恢复

联系：手机/微信(+86 17813235971) QQ(107644445)

标题：又一起存储故障导致ORA-00333 ORA-00312恢复

数据库启动报ORA-00333 ORA-00312错误，无法正常open数据库

Thu Aug 07 10:42:03  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\bdump\szcg_arc0_4724.trc:
ORA-00333: redo log read error block 63489 count 2048
ORA-00312: online log 2 thread 1: 'F:\ORADATA\SZCG\REDO02.LOG'
ORA-27091: unable to queue I/O
ORA-27070: async read/write failed
OSD-04006: ReadFile() 失败, 无法读取文件
O/S-Error: (OS 1) 函数不正确。
Thu Aug 07 10:42:03  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\bdump\szcg_arc0_4724.trc:
ORA-00333: redo log read error block 63489 count 2048
Thu Aug 07 10:42:03  2014
ARC0: All Archive destinations made inactive due to error 333
Thu Aug 07 10:42:03  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_1856.trc:
ORA-00449: 后台进程 'LGWR' 因错误 340 异常终止
ORA-00340: 处理联机日志  (用于线程 ) 时出现 I/O 错误
Thu Aug 07 10:42:03  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_6548.trc:
ORA-00449: 后台进程 'LGWR' 因错误 340 异常终止
ORA-00340: 处理联机日志  (用于线程 ) 时出现 I/O 错误
Thu Aug 07 10:42:03  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_8104.trc:
ORA-00449: 后台进程 'LGWR' 因错误 340 异常终止
ORA-00340: 处理联机日志  (用于线程 ) 时出现 I/O 错误
Thu Aug 07 10:42:03  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\bdump\szcg_lgwr_884.trc:
ORA-00340: IO error processing online log 3 of thread 1
ORA-00345: redo log write error block 65238 count 13
ORA-00312: online log 3 thread 1: 'F:\ORADATA\SZCG\REDO03.LOG'
ORA-27070: async read/write failed
OSD-04016: 异步 I/O 请求排队时出错。
O/S-Error: (OS 1) 函数不正确。
Thu Aug 07 10:42:03  2014
LGWR: terminating instance due to error 340
Thu Aug 07 10:42:05  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_8104.trc:
ORA-00603: ORACLE server session terminated by fatal error
ORA-00449: background process 'LGWR' unexpectedly terminated with error 340
ORA-00340: IO error processing online log  of thread
Thu Aug 07 10:42:05  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_1856.trc:
ORA-00603: ORACLE server session terminated by fatal error
ORA-00449: background process 'LGWR' unexpectedly terminated with error 340
ORA-00340: IO error processing online log  of thread
Thu Aug 07 10:42:05  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_6548.trc:
ORA-00603: ORACLE server session terminated by fatal error
ORA-00449: background process 'LGWR' unexpectedly terminated with error 340
ORA-00340: IO error processing online log  of thread
Thu Aug 07 17:40:05  2014
ALTER DATABASE OPEN
Thu Aug 07 17:40:05  2014
Beginning crash recovery of 1 threads
 parallel recovery started with 15 processes
Thu Aug 07 17:40:06  2014
Started redo scan
Thu Aug 07 17:40:06  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_5168.trc:
ORA-00333: 重做日志读取块 63016 计数 8192 出错
ORA-00312: 联机日志 3 线程 1: 'F:\ORADATA\SZCG\REDO03.LOG'
ORA-27070: 异步读取/写入失败
OSD-04016: 异步 I/O 请求排队时出错。
O/S-Error: (OS 1) 函数不正确。
Thu Aug 07 17:40:06  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_5168.trc:
ORA-00333: 重做日志读取块 63016 计数 8192 出错
ORA-00312: 联机日志 3 线程 1: 'F:\ORADATA\SZCG\REDO03.LOG'
ORA-27091: 无法将 I/O 排队
ORA-27070: 异步读取/写入失败
OSD-04006: ReadFile() 失败, 无法读取文件
O/S-Error: (OS 1) 函数不正确。
Thu Aug 07 17:40:06  2014
Aborting crash recovery due to error 333
Thu Aug 07 17:40:06  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_5168.trc:
ORA-00333: 重做日志读取块 63016 计数 8192 出错
ORA-333 signalled during: ALTER DATABASE OPEN...

进一步检查发现在7月6日系统就已经报io异常

Sun Jul 06 10:05:23  2014
ARC0: All Archive destinations made inactive due to error 333
Sun Jul 06 10:06:07  2014
KCF: write/open error block=0xd03 online=1
     file=3 F:\ORADATA\SZCG\SYSAUX01.DBF
     error=27070 txt: 'OSD-04016: 异步 I/O 请求排队时出错。
O/S-Error: (OS 1) 函数不正确。'
Automatic datafile offline due to write error on
file 3: F:\ORADATA\SZCG\SYSAUX01.DBF
Sun Jul 06 10:06:23  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\bdump\szcg_arc1_2676.trc:
ORA-00333: redo log read error block 63489 count 2048
ORA-00312: online log 2 thread 1: 'F:\ORADATA\SZCG\REDO02.LOG'
ORA-27091: unable to queue I/O
ORA-27070: async read/write failed
OSD-04006: ReadFile() 失败, 无法读取文件
O/S-Error: (OS 1) 函数不正确。
Thu Aug 07 10:36:54  2014
ARC1: All Archive destinations made inactive due to error 333
Thu Aug 07 10:37:25  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\bdump\szcg_m000_5832.trc:
ORA-01135: file 3 accessed for DML/query is offline
ORA-01110: data file 3: 'F:\ORADATA\SZCG\SYSAUX01.DBF'

检查硬件发现raid一块盘完全损坏，另外一块盘也处于告警状态，保护现场拷贝文件过程中发现redo02,redo03,sysaux无法拷贝,使用rman检查发现
sysaux-block

因为redo完全损坏，使用工具跳过坏块，拷贝相关有坏块文件到其他目录，重命名相关文件尝试启动数据库，依然报ORA-00333 ORA-00312

Started redo scan
Thu Aug 07 17:40:06  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_5168.trc:
ORA-00333: 重做日志读取块 63016 计数 8192 出错
ORA-00312: 联机日志 3 线程 1: 'F:\ORADATA\SZCG\REDO03.LOG'
ORA-27070: 异步读取/写入失败
OSD-04016: 异步 I/O 请求排队时出错。
O/S-Error: (OS 1) 函数不正确。
Thu Aug 07 17:40:06  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_5168.trc:
ORA-00333: 重做日志读取块 63016 计数 8192 出错
ORA-00312: 联机日志 3 线程 1: 'F:\ORADATA\SZCG\REDO03.LOG'
ORA-27091: 无法将 I/O 排队
ORA-27070: 异步读取/写入失败
OSD-04006: ReadFile() 失败, 无法读取文件
O/S-Error: (OS 1) 函数不正确。
Thu Aug 07 17:40:06  2014
Aborting crash recovery due to error 333
Thu Aug 07 17:40:06  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_5168.trc:
ORA-00333: 重做日志读取块 63016 计数 8192 出错
ORA-333 signalled during: ALTER DATABASE OPEN...

设置隐含参数_allow_resetlogs_corruption,尝试强制拉库

Started redo scan
Fri Aug 08 12:13:25  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_3892.trc:
ORA-00333: 重做日志读取块 63016 计数 8192 出错
ORA-00312: 联机日志 3 线程 1: 'F:\ORADATA\SZCG\REDO03.LOG'
ORA-27070: 异步读取/写入失败
OSD-04016: 异步 I/O 请求排队时出错。
O/S-Error: (OS 1) 函数不正确。
Fri Aug 08 12:13:25  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_3892.trc:
ORA-00333: 重做日志读取块 63016 计数 8192 出错
ORA-00312: 联机日志 3 线程 1: 'F:\ORADATA\SZCG\REDO03.LOG'
ORA-27091: 无法将 I/O 排队
ORA-27070: 异步读取/写入失败
OSD-04006: ReadFile() 失败, 无法读取文件
O/S-Error: (OS 1) 函数不正确。
Fri Aug 08 12:13:25  2014
Aborting crash recovery due to error 333
Fri Aug 08 12:13:25  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_3892.trc:
ORA-00333: 重做日志读取块 63016 计数 8192 出错
ORA-333 signalled during: ALTER DATABASE OPEN...
Fri Aug 08 12:13:45  2014
ALTER DATABASE RECOVER  database until cancel
Fri Aug 08 12:13:45  2014
Media Recovery Start
 parallel recovery started with 15 processes
ORA-279 signalled during: ALTER DATABASE RECOVER  database until cancel  ...
Fri Aug 08 12:13:55  2014
ALTER DATABASE RECOVER    CANCEL
Fri Aug 08 12:13:59  2014
ORA-1547 signalled during: ALTER DATABASE RECOVER    CANCEL  ...
Fri Aug 08 12:13:59  2014
ALTER DATABASE RECOVER CANCEL
ORA-1112 signalled during: ALTER DATABASE RECOVER CANCEL ...
Fri Aug 08 12:14:12  2014
alter database open resetlogs
Fri Aug 08 12:14:13  2014
RESETLOGS is being done without consistancy checks. This may result
in a corrupted database. The database should be recreated.
ORA-1245 signalled during: alter database open resetlogs...
Fri Aug 08 12:54:11  2014
alter tablespace sysaux offline
Fri Aug 08 12:54:11  2014
ORA-1109 signalled during: alter tablespace sysaux offline...
Fri Aug 08 13:05:30  2014
alter database open
Fri Aug 08 13:05:30  2014
ORA-1589 signalled during: alter database open...

在offline过程中，数据库检查到sysaux数据文件为offline状态，当表空间只有一个数据文件，而且该数据文件为offline，数据库将会尝试offline sysaux表空间，但是发现该表空间文件非正常scn，无法offline 表空间，导致resetlogs操作失败。这里是操作失误应该先online相关数据文件，然后再进行resetlogs操作

Sat Aug 09 11:56:03  2014
alter database datafile 3 online
Sat Aug 09 11:56:04  2014
Completed: alter database datafile 3 online
Sat Aug 09 11:56:08  2014
alter database open resetlogs
RESETLOGS is being done without consistancy checks. This may result
in a corrupted database. The database should be recreated.
Sat Aug 09 11:56:18  2014
ARCH: Encountered disk I/O error 19502
Sat Aug 09 11:56:18  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_4516.trc:
ORA-19502: 文件 "F:\ARCHIVE\ARC01745_0814618167.001", 块编号 55297 写错误 (块大小 = 512)
ORA-27072: 文件 I/O 错误
OSD-04008: WriteFile() 失败, 无法写入文件
O/S-Error: (OS 1) 函数不正确。
ORA-19502: 文件 "F:\ARCHIVE\ARC01745_0814618167.001", 块编号 55297 写错误 (块大小 = 512)
Sat Aug 09 11:56:18  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_4516.trc:
ORA-19502: 文件 "F:\ARCHIVE\ARC01745_0814618167.001", 块编号 55297 写错误 (块大小 = 512)
ORA-27072: 文件 I/O 错误
OSD-04008: WriteFile() 失败, 无法写入文件
O/S-Error: (OS 1) 函数不正确。
ORA-19502: 文件 "F:\ARCHIVE\ARC01745_0814618167.001", 块编号 55297 写错误 (块大小 = 512)
ARCH: I/O error 19502 archiving log 3 to 'F:\ARCHIVE\ARC01745_0814618167.001'
Sat Aug 09 11:56:18  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_4516.trc:
ORA-00265: 要求实例恢复, 无法设置 ARCHIVELOG 模式
Archive all online redo logfiles failed:265
RESETLOGS after incomplete recovery UNTIL CHANGE 77983856
Resetting resetlogs activation ID 3562192628 (0xd452bef4)
Online log F:\ORADATA\SZCG\REDO01.LOG: Thread 1 Group 1 was previously cleared
Online log F:\ORADATA\SZCG\REDO02.LOG: Thread 1 Group 2 was previously cleared
Online log D:\REDO04.LOG: Thread 1 Group 4 was previously cleared
Sat Aug 09 11:56:22  2014
Setting recovery target incarnation to 3
Sat Aug 09 11:56:23  2014
Assigning activation ID 3602586269 (0xd6bb1a9d)
LGWR: STARTING ARCH PROCESSES
ARC0 started with pid=33, OS id=5900
Sat Aug 09 11:56:23  2014
ARC0: Archival started
ARC1: Archival started
LGWR: STARTING ARCH PROCESSES COMPLETE
ARC1 started with pid=34, OS id=5776
Sat Aug 09 11:56:24  2014
Thread 1 opened at log sequence 1
  Current log# 1 seq# 1 mem# 0: F:\ORADATA\SZCG\REDO01.LOG
Successful open of redo thread 1
Sat Aug 09 11:56:24  2014
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Sat Aug 09 11:56:24  2014
ARC1: Becoming the 'no FAL' ARCH
ARC1: Becoming the 'no SRL' ARCH
Sat Aug 09 11:56:24  2014
ARC0: Becoming the heartbeat ARCH
Sat Aug 09 11:56:24  2014
SMON: enabling cache recovery
Sat Aug 09 11:56:25  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_4516.trc:
ORA-00600: 内部错误代码, 参数: [2662], [0], [77983864], [0], [77992379], [8388617], [], []
Sat Aug 09 11:56:26  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_4516.trc:
ORA-00600: 内部错误代码, 参数: [2662], [0], [77983864], [0], [77992379], [8388617], [], []
Sat Aug 09 11:56:26  2014
Error 600 happened during db open, shutting down database
USER: terminating instance due to error 600
Instance terminated by USER, pid = 4516
ORA-1092 signalled during: alter database open resetlogs...

ORA-600 2662这个错误很熟悉，直接推SCN,数据库open,但是报ORA-600 4194

Sat Aug 09 12:01:28  2014
SMON: enabling cache recovery
Dictionary check complete
Sat Aug 09 12:01:32  2014
SMON: enabling tx recovery
Sat Aug 09 12:01:32  2014
Database Characterset is ZHS16GBK
Opening with internal Resource Manager plan
replication_dependency_tracking turned off (no async multimaster replication found)
Starting background process QMNC
QMNC started with pid=34, OS id=6116
Sat Aug 09 12:01:34  2014
LOGSTDBY: Validating controlfile with logical metadata
Sat Aug 09 12:01:34  2014
LOGSTDBY: Validation complete
Sat Aug 09 12:01:34  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\bdump\szcg_smon_920.trc:
ORA-00600: internal error code, arguments: [4194], [21], [53], [], [], [], [], []
Sat Aug 09 12:01:36  2014
Doing block recovery for file 2 block 319
Resuming block recovery (PMON) for file 2 block 319
Block recovery from logseq 2, block 56 to scn 1073742003
Sat Aug 09 12:01:36  2014
Recovery of Online Redo Log: Thread 1 Group 2 Seq 2 Reading mem 0
  Mem# 0: F:\ORADATA\SZCG\REDO02.LOG
Block recovery stopped at EOT rba 2.79.16
Block recovery completed at rba 2.79.16, scn 0.1073742002
Doing block recovery for file 2 block 153
Resuming block recovery (PMON) for file 2 block 153
Block recovery from logseq 2, block 56 to scn 1073741986
Sat Aug 09 12:01:36  2014
Recovery of Online Redo Log: Thread 1 Group 2 Seq 2 Reading mem 0
  Mem# 0: F:\ORADATA\SZCG\REDO02.LOG
Block recovery completed at rba 2.66.16, scn 0.1073741988
Sat Aug 09 12:01:36  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\bdump\szcg_smon_920.trc:
ORA-01595: error freeing extent (4) of rollback segment (10))
ORA-00607: Internal error occurred while making a change to a data block
ORA-00600: internal error code, arguments: [4194], [21], [53], [], [], [], [], []
Sat Aug 09 12:01:36  2014
Errors in file d:\oracle\product\10.2.0\admin\szcg\udump\szcg_ora_5272.trc:
ORA-00600: internal error code, arguments: [4194], [21], [53], [], [], [], [], []
Sat Aug 09 12:01:36  2014
Completed: alter database open

尝试重建undo表空间并切换undo_tabspace到新undo表空间解决，因为数据库在恢复过程中使用了隐含参数强制拉库,不能保证数据一致性,强烈建议逻辑方式重建数据库
在本次故障中，所幸的是只有redo和sysaux文件损坏，如果是业务数据文件或者system数据文件损坏，恢复的后果可能更加麻烦，丢失数据可能更加多。再次说明：数据库备份非常重要,数据的安全性不能完全寄希望于硬件之上

分享一次ORA-01113 ORA-01110故障处理过程

联系：手机/微信(+86 17813235971) QQ(107644445)

标题：分享一次ORA-01113 ORA-01110故障处理过程

数据库报ORA-01113 ORA-01110无法正常open

SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-01113: file 1 needs media recovery
ORA-01110: data file 1: 'D:\APP\ADMINISTRATOR\ORADATA\ORCL\SYSTEM01.DBF'

alert日志

Wed Jul 16 17:17:56 2014
ALTER DATABASE   MOUNT
Successful mount of redo thread 1, with mount id 1380870980
Database mounted in Exclusive Mode
Lost write protection disabled
Completed: ALTER DATABASE   MOUNT
Wed Jul 16 17:19:01 2014
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\trace\orcl_m000_3540.trc  (incident=367575):
ORA-00700: soft internal error, arguments: [dbgrmblcp_corrupt_page], [D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\metadata\HM_FINDING.ams],
[1245], [], [], [], [], [], [], [], [], []
ORA-01113: file 1 needs media recovery
ORA-01110: data file 1: 'D:\APP\ADMINISTRATOR\ORADATA\ORCL\SYSTEM01.DBF'
Incident details in: D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\incident\incdir_367575\orcl_m000_3540_i367575.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\trace\orcl_m000_3540.trc  (incident=367576):
ORA-00600: internal error code, arguments: [dbgrmblgp_get_page_1], [1245], [0], [80], [], [], [], [], [], [], [], []
ORA-01113: file 1 needs media recovery
ORA-01110: data file 1: 'D:\APP\ADMINISTRATOR\ORADATA\ORCL\SYSTEM01.DBF'
Incident details in: D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\incident\incdir_367576\orcl_m000_3540_i367576.trc
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\trace\orcl_m000_3540.trc:
ORA-51106: check failed to complete due to an error.  See error below
ORA-00600: internal error code, arguments: [dbgrmblgp_get_page_1], [1245], [0], [80], [], [], [], [], [], [], [], []
ORA-01113: file 1 needs media recovery
ORA-01110: data file 1: 'D:\APP\ADMINISTRATOR\ORADATA\ORCL\SYSTEM01.DBF'

这里出现ORA-00700[dbgrmblcp_corrupt_page]与ORA-00600[dbgrmblgp_get_page_1]是Bug 10321285 : ORA-600 [DBGRMBLCP_CORRUPT_PAGE]

尝试recover database 遭遇ORA-00283 ORA-16433

SQL> recover database;
ORA-00283: recovery session canceled due to errors
ORA-16433: The database must be opened in read/write mode.

alert日志信息

Wed Jul 16 17:36:33 2014
ALTER DATABASE RECOVER  database
Media Recovery Start
 started logmerger process
Wed Jul 16 17:36:33 2014
Media Recovery failed with error 16433
Slave exiting with ORA-283 exception
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\trace\orcl_pr00_3868.trc:
ORA-00283: recovery session canceled due to errors
ORA-16433: The database must be opened in read/write mode.
Recovery Slave PR00 previously exited with exception 283
ORA-283 signalled during: ALTER DATABASE RECOVER  database  ...
Wed Jul 16 17:36:45 2014
alter database open
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\trace\orcl_ora_3236.trc:
ORA-01113: file 1 needs media recovery
ORA-01110: data file 1: 'D:\APP\ADMINISTRATOR\ORADATA\ORCL\SYSTEM01.DBF'
ORA-1113 signalled during: alter database open...

重建控制文件后恢复,遭遇ORA-600 2662错误

SQL> shutdown abort
ORACLE instance shut down.
SQL> startup nomount
ORACLE instance started.
Total System Global Area      10255212544 bytes
Fixed Size                        2264088 bytes
Variable Size                  4529849320 bytes
Database Buffers               5704253440 bytes
Redo Buffers                     18845696 bytes
SQL> @d:\ctl.sql
Control file created.
SQL> recover database using backup controlfile;
ORA-00279: change 13953689551797 generated at 07/16/2014 17:53:41 needed for
thread 1
ORA-00289: suggestion :
D:\APP\ADMINISTRATOR\FAST_RECOVERY_AREA\ORCL\ARCHIVELOG\2014_07_16\O1_MF_1_1_%U_
.ARC
ORA-00280: change 13953689551797 for thread 1 is in sequence #1
Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
d:\app\administrator\oradata\orcl\redo01.log
Log applied.
Media recovery complete.
SQL> alter database open resetlogs;
alter database open resetlogs
*
ERROR at line 1:
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00600: internal error code, arguments: [2662], [3248], [3635774399],
[3249], [3580470066], [4194545], [], [], [], [], [], []
Process ID: 3932
Session ID: 200 Serial number: 1

因为该数据库是11.2.0.3未打上scn patch,直接推进scn(可以使用event,隐含参数,oradebug,bbed等),数据库就正常open

记录一次ORA-600 3004 恢复过程和处理思路

联系：手机/微信(+86 17813235971) QQ(107644445)

标题：记录一次ORA-600 3004 恢复过程和处理思路

10.1.0.2数据库在启动的时候报ORA-00600[3004]错误

SQL> startup
ORACLE 例程已经启动。
Total System Global Area 1042254360 bytes
Fixed Size 743960 bytes
Variable Size 503316480 bytes
Database Buffers 536870912 bytes
Redo Buffers 1323008 bytes
数据库装载完毕。
ORA-00600: 内部错误代码，参数: [3004], [1], [0], [0], [], [], [], []

alert日志信息

Tue Jul 15 10:57:11 2014
Database mounted in Exclusive Mode.
Completed: ALTER DATABASE   MOUNT
Tue Jul 15 10:57:12 2014
ALTER DATABASE OPEN
Tue Jul 15 10:57:12 2014
Beginning crash recovery of 1 threads
 attempting to start a parallel recovery with 3 processes
 parallel recovery started with 3 processes
Tue Jul 15 10:57:12 2014
Started first pass scan
Tue Jul 15 10:57:12 2014
Errors in file d:\oracle\product\10.1.0\admin\orcl\udump\orcl_ora_1084.trc:
ORA-00600: 内部错误代码, 参数: [3004], [1], [0], [0], [], [], [], []
Tue Jul 15 10:57:13 2014
Errors in file d:\oracle\product\10.1.0\admin\orcl\bdump\orcl_p000_3780.trc:
ORA-10388: parallel query server interrupt (failure)
Tue Jul 15 10:57:13 2014
Errors in file d:\oracle\product\10.1.0\admin\orcl\bdump\orcl_p002_3420.trc:
ORA-10388: parallel query server interrupt (failure)
Tue Jul 15 10:57:14 2014
Errors in file d:\oracle\product\10.1.0\admin\orcl\bdump\orcl_p001_3784.trc:
ORA-10388: parallel query server interrupt (failure)
Tue Jul 15 10:57:14 2014
Aborting crash recovery due to error 600
Tue Jul 15 10:57:14 2014
Errors in file d:\oracle\product\10.1.0\admin\orcl\udump\orcl_ora_1084.trc:
ORA-00600: 内部错误代码, 参数: [3004], [1], [0], [0], [], [], [], []
ORA-600 signalled during: ALTER DATABASE OPEN...

根据老杨的blog,出现该问题，很可能是crontrolfile异常导致,所以这里可以考虑通过重建控制文件或者把当前控制文件当作备份控制文件来使用

Tue Jul 15 13:42:31 2014
ALTER DATABASE RECOVER database using backup controlfile
Tue Jul 15 13:42:31 2014
Media Recovery Start
WARNING! Recovering data file 1 from a fuzzy file. If not the current file
it might be an online backup taken without entering the begin backup command.
WARNING! Recovering data file 2 from a fuzzy file. If not the current file
it might be an online backup taken without entering the begin backup command.
WARNING! Recovering data file 3 from a fuzzy file. If not the current file
it might be an online backup taken without entering the begin backup command.
WARNING! Recovering data file 4 from a fuzzy file. If not the current file
it might be an online backup taken without entering the begin backup command.
WARNING! Recovering data file 5 from a fuzzy file. If not the current file
it might be an online backup taken without entering the begin backup command.
WARNING! Recovering data file 6 from a fuzzy file. If not the current file
it might be an online backup taken without entering the begin backup command.
WARNING! Recovering data file 7 from a fuzzy file. If not the current file
it might be an online backup taken without entering the begin backup command.
WARNING! Recovering data file 8 from a fuzzy file. If not the current file
it might be an online backup taken without entering the begin backup command.
WARNING! Recovering data file 9 from a fuzzy file. If not the current file
it might be an online backup taken without entering the begin backup command.
WARNING! Recovering data file 10 from a fuzzy file. If not the current file
it might be an online backup taken without entering the begin backup command.
WARNING! Recovering data file 11 from a fuzzy file. If not the current file
it might be an online backup taken without entering the begin backup command.
attempting to start a parallel recovery with 3 processes
parallel recovery started with 3 processes
Starting datafile 1 with incarnation depth 0 in thread 1 sequence 794
Datafile 1: 'D:\ORACLE\PRODUCT\10.1.0\ORADATA\ORCL\SYSTEM01.DBF'
Starting datafile 2 with incarnation depth 0 in thread 1 sequence 794
Datafile 2: 'D:\ORACLE\PRODUCT\10.1.0\ORADATA\ORCL\UNDOTBS01.DBF'
Starting datafile 3 with incarnation depth 0 in thread 1 sequence 794
Datafile 3: 'D:\ORACLE\PRODUCT\10.1.0\ORADATA\ORCL\SYSAUX01.DBF'
Starting datafile 4 with incarnation depth 0 in thread 1 sequence 794
Datafile 4: 'D:\ORACLE\PRODUCT\10.1.0\ORADATA\ORCL\USERS01.DBF'
Starting datafile 5 with incarnation depth 0 in thread 1 sequence 794
Datafile 5: 'E:\ORADB\USER1_1.ORA'
Starting datafile 6 with incarnation depth 0 in thread 1 sequence 794
Datafile 6: 'E:\ORADB\USER1_2.ORA'
Starting datafile 7 with incarnation depth 0 in thread 1 sequence 794
Datafile 7: 'E:\ORADB\USER1_3.ORA'
Starting datafile 8 with incarnation depth 0 in thread 1 sequence 794
Datafile 8: 'E:\ORADB\USER1_4.ORA'
Starting datafile 9 with incarnation depth 0 in thread 1 sequence 794
Datafile 9: 'E:\ORADB\INDEX1_1.ORA'
Starting datafile 10 with incarnation depth 0 in thread 1 sequence 794
Datafile 10: 'E:\ORADB\INDEX1_2.ORA'
Starting datafile 11 with incarnation depth 0 in thread 1 sequence 794
Datafile 11: 'E:\ORADB\TEMP1_1.ORA'
Media Recovery Log
ORA-279 signalled during: ALTER DATABASE RECOVER database using backup cont...
Tue Jul 15 13:43:03 2014
ALTER DATABASE RECOVER LOGFILE 'D:\ORACLE\PRODUCT\10.1.0\ORADATA\ORCL\REDO01.LOG'
Tue Jul 15 13:43:03 2014
Media Recovery Log D:\ORACLE\PRODUCT\10.1.0\ORADATA\ORCL\REDO01.LOG
Completed: ALTER DATABASE RECOVER LOGFILE 'D:\ORACLE\PRODU
Tue Jul 15 13:43:17 2014
alter database open resetlogs
RESETLOGS after complete recovery through change 3142208
Resetting resetlogs activation ID 1378452168 (0x522982c8)
Setting recovery target incarnation to 3
Tue Jul 15 13:43:18 2014
Setting recovery target incarnation to 3
Tue Jul 15 13:43:18 2014
Flashback Database Disabled
Tue Jul 15 13:43:19 2014
Assigning activation ID 1380750902 (0x524c9636)
Maximum redo generation record size = 120832 bytes
Maximum redo generation change vector size = 116476 bytes
Private_strands 7 at log switch
Thread 1 opened at log sequence 1
Current log# 1 seq# 1 mem# 0: D:\ORACLE\PRODUCT\10.1.0\ORADATA\ORCL\REDO01.LOG
Successful open of redo thread 1
Tue Jul 15 13:43:19 2014
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Tue Jul 15 13:43:19 2014
SMON: enabling cache recovery
Tue Jul 15 13:43:20 2014
Errors in file d:\oracle\product\10.1.0\admin\orcl\udump\orcl_ora_1484.trc:
ORA-00600: 内部错误代码, 参数: [2662], [0], [3142214], [0], [3142438], [8388617], [], []
Tue Jul 15 13:43:21 2014
Errors in file d:\oracle\product\10.1.0\admin\orcl\udump\orcl_ora_1484.trc:
ORA-00600: 内部错误代码, 参数: [2662], [0], [3142214], [0], [3142438], [8388617], [], []
Tue Jul 15 13:43:21 2014
Error 600 happened during db open, shutting down database
USER: terminating instance due to error 600

这里出现ORA-600[2662]因为scn相差较小,直接重启几次,然后数据库出现如下启动正常但是出现ORA-01595和ORA-00600[4194]错误

Tue Jul 15 13:48:26 2014
Starting background process MMON
Starting background process MMNL
MMON started with pid=17, OS id=2896
MMNL started with pid=18, OS id=3828
Tue Jul 15 13:48:26 2014
Block recovery completed at rba 2.90.16, scn 0.3162303
Tue Jul 15 13:48:26 2014
Errors in file d:\oracle\product\10.1.0\admin\orcl\bdump\orcl_smon_3724.trc:
ORA-01595: error freeing extent (3) of rollback segment (1))
ORA-00607: Internal error occurred while making a change to a data block
ORA-00600: internal error code, arguments: [4194], [57], [6], [], [], [], [], []
Tue Jul 15 13:48:26 2014
Completed: alter database open
Tue Jul 15 13:48:27 2014
Errors in file d:\oracle\product\10.1.0\admin\orcl\bdump\orcl_j000_2956.trc:
ORA-00600: internal error code, arguments: [4193], [1375], [1379], [], [], [], [], []

通过设置undo_management=manual,重建undo表空间解决,至此本数据库恢复完成,建议对其进行逻辑方式重建

ORACLE 12C redo异常恢复测试—部分pdb未正常open

联系：手机/微信(+86 17813235971) QQ(107644445)

标题：ORACLE 12C redo异常恢复测试—部分pdb未正常open

为了验证当前redo丢失的情况下ORACLE 12C CDB数据库恢复的情况,做了一个小实验，三个会话，分别操作为在root pdb中执行一个delete 不提交；另外一个会话在user pdb中delete记录不提交；最后一个会话中直接abort数据库，然后进行数据库恢复，验证数据库是否可以都正常open(所有pdb)
会话1(root pdb中操作)

CDB_CDB$ROOT@SYS> show pdbs;
    CON_ID CON_NAME                       OPEN MODE  RESTRICTED
---------- ------------------------------ ---------- ----------
         2 PDB$SEED                       READ ONLY  NO
         3 PDB                            READ WRITE NO
CDB_CDB$ROOT@SYS> show con_name
CON_NAME
------------------------------
CDB$ROOT
CDB_CDB$ROOT@SYS> create table t_xifenfei as select * from dba_objects;
表已创建。
CDB_CDB$ROOT@SYS> delete from t_xifenfei;
已删除 91032 行。
CDB_CDB$ROOT@SYS>

会话2(user pdb中操作)

CDB_CDB$ROOT@SYS> show con_name
CON_NAME
------------------------------
PDB
CDB_CDB$ROOT@SYS> drop table t_xifenfei purge;
表已删除。
CDB_CDB$ROOT@SYS> create table t_xifenfei as select * from dba_objects;
表已创建。
CDB_CDB$ROOT@SYS> delete from t_xifenfei;
已删除 91144 行。

会话3(直接abort数据库)

CDB_CDB$ROOT@SYS> shutdown abort;
ORACLE 例程已经关闭。

删除数据库联机日志

E:\app\XIFENFEI\oradata\cdb>dir redo*.log
 驱动器 E 中的卷是 本地磁盘
 卷的序列号是 000C-3B41
 E:\app\XIFENFEI\oradata\cdb 的目录
2013-08-07  01:41        52,429,312 REDO01.LOG
2013-08-07  01:40        52,429,312 REDO02.LOG
2013-08-07  01:40        52,429,312 REDO03.LOG
2014-03-20  22:47        52,432,896 REDO04.LOG
2014-03-20  22:47        52,432,896 REDO05.LOG
2014-03-20  22:47        52,432,896 REDO06.LOG
               6 个文件    314,586,624 字节
               0 个目录 21,359,374,336 可用字节
E:\app\XIFENFEI\oradata\cdb>del redo*.log
E:\app\XIFENFEI\oradata\cdb>dir redo*.log
 驱动器 E 中的卷是 本地磁盘
 卷的序列号是 000C-3B41
 E:\app\XIFENFEI\oradata\cdb 的目录
找不到文件

启动数据库报ORA-00313错误

idle> startup
ORACLE 例程已经启动。
Total System Global Area  521936896 bytes
Fixed Size                  2404552 bytes
Variable Size             205524792 bytes
Database Buffers          306184192 bytes
Redo Buffers                7823360 bytes
数据库装载完毕。
ORA-00313: 无法打开日志组 6 (用于线程 1) 的成员
ORA-00312: 联机日志 6 线程 1: 'E:\APP\XIFENFEI\ORADATA\CDB\REDO06.LOG'
ORA-27041: 无法打开文件
OSD-04002: 无法打开文件
O/S-Error: (OS 2) 系统找不到指定的文件。

使用_allow_resetlogs_corruption参数尝试open数据库

idle> shutdown abort
ORACLE 例程已经关闭。
idle> startup pfile='d:/pfile.txt' mount
ORACLE 例程已经启动。
Total System Global Area  521936896 bytes
Fixed Size                  2404552 bytes
Variable Size             205524792 bytes
Database Buffers          306184192 bytes
Redo Buffers                7823360 bytes
数据库装载完毕。
idle> show parameter resetlogs;
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
_allow_resetlogs_corruption          boolean     TRUE
idle> recover database until cancel;
ORA-00279: 更改 12696935641735 (在 03/20/2014 22:38:52 生成) 对于线程 1
是必需的
ORA-00289: 建议:
E:\APP\XIFENFEI\FAST_RECOVERY_AREA\CDB\ARCHIVELOG\2014_03_20\O1_MF_1_872_9LOZSK9
X_.ARC
ORA-00280: 更改 12696935641735 (用于线程 1) 在序列 #872 中
指定日志: {<RET>=suggested | filename | AUTO | CANCEL}
cancel
ORA-01547: 警告: RECOVER 成功但 OPEN RESETLOGS 将出现如下错误
ORA-01194: 文件 1 需要更多的恢复来保持一致性
ORA-01110: 数据文件 1: 'E:\APP\XIFENFEI\ORADATA\CDB\SYSTEM01.DBF'
ORA-01112: 未启动介质恢复
idle> alter database open resetlogs;
alter database open resetlogs
*
第 1 行出现错误:
ORA-00603: ORACLE server session terminated by fatal error
ORA-00600: internal error code, arguments: [2662], [2956], [1012314778],
[2956], [1012314903], [268435600], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [2662], [2956], [1012314777],
[2956], [1012314903], [268435600], [], [], [], [], [], []
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00600: internal error code, arguments: [2662], [2956], [1012314768],
[2956], [1012314903], [268435600], [], [], [], [], [], []
进程 ID: 9268
会话 ID: 92 序列号: 3

ora-600[2662]很熟悉，文件头的scn小于文件中某个block的csn，通过bbed等工具修改文件scn，尝试启动数据库

CDB_CDB$ROOT@SYS> SHUTDOWN ABORT
ORACLE 例程已经关闭。
CDB_CDB$ROOT@SYS> STARTUP
ORACLE 例程已经启动。
Total System Global Area  521936896 bytes
Fixed Size                  2404552 bytes
Variable Size             205524792 bytes
Database Buffers          306184192 bytes
Redo Buffers                7823360 bytes
数据库装载完毕。
数据库已经打开。

尝试open user pdb

CDB_CDB$ROOT@SYS> alter session set container=pdb;
会话已更改。
CDB_CDB$ROOT@SYS> alter database open;
alter database open
*
第 1 行出现错误:
ORA-00600: 内部错误代码, 参数: [kcvfdb_pdb_set_clean_scn: cleanckpt], [3],
[2956], [1012312995], [2956], [1012334778], [2], [], [], [], [], []

查询mos得到结论:
Bug 16784143 ORA-600 [kcvfdb_pdb_set_clean_scn: cleanckpt] with PDBs
在12.2/12.1.0.2/12.1.0.1.1中修复，后续尝试打patch,来进一步恢复.

测试结果证明
1.root pdb的open过程和以前版本数据库无大差异,但是scn推进比较费劲,以前的event和隐含参数方法无效
2.user pdb再是看是因为bug无法open,后续继续验证
3.12.1的base 版本可能确实问题很多，生产使用慎重

ORACLE REDO各种异常恢复

联系：手机/微信(+86 17813235971) QQ(107644445)

标题：ORACLE REDO各种异常恢复

redo是oracle数据库比较核心文件,当redo异常之后,数据库无法正常启动,而且有丢失数据的风险,强烈建议条件允许redo多路镜像.redo文件异常的故障可以说是千奇百怪,但是总体上可以分为几类:
数据库归档/非归档模式下inactive redo异常
ORA-00316 ORA-00327

SQL> startup mount
ORACLE instance started.
Total System Global Area  260046848 bytes
Fixed Size                  1266896 bytes
Variable Size              83888944 bytes
Database Buffers          167772160 bytes
Redo Buffers                7118848 bytes
Database mounted.
SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-00316: log 2 of thread 1, type  in header is not log file
ORA-00312: online log 2 thread 1: '/u01/oracle/oradata/XFF/redo02.log'
SQL> col member for a40
SQL> set lines 120
SQL>   SELECT thread#,
  2           a.sequence#,
  3           a.group#,
  4           TO_CHAR (first_change#, '9999999999999999') "SCN",
  5           a.status,
  6           MEMBER
  7      FROM v$log a, v$logfile b
  8     WHERE a.group# = B.GROUP#
  9  ORDER BY a.sequence# DESC;
   THREAD#  SEQUENCE#     GROUP# SCN               STATUS           MEMBER
---------- ---------- ---------- ----------------- ---------------- -----------------------------------
         1         15          3            665697 CURRENT          /u01/oracle/oradata/XFF/redo03.log
         1         14          2            645619 INACTIVE         /u01/oracle/oradata/XFF/redo02.log
         1         13          1            625540 INACTIVE         /u01/oracle/oradata/XFF/redo01.log
SQL> ALTER DATABASE CLEAR LOGFILE GROUP 2;
Database altered.
SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-00327: log 2 of thread 1, physical size  less than needed
ORA-00312: online log 2 thread 1: '/u01/oracle/oradata/XFF/redo02.log'
SQL> alter database drop logfile group 2;
Database altered.
SQL> alter database open;
Database altered.
SQL> alter database add logfile group 2 ('/u01/oracle/oradata/XFF/redo02.log') size 50M reuse;
Database altered.

正常关闭数据库current redo异常
ORA-00316 ORA-01623

SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-00316: log 1 of thread 1, type  in header is not log file
ORA-00312: online log 1 thread 1: '/u01/oracle/oradata/XFF/redo01.log'
SQL>   SELECT thread#,
  2           a.sequence#,
  3           a.group#,
  4           TO_CHAR (first_change#, '9999999999999999') "SCN",
  5           a.status,
  6           MEMBER
  7      FROM v$log a, v$logfile b
  8     WHERE a.group# = B.GROUP#
  9  ORDER BY a.sequence# DESC;
   THREAD#  SEQUENCE#     GROUP# SCN               STATUS           MEMBER
---------- ---------- ---------- ----------------- ---------------- ----------------------------------
         1         16          1            685918 CURRENT          /u01/oracle/oradata/XFF/redo01.log
         1         15          3            665697 INACTIVE         /u01/oracle/oradata/XFF/redo03.log
         1          0          2                 0 UNUSED           /u01/oracle/oradata/XFF/redo02.log
SQL> ALTER DATABASE CLEAR LOGFILE GROUP 1;
ALTER DATABASE CLEAR LOGFILE GROUP 1
*
ERROR at line 1:
ORA-00316: log 1 of thread 1, type 0 in header is not log file
ORA-00312: online log 1 thread 1: '/u01/oracle/oradata/XFF/redo01.log'
SQL> ALTER DATABASE drop logfile group 1;
ALTER DATABASE drop logfile group 1
*
ERROR at line 1:
ORA-01623: log 1 is current log for instance XFF (thread 1) - cannot drop
ORA-00312: online log 1 thread 1: '/u01/oracle/oradata/XFF/redo01.log'
SQL> recover database until cancel;
Media recovery complete.
SQL> alter database open resetlogs;
Database altered.

数据库异常关闭current/active redo异常
ORA-00316 ORA-01624 ORA-01194

SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-00316: log 1 of thread 1, type 0 in header is not log file
ORA-00312: online log 1 thread 1: '/u01/oracle/oradata/XFF/redo01.log'
SQL>   SELECT thread#,
  2           a.sequence#,
  3           a.group#,
  4           TO_CHAR (first_change#, '9999999999999999') "SCN",
  5           a.status,
  6           MEMBER
  7      FROM v$log a, v$logfile b
  8     WHERE a.group# = B.GROUP#
  9  ORDER BY a.sequence# DESC;
   THREAD#  SEQUENCE#     GROUP# SCN               STATUS           MEMBER
---------- ---------- ---------- ----------------- ---------------- -----------------------------------
         1          8          2            686310 CURRENT          /u01/oracle/oradata/XFF/redo02.log
         1          7          1            686294 ACTIVE           /u01/oracle/oradata/XFF/redo01.log
         1          6          3            686289 INACTIVE         /u01/oracle/oradata/XFF/redo03.log
SQL> ALTER DATABASE CLEAR LOGFILE GROUP 1;
ALTER DATABASE CLEAR LOGFILE GROUP 1
*
ERROR at line 1:
ORA-01624: log 1 needed for crash recovery of instance XFF (thread 1)
ORA-00312: online log 1 thread 1: '/u01/oracle/oradata/XFF/redo01.log'
SQL> ALTER DATABASE drop logfile group 1;
ALTER DATABASE drop logfile group 1
*
ERROR at line 1:
ORA-01624: log 1 needed for crash recovery of instance XFF (thread 1)
ORA-00312: online log 1 thread 1: '/u01/oracle/oradata/XFF/redo01.log'
SQL> recover database until cancel
ORA-00279: change 686294 generated at 04/20/2013 01:37:16 needed for thread 1
ORA-00289: suggestion : /u01/oracle/oracle/product/10.2.0/db_1/dbs/arch1_7_813202529.dbf
ORA-00280: change 686294 for thread 1 is in sequence #7
Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
/u01/oracle/oradata/XFF/redo01.log
ORA-00308: cannot open archived log '/u01/oracle/oradata/XFF/redo01.log'
ORA-27047: unable to read the header block of file
Additional information: 2
Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
cancel
ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below
ORA-01194: file 1 needs more recovery to be consistent
ORA-01110: data file 1: '/u01/oracle/oradata/XFF/system01.dbf'
ORA-01112: media recovery not started
SQL> alter database open resetlogs;
alter database open resetlogs
*
ERROR at line 1:
ORA-01194: file 1 needs more recovery to be consistent
ORA-01110: data file 1: '/u01/oracle/oradata/XFF/system01.dbf'
SQL> alter system set "_allow_resetlogs_corruption"=true scope=spfile;
System altered.
SQL> shutdown immediate;
ORA-01109: database not open
Database dismounted.
ORACLE instance shut down.
SQL> startup mount;
ORACLE instance started.
Total System Global Area  260046848 bytes
Fixed Size                  1266896 bytes
Variable Size              83888944 bytes
Database Buffers          167772160 bytes
Redo Buffers                7118848 bytes
Database mounted.
SQL> recover database until cancel
ORA-00279: change 686294 generated at 04/20/2013 01:37:16 needed for thread 1
ORA-00289: suggestion : /u01/oracle/oracle/product/10.2.0/db_1/dbs/arch1_7_813202529.dbf
ORA-00280: change 686294 for thread 1 is in sequence #7
Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
cancel
ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below
ORA-01194: file 1 needs more recovery to be consistent
ORA-01110: data file 1: '/u01/oracle/oradata/XFF/system01.dbf'
ORA-01112: media recovery not started
SQL> alter database open resetlogs;
Database altered.

在这样的情况下,数据库异常关闭,current/active redo异常,通过使用隐含参数可能可以侥幸的恢复数据库,但是也可能导致数据丢失.这里因为是模拟情况,无业务所以在很多较为繁忙的业务系统中,使用隐含参数resetlogs过程中可能还会遇到如下很多常见的错误,进一步增加了恢复难度

current/active redo异常后附带其他错误
ORA-600[2662]

Wed Dec 07 13:02:49 2011
SMON: enabling cache recovery
Errors in file d:\app\administrator\diag\rdbms\hzyl\hzyl\trace\hzyl_ora_3388.trc  (incident=216664):
ORA-00600: 内部错误代码, 参数: [2662], [2], [1153862134], [2], [1153864845], [12582921], [], []
Incident details in: d:\app\administrator\diag\rdbms\hzyl\hzyl\incident\incdir_216664\hzyl_ora_3388_i216664.trc
Errors in file d:\app\administrator\diag\rdbms\hzyl\hzyl\trace\hzyl_ora_3388.trc:
ORA-00600: 内部错误代码, 参数: [2662], [2], [1153862134], [2], [1153864845], [12582921], [], []
Error 600 happened during db open, shutting down database
USER (ospid: 3388): terminating the instance due to error 600

ORA-00600[4000]

Thu Feb 28 19:29:10 2013
SMON: enabling cache recovery
Thu Feb 28 19:29:11 2013
Errors in file /u1/PROD/prodora/db/tech_st/10.2.0/admin/PROD_oracle/udump/prod_ora_20989.trc:
ORA-00600: internal error code, arguments: [4000], [50], [], [], [], [], [], []
Thu Feb 28 19:29:13 2013
Incremental checkpoint up to RBA [0x1.3.0], current log tail at RBA [0x1.3.0]
Thu Feb 28 19:29:13 2013
Errors in file /u1/PROD/prodora/db/tech_st/10.2.0/admin/PROD_oracle/udump/prod_ora_20989.trc:
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00600: internal error code, arguments: [4000], [50], [], [], [], [], [], []

ORA-00704 ORA-00604 ORA-01555

Fri May  4 21:04:21 2012
select ctime, mtime, stime from obj$ where obj# = :1
Fri May  4 21:04:21 2012
Errors in file /oracle/admin/standdb/udump/perfdb_ora_1286288.trc:
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 1
ORA-01555: snapshot too old: rollback segment number 40 with name "_SYSSMU40$" too small
Error 704 happened during db open, shutting down database
USER: terminating instance due to error 704
Instance terminated by USER, pid = 1286288
ORA-1092 signalled during: alter database open resetlogs...

current/active redo异常还可能报如下错误
redo文件损坏报错

Started redo scan
Errors in file d:\app\administrator\diag\rdbms\hzyl\hzyl\trace\hzyl_ora_2960.trc  (incident=214262):
ORA-00353: 日志损坏接近块 12014 更改 9743799889 时间 12/05/2011 09:21:11
ORA-00312: 联机日志 3 线程 1: 'R:\ORADATA\HZYL\REDO03.LOG'
Incident details in: d:\app\administrator\diag\rdbms\hzyl\hzyl\incident\incdir_214262\hzyl_ora_2960_i214262.trc
Aborting crash recovery due to error 368
Errors in file d:\app\administrator\diag\rdbms\hzyl\hzyl\trace\hzyl_ora_2960.trc:
ORA-00368: 重做日志块中的校验和错误
ORA-00353: 日志损坏接近块 12014 更改 9743799889 时间 12/05/2011 09:21:11
ORA-00312: 联机日志 3 线程 1: 'R:\ORADATA\HZYL\REDO03.LOG'
ORA-368 signalled during: ALTER DATABASE OPEN...

redo文件被其他实例占用报错

Wed May 16 17:03:11 2012
Started redo scan
Wed May 16 17:03:11 2012
Errors in file /oracle/admin/odsdb/udump/odsdb1_ora_2040024.trc:
ORA-00305: log 14 of thread 1 inconsistent; belongs to another database
ORA-00312: online log 14 thread 1: '/dev/rods_redo1_2_2'
ORA-00305: log 14 of thread 1 inconsistent; belongs to another database
ORA-00312: online log 14 thread 1: '/dev/rods_redo1_2_1'
ORA-305 signalled during: ALTER DATABASE OPEN...

存储整体异常

Mon Oct 17 09:35:09 2011
Errors in file /oracle/app/admin/orcl/bdump/orcl2_lgwr_348814.trc:
ORA-00340: IO error processing online log 4 of thread 2
ORA-00345: redo log write error block 6732 count 2
ORA-00312: online log 4 thread 2: '/dev/rredo21'
ORA-27063: number of bytes read/written is incorrect
IBM AIX RISC System/6000 Error: 6: No such device or address
Additional information: -1
Additional information: 1024
Mon Oct 17 09:35:09 2011
LGWR: terminating instance due to error 340

存储IO异常

Fri Feb 21 08:44:42 2014
Thread 1 advanced to log sequence 591 (LGWR switch)
  Current log# 1 seq# 591 mem# 0: J:\ORADATA\ORCL\REDO01.LOG
Fri Feb 21 15:31:20 2014
Errors in file c:\oracle\product\10.2.0\admin\orcl\bdump\orcl_lgwr_10312.trc:
ORA-00316: log 1 of thread 1, type 286 in header is not log file
ORA-00312: online log 1 thread 1: 'J:\ORADATA\ORCL\REDO01.LOG'

使用_disable_logging参数

Sat May 14 23:16:49 2005
Errors in file d:\oracle\admin\rman\bdump\rman_arc0_736.trc:
ORA-16038: log 3 sequence# 72 cannot be archived
ORA-00354: corrupt redo log block header
ORA-00312: online log 3 thread 1: 'D:\ORACLE\ORADATA\RMAN\REDO03.LOG'

如果你在使用这些思路进行恢复遇到突发情况不能自行解决,请联系我们,将为您提供专业数据库技术支持
Phone:17813235971 Q Q:107644445 E-Mail:dba@xifenfei.com

姊妹篇
undo异常总结和恢复思路
 ORACLE丢失各种文件导致数据库不能OPEN恢复

使用bbed解决ORA-00600[2662]

联系：手机/微信(+86 17813235971) QQ(107644445)

标题：使用bbed解决ORA-00600[2662]

一、数据库启动报ORA-00600[2662]

[oracle@node1 ora11g]$ sqlplus / as sysdba
SQL*Plus: Release 11.2.0.3.0 Production on Thu Dec 22 14:37:00 2011
Copyright (c) 1982, 2011, Oracle.  All rights reserved.
Connected to an idle instance.
SQL> startup
ORACLE instance started.
Total System Global Area 2137886720 bytes
Fixed Size                  2230072 bytes
Variable Size            1493174472 bytes
Database Buffers          637534208 bytes
Redo Buffers                4947968 bytes
Database mounted.
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00600: internal error code, arguments: [2662], [2], [2147510731], [2],
[2164287937], [4194432], [], [], [], [], [], []
Process ID: 16829
Session ID: 96 Serial number: 3

二.alert日志错误显示

Thu Dec 22 14:37:09 2011
ALTER DATABASE OPEN
LGWR: STARTING ARCH PROCESSES
Thu Dec 22 14:37:09 2011
ARC0 started with pid=20, OS id=16831
ARC0: Archival started
LGWR: STARTING ARCH PROCESSES COMPLETE
ARC0: STARTING ARCH PROCESSES
Thu Dec 22 14:37:10 2011
ARC1 started with pid=21, OS id=16833
Thu Dec 22 14:37:10 2011
ARC2 started with pid=22, OS id=16835
Thu Dec 22 14:37:10 2011
ARC3 started with pid=23, OS id=16837
ARC1: Archival started
ARC2: Archival started
ARC2: Becoming the 'no FAL' ARCH
ARC2: Becoming the 'no SRL' ARCH
ARC1: Becoming the heartbeat ARCH
Thread 1 opened at log sequence 17
  Current log# 2 seq# 17 mem# 0: /opt/oracle/oradata/ora11g/redo02.log
Successful open of redo thread 1
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
SMON: enabling cache recovery
Errors in file /opt/oracle/diag/rdbms/ora11g/ora11g/trace/ora11g_ora_16829.trc  (incident=36156):
ORA-00600: internal error code, arguments: [2662], [2], [2147510731], [2], [2164287937], [4194432], [], [], [], [], [], []
Incident details in: /opt/oracle/diag/rdbms/ora11g/ora11g/incident/incdir_36156/ora11g_ora_16829_i36156.trc
ARC3: Archival started
ARC0: STARTING ARCH PROCESSES COMPLETE
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Errors in file /opt/oracle/diag/rdbms/ora11g/ora11g/trace/ora11g_ora_16829.trc  (incident=36157):
ORA-00600: internal error code, arguments: [2662], [2], [2147510731], [2], [2164287937], [4194432], [], [], [], [], [], []
Incident details in: /opt/oracle/diag/rdbms/ora11g/ora11g/incident/incdir_36157/ora11g_ora_16829_i36157.trc
Dumping diagnostic data in directory=[cdmp_20111222143713], requested by (instance=1, osid=16829), summary=[incident=36156].
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Undo initialization errored: err:600 serial:0 start:176607884 end:176611234 diff:3350 (33 seconds)
Errors in file /opt/oracle/diag/rdbms/ora11g/ora11g/trace/ora11g_ora_16829.trc:
ORA-00600: internal error code, arguments: [2662], [2], [2147510731], [2], [2164287937], [4194432], [], [], [], [], [], []
Errors in file /opt/oracle/diag/rdbms/ora11g/ora11g/trace/ora11g_ora_16829.trc:
ORA-00600: internal error code, arguments: [2662], [2], [2147510731], [2], [2164287937], [4194432], [], [], [], [], [], []
Error 600 happened during db open, shutting down database
USER (ospid: 16829): terminating the instance due to error 600
Instance terminated by USER, pid = 16829
ORA-1092 signalled during: ALTER DATABASE OPEN...
opiodr aborting process unknown ospid (16829) as a result of ORA-1092
Thu Dec 22 14:37:15 2011
ORA-1092 : opitsk aborting process

三.分析日志
ORA-00600[2662]主要参数说明见:ORA-00600 [2662]
这里补充说明:e表示出现异常问题的数据块的DBA,这里的4194432就是一个数据块的DBA

--通过DBA地址查询数据块和文件号
SQL> select dbms_utility.data_block_address_block(4194432) "blick",
  2    dbms_utility.data_block_address_file(4194432) "file" from dual;
     blick       file
---------- ----------
       128          1
--当前数据库SCN
SQL> select to_char(2147510731,'xxxxxxxxxxx') from dual;
TO_CHAR(2147
------------
    800069cb
--当前数据块SCN
SQL> select to_char(2164287937,'xxxxxxxxxxx') from dual;
TO_CHAR(2164
------------
    810069c1

四.bbed查看相关SCN

[oracle@node1 ora11g]$ bbed
Password:
BBED-00113: Invalid password. Please rerun utility with the correct password.
[oracle@node1 ora11g]$ bbed
Password:
BBED: Release 2.0.0.0.0 - Limited Production on Thu Dec 22 14:49:24 2011
Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.
************* !!! For Oracle Internal Use only !!! ***************
BBED> set filename "/opt/oracle/oradata/ora11g/system01.dbf"
        FILENAME        /opt/oracle/oradata/ora11g/system01.dbf
BBED> set block 1
        BLOCK#          1
BBED> p kcvfhckp
struct kcvfhckp, 36 bytes                   @484
   struct kcvcpscn, 8 bytes                 @484
      ub4 kscnbas                           @484      0x800069c8
      ub2 kscnwrp                           @488      0x0002
   ub4 kcvcptim                             @492      0x2dedee96
   ub2 kcvcpthr                             @496      0x0001
   union u, 12 bytes                        @500
      struct kcvcprba, 12 bytes             @500
         ub4 kcrbaseq                       @500      0x00000011
         ub4 kcrbabno                       @504      0x0000210f
         ub2 kcrbabof                       @508      0x0010
   ub1 kcvcpetb[0]                          @512      0x02
   ub1 kcvcpetb[1]                          @513      0x00
   ub1 kcvcpetb[2]                          @514      0x00
   ub1 kcvcpetb[3]                          @515      0x00
   ub1 kcvcpetb[4]                          @516      0x00
   ub1 kcvcpetb[5]                          @517      0x00
   ub1 kcvcpetb[6]                          @518      0x00
   ub1 kcvcpetb[7]                          @519      0x00
BBED> set block 128
        BLOCK#          128
BBED> p bas_kcbh
ub4 bas_kcbh                                @8        0x810069c1
BBED> p wrp_kcbh
ub2 wrp_kcbh                                @12       0x0002

这里看到的SCN(16进制)和我们在alert日志中看到的有一定的出入原因是在数据库启动的时候，当前SCN增加了，但是因为数据库直接abort，没有写入到数据文件中。导致数据文件头部的SCN比alert中显示的稍微小一点(还有可能，系统当前的scn比system01.dbf的scn大一点)。通过对比数据块和数据文件头部的SCN也可以说明当数据块的SCN>数据块当前SCN导致ORA-00600[2662]

五.bbed修改数据块的SCN

BBED> set offset 8
        OFFSET          8
BBED> m /x c8690080
BBED-00215: editing not allowed in BROWSE mode
BBED> set mode edit
        MODE            Edit
BBED> m /x c8690080
BBED-00209: invalid number (c8690080)
--分开修改，曲线救国策略
BBED> m /x c869
Warning: contents of previous BIFILE will be lost. Proceed? (Y/N) y
 File: /opt/oracle/oradata/ora11g/system01.dbf (0)
 Block: 128              Offsets:    8 to  519           Dba:0x00000000
------------------------------------------------------------------------
 c8690081 02000104 2f8f0000 00000000 00000000 00000000 00000000 06000000
 2f000000 20100000 00000000 00000000 07000000 81004000 00000000 00000000
 00000000 00000000 00000000 00000000 00000000 06000000 00000000 00000000
 00000040 81004000 07000000 88004000 08000000 10024000 08000000 18024000
 08000000 20024000 08000000 28024000 08000000 00000000 00000000 00000000
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
 <32 bytes per line>
BBED> set offset +2
        OFFSET          10
BBED> m /x 0080
 File: /opt/oracle/oradata/ora11g/system01.dbf (0)
 Block: 128              Offsets:   10 to  521           Dba:0x00000000
------------------------------------------------------------------------
 00800200 01042f8f 00000000 00000000 00000000 00000000 00000600 00002f00
 00002010 00000000 00000000 00000700 00008100 40000000 00000000 00000000
 00000000 00000000 00000000 00000000 00000600 00000000 00000000 00000000
 00408100 40000700 00008800 40000800 00001002 40000800 00001802 40000800
 00002002 40000800 00002802 40000800 00000000 00000000 00000000 00000000
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
 <32 bytes per line>
BBED> p tailchk
ub4 tailchk                                 @8188     0x69c10e01
BBED> set offset 8188
        OFFSET          8188
BBED> m /x 010ec869
 File: /opt/oracle/oradata/ora11g/system01.dbf (0)
 Block: 128              Offsets: 8188 to 8191           Dba:0x00000000
------------------------------------------------------------------------
 010ec869
 <32 bytes per line>
BBED> p tailchk
ub4 tailchk                                 @8188     0x69c80e01
BBED> p bas_kcbh
ub4 bas_kcbh                                @8        0x800069c8
BBED> sum apply
Check value for File 0, Block 128:
current = 0x8e2f, required = 0x8e2f
BBED> exit

六.启动数据库

[oracle@node1 ora11g]$ sqlplus / as sysdba
SQL*Plus: Release 11.2.0.3.0 Production on Thu Dec 22 14:58:10 2011
Copyright (c) 1982, 2011, Oracle.  All rights reserved.
Connected to an idle instance.
SQL> startup
ORACLE instance started.
Total System Global Area 2137886720 bytes
Fixed Size                  2230072 bytes
Variable Size            1493174472 bytes
Database Buffers          637534208 bytes
Redo Buffers                4947968 bytes
Database mounted.
Database opened.

七.补充说明
一般遇到ORA-00600[2662]都是使用alter session set events ‘10015 trace name adjust_scn level N’;方法处理，但是有时候会遇到ORA-01031错误，那就需要请bbed帮忙处理

OS Pid: 30268 executed alter session set events '10051 trace name adjust_scn level 2'
Thu Dec 22 12:04:07 2011
Errors in file /ora101/diag/rdbms/ora11/ora11/trace/ora11_ora_30268.trc:
ORA-01031: insufficient privileges
Thu Dec 22 12:04:43 2011
Errors in file /ora101/diag/rdbms/ora11/ora11/trace/ora11_ora_846.trc:
ORA-01031: insufficient privileges

异常断电导致current redo损坏处理

联系：手机/微信(+86 17813235971) QQ(107644445)

标题：异常断电导致current redo损坏处理

一、数据库启动异常
网友因异常断电导致数据库不能正常启动，处理过程记录
1、sqlplus中提示

C:\Documents and Settings\Administrator>sqlplus "/as sysdba"
SQL*Plus: Release 11.1.0.6.0 - Production on 星期三 12月 7 12:57:02 2011
Copyright (c) 1982, 2007, Oracle.  All rights reserved.
已连接到空闲例程。
SQL> startup
ORACLE 例程已经启动。
Total System Global Area 1581916160 bytes
Fixed Size                  1336060 bytes
Variable Size             964693252 bytes
Database Buffers          603979776 bytes
Redo Buffers               11907072 bytes
数据库装载完毕。
ORA-00368: 重做日志块中的校验和错误
ORA-00353: 日志损坏接近块 12014 更改 9743799889 时间 12/05/2011 09:21:11
ORA-00312: 联机日志 3 线程 1: 'R:\ORADATA\HZYL\REDO03.LOG'

2、alert日志文件

ALTER DATABASE OPEN
Beginning crash recovery of 1 threads
 parallel recovery started with 7 processes
Started redo scan
Errors in file d:\app\administrator\diag\rdbms\hzyl\hzyl\trace\hzyl_ora_2960.trc  (incident=214262):
ORA-00353: 日志损坏接近块 12014 更改 9743799889 时间 12/05/2011 09:21:11
ORA-00312: 联机日志 3 线程 1: 'R:\ORADATA\HZYL\REDO03.LOG'
Incident details in: d:\app\administrator\diag\rdbms\hzyl\hzyl\incident\incdir_214262\hzyl_ora_2960_i214262.trc
Aborting crash recovery due to error 368
Errors in file d:\app\administrator\diag\rdbms\hzyl\hzyl\trace\hzyl_ora_2960.trc:
ORA-00368: 重做日志块中的校验和错误
ORA-00353: 日志损坏接近块 12014 更改 9743799889 时间 12/05/2011 09:21:11
ORA-00312: 联机日志 3 线程 1: 'R:\ORADATA\HZYL\REDO03.LOG'
ORA-368 signalled during: ALTER DATABASE OPEN...
Wed Dec 07 12:57:17 2011
Trace dumping is performing id=[cdmp_20111207125717]
Wed Dec 07 12:57:17 2011
Sweep Incident[214262]: completed
Errors in file d:\app\administrator\diag\rdbms\hzyl\hzyl\trace\hzyl_m000_2400.trc  (incident=214326):
ORA-00353: log corruption near block 12014 change 9743799889 time 12/05/2011 09:21:11
ORA-00312: online log 3 thread 1: 'R:\ORADATA\HZYL\REDO03.LOG'
Incident details in: d:\app\administrator\diag\rdbms\hzyl\hzyl\incident\incdir_214326\hzyl_m000_2400_i214326.trc
Trace dumping is performing id=[cdmp_20111207125718]

二、查询是损坏日志组

SQL> col member for a35
SQL> select a.group#,a.status,b.member from v$log a,v$logfile b
   2 where a.group#=b.group#;
    GROUP# STATUS           MEMBER
---------- ---------------- -----------------------------------
         3 CURRENT          R:\ORADATA\HZYL\REDO03.LOG
         2 INACTIVE         R:\ORADATA\HZYL\REDO02.LOG
         1 INACTIVE         R:\ORADATA\HZYL\REDO01.LOG

确定是因为断电导致当前redo 损坏，需要使用隐含参数+不完全恢复处理(经验)

三、不完全恢复

SQL> recover database until cancel;
ORA-00279: 更改 9743796718 (在 12/05/2011 09:07:40 生成) 对于线程 1 是必需的
ORA-00289: 建议: S:\ORAFLASH\HZYL\ARCHIVELOG\2011_12_07\O1_MF_1_15114_%U_.ARC
ORA-00280: 更改 9743796718 (用于线程 1) 在序列 #15114 中
指定日志: {<RET>=suggested | filename | AUTO | CANCEL}
auto
ORA-00308: 无法打开归档日志
'S:\ORAFLASH\HZYL\ARCHIVELOG\2011_12_07\O1_MF_1_15114_%U_.ARC'
ORA-27041: 无法打开文件
OSD-04002: 无法打开文件
O/S-Error: (OS 2) 系统找不到指定的文件。
ORA-00308: 无法打开归档日志
'S:\ORAFLASH\HZYL\ARCHIVELOG\2011_12_07\O1_MF_1_15114_%U_.ARC'
ORA-27041: 无法打开文件
OSD-04002: 无法打开文件
O/S-Error: (OS 2) 系统找不到指定的文件。
ORA-01547: 警告: RECOVER 成功但 OPEN RESETLOGS 将出现如下错误
ORA-01194: 文件 1 需要更多的恢复来保持一致性
ORA-01110: 数据文件 1: 'R:\ORADATA\HZYL\SYSTEM01.DBF'

四、启用隐含参数，尝试打开数据库

SQL> create pfile='d:/1.txt' from spfile;
文件已创建。
#############添加隐含参数##################
*._allow_resetlogs_corruption=TRUE
*._allow_error_simulation=TRUE
##########################################
SQL> shutdown immediate
ORA-01109: 数据库未打开
已经卸载数据库。
ORACLE 例程已经关闭。
SQL> startup pfile ='d:\1.txt';
ORACLE 例程已经启动。
Total System Global Area 1581916160 bytes
Fixed Size                  1336060 bytes
Variable Size             964693252 bytes
Database Buffers          603979776 bytes
Redo Buffers               11907072 bytes
数据库装载完毕。
ORA-01589: 要打开数据库则必须使用 RESETLOGS 或 NORESETLOGS 选项
SQL> alter database open resetlogs;
alter database open resetlogs
*
第 1 行出现错误:
ORA-00603: ORACLE 服务器会话因致命错误而终止
进程 ID: 3388
会话 ID: 335 序列号: 5

五、查看alert日志

Wed Dec 07 13:02:49 2011
SMON: enabling cache recovery
Errors in file d:\app\administrator\diag\rdbms\hzyl\hzyl\trace\hzyl_ora_3388.trc  (incident=216664):
ORA-00600: 内部错误代码, 参数: [2662], [2], [1153862134], [2], [1153864845], [12582921], [], []
Incident details in: d:\app\administrator\diag\rdbms\hzyl\hzyl\incident\incdir_216664\hzyl_ora_3388_i216664.trc
Errors in file d:\app\administrator\diag\rdbms\hzyl\hzyl\trace\hzyl_ora_3388.trc:
ORA-00600: 内部错误代码, 参数: [2662], [2], [1153862134], [2], [1153864845], [12582921], [], []
Error 600 happened during db open, shutting down database
USER (ospid: 3388): terminating the instance due to error 600
Wed Dec 07 13:02:50 2011
Errors in file d:\app\administrator\diag\rdbms\hzyl\hzyl\trace\hzyl_mmnl_3884.trc  (incident=216640):
ORA-00600: internal error code, arguments: [], [], [], [], [], [], [], []
Incident details in: d:\app\administrator\diag\rdbms\hzyl\hzyl\incident\incdir_216640\hzyl_mmnl_3884_i216640.trc
Wed Dec 07 13:02:50 2011
Errors in file d:\app\administrator\diag\rdbms\hzyl\hzyl\trace\hzyl_diag_4020.trc  (incident=216552):
ORA-00600: internal error code, arguments: [], [], [], [], [], [], [], []
Incident details in: d:\app\administrator\diag\rdbms\hzyl\hzyl\incident\incdir_216552\hzyl_diag_4020_i216552.trc
Wed Dec 07 13:02:50 2011
Errors in file d:\app\administrator\diag\rdbms\hzyl\hzyl\trace\hzyl_dia0_2896.trc  (incident=216584):
ORA-00600: internal error code, arguments: [], [], [], [], [], [], [], []
Incident details in: d:\app\administrator\diag\rdbms\hzyl\hzyl\incident\incdir_216584\hzyl_dia0_2896_i216584.trc
Wed Dec 07 13:02:51 2011
Errors in file d:\app\administrator\diag\rdbms\hzyl\hzyl\trace\hzyl_pmon_2964.trc  (incident=216536):
ORA-00600: internal error code, arguments: [], [], [], [], [], [], [], []
Incident details in: d:\app\administrator\diag\rdbms\hzyl\hzyl\incident\incdir_216536\hzyl_pmon_2964_i216536.trc
Wed Dec 07 13:02:51 2011
Errors in file d:\app\administrator\diag\rdbms\hzyl\hzyl\trace\hzyl_mmon_3764.trc  (incident=216632):
ORA-00600: internal error code, arguments: [], [], [], [], [], [], [], []
Incident details in: d:\app\administrator\diag\rdbms\hzyl\hzyl\incident\incdir_216632\hzyl_mmon_3764_i216632.trc
Wed Dec 07 13:02:51 2011
Errors in file d:\app\administrator\diag\rdbms\hzyl\hzyl\trace\hzyl_dbrm_2060.trc  (incident=216560):
ORA-00600: internal error code, arguments: [], [], [], [], [], [], [], []
Wed Dec 07 13:02:51 2011
Errors in file d:\app\administrator\diag\rdbms\hzyl\hzyl\trace\hzyl_psp0_1476.trc  (incident=216568):
ORA-00600: internal error code, arguments: [], [], [], [], [], [], [], []
Wed Dec 07 13:02:52 2011
Errors in file d:\app\administrator\diag\rdbms\hzyl\hzyl\trace\hzyl_mman_3040.trc  (incident=216576):
ORA-00600: internal error code, arguments: [], [], [], [], [], [], [], []
Wed Dec 07 13:02:52 2011
Errors in file d:\app\administrator\diag\rdbms\hzyl\hzyl\trace\hzyl_dbw0_2116.trc  (incident=216592):
ORA-00600: internal error code, arguments: [], [], [], [], [], [], [], []
Errors in file d:\app\administrator\diag\rdbms\hzyl\hzyl\trace\hzyl_lgwr_2624.trc  (incident=216600):
ORA-00600: internal error code, arguments: [], [], [], [], [], [], [], []
Errors in file d:\app\administrator\diag\rdbms\hzyl\hzyl\trace\hzyl_smon_592.trc  (incident=216616):
ORA-00600: internal error code, arguments: [], [], [], [], [], [], [], []
Errors in file d:\app\administrator\diag\rdbms\hzyl\hzyl\trace\hzyl_ckpt_3052.trc  (incident=216608):
ORA-00600: internal error code, arguments: [], [], [], [], [], [], [], []
Instance terminated by USER, pid = 3388
ORA-1092 signalled during: alter database open resetlogs...
Errors in file d:\app\administrator\diag\rdbms\hzyl\hzyl\trace\hzyl_ora_3388.trc  (incident=216665):
ORA-00600: 内部错误代码, 参数: [], [], [], [], [], [], [], []
ORA-01092: ORACLE 实例终止。强制断开连接
ORA-00600: 内部错误代码, 参数: [2662], [2], [1153862134], [2], [1153864845], [12582921], [], []
Errors in file d:\app\administrator\diag\rdbms\hzyl\hzyl\trace\hzyl_ora_3388.trc  (incident=216666):
ORA-00600: 内部错误代码, 参数: [], [], [], [], [], [], [], []
ORA-00600: 内部错误代码, 参数: [], [], [], [], [], [], [], []
ORA-01092: ORACLE 实例终止。强制断开连接
ORA-00600: 内部错误代码, 参数: [2662], [2], [1153862134], [2], [1153864845], [12582921], [], []
Errors in file d:\app\administrator\diag\rdbms\hzyl\hzyl\trace\hzyl_ora_3388.trc:
ORA-00600: 内部错误代码, 参数: [], [], [], [], [], [], [], []
ORA-00600: 内部错误代码, 参数: [], [], [], [], [], [], [], []
ORA-01092: ORACLE 实例终止。强制断开连接
ORA-00600: 内部错误代码, 参数: [2662], [2], [1153862134], [2], [1153864845], [12582921], [], []
Wed Dec 07 13:02:53 2011
Errors in file d:\app\administrator\diag\rdbms\hzyl\hzyl\trace\hzyl_ora_3388.trc  (incident=218920):
ORA-00603: ORACLE server session terminated by fatal error
ORA-00600: internal error code, arguments: [], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [], [], [], [], [], [], [], []
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00600: internal error code, arguments: [2662], [2], [1153862134], [2], [1153864845], [12582921], [], []
Incident details in: d:\app\administrator\diag\rdbms\hzyl\hzyl\incident\incdir_218920\hzyl_ora_3388_i218920.trc
Errors in file d:\app\administrator\diag\rdbms\hzyl\hzyl\incident\incdir_218920\hzyl_ora_3388_i218920.trc:
ORA-00603: ORACLE server session terminated by fatal error
ORA-00600: internal error code, arguments: [], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [], [], [], [], [], [], [], []
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00600: internal error code, arguments: [2662], [2], [1153862134], [2], [1153864845], [12582921], [], []

这里看到熟悉的ora-600 [2662]，处理方法增加scn，参考ORA-00600 [2662]

六、增加SCN，打开数据库

SQL> exit
从 Oracle Database 11g Enterprise Edition Release 11.1.0.6.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options 断
开
C:\Documents and Settings\Administrator>sqlplus "/as sysdba"
SQL*Plus: Release 11.1.0.6.0 - Production on 星期三 12月 7 13:04:51 2011
Copyright (c) 1982, 2007, Oracle.  All rights reserved.
已连接到空闲例程。
SQL> startup pfile ='d:\1.txt' mount;
ORACLE 例程已经启动。
Total System Global Area 1581916160 bytes
Fixed Size                  1336060 bytes
Variable Size             964693252 bytes
Database Buffers          603979776 bytes
Redo Buffers               11907072 bytes
数据库装载完毕。
SQL> alter session set events '10015 trace name adjust_scn level 1';
会话已更改。
SQL> alter database open;
数据库已更改。
SQL> select status from v$instance;
STATUS
------------
OPEN

七、启动过程中，alert日志现ORA-00600[2256]

Completed: ALTER DATABASE   MOUNT
Wed Dec 07 13:06:07 2011
alter database open
Beginning crash recovery of 1 threads
 parallel recovery started with 7 processes
Started redo scan
Completed redo scan
 1 redo blocks read, 0 data blocks need recovery
Started redo application at
 Thread 1: logseq 1, block 2, scn 9743796722
Recovery of Online Redo Log: Thread 1 Group 1 Seq 1 Reading mem 0
  Mem# 0: R:\ORADATA\HZYL\REDO01.LOG
Completed redo application
Completed crash recovery at
 Thread 1: logseq 1, block 3, scn 9743816724
 0 data blocks read, 0 data blocks written, 1 redo blocks read
Wed Dec 07 13:06:08 2011
Thread 1 advanced to log sequence 2
Thread 1 opened at log sequence 2
  Current log# 2 seq# 2 mem# 0: R:\ORADATA\HZYL\REDO02.LOG
Successful open of redo thread 1
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Wed Dec 07 13:06:08 2011
SMON: enabling cache recovery
Errors in file d:\app\administrator\diag\rdbms\hzyl\hzyl\trace\hzyl_ora_3096.trc  (incident=219072):
ORA-00600: 内部错误代码, 参数: [2256], [0], [1073741824], [2], [1153882138], [], [], []
Incident details in: d:\app\administrator\diag\rdbms\hzyl\hzyl\incident\incdir_219072\hzyl_ora_3096_i219072.trc
Dictionary check beginning
Dictionary check complete
Verifying file header compatibility for 11g tablespace encryption..
Verifying 11g file header compatibility for tablespace encryption completed
SMON: enabling tx recovery
Database Characterset is ZHS16GBK
Wed Dec 07 13:06:09 2011
Trace dumping is performing id=[cdmp_20111207130609]
Opening with internal Resource Manager plan
Wed Dec 07 13:06:10 2011
Sweep Incident[219072]: completed
Starting background process FBDA
Starting background process SMCO
Wed Dec 07 13:06:10 2011
FBDA started with pid=26, OS id=2436
Wed Dec 07 13:06:10 2011
SMCO started with pid=27, OS id=2876
replication_dependency_tracking turned off (no async multimaster replication found)
Starting background process QMNC
Wed Dec 07 13:06:11 2011
QMNC started with pid=28, OS id=2812
LOGSTDBY: Validating controlfile with logical metadata
LOGSTDBY: Validation complete
Wed Dec 07 13:06:15 2011
db_recovery_file_dest_size of 10000 MB is 0.00% used. This is a
user-specified limit on the amount of space that will be used by this
database for recovery-related files, and does not reflect the amount of
space available in the underlying filesystem or ASM diskgroup.
Wed Dec 07 13:06:25 2011
Completed: alter database open

打开过程中，出现ORA-00600[2256]，但是数据库还是正常打开，具体原因待研究

标签归档：ORA-600 2662