ORA-15096: lost disk write detected

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:ORA-15096: lost disk write detected

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

又一例由于存储掉电导致asm磁盘组,由于ORA-15096: lost disk write detected,导致无法mount的恢复请求

SQL> ALTER DISKGROUP DATA MOUNT  /* asm agent *//* {1:45277:148} */
NOTE: cache registered group DATA number=2 incarn=0x73886b6a
NOTE: cache began mount (first) of group DATA number=2 incarn=0x73886b6a
NOTE: Assigning number (2,2) to disk (/dev/asm-data3)
NOTE: Assigning number (2,1) to disk (/dev/asm-data2)
NOTE: Assigning number (2,0) to disk (/dev/asm-data1)
Fri Nov 06 19:06:56 2020
NOTE: GMON heartbeating for grp 2
GMON querying group 2 at 94 for pid 30, osid 11596
NOTE: cache opening disk 0 of grp 2: DATA_0000 path:/dev/asm-data1
NOTE: F1X0 found on disk 0 au 2 fcn 0.0
NOTE: cache opening disk 1 of grp 2: DATA_0001 path:/dev/asm-data2
NOTE: cache opening disk 2 of grp 2: DATA_0002 path:/dev/asm-data3
NOTE: cache mounting (first) external redundancy group 2/0x73886B6A (DATA)
Fri Nov 06 19:06:57 2020
* allocate domain 2, invalid = TRUE
kjbdomatt send to inst 2
Fri Nov 06 19:06:57 2020
NOTE: attached to recovery domain 2
NOTE: starting recovery of thread=1 ckpt=25.7986 group=2 (DATA)
NOTE: starting recovery of thread=2 ckpt=33.364 group=2 (DATA)
NOTE: BWR validation signaled ORA-15096
Errors in file /u01/app/oracle/diag/asm/+asm/+ASM1/trace/+ASM1_ora_11596.trc:
ORA-15096: lost disk write detected
NOTE: crash recovery signalled OER-15096
ERROR: ORA-15096 signalled during mount of diskgroup DATA
NOTE: cache dismounting (clean) group 2/0x73886B6A (DATA)
NOTE: messaging CKPT to quiesce pins Unix process pid: 11596, image: oracle@db1 (TNS V1-V3)
NOTE: lgwr not being msg'd to dismount
kjbdomdet send to inst 2
detach from dom 2, sending detach message to inst 2
freeing rdom 2
NOTE: detached from domain 2
NOTE: cache dismounted group 2/0x73886B6A (DATA)
NOTE: cache ending mount (fail) of group DATA number=2 incarn=0x73886b6a
NOTE: cache deleting context for group DATA 2/0x73886b6a
GMON dismounting group 2 at 95 for pid 30, osid 11596
NOTE: Disk DATA_0000 in mode 0x7f marked for de-assignment
NOTE: Disk DATA_0001 in mode 0x7f marked for de-assignment
NOTE: Disk DATA_0002 in mode 0x7f marked for de-assignment
ERROR: diskgroup DATA was not mounted
ORA-15032: not all alterations performed
ORA-15096: lost disk write detected
ERROR: ALTER DISKGROUP DATA MOUNT  /* asm agent *//* {1:45277:148} */

通过判断,通过一系列处理之后,数据库进行了mount操作发现报错ORA-600 2130

Fri Nov 06 17:03:27 2020
ALTER DATABASE RECOVER  database
Media Recovery Start
 started logmerger process
Parallel Media Recovery started with 40 slaves
Fri Nov 06 17:03:29 2020
Errors in file /u01/app/oracle/diag/rdbms/ynhis/ynhis1/trace/ynhis1_pr00_7393.trc  (incident=195869):
ORA-00600: internal error code, arguments: [2130], [2], [1], [2], [], [], [], [], [], [], [], []
Incident details in: /u01/app/oracle/diag/rdbms/ynhis/ynhis1/incident/incdir_195869/ynhis1_pr00_7393_i195869.trc
Fri Nov 06 17:03:30 2020
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Media Recovery failed with error 600
ORA-10877 signalled during: ALTER DATABASE RECOVER  database  ...

判断redo异常,通过resetlogs打开库,发现报错ORA-00600 2662

Fri Nov 06 18:21:32 2020
alter database open resetlogs
RESETLOGS is being done without consistancy checks. This may result
in a corrupted database. The database should be recreated.
RESETLOGS after incomplete recovery UNTIL CHANGE 8670753264
Resetting resetlogs activation ID 306909514 (0x124b114a)
Redo thread 2 enabled by open resetlogs or standby activation
Fri Nov 06 18:21:39 2020
Setting recovery target incarnation to 2
Initializing SCN for created control file
Database SCN compatibility initialized to 3
Warning - High Database SCN: Current SCN value is 8670753267, threshold SCN value is 0
Fri Nov 06 18:21:39 2020
Assigning activation ID 408224320 (0x18550240)
Thread 1 opened at log sequence 1
  Current log# 1 seq# 1 mem# 0: /orabak/data/group_1.289.954514319
Successful open of redo thread 1
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Fri Nov 06 18:21:40 2020
SMON: enabling cache recovery
Errors in file /u01/app/oracle/diag/rdbms/ynhis/ynhis1/trace/ynhis1_ora_24310.trc  (incident=231847):
ORA-00600: internal error code, arguments: [2662], [2], [80818679], [2], [93545365], [4194545], [], [], [], [], [],[]
Incident details in: /u01/app/oracle/diag/rdbms/ynhis/ynhis1/incident/incdir_231847/ynhis1_ora_24310_i231847.trc
Fri Nov 06 18:21:42 2020
Dumping diagnostic data in directory=[cdmp_20201106182142],requested by(instance=1,osid=24310),summary=[incident=231847]
Fri Nov 06 18:21:43 2020
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Errors in file /u01/app/oracle/diag/rdbms/ynhis/ynhis1/trace/ynhis1_ora_24310.trc:
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00600: internal error code, arguments: [2662], [2], [80818679], [2], [93545365],[4194545],[],[],[],[],[],[]
Errors in file /u01/app/oracle/diag/rdbms/ynhis/ynhis1/trace/ynhis1_ora_24310.trc:
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00600: internal error code, arguments: [2662], [2], [80818679], [2], [93545365],[4194545],[],[],[],[],[],[]
Error 704 happened during db open, shutting down database
USER (ospid: 24310): terminating the instance due to error 704
Instance terminated by USER, pid = 24310
ORA-1092 signalled during: alter database open resetlogs...
opiodr aborting process unknown ospid (24310) as a result of ORA-1092

处理该错误之后,数据库resetlog之后,数据库open成功但是报错ORA-00600 4137

Database Characterset is ZHS16GBK
Errors in file /u01/app/oracle/diag/rdbms/ynhis/ynhis1/trace/ynhis1_smon_26195.trc  (incident=255799):
ORA-00600: internal error code, arguments: [4137], [25.33.122556], [0], [0], [], [], [], [], [], [], [], []
Incident details in: /u01/app/oracle/diag/rdbms/ynhis/ynhis1/incident/incdir_255799/ynhis1_smon_26195_i255799.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
No Resource Manager plan active
Fri Nov 06 18:30:46 2020
replication_dependency_tracking turned off (no async multimaster replication found)
ORACLE Instance ynhis1 (pid = 23) - Error 600 encountered while recovering transaction (25, 33).
Errors in file /u01/app/oracle/diag/rdbms/ynhis/ynhis1/trace/ynhis1_smon_26195.trc:
ORA-00600: internal error code, arguments: [4137], [25.33.122556], [0], [0], [], [], [], [], [], [], [], []

对异常undo进行处理,数据库可以正常启动关闭,然后安排数据导出导入新库操作,恢复完成.

系统故障oracle数据库恢复

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:系统故障oracle数据库恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

由于系统故障,导致操作系统进入,客户通过其他方式进入系统拷贝出来数据文件,redo,ctl等文件,安装版本相同的数据库,修改相关路径,启动数据库,但是启动报错,让我们给予技术支持.数据库open报ORA-600 2662错误

SQL> alter database open resetlogs;
alter database open resetlogs
*
ERROR at line 1:
ORA-00603: ORACLE server session terminated by fatal error
ORA-00600: internal error code, arguments: [2662], [2], [2313731576], [2],
[2313735660], [12583040], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [2662], [2], [2313731575], [2],
[2313735660], [12583040], [], [], [], [], [], []
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00600: internal error code, arguments: [2662], [2], [2313731573], [2],
[2313735660], [12583040], [], [], [], [], [], []
Process ID: 22446
Session ID: 577 Serial number: 1

alert日志报错

Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Errors in file /home/oracle/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_22446.trc:
ORA-00600: internal error code, arguments:[2662],[2],[2313731573],[2],[2313735660],[12583040],[],[],[],[],[],[]
Errors in file /home/oracle/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_22446.trc:
ORA-00600: internal error code, arguments:[2662],[2],[2313731573],[2],[2313735660],[12583040],[],[],[],[],[],[]
Error 600 happened during db open, shutting down database
USER (ospid: 22446): terminating the instance due to error 600

这个错误比较常见,特别是使用了_allow_resetlogs_corruption屏蔽一致性强制拉库的时候.解决该问题比较简单,修改数据库scn,然后open数据库成功,参考部分案例_allow_resetlogs_corruption

[oracle@localhost ~]$ sqlplus / as sysdba

SQL*Plus: Release 11.2.0.3.0 Production on Sat Jun 20 08:31:55 2020

Copyright (c) 1982, 2011, Oracle.  All rights reserved.

Connected to an idle instance.

SQL> startup mount pfile='/tmp/pfile';
ORACLE instance started.

Total System Global Area 4275781632 bytes
Fixed Size                  2235208 bytes
Variable Size            2902459576 bytes
Database Buffers         1325400064 bytes
Redo Buffers               45686784 bytes
Database mounted.
SQL> alter database open;

Database altered.

尝试导出数据

[oracle@localhost ~]$ tail -f nohup.out 
. exporting foreign function library names for user XIFENFEI 
. exporting PUBLIC type synonyms
. exporting private type synonyms
. exporting object type definitions for user XIFENFEI 
About to export LIOVBJ2017's objects ...
. exporting database links
. exporting sequence numbers
. exporting cluster definitions
. about to export LIOVBJ2017's tables via Conventional Path ...
. . exporting table                           ABCD
                                                            1 rows exported
. . exporting table               TB_D_RECORD
EXP-00008: ORACLE error 1578 encountered
ORA-01578: ORACLE data block corrupted (file # 1, block # 290344)
ORA-01110: data file 1: '/home/oracle/app/oracle/oradata/orcl/system01.dbf'
. . exporting table                      TB_DRIVER
EXP-00008: ORACLE error 1578 encountered
ORA-01578: ORACLE data block corrupted (file # 1, block # 290344)
ORA-01110: data file 1: '/home/oracle/app/oracle/oradata/orcl/system01.dbf'
. . exporting table               TB_XFF
EXP-00008: ORACLE error 1578 encountered
ORA-01578: ORACLE data block corrupted (file # 1, block # 290344)
ORA-01110: data file 1: '/home/oracle/app/oracle/oradata/orcl/system01.dbf'
. . exporting table              TB_XFF_TM_REL
EXP-00008: ORACLE error 1578 encountered
ORA-01578: ORACLE data block corrupted (file # 1, block # 290344)
ORA-01110: data file 1: '/home/oracle/app/oracle/oradata/orcl/system01.dbf'
. . exporting table                    TB_LOCATION

由于file # 1, block # 290344坏块导致数据无法导出,通过dbv检查数据文件

[oracle@localhost trace]$ dbv file=/home/oracle/app/oracle/oradata/orcl/system01.dbf

DBVERIFY: Release 11.2.0.3.0 - Production on Sat Jun 20 08:43:49 2020

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

DBVERIFY - Verification starting : FILE = /home/oracle/app/oracle/oradata/orcl/system01.dbf
Page 290344 is influx - most likely media corrupt
Corrupt block relative dba: 0x00446e28 (file 1, block 290344)
Fractured block found during dbv: 
Data in bad block:
 type: 6 format: 2 rdba: 0x00446e28
 last change scn: 0x0002.89e2b718 seq: 0x1 flg: 0x06
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x61980601
 check value in block header: 0xe118
 computed block checksum: 0xd680



DBVERIFY - Verification complete

Total Pages Examined         : 298240
Total Pages Processed (Data) : 257035
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 13457
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 3598
Total Pages Processed (Seg)  : 1
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 24149
Total Pages Marked Corrupt   : 1
Total Pages Influx           : 1
Total Pages Encrypted        : 0
Highest block SCN            : 3221247831 (2.3221247831)

确认只有一个坏块,尝试通过bbed进行坏块修复

BBED> set blocksize 8192
        BLOCKSIZE       8192

BBED> set block 290344
        BLOCK#          290344

BBED> map
 File: /home/oracle/app/oracle/oradata/orcl/system01.dbf (0)
 Block: 290344                                Dba:0x00000000
------------------------------------------------------------
 KTB Data Block (Index Leaf)

 struct kcbh, 20 bytes                      @0       

 struct ktbbh, 72 bytes                     @20      

 struct kdxle, 32 bytes                     @92      

 sb2 kd_off[231]                            @124     

 ub1 freespace[3026]                        @586     

 ub1 rowdata[4508]                          @3612    

 ub4 tailchk                                @8188    


BBED> verify
DBVERIFY - Verification starting
FILE = /home/oracle/app/oracle/oradata/orcl/system01.dbf
BLOCK = 290344

Block 290344 is corrupt
Corrupt block relative dba: 0x00446e28 (file 0, block 290344)
Fractured block found during verification
Data in bad block:
 type: 6 format: 2 rdba: 0x00446e28
 last change scn: 0x0002.89e2b718 seq: 0x1 flg: 0x06
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x61980601
 check value in block header: 0xe118
 computed block checksum: 0xd680


DBVERIFY - Verification complete

Total Blocks Examined         : 1
Total Blocks Processed (Data) : 0
Total Blocks Failing   (Data) : 0
Total Blocks Processed (Index): 0
Total Blocks Failing   (Index): 0
Total Blocks Empty            : 0
Total Blocks Marked Corrupt   : 1
Total Blocks Influx           : 2
Message 531 not found;  product=RDBMS; facility=BBED


BBED> set mode edit
        MODE            Edit

BBED> p kcbh
struct kcbh, 20 bytes                       @0       
   ub1 type_kcbh                            @0        0x06
   ub1 frmt_kcbh                            @1        0xa2
   ub1 spare1_kcbh                          @2        0x00
   ub1 spare2_kcbh                          @3        0x00
   ub4 rdba_kcbh                            @4        0x00446e28
   ub4 bas_kcbh                             @8        0x89e2b718
   ub2 wrp_kcbh                             @12       0x0002
   ub1 seq_kcbh                             @14       0x01
   ub1 flg_kcbh                             @15       0x06 (KCBHFDLC, KCBHFCKV)
   ub2 chkval_kcbh                          @16       0xe118
   ub2 spare3_kcbh                          @18       0x0000

BBED> p tailchk
ub4 tailchk                                 @8188     0x61980601

BBED> d /v
 File: /home/oracle/app/oracle/oradata/orcl/system01.dbf (0)
 Block: 290344  Offsets: 8188 to 8191  Dba:0x00000000
-------------------------------------------------------
 01069861                            l ...a

 <16 bytes per line>

BBED> m /x 010618b7
 File: /home/oracle/app/oracle/oradata/orcl/system01.dbf (0)
 Block: 290344           Offsets: 8188 to 8191           Dba:0x00000000
------------------------------------------------------------------------
 010618b7 

 <32 bytes per line>

BBED> sum apply
Check value for File 0, Block 290344:
current = 0xe118, required = 0xe118

BBED> verify 
DBVERIFY - Verification starting
FILE = /home/oracle/app/oracle/oradata/orcl/system01.dbf
BLOCK = 290344


DBVERIFY - Verification complete

Total Blocks Examined         : 1
Total Blocks Processed (Data) : 0
Total Blocks Failing   (Data) : 0
Total Blocks Processed (Index): 1
Total Blocks Failing   (Index): 0
Total Blocks Empty            : 0
Total Blocks Marked Corrupt   : 0
Total Blocks Influx           : 0
Message 531 not found;  product=RDBMS; facility=BBED

继续尝试导出数据,遭遇ORA-08103,参考相关文章:
模拟普通ORA-08103并解决
模拟极端ORA-08103并解决
数据库启动ORA-08103故障恢复

EXP-00056: ORACLE error 8103 encountered
ORA-08103: object no longer exists

通过对其进行处理,恢复该记录之外的所有记录,客户创建新库导入数据,数据库恢复基本完成

在数据库恢复遭遇ORA-07445 kgegpa错误

联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

标题:在数据库恢复遭遇ORA-07445 kgegpa错误

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

接到客户恢复请求,数据库启动报ORA-600 2662错误

Fri Apr 24 19:52:58 2020
alter database open resetlogs
RESETLOGS is being done without consistancy checks. This may result
in a corrupted database. The database should be recreated.
RESETLOGS after incomplete recovery UNTIL CHANGE 15491509441794
Resetting resetlogs activation ID 1460987657 (0x5714e709)
Fri Apr 24 19:52:59 2020
Setting recovery target incarnation to 3
Fri Apr 24 19:52:59 2020
Assigning activation ID 1566342598 (0x5d5c7dc6)
Thread 1 opened at log sequence 1
  Current log# 1 seq# 1 mem# 0: Y:\APP\ADMINISTRATOR\ORADATA\ORCL\REDO01.LOG
Successful open of redo thread 1
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Fri Apr 24 19:52:59 2020
SMON: enabling cache recovery
Errors in file d:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_ora_3860.trc  (incident=8561):
ORA-00600: 内部错误代码, 参数: [2662], [3606], [3857372426], [3606], [3857377059], [12583040], [], [], [], [], [], []
Incident details in: d:\app\administrator\diag\rdbms\orcl\orcl\incident\incdir_8561\orcl_ora_3860_i8561.trc
Errors in file d:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_ora_3860.trc:
ORA-00600: 内部错误代码, 参数: [2662], [3606], [3857372426], [3606], [3857377059], [12583040], [], [], [], [], [], []
Errors in file d:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_ora_3860.trc:
ORA-00600: 内部错误代码, 参数: [2662], [3606], [3857372426], [3606], [3857377059], [12583040], [], [], [], [], [], []
Error 600 happened during db open, shutting down database
USER (ospid: 3860): terminating the instance due to error 600
Instance terminated by USER, pid = 3860
ORA-1092 signalled during: alter database open resetlogs...

这个错误比较常见,通过对数据库scn进行调整,顺利规避该错误,继续启动报如下错误

SQL> startup mount pfile='d:/pfile.txt';
ORACLE 例程已经启动。

Total System Global Area 1.3696E+10 bytes
Fixed Size                  2188768 bytes
Variable Size            6878661152 bytes
Database Buffers         6777995264 bytes
Redo Buffers               37044224 bytes
数据库装载完毕。
SQL> alter database open;
alter database open
*
第 1 行出现错误:
ORA-03113: 通信通道的文件结尾
进程 ID: 5884
会话 ID: 66 序列号: 3
Fri Apr 24 20:57:49 2020
SMON: enabling cache recovery
Successfully onlined Undo Tablespace 2.
Dictionary check beginning
Dictionary check complete
Verifying file header compatibility for 11g tablespace encryption..
Verifying 11g file header compatibility for tablespace encryption completed
SMON: enabling tx recovery
Database Characterset is ZHS16GBK
No Resource Manager plan active
Exception [type: ACCESS_VIOLATION, UNABLE_TO_READ] [ADDR:0x898ADE43] [PC:0x9287D88, kgegpa()+38]
Dump file d:\app\administrator\diag\rdbms\orcl\orcl\trace\alert_orcl.log
Fri Apr 24 20:57:49 2020
ORACLE V11.2.0.1.0 - 64bit Production vsnsta=0
vsnsql=16 vsnxtr=3
Windows NT Version V6.1  
CPU                 : 16 - type 8664, 16 Physical Cores
Process Affinity    : 0x0x0000000000000000
Memory (Avail/Total): Ph:21429M/32767M, Ph+PgF:54255M/65533M 
Fri Apr 24 20:57:49 2020
Errors in file 
ORA-07445: caught exception [ACCESS_VIOLATION] at [kgegpa()+38] [0x0000000009287D88]
Fri Apr 24 20:57:52 2020
PMON (ospid: 2496): terminating the instance due to error 397
Instance terminated by PMON, pid = 2496

这里的主要错误是由于ORA-07445 kgegpa,根据以前恢复经验,该问题很可能和undo有关,对undo进行处理之后启动库

SQL> startup mount pfile='d:/pfile.txt' ;
ORACLE 例程已经启动。

Total System Global Area 1.3696E+10 bytes
Fixed Size                  2188768 bytes
Variable Size            6878661152 bytes
Database Buffers         6777995264 bytes
Redo Buffers               37044224 bytes
数据库装载完毕。
SQL> recover database;
完成介质恢复。
SQL> alter database open;

数据库已更改。

SMON: enabling tx recovery
Database Characterset is ZHS16GBK
SMON: Restarting fast_start parallel rollback
Fri Apr 24 21:01:28 2020
Errors in file d:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_p000_4360.trc  (incident=13377):
ORA-00600: internal error code, arguments: [4198], [], [], [], [], [], [], [], [], [], [], []
Incident details in: d:\app\administrator\diag\rdbms\orcl\orcl\incident\incdir_13377\orcl_p000_4360_i13377.trc
Stopping background process MMNL
Doing block recovery for file 3 block 296
Resuming block recovery (PMON) for file 3 block 296
Block recovery from logseq 3, block 25 to scn 15491947056761
Recovery of Online Redo Log: Thread 1 Group 3 Seq 3 Reading mem 0
  Mem# 0: Y:\APP\ADMINISTRATOR\ORADATA\ORCL\REDO03.LOG
Block recovery completed at rba 3.25.16, scn 3607.20090
Doing block recovery for file 6 block 165592
Resuming block recovery (PMON) for file 6 block 165592
Block recovery from logseq 3, block 33 to scn 15491947056769
Recovery of Online Redo Log: Thread 1 Group 3 Seq 3 Reading mem 0
  Mem# 0: Y:\APP\ADMINISTRATOR\ORADATA\ORCL\REDO03.LOG
Block recovery completed at rba 3.58.16, scn 3607.20098
Errors in file d:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_smon_4912.trc  (incident=13321):
ORA-00600: internal error code, arguments: [4198], [], [], [], [], [], [], [], [], [], [], []
Incident details in: d:\app\administrator\diag\rdbms\orcl\orcl\incident\incdir_13321\orcl_smon_4912_i13321.trc
SMON: Parallel transaction recovery slave got internal error
SMON: Downgrading transaction recovery to serial
Stopping background process MMON
Fri Apr 24 21:01:29 2020
Trace dumping is performing id=[cdmp_20200424210129]
Errors in file d:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_smon_4912.trc  (incident=13322):
ORA-00600: internal error code, arguments: [4137], [12.30.1712324], [0], [0], [], [], [], [], [], [], [], []
Incident details in: d:\app\administrator\diag\rdbms\orcl\orcl\incident\incdir_13322\orcl_smon_4912_i13322.trc
ORACLE Instance orcl (pid = 14) - Error 600 encountered while recovering transaction (12, 30).
Errors in file d:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_smon_4912.trc:
ORA-00600: internal error code, arguments: [4137], [12.30.1712324], [0], [0], [], [], [], [], [], [], [], []
Completed: alter database open upgrade
Fri Apr 24 21:01:30 2020
MMON started with pid=16, OS id=4980 
Fri Apr 24 21:01:31 2020
Sweep [inc][13322]: completed
Corrupt block relative dba: 0x00c395ee (file 3, block 234990)
Fractured block found during buffer read
Data in bad block:
 type: 2 format: 2 rdba: 0x00c395ee
 last change scn: 0x0e16.e5ead38b seq: 0x2b flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0xdb720232
 check value in block header: 0xebe2
 computed block checksum: 0xb60b
Reading datafile'Y:\APP\ADMINISTRATOR\ORADATA\ORCL\UNDOTBS01.DBF'for corruption at rdba: 0x00c395ee (file 3,block 234990)
Reread (file 3, block 234990) found same corrupt data
Corrupt Block Found
         TSN = 2, TSNAME = UNDOTBS1
         RFN = 3, BLK = 234990, RDBA = 12817902
         OBJN = 0, OBJD = -1, OBJECT = , SUBOBJECT = 
         SEGMENT OWNER = , SEGMENT TYPE = 
Errors in file d:\app\administrator\diag\rdbms\orcl\orcl\trace\orcl_m001_4852.trc  (incident=13641):
ORA-01578: ORACLE data block corrupted (file # 3, block # 234990)
ORA-01110: data file 3: 'Y:\APP\ADMINISTRATOR\ORADATA\ORCL\UNDOTBS01.DBF'
Incident details in: d:\app\administrator\diag\rdbms\orcl\orcl\incident\incdir_13641\orcl_m001_4852_i13641.trc
SQL> create undo tablespace undotbs2 datafile 
2   'Y:\APP\ADMINISTRATOR\ORADATA\ORCL\undo_xff02.dbf' size 128M autoextend on;

表空间已创建。

SQL> drop tablespace undotbs1 including contents and datafiles;

表空间已删除。

SQL> shutdown immediate;
数据库已经关闭。
已经卸载数据库。
ORACLE 例程已经关闭。
SQL> create spfile from pfile='d:/pfile.txt';

文件已创建。

SQL> startup mount
ORACLE 例程已经启动。

Total System Global Area 1.3696E+10 bytes
Fixed Size                  2188768 bytes
Variable Size            6878661152 bytes
Database Buffers         6777995264 bytes
Redo Buffers               37044224 bytes
数据库装载完毕。
SQL> alter database open;

数据库已更改。

数据库启动之后继续报出来的ORA-600 4198和ORA-600 4137以及undo坏块均证明是由于undo异常引起的问题,通过重建新undo,数据库open正常,安排客户进行数据导出导入到新库

正常open库报ORA-600 2662,ORA-600 4097错误

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:正常open库报ORA-600 2662,ORA-600 4097错误

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

又一起数据库正常open之后继续报ORA-600 2662

Sat Dec 29 04:30:40 2018
QMNC started with pid=36, OS id=12985
LOGSTDBY: Validating controlfile with logical metadata
LOGSTDBY: Validation complete
Completed: alter database open
Sat Dec 29 04:30:42 2018
Starting background process CJQ0
Sat Dec 29 04:30:42 2018
CJQ0 started with pid=41, OS id=12999
Sat Dec 29 04:30:43 2018
Errors in file /home/app/diag/rdbms/orcl/orcl/trace/orcl_m000_12989.trc  (incident=365101):
ORA-00600: internal error code, arguments: [2662], [0], [193646286], [0], [193662077], [50331650], [], [], [], [], [], []
Sat Dec 29 04:30:43 2018
Errors in file /home/app/diag/rdbms/orcl/orcl/trace/orcl_m002_12993.trc  (incident=365117):
ORA-00600: internal error code, arguments: [2662], [0], [193646286], [0], [193662077], [50331650], [], [], [], [], [], []
Incident details in: /home/app/diag/rdbms/orcl/orcl/incident/incdir_365101/orcl_m000_12989_i365101.trc
Incident details in: /home/app/diag/rdbms/orcl/orcl/incident/incdir_365117/orcl_m002_12993_i365117.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Errors in file /home/app/diag/rdbms/orcl/orcl/trace/orcl_m000_12989.trc  (incident=365102):
ORA-00600: internal error code, arguments: [2662], [0], [193646286], [0], [193662077], [50331650], [], [], [], [], [], []
Incident details in: /home/app/diag/rdbms/orcl/orcl/incident/incdir_365102/orcl_m000_12989_i365102.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Errors in file /home/app/diag/rdbms/orcl/orcl/trace/orcl_m002_12993.trc  (incident=365118):
ORA-00600: internal error code, arguments: [2662], [0], [193646286], [0], [193662077], [50331650], [], [], [], [], [], []
Incident details in: /home/app/diag/rdbms/orcl/orcl/incident/incdir_365118/orcl_m002_12993_i365118.trc

在open状态处理掉ORA-00600 2662错误之后,又出现了ORA-600 4097 以及ORA-00353 ORA-00312错误,这个也是一种不常见的现象,一般都是数据库open之前出现类似错误,但是这里在open之后出现(以前类似处理参考:ORA-600 4097故障处理ORA-00322 ORA-00312 ,ORA-00354 ORA-00353 ORA-00312异常处理

Sat Dec 29 05:55:22 2018
Errors in file /home/app/diag/rdbms/orcl/orcl/trace/orcl_j000_19136.trc  (incident=432567):
ORA-00600: internal error code, arguments: [4097], [3], [15], [76480], [], [], [], [], [], [], [], []
Incident details in: /home/app/diag/rdbms/orcl/orcl/incident/incdir_432567/orcl_j000_19136_i432567.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Errors in file /home/app/diag/rdbms/orcl/orcl/trace/orcl_j000_19136.trc  (incident=432568):
ORA-00353: log corruption near block 15682 change 193647101 time 12/29/2018 04:40:31
ORA-00312: online log 3 thread 1: '/home/app/oradata/orcl/redo03.log'
ORA-00312: online log 3 thread 1: '/home/app/oradata/orcl/redo03.log'
ORA-00600: internal error code, arguments: [4097], [3], [15], [76480], [], [], [], [], [], [], [], []
Incident details in: /home/app/diag/rdbms/orcl/orcl/incident/incdir_432568/orcl_j000_19136_i432568.trc
Errors in file /home/app/diag/rdbms/orcl/orcl/incident/incdir_432567/orcl_j000_19136_i432567.trc:
ORA-00399: corrupt change description in redo log
ORA-00353: log corruption near block 15682 change 193647101 time 12/29/2018 04:40:31
ORA-00312: online log 3 thread 1: '/home/app/oradata/orcl/redo03.log'
ORA-00312: online log 3 thread 1: '/home/app/oradata/orcl/redo03.log'
ORA-00600: internal error code, arguments: [4097], [3], [15], [76480], [], [], [], [], [], [], [], []
Errors in file /home/app/diag/rdbms/orcl/orcl/trace/orcl_j000_19136.trc  (incident=432569):
ORA-00353: log corruption near block 15682 change 193647101 time 12/29/2018 04:40:31
ORA-00334: archived log: '/home/app/oradata/orcl/redo03.log'
ORA-00312: online log 3 thread 1: '/home/app/oradata/orcl/redo03.log'
ORA-00600: internal error code, arguments: [4097], [3], [15], [76480], [], [], [], [], [], [], [], []
Incident details in: /home/app/diag/rdbms/orcl/orcl/incident/incdir_432569/orcl_j000_19136_i432569.trc

处理完成上面报错之后,继续出现比较少见的ORA-600 ktecgsc:kcbz_objdchk错误,具体参考:ORA-00600 [ktecgsc:kcbz_objdchk] on 11.2 (Doc ID 1562473.1)

Sat Dec 29 06:00:17 2018
DBMS_STATS: GATHER_STATS_JOB encountered errors.  Check the trace file.
Errors in file /home/app/diag/rdbms/orcl/orcl/trace/orcl_j001_19436.trc:
ORA-20011: Approximate NDV failed: ORA-08103: object no longer exists
Sat Dec 29 06:00:21 2018
Errors in file /home/app/diag/rdbms/orcl/orcl/trace/orcl_j002_19479.trc  (incident=432631):
ORA-00600: internal error code, arguments: [ktecgsc:kcbz_objdchk], [0], [0], [1], [], [], [], [], [], [], [], []
Incident details in: /home/app/diag/rdbms/orcl/orcl/incident/incdir_432631/orcl_j002_19479_i432631.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.

oracle 8.1.6因断电无法启动恢复

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:oracle 8.1.6因断电无法启动恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

接到一个oralce 8.1.6的数据库恢复请求,由于断电之后,系统无法正常恢复,通过尝试recover datafile报错如下

SQL> select * from v$version;
BANNER
----------------------------------------------------------------
Oracle8i Enterprise Edition Release 8.1.6.0.0 - Production
PL/SQL Release 8.1.6.0.0 - Production
CORE	8.1.6.0.0	Production
TNS for 32-bit Windows: Version 8.1.6.0.0 - Production
NLSRTL Version 3.4.1.0.0 - Production
Tue Oct 16 14:17:01 2018
Media Recovery Datafile: 1
Media Recovery Start
Media Recovery Log
Recovery of Online Redo Log: Thread 1 Group 1 Seq 146597 Reading mem 0
  Mem# 0 errs 0: D:\ORACLE\ORADATA\ORCL\REDO01.LOG
Tue Oct 16 14:17:02 2018
Errors in file D:\Oracle\admin\orcl\udump\ORA03040.TRC:
ORA-00600: internal error code, arguments: [kcoapl_blkchk], [1], [30547], [6101], [], [], [], []
Tue Oct 16 14:17:03 2018
Errors in file D:\Oracle\admin\orcl\udump\ORA03040.TRC:
ORA-01578: ORACLE data block corrupted (file # 1, block # 30547)
ORA-01110: data file 1: 'D:\ORACLE\ORADATA\ORCL\SYSTEM01.DBF'
ORA-00607: Internal error occurred while making a change to a data block
ORA-00600: internal error code, arguments: [kcoapl_blkchk], [1], [30547], [6101], [], [], [], []
Tue Oct 16 14:17:03 2018
Errors in file D:\Oracle\admin\orcl\udump\ORA03040.TRC:
ORA-00314: log 2 of thread 1, expected sequence# 146598 doesn't match 146415
ORA-00312: online log 2 thread 1: 'D:\ORACLE\ORADATA\ORCL\REDO02.LOG'

这里可以获取到两个信息:1)system文件可能有坏块,导致数据库recover的时候报ORA-600 kcoapl_blkchk错误,2)数据库的redo可能异常了,由于8i数据库默认redo 10m,在业务繁忙时候切换较为频繁,而文件系统的cache导致redo信息比较老,而数据文件需要redo比较新,从而无法正常恢复成功。比较明显对于当前这样的情况只能是屏蔽数据一致性,强制拉库

ARC0: media recovery disabled
Tue Oct 16 14:17:39 2018
SMON: enabling cache recovery
Tue Oct 16 14:17:39 2018
Errors in file D:\Oracle\admin\orcl\udump\ORA02256.TRC:
ORA-00600: internal error code, arguments: [2662], [1], [1712082681], [1], [1712107587], [8388610], [], []
Tue Oct 16 14:17:41 2018
Errors in file D:\Oracle\admin\orcl\udump\ORA02256.TRC:
ORA-00600: internal error code, arguments: [2662], [1], [1712082682], [1], [1712107587], [8388610], [], []
ORA-00600: internal error code, arguments: [2662], [1], [1712082681], [1], [1712107587], [8388610], [], []
Tue Oct 16 14:17:43 2018
Errors in file D:\Oracle\admin\orcl\udump\ORA02256.TRC:
ORA-00600: internal error code, arguments: [2662], [1], [1712082682], [1], [1712107587], [8388610], [], []
ORA-00600: internal error code, arguments: [2662], [1], [1712082681], [1], [1712107587], [8388610], [], []
Tue Oct 16 14:17:45 2018
Errors in file D:\Oracle\admin\orcl\udump\ORA02256.TRC:
ORA-00603: ORACLE server session terminated by fatal error
ORA-00600: internal error code, arguments: [2662], [1], [1712082682], [1], [1712107587], [8388610], [], []
ORA-00600: internal error code, arguments: [2662], [1], [1712082681], [1], [1712107587], [8388610], [], []

ORA-600 2662这个错误比较常见,直接推数据库scn,启动库

Tue Oct 16 16:09:11 2018
Errors in file D:\Oracle\admin\orcl\bdump\orclSMON.TRC:
ORA-00600: internal error code, arguments: [4193], [29469], [29477], [], [], [], [], []
Recovery of Online Redo Log: Thread 1 Group 3 Seq 1 Reading mem 0
  Mem# 0 errs 0: D:\ORACLE\ORADATA\ORCL\REDO03.LOG
Tue Oct 16 16:09:12 2018
Errors in file D:\Oracle\admin\orcl\bdump\orclSMON.TRC:
ORA-01595: error freeing extent (3) of rollback segment (2))
ORA-00600: internal error code, arguments: [4193], [29469], [29477], [], [], [], [], []
Tue Oct 16 16:09:13 2018
Completed: ALTER DATABASE OPEN
Tue Oct 16 16:12:06 2018
CREATE ROLLBACK SEGMENT nrbs1 TABLESPACE rbs
Tue Oct 16 16:12:06 2018
Errors in file D:\Oracle\admin\orcl\udump\ORA02252.TRC:
ORA-00600: internal error code, arguments: [4194], [79], [38], [], [], [], [], []

错误比较明显ORA-600 4194,而且已经告知是由于rollback segment 2异常,通过屏蔽回滚段,open数据库,删除老回滚段,创建新回滚段(8i无undo自动管理)

SQL> startup
ORACLE instance started.
Total System Global Area 1549432076 bytes
Fixed Size                    70924 bytes
Variable Size             500707328 bytes
Database Buffers         1048576000 bytes
Redo Buffers                  77824 bytes
Database mounted.
Database opened.
SQL> drop rollback segment "RBS1";
Rollback segment dropped.
SQL> drop rollback segment "RBS2";
Rollback segment dropped.
SQL> drop rollback segment "RBS3";
Rollback segment dropped.
SQL> drop rollback segment "RBS4";
Rollback segment dropped.
SQL> drop rollback segment "RBS5";
Rollback segment dropped.
SQL> drop rollback segment "RBS6";
Rollback segment dropped.
SQL> CREATE ROLLBACK SEGMENT nrbs1 TABLESPACE rbs;
Rollback segment created.
SQL> CREATE ROLLBACK SEGMENT nrbs2 TABLESPACE rbs;
Rollback segment created.
SQL> CREATE ROLLBACK SEGMENT nrbs3 TABLESPACE rbs;
Rollback segment created.
SQL> CREATE ROLLBACK SEGMENT nrbs4 TABLESPACE rbs;
Rollback segment created.
SQL> CREATE ROLLBACK SEGMENT nrbs5 TABLESPACE rbs;
Rollback segment created.
SQL> CREATE ROLLBACK SEGMENT nrbs6 TABLESPACE rbs;
Rollback segment created.
SQL> CREATE ROLLBACK SEGMENT nrbs7 TABLESPACE rbs;
Rollback segment created.

客户安排导出导入,至此该库恢复完成

记录正常open库报ORA-600 2662

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:记录正常open库报ORA-600 2662

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

数据库版本10.2.0.3 32位

SQL> select * from v$version;
BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 - Prod
PL/SQL Release 10.2.0.3.0 - Production
CORE    10.2.0.3.0      Production
TNS for 32-bit Windows: Version 10.2.0.3.0 - Production
NLSRTL Version 10.2.0.3.0 - Production

数据库启动alert日志报大量ORA-600和ORA-07445错误

Tue Jul 31 09:56:45 2018
Started redo application at
 Thread 1: logseq 7593, block 46691
Tue Jul 31 09:56:45 2018
Recovery of Online Redo Log: Thread 1 Group 2 Seq 7593 Reading mem 0
  Mem# 0: E:\ORACLE\PRODUCT\10.2.0\ORADATA\ORCL\REDO02.LOG
Tue Jul 31 09:56:45 2018
RECOVERY OF THREAD 1 STUCK AT BLOCK 779 OF FILE 2
Tue Jul 31 09:56:45 2018
RECOVERY OF THREAD 1 STUCK AT BLOCK 4430 OF FILE 2
Tue Jul 31 09:56:45 2018
Hex dump of (file 3, block 23704) in trace file e:\oracle\product\10.2.0\admin\orcl\bdump\orcl_p002_248.trc
Corrupt block relative dba: 0x00c05c98 (file 3, block 23704)
Fractured block found during crash/instance recovery
Data in bad block:
 type: 6 format: 2 rdba: 0x00c05c98
 last change scn: 0x0000.05c3cad2 seq: 0x16 flg: 0x06
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0xe4be0601
 check value in block header: 0x5c03
 computed block checksum: 0x2e7b
Tue Jul 31 09:56:45 2018
Completed redo application
Tue Jul 31 09:56:46 2018
Reread of rdba: 0x00c05c98 (file 3, block 23704) found same corrupted data
RECOVERY OF THREAD 1 STUCK AT BLOCK 40841 OF FILE 8
Tue Jul 31 09:56:46 2018
RECOVERY OF THREAD 1 STUCK AT BLOCK 297 OF FILE 2
Tue Jul 31 09:56:46 2018
Errors in file e:\oracle\product\10.2.0\admin\orcl\bdump\orcl_p002_248.trc:
ORA-00600: internal error code, arguments: [kssadpm1], [], [], [], [], [], [], []
Tue Jul 31 09:56:46 2018
Errors in file e:\oracle\product\10.2.0\admin\orcl\bdump\orcl_p001_2320.trc:
ORA-00600: internal error code, arguments: [545], [0xCA15AE84], [3], [16], [], [], [], []
ORA-00600: internal error code, arguments: [545], [0xCA15AE84], [3], [16], [], [], [], []
ORA-00600: internal error code, arguments: [545], [0xCA15AE84], [3], [16], [], [], [], []
ORA-00600: internal error code, arguments: [545], [0xCA15AE84], [3], [8], [], [], [], []
Tue Jul 31 09:56:46 2018
Errors in file e:\oracle\product\10.2.0\admin\orcl\bdump\orcl_p004_2116.trc:
ORA-00600: internal error code, arguments: [kssdmc: null so], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [kssdmc: null so], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [kssdmc: null so], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [kssdmc: null so], [], [], [], [], [], [], []
ORA-07445: exception encountered: core dump [ACCESS_VIOLATION] [unable_to_trans_pc] [PC:0x7C3429C1] [ADDR:0xCA800000] [UNABLE_TO_READ] []
Tue Jul 31 09:56:46 2018
Errors in file e:\oracle\product\10.2.0\admin\orcl\bdump\orcl_p000_220.trc:
ORA-00600: internal error code, arguments: [ksfdchkfobrerr1], [0xBB9852DC], [0xA7EC530C], [], [], [], [], []
Tue Jul 31 09:56:46 2018
Errors in file e:\oracle\product\10.2.0\admin\orcl\bdump\orcl_p003_252.trc:
ORA-00600: internal error code, arguments: [kssrc_test_cleanup:popall], [0xCA000304], [], [], [], [], [], []
Tue Jul 31 09:56:46 2018
Errors in file e:\oracle\product\10.2.0\admin\orcl\bdump\orcl_p001_228.trc:
ORA-07445: exception encountered: core dump [ACCESS_VIOLATION] [_kghalf+895] [PC:0x603BEBF3] [ADDR:0xFFFFFFF8] [UNABLE_TO_READ] []
Tue Jul 31 09:56:46 2018
Errors in file e:\oracle\product\10.2.0\admin\orcl\bdump\orcl_lgwr_1848.trc:
ORA-07445: exception encountered: core dump [ACCESS_VIOLATION] [_ksarcv+243] [PC:0x5B2223] [ADDR:0x206C000C] [UNABLE_TO_READ] []
Tue Jul 31 09:56:47 2018
Errors in file e:\oracle\product\10.2.0\admin\orcl\bdump\orcl_p000_220.trc:
ORA-07445: exception encountered: core dump [ACCESS_VIOLATION] [_kslwlmod+166] [PC:0x46946A] [ADDR:0x54F8] [UNABLE_TO_WRITE] []
ORA-00081: address range [0x75C80047, 0x75C8004B) is not readable
ORA-00600: internal error code, arguments: [ksfdchkfobrerr1], [0xBB9852DC], [0xA7EC530C], [], [], [], [], []
Tue Jul 31 09:56:47 2018
Errors in file e:\oracle\product\10.2.0\admin\orcl\bdump\orcl_ckpt_1852.trc:
ORA-07445: exception encountered: core dump [ACCESS_VIOLATION] [_kcbs_dump_adv_state+471] [PC:0x59B403] [ADDR:0xCC60CBAD] [UNABLE_TO_READ] []
ORA-00600: internal error code, arguments: [kssrc_test_cleanup:popall], [0xCA000304], [], [], [], [], [], []
Tue Jul 31 09:56:47 2018
Errors in file e:\oracle\product\10.2.0\admin\orcl\bdump\orcl_pmon_1828.trc:
ORA-07445: exception encountered: core dump [ACCESS_VIOLATION] [_kews_idle_wait+378] [PC:0x60BE2E] [ADDR:0x1820D468] [UNABLE_TO_WRITE] []
Tue Jul 31 09:56:47 2018
Errors in file e:\oracle\product\10.2.0\admin\orcl\bdump\orcl_p004_236.trc:
ORA-00081: address range [0x75C80041, 0x75C80045) is not readable
ORA-07445: exception encountered: core dump [ACCESS_VIOLATION] [unable_to_trans_pc] [PC:0x7C3429C1] [ADDR:0xCA800000] [UNABLE_TO_READ] []

检查主要坏块
主要影响数据库恢复的坏块,system不言而喻,block 2也比较敏感

E:\>dbv file=E:\ORACLE\PRODUCT\10.2.0\DB_1\DATABASE\TTTS.DAT
DBVERIFY: Release 10.2.0.3.0 - Production on 星期三 8月 1 00:24:52 2018
Copyright (c) 1982, 2005, Oracle.  All rights reserved.
DBVERIFY - 开始验证: FILE = E:\ORACLE\PRODUCT\10.2.0\DB_1\DATABASE\TTTS.DAT
页 2 流入 - 很可能是介质损坏
Corrupt block relative dba: 0x02000002 (file 8, block 2)
Fractured block found during dbv:
Data in bad block:
 type: 29 format: 2 rdba: 0x02000002
 last change scn: 0x0000.05c45a54 seq: 0x2 flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x9e601d02
 check value in block header: 0xea86
 computed block checksum: 0xc434
DBVERIFY - 验证完成
检查的页总数: 121600
处理的页总数 (数据): 58606
失败的页总数 (数据): 0
处理的页总数 (索引): 45192
失败的页总数 (索引): 0
处理的页总数 (其它): 4453
处理的总页数 (段)  : 0
失败的总页数 (段)  : 0
空的页总数: 13348
标记为损坏的总页数: 1
流入的页总数: 1
最高块 SCN            : 96716847 (0.96716847)
E:\>dbv FILE = E:\ORACLE\PRODUCT\10.2.0\ORADATA\ORCL\SYSTEM01.DBF
DBVERIFY: Release 10.2.0.3.0 - Production on 星期三 8月 1 00:28:28 2018
Copyright (c) 1982, 2005, Oracle.  All rights reserved.
DBVERIFY - 开始验证: FILE = E:\ORACLE\PRODUCT\10.2.0\ORADATA\ORCL\SYSTEM01.DBF
页 68377 流入 - 很可能是介质损坏
Corrupt block relative dba: 0x00410b19 (file 1, block 68377)
Fractured block found during dbv:
Data in bad block:
 type: 6 format: 2 rdba: 0x00410b19
 last change scn: 0x0000.05c45bdf seq: 0x1 flg: 0x06
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x4f3b0601
 check value in block header: 0x465b
 computed block checksum: 0x11c7
DBV-00200: 块 dba 4262777 已标记为损坏
DBVERIFY - 验证完成
检查的页总数: 74240
处理的页总数 (数据): 46209
失败的页总数 (数据): 0
处理的页总数 (索引): 9729
失败的页总数 (索引): 0
处理的页总数 (其它): 1923
处理的总页数 (段)  : 0
失败的总页数 (段)  : 0
空的页总数: 16378
标记为损坏的总页数: 2
流入的页总数: 1
最高块 SCN            : 96757289 (0.96757289)

尝试recover数据库,open数据库

E:\>sqlplus / as sysdba
SQL*Plus: Release 10.2.0.3.0 - Production on 星期二 7月 31 15:40:05 2018
Copyright (c) 1982, 2006, Oracle.  All Rights Reserved.
连接到:
Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 - Production
With the Partitioning, OLAP and Data Mining options
SQL> recover database;
完成介质恢复。
SQL> alter database open;
数据库已更改。

数据库报ORA-600 2662错误
该数据库没有增加隐含隐含参数(屏蔽一致性,屏蔽事务),数据库直接启动之后报ORA-600 2662

Tue Jul 31 15:40:15 2018
SMON: enabling cache recovery
Tue Jul 31 15:40:16 2018
Successfully onlined Undo Tablespace 1.
Tue Jul 31 15:40:16 2018
SMON: enabling tx recovery
Tue Jul 31 15:40:16 2018
Database Characterset is ZHS16GBK
replication_dependency_tracking turned off (no async multimaster replication found)
Starting background process QMNC
QMNC started with pid=32, OS id=1152
Tue Jul 31 15:40:17 2018
Errors in file e:\oracle\product\10.2.0\admin\orcl\bdump\orcl_smon_1288.trc:
ORA-00600: internal error code, arguments: [2662], [0], [96736891], [0], [96752794], [4264138], [], []
Tue Jul 31 15:40:18 2018
Non-fatal internal error happenned while SMON was doing logging scn->time mapping.
SMON encountered 1 out of maximum 100 non-fatal internal errors.
Tue Jul 31 15:40:18 2018
Errors in file e:\oracle\product\10.2.0\admin\orcl\bdump\orcl_smon_1288.trc:
ORA-00600: internal error code, arguments: [2662], [0], [96736892], [0], [96752794], [4264138], [], []
Non-fatal internal error happenned while SMON was doing extent coalescing.
SMON encountered 3 out of maximum 100 non-fatal internal errors.
Tue Jul 31 15:40:19 2018
db_recovery_file_dest_size of 2048 MB is 0.00% used. This is a
user-specified limit on the amount of space that will be used by this
database for recovery-related files, and does not reflect the amount of
space available in the underlying filesystem or ASM diskgroup.
Hex dump of (file 8, block 2) in trace file e:\oracle\product\10.2.0\admin\orcl\bdump\orcl_mmon_1740.trc
Corrupt block relative dba: 0x02000002 (file 8, block 2)
Fractured block found during buffer read
Data in bad block:
 type: 29 format: 2 rdba: 0x02000002
 last change scn: 0x0000.05c45a54 seq: 0x2 flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x9e601d02
 check value in block header: 0xea86
 computed block checksum: 0xc434
Reread of rdba: 0x02000002 (file 8, block 2) found same corrupted data
Tue Jul 31 15:40:20 2018
Errors in file e:\oracle\product\10.2.0\admin\orcl\bdump\orcl_smon_1288.trc:
ORA-00600: internal error code, arguments: [2662], [0], [96736902], [0], [96752794], [4264138], [], []
Tue Jul 31 15:40:20 2018
Completed: alter database open
Tue Jul 31 15:40:20 2018
Non-fatal internal error happenned while SMON was doing extent coalescing.
SMON encountered 4 out of maximum 100 non-fatal internal errors.
Tue Jul 31 15:40:21 2018
Errors in file e:\oracle\product\10.2.0\admin\orcl\bdump\orcl_smon_1288.trc:
ORA-00600: internal error code, arguments: [2662], [0], [96736922], [0], [96752794], [4264138], [], []
Non-fatal internal error happenned while SMON was doing extent coalescing.
SMON encountered 5 out of maximum 100 non-fatal internal errors.
Tue Jul 31 15:40:22 2018
Errors in file e:\oracle\product\10.2.0\admin\orcl\bdump\orcl_j000_1076.trc:
ORA-00600: 内部错误代码, 参数: [2662], [0], [96736954], [0], [96754130], [12615382], [], []
Tue Jul 31 15:40:22 2018
DEBUG: Replaying xcb 0xbba2d2e4, pmd 0xab1920fc for failed op 8
Reconstructing  Uhdr 0x800019 for xcb 0xbba2d2e4, pmd 0xab1920fc
Doing block recovery for file 2 block 25
Block recovery from logseq 7594, block 63 to scn 96736910
Tue Jul 31 15:40:22 2018
Recovery of Online Redo Log: Thread 1 Group 3 Seq 7594 Reading mem 0
  Mem# 0: E:\ORACLE\PRODUCT\10.2.0\ORADATA\ORCL\REDO03.LOG
Block recovery completed at rba 7594.82.16, scn 0.96736911
DEBUG: Restoring block headers for xcb 0xbba2d2e4, pmd 0xab1920fc
DEBUG: Finished replay for xcb 0xbba2d2e4, pmd 0xab1920fc for failed op 8
Tue Jul 31 15:40:22 2018
Errors in file e:\oracle\product\10.2.0\admin\orcl\bdump\orcl_j000_1076.trc:
ORA-00603: ORACLE 服务器会话因致命错误而终止
ORA-00600: 内部错误代码, 参数: [2662], [0], [96736955], [0], [96754130], [12615382], [], []
ORA-00600: 内部错误代码, 参数: [2662], [0], [96736954], [0], [96754130], [12615382], [], []
Tue Jul 31 15:40:31 2018
Errors in file e:\oracle\product\10.2.0\admin\orcl\bdump\orcl_smon_1288.trc:
ORA-00600: internal error code, arguments: [2662], [0], [96736963], [0], [96752794], [4264138], [], []
Tue Jul 31 15:40:31 2018
Non-fatal internal error happenned while SMON was doing extent coalescing.
SMON encountered 6 out of maximum 100 non-fatal internal errors.
Tue Jul 31 15:40:41 2018
Errors in file e:\oracle\product\10.2.0\admin\orcl\bdump\orcl_smon_1288.trc:
ORA-00600: internal error code, arguments: [2662], [0], [96736967], [0], [96752794], [4264138], [], []
Tue Jul 31 15:41:03 2018
Errors in file e:\oracle\product\10.2.0\admin\orcl\bdump\orcl_pmon_1448.trc:
ORA-00474: SMON process terminated with error
Tue Jul 31 15:41:03 2018
PMON: terminating instance due to error 474
Tue Jul 31 15:41:03 2018
Errors in file e:\oracle\product\10.2.0\admin\orcl\bdump\orcl_j004_1344.trc:
ORA-00474: SMON process terminated with error
Tue Jul 31 15:41:03 2018
Errors in file e:\oracle\product\10.2.0\admin\orcl\bdump\orcl_j003_2104.trc:
ORA-00474: SMON process terminated with error
Tue Jul 31 15:41:03 2018
Errors in file e:\oracle\product\10.2.0\admin\orcl\bdump\orcl_j002_932.trc:
ORA-00474: SMON process terminated with error
Tue Jul 31 15:41:04 2018
Errors in file e:\oracle\product\10.2.0\admin\orcl\bdump\orcl_j001_1680.trc:
ORA-00474: SMON process terminated with error
Tue Jul 31 15:41:04 2018
Errors in file e:\oracle\product\10.2.0\admin\orcl\bdump\orcl_q000_2336.trc:
ORA-00474: SMON process terminated with error
Tue Jul 31 15:41:05 2018
Errors in file e:\oracle\product\10.2.0\admin\orcl\bdump\orcl_lgwr_552.trc:
ORA-00474: SMON process terminated with error
Tue Jul 31 15:41:05 2018
Errors in file e:\oracle\product\10.2.0\admin\orcl\bdump\orcl_dbw0_1420.trc:
ORA-00474: SMON process terminated with error
Tue Jul 31 15:41:05 2018
Errors in file e:\oracle\product\10.2.0\admin\orcl\bdump\orcl_dbw1_2556.trc:
ORA-00474: SMON process terminated with error
Tue Jul 31 15:41:05 2018
Errors in file e:\oracle\product\10.2.0\admin\orcl\bdump\orcl_psp0_1812.trc:
ORA-00474: SMON process terminated with error
Tue Jul 31 15:41:05 2018
Errors in file e:\oracle\product\10.2.0\admin\orcl\bdump\orcl_mman_1924.trc:
ORA-00474: SMON process terminated with error
Tue Jul 31 15:41:05 2018
Errors in file e:\oracle\product\10.2.0\admin\orcl\bdump\orcl_ckpt_2940.trc:
ORA-00474: SMON process terminated with error
Tue Jul 31 15:41:10 2018
Errors in file e:\oracle\product\10.2.0\admin\orcl\bdump\orcl_reco_660.trc:
ORA-00474: SMON process terminated with error
Instance terminated by PMON, pid = 1448
Tue Jul 31 15:41:57 2018

这是一个不常见的错误,没有使用隐含参数强制拉库,数据库正常open成功,但是由于system坏块,导致数据库启动之后报ORA-600 2662错误(再次证明2662不一定只有强制拉库发生,正常open的库也有可能,主要取决block scn和datafile scn,如果不是open过程必须要访问的block,那可能在open之后由于访问需要再报出来该错误).这个问题比较简单,因为open成功之后再crash,而且该坏块引起smon报错但是并没有ora-600 4xxx相关错误,因此根据经验,直接在数据库open之后,处理掉system异常报错对象和坏块对象即可.另外block 2 比较特殊,需要对其上面的对象进行处理

Tue Jul 31 15:55:54 2018
Hex dump of (file 8, block 2) in trace file e:\oracle\product\10.2.0\admin\orcl\udump\orcl_ora_3664.trc
Corrupt block relative dba: 0x02000002 (file 8, block 2)
Completely zero block found during reading space header
Reread of blocknum=2, file=E:\ORACLE\PRODUCT\10.2.0\DB_1\DATABASE\TTTS.DAT. found same corrupt data
Reread of blocknum=2, file=E:\ORACLE\PRODUCT\10.2.0\DB_1\DATABASE\TTTS.DAT. found same corrupt data
Reread of blocknum=2, file=E:\ORACLE\PRODUCT\10.2.0\DB_1\DATABASE\TTTS.DAT. found same corrupt data
Reread of blocknum=2, file=E:\ORACLE\PRODUCT\10.2.0\DB_1\DATABASE\TTTS.DAT. found same corrupt data
Reread of blocknum=2, file=E:\ORACLE\PRODUCT\10.2.0\DB_1\DATABASE\TTTS.DAT. found same corrupt data
Tue Jul 31 15:55:54 2018
Errors in file e:\oracle\product\10.2.0\admin\orcl\udump\orcl_ora_3664.trc:
ORA-07445: 出现异常错误: 核心转储 [ACCESS_VIOLATION] [_krbodmpcx+243] [PC:0x2317857] [ADDR:0x8] [UNABLE_TO_READ] []
ORA-19880: 数据文件 E:\ORACLE\PRODUCT\10.2.0\DB_1\DATABASE\TTTS.DAT 的空间标头已损坏, 块 2 备份已中止

补充ora-600 2662 block记录

          ----------------------------------------
          SO: BB97A97C, type: 24, owner: BB882B10, flag: INIT/-/-/0x00
          (buffer) (CR) PR: BB8029D8 FLG: 0x100000
          class bit: 00000000
          kcbbfbp: [BH: B9FCF268, LINK: BB97A9A0]
          where: kdswh06: kdscgr, why: 0
          BH (B9FCF268) file#: 1 rdba: 0x004110ca (1/69834) class: 1 ba: B9832000
            set: 9 blksize: 8192 bsi: 0 set-flg: 2 pwbcnt: 0
            dbwrid: 0 obj: 573 objn: 575 tsn: 0 afn: 1
            hash: [bb9151ec,bb9151ec] lru: [b9fcf378,b9fcf200]
            ckptq: [NULL] fileq: [NULL] objq: [b9fcf3d0,aa776524]
            use: [bb97a9a0,bb97a9a0] wait: [NULL]
            st: XCURRENT md: SHR tch: 0
            flags:
            LRBA: [0x0.0.0] HSCN: [0xffff.ffffffff] HSUB: [65535]
            Using State Objects
              ----------------------------------------
              SO: BB97A97C, type: 24, owner: BB882B10, flag: INIT/-/-/0x00
              (buffer) (CR) PR: BB8029D8 FLG: 0x100000
              class bit: 00000000
              kcbbfbp: [BH: B9FCF268, LINK: BB97A9A0]
              where: kdswh06: kdscgr, why: 0
            buffer tsn: 0 rdba: 0x004110ca (1/69834)
            scn: 0x0000.05c4549a seq: 0x01 flg: 0x06 tail: 0x549a0601
            frmt: 0x02 chkval: 0xc60a type: 0x06=trans data

ORA-600 2662

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:ORA-600 2662

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

在数据库恢复中,ORA-600 2662我想是很多人都非常熟悉的错误,下文是对于该错误的一些解释
ORA-600 2662解释说明

ERROR:
  Format: ORA-600 [2662] [a] [b] {c} [d] [e]
VERSIONS:
  versions 6.0 to 12.1
DESCRIPTION:
  A data block SCN is ahead of the current SCN.
  The ORA-600 [2662] occurs when an SCN is compared to the dependent SCN
  stored in a UGA variable.
  If the SCN is less than the dependent SCN then we signal the ORA-600 [2662]
  internal error.
ARGUMENTS:
  Arg [a]  Current SCN WRAP
  Arg [b]  Current SCN BASE
  Arg {c}  dependent SCN WRAP
  Arg [d]  dependent SCN BASE
  Arg [e]  Where present this is the DBA where the dependent SCN came from.

出现ORA-600 2662可能的原因

  (1) doing an open resetlogs with _ALLOW_RESETLOGS_CORRUPTION enabled
  (2) a hardware problem, like a faulty controller, resulting in a failed
      write to the control file or the redo logs
  (3) restoring parts of the database from backup and not doing the
      appropriate recovery
  (4) restoring a control file and not doing a RECOVER DATABASE USING BACKUP
      CONTROLFILE
  (5) having _DISABLE_LOGGING set during crash recovery
  (6) problems with the DLM in a parallel server environment
  (7) a bug

ORA-600 2662解决方法

   (1) if the SCNs in the error are very close, attempting a startup several
       times will bump up the dscn every time we open the database even if
       open fails. The database will open when dscn=scn.
   (2)You can bump the SCN either on open or while the database is open
      using Event:ADJUST_SCN
      Be aware that you should rebuild the database if you use this
      option.

ORA-00354 ORA-00353 ORA-00312异常处理

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:ORA-00354 ORA-00353 ORA-00312异常处理

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

数据库启动报错
WIN平台oracle 9.2.0.6版本数据库redo log block header损坏,ORA-00354 ORA-00353 ORA-00312错误导致数据库无法启动

SQL >alter database open;
*
ERROR at line 1:
ORA-00354: corrupt redo log block header
ORA-00353: log corruption near block 1892904 change 281470950178815
ORA-00312: online log 3 thread 1: 'D:\ORACLE\ORADATA\ZOYO\REDO03.LOG'
Sun Jan 24 15:44:05 2016
Database mounted in Exclusive Mode.
Completed: alter database mount exclusive
Sun Jan 24 15:44:05 2016
alter database open
Sun Jan 24 15:44:05 2016
Beginning crash recovery of 1 threads
Sun Jan 24 15:44:05 2016
Started redo scan
ORA-354 signalled during: alter database open...
Shutting down instance: further logons disabled
Shutting down instance (immediate)
License high water mark = 3
Sun Jan 24 15:44:32 2016
ALTER DATABASE CLOSE NORMAL
ORA-1109 signalled during: ALTER DATABASE CLOSE NORMAL...

通过分析,确定损坏的redo03为当前redo,无法使用正常方法打开,加上_allow_resetlogs_corruption参数,尝试打开库,依旧失败

数据库报ORA-600 2662错误

Sun Jan 24 16:26:30 2016
SMON: enabling cache recovery
Sun Jan 24 16:26:30 2016
Errors in file d:\oracle\admin\zoyo\udump\zoyo_ora_640.trc:
ORA-00600: 内部错误代码,参数: [2662], [0], [31563641], [0], [31563654], [4194721], [], []
Sun Jan 24 16:26:31 2016
Errors in file d:\oracle\admin\zoyo\udump\zoyo_ora_640.trc:
ORA-00704: 引导程序进程失败
ORA-00600: 内部错误代码,参数: [2662], [0], [31563641], [0], [31563654], [4194721], [], []
Sun Jan 24 16:26:31 2016
Error 704 happened during db open, shutting down database
USER: terminating instance due to error 704
Instance terminated by USER, pid = 640
ORA-1092 signalled during: alter database open resetlogs...

ORA 600 2662的错误处理
根据经验,这个错误只需要推scn即可,可以通过bbed,隐含参数,event,oradebug,修改控制文件等方法进行,推scn之后,数据库报熟悉的ORA-00604 ORA-00607 ORA-600 4194错误,以前我们遇到的block大部分是128,这次报异常block为9.实际中跟版本有关系,在ORACLE 9.2.0.6中该错误为file 1 block 9.大部分版本为128

Sun Jan 24 16:29:39 2016
SMON: enabling cache recovery
Sun Jan 24 16:29:39 2016
Errors in file d:\oracle\admin\zoyo\udump\zoyo_ora_3432.trc:
ORA-00600: 内部错误代码,参数: [4194], [14], [5], [], [], [], [], []
Sun Jan 24 16:29:39 2016
Doing block recovery for fno: 1 blk: 401
Sun Jan 24 16:29:39 2016
Recovery of Online Redo Log: Thread 1 Group 1 Seq 2 Reading mem 0
  Mem# 0 errs 0: D:\ORACLE\ORADATA\ZOYO\REDO01.LOG
Doing block recovery for fno: 1 blk: 9
Sun Jan 24 16:29:40 2016
Recovery of Online Redo Log: Thread 1 Group 1 Seq 2 Reading mem 0
  Mem# 0 errs 0: D:\ORACLE\ORADATA\ZOYO\REDO01.LOG
Sun Jan 24 16:29:40 2016
Errors in file d:\oracle\admin\zoyo\udump\zoyo_ora_3432.trc:
ORA-00604: 递归 SQL 层 1 出现错误
ORA-00607: 当更改数据块时出现内部错误
ORA-00600: 内部错误代码,参数: [4194], [14], [5], [], [], [], [], []
Error 604 happened during db open, shutting down database
USER: terminating instance due to error 604
Instance terminated by USER, pid = 3432

ORA-00604 ORA-00607 ORA-600 4194分析trace文件

*** 2016-01-24 16:29:40.031
Recovery of Online Redo Log: Thread 1 Group 1 Seq 2 Reading mem 0
Block image after block recovery:
buffer tsn: 0 rdba: 0x00400009 (1/9)
scn: 0x0000.01e112e1 seq: 0x01 flg: 0x04 tail: 0x12e10e01
frmt: 0x02 chkval: 0xba76 type: 0x0e=KTU UNDO HEADER W/UNLIMITED EXTENTS
  Extent Control Header
  -----------------------------------------------------------------
  Extent Header:: spare1: 0      spare2: 0      #extents: 6      #blocks: 47
                  last map  0x00000000  #maps: 0      offset: 4128
      Highwater::  0x00400191  ext#: 4      blk#: 0      ext size: 8
  #blocks in seg. hdr's freelists: 0
  #blocks below: 0
  mapblk  0x00000000  offset: 4
                   Unlocked
     Map Header:: next  0x00000000  #extents: 6    obj#: 0      flag: 0x40000000
  Extent Map
  -----------------------------------------------------------------
   0x0040000a  length: 7
   0x00400011  length: 8
   0x00400181  length: 8
   0x00400189  length: 8
   0x00400191  length: 8
   0x00400199  length: 8
  TRN CTL:: seq: 0x008e chd: 0x0060 ctl: 0x0024 inc: 0x00000000 nfb: 0x0001
            mgc: 0x8002 xts: 0x0068 flg: 0x0001 opt: 2147483646 (0x7ffffffe)
            uba: 0x00400191.008e.04 scn: 0x0000.01ded29c
Version: 0x01
  FREE BLOCK POOL::
    uba: 0x00400191.008e.04 ext: 0x4  spc: 0x1c3e
    uba: 0x00000000.002f.21 ext: 0x5  spc: 0x1334
    uba: 0x00000000.002e.37 ext: 0x4  spc: 0x788
    uba: 0x00000000.0000.00 ext: 0x0  spc: 0x0
    uba: 0x00000000.0000.00 ext: 0x0  spc: 0x0
  TRN TBL::

从这里可以确定undo segment header中的分配block记录有问题,清除ktuxc.fbp.fbp[N].kuba.kdba相关记录,数据库正常打开

    . struct ktuxc  kernel transaction undo xaction table control with 15 members
    . {
    .   struct kscn   scn with 3 members
    .   {
04148     ub4           bas      = 0X9CD2DE01 = 31380124
04152     ub2           wrp      = 0X0000 = 0
04154     cc32          pad      = 0X0000 = 0
    .   }
    .   struct kuba   uba with 4 members
    .   {
04156     kdba          dba      = 0X91014000 = 0x00400191 file 1 block 401
04160     ub2           seq      = 0X8E00 = 142
04162     ub1           rec      = 0X04 = 4
04163     cc16          pad      = 0X00 = 0
    .   }
04164   sb2           flg      = 0X0100 = 1
04166   ub2           seq      = 0X8E00 = 142
04168   sb2           nfb      = 0X0100 = 1
04170   cc32          pad1     = 0X0000 = 0
04172   ub4           inc      = 0X00000000 = 0
04176   sb2           chd      = 0X6000 = 96
04178   sb2           ctl      = 0X2400 = 36
04180   ub2x          mgc      = 0X0280 = 0x8002
04182   ub2           ver      = 0X0100 = 1
04184   ub2           xts      = 0X6800 = 104
04186   cc32          pad2     = 0X0000 = 0
04188   ub4           opt      = 0XFEFFFF7F = 2147483646
    .   ktufb fbp[5] (array with 5 elements)
    .     struct fbp   [0] with 3 members
    .     {
    .       struct kuba   uba with 4 members
    .       {
04192         kdba          dba      = 0X91014000 = 0x00400191 file 1 block 401
04196         ub2           seq      = 0X8E00 = 142
04198         ub1           rec      = 0X04 = 4
04199         cc16          pad      = 0X00 = 0
    .       }
04200       sb2           ext      = 0X0400 = 4
04202       sb2           spc      = 0X3E1C = 7230
    .     }
    .     struct fbp   [1] with 3 members
    .     {
    .       struct kuba   uba with 4 members
    .       {
04204         kdba          dba      = 0X00000000 = 0x00000000 file 0 block 0
04208         ub2           seq      = 0X2F00 = 47
04210         ub1           rec      = 0X21 = 33
04211         cc16          pad      = 0X00 = 0
    .       }
04212       sb2           ext      = 0X0500 = 5
04214       sb2           spc      = 0X3413 = 4916
    .     }
    .     struct fbp   [2] with 3 members
    .     {
    .       struct kuba   uba with 4 members
    .       {
04216         kdba          dba      = 0X00000000 = 0x00000000 file 0 block 0
04220         ub2           seq      = 0X2E00 = 46
04222         ub1           rec      = 0X37 = 55
04223         cc16          pad      = 0X00 = 0
    .       }
04224       sb2           ext      = 0X0400 = 4
04226       sb2           spc      = 0X8807 = 1928
    .     }
    .     struct fbp   [3] with 3 members
    .     {
    .       struct kuba   uba with 4 members
    .       {
04228         kdba          dba      = 0X00000000 = 0x00000000 file 0 block 0
04232         ub2           seq      = 0X0000 = 0
04234         ub1           rec      = 0X00 = 0
04235         cc16          pad      = 0X00 = 0
    .       }
04236       sb2           ext      = 0X0000 = 0
04238       sb2           spc      = 0X0000 = 0
    .     }
    .     struct fbp   [4] with 3 members
    .     {
    .       struct kuba   uba with 4 members
    .       {
04240         kdba          dba      = 0X00000000 = 0x00000000 file 0 block 0
04244         ub2           seq      = 0X0000 = 0
04246         ub1           rec      = 0X00 = 0
04247         cc16          pad      = 0X00 = 0
    .       }
04248       sb2           ext      = 0X0000 = 0
04250       sb2           spc      = 0X0000 = 0
    .     }
    . }
Sun Jan 24 16:44:52 2016
SMON: enabling tx recovery
Sun Jan 24 16:44:52 2016
Database Characterset is ZHS16GBK
replication_dependency_tracking turned off (no async multimaster replication found)
Completed: ALTER DATABASE OPEN

记录一次由于坏块和不恰当恢复引起各种ORA-600案例

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:记录一次由于坏块和不恰当恢复引起各种ORA-600案例

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

朋友让我帮忙处理一个不能open的库,打开alert日志一看,傻眼了,里面是各种ORA-600的错误应有尽有,被折腾的够惨
故障后重启,无法启动主要表现在block坏块,引起的各种ORA-600等错误

Mon Mar 02 16:09:27 2015
ALTER DATABASE OPEN
Beginning crash recovery of 1 threads
 parallel recovery started with 23 processes
Started redo scan
Completed redo scan
 read 962 KB redo, 256 data blocks need recovery
Started redo application at
 Thread 1: logseq 726, block 37343
Recovery of Online Redo Log: Thread 1 Group 3 Seq 726 Reading mem 0
  Mem# 0: /u01/app/oracle/oradata/oa/redo03.log
Mon Mar 02 16:09:27 2015
RECOVERY OF THREAD 1 STUCK AT BLOCK 1673 OF FILE 3
Completed redo application of 0.27MB
Mon Mar 02 16:09:27 2015
RECOVERY OF THREAD 1 STUCK AT BLOCK 3104 OF FILE 3
Mon Mar 02 16:09:27 2015
RECOVERY OF THREAD 1 STUCK AT BLOCK 3613 OF FILE 3
Mon Mar 02 16:09:28 2015
RECOVERY OF THREAD 1 STUCK AT BLOCK 272 OF FILE 3
Mon Mar 02 16:09:28 2015
RECOVERY OF THREAD 1 STUCK AT BLOCK 2512 OF FILE 3
Hex dump of (file 2, block 92889) in trace file /u01/app/oracle/diag/rdbms/oa/oa/trace/oa_dbw2_4158.trc
Corrupt block relative dba: 0x00816ad9 (file 2, block 92889)
Bad header found during preparing block for write
Data in bad block:
 type: 0 format: 0 rdba: 0x6ad90000
 last change scn: 0x0000.00c6a052 seq: 0x1 flg: 0x00
 spare1: 0x6 spare2: 0xa2 spare3: 0x5d7e
 consistency value in tail: 0xa0520001
 check value in block header: 0x0
 block checksum disabled
Mon Mar 02 16:09:28 2015
Errors in file /u01/app/oracle/diag/rdbms/oa/oa/trace/oa_p007_4196.trc  (incident=3833):
ORA-00600: internal error code, arguments: [4502], [1], [], [], [], [], [], [], [], [], [], []
Mon Mar 02 16:09:28 2015
Errors in file /u01/app/oracle/diag/rdbms/oa/oa/trace/oa_p013_4208.trc  (incident=3881):
ORA-00600: internal error code, arguments: [2037], [4259067], [4244307968], [159], [243], [0], [2162032704], [100728832], [], [], [], []
Slave exiting with ORA-1172 exception
Errors in file /u01/app/oracle/diag/rdbms/oa/oa/trace/oa_p009_4200.trc:
ORA-01172: recovery of thread 1 stuck at block 3613 of file 3
ORA-01151: use media recovery to recover block, restore backup if needed
Errors in file /u01/app/oracle/diag/rdbms/oa/oa/trace/oa_p001_4184.trc:
ORA-01172: recovery of thread 1 stuck at block 2512 of file 3
ORA-01151: use media recovery to recover block, restore backup if needed
Errors in file /u01/app/oracle/diag/rdbms/oa/oa/trace/oa_p021_4224.trc:
ORA-10388: parallel query server interrupt (failure)
Errors in file /u01/app/oracle/diag/rdbms/oa/oa/trace/oa_p021_4224.trc:
Errors in file /u01/app/oracle/diag/rdbms/oa/oa/trace/oa_dbw2_4158.trc  (incident=3697):
ORA-00600: internal error code, arguments: [kcbzpbuf_1], [4], [1], [], [], [], [], [], [], [], [], []
Incident details in: /u01/app/oracle/diag/rdbms/oa/oa/incident/incdir_3697/oa_dbw2_4158_i3697.trc
Exception [type: SIGSEGV, SI_KERNEL(general_protection)] [ADDR:0x0] [PC:0xD2DDB7, kcbs_shrink_pool()+705] [flags: 0x0, count: 1]
Errors in file /u01/app/oracle/diag/rdbms/oa/oa/trace/oa_mman_4152.trc  (incident=3673):
ORA-07445: exception encountered: core dump [kcbs_shrink_pool()+705] [SIGSEGV] [ADDR:0x0] [PC:0xD2DDB7] [SI_KERNEL(general_protection)] []
Incident details in: /u01/app/oracle/diag/rdbms/oa/oa/incident/incdir_3673/oa_mman_4152_i3673.trc
Errors in file /u01/app/oracle/diag/rdbms/oa/oa/trace/oa_dbw2_4158.trc:
Mon Mar 02 16:09:34 2015
Instance terminated by DBW2, pid = 4158

第二次重启后增加新错误ORA-00600[17182]

Mon Mar 02 16:39:50 2015
Errors in file /u01/app/oracle/diag/rdbms/oa/oa/trace/oa_p002_4321.trc  (incident=4993):
ORA-00600: internal error code, arguments: [17182], [0x7F548C2BDBA8], [], [], [], [], [], [], [], [], [], []

进行了一些恢复处理后,日志中报错
主要体现在进行了不完全恢复,而且应该是对redo进行了重命名或者redo头损坏锁引起的一系列提示

Beginning crash recovery of 1 threads
Started redo scan
Completed redo scan
 read 962 KB redo, 256 data blocks need recovery
Started redo application at
 Thread 1: logseq 726, block 37343
Recovery of Online Redo Log: Thread 1 Group 3 Seq 726 Reading mem 0
  Mem# 0: /u01/app/oracle/oradata/oa/redo03.log
RECOVERY OF THREAD 1 STUCK AT BLOCK 1673 OF FILE 3
Aborting crash recovery due to error 1172
Errors in file /u01/app/oracle/diag/rdbms/oa/oa/trace/oa_ora_6644.trc:
ORA-01172: recovery of thread 1 stuck at block 1673 of file 3
ORA-01151: use media recovery to recover block, restore backup if needed
Errors in file /u01/app/oracle/diag/rdbms/oa/oa/trace/oa_ora_6644.trc:
ORA-01172: recovery of thread 1 stuck at block 1673 of file 3
ORA-01151: use media recovery to recover block, restore backup if needed
ORA-1172 signalled during: alter  database open...
Tue Mar 03 11:17:59 2015
Sweep [inc][17178]: completed
Sweep [inc][17177]: completed
Sweep [inc2][17178]: completed
Tue Mar 03 11:18:00 2015
ALTER DATABASE RECOVER  database until cancel
Media Recovery Start
 started logmerger process
Parallel Media Recovery started with 24 slaves
ORA-279 signalled during: ALTER DATABASE RECOVER  database until cancel  ...
ALTER DATABASE RECOVER    CONTINUE DEFAULT
Tue Mar 03 11:18:06 2015
Errors in file /u01/app/oracle/diag/rdbms/oa/oa/trace/oa_pr00_6701.trc:
ORA-00266: name of archived log file needed
ORA-266 signalled during: ALTER DATABASE RECOVER    CONTINUE DEFAULT  ...
ALTER DATABASE RECOVER CANCEL
Errors in file /u01/app/oracle/diag/rdbms/oa/oa/trace/oa_pr00_6701.trc:
ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below
ORA-01194: file 1 needs more recovery to be consistent
ORA-01110: data file 1: '/u01/app/oracle/oradata/oa/system01.dbf'
Slave exiting with ORA-1547 exception
Errors in file /u01/app/oracle/diag/rdbms/oa/oa/trace/oa_pr00_6701.trc:
ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below
ORA-01194: file 1 needs more recovery to be consistent
ORA-01110: data file 1: '/u01/app/oracle/oradata/oa/system01.dbf'
ORA-10879 signalled during: ALTER DATABASE RECOVER CANCEL ...
Tue Mar 03 11:18:06 2015
Checker run found 4 new persistent data failures
Tue Mar 03 11:18:13 2015
alter database open resetlogs
RESETLOGS is being done without consistancy checks. This may result
in a corrupted database. The database should be recreated.
RESETLOGS after incomplete recovery UNTIL CHANGE 12986989
Resetting resetlogs activation ID 3278679642 (0xc36cae5a)
Errors in file /u01/app/oracle/diag/rdbms/oa/oa/trace/oa_ora_6644.trc:
ORA-00367: checksum error in log file header
ORA-00322: log 1 of thread 1 is not current copy
ORA-00312: online log 1 thread 1: '/u01/app/oracle/oradata/oa/redo01.log'
Errors in file /u01/app/oracle/diag/rdbms/oa/oa/trace/oa_ora_6644.trc:

再一步折腾,增加了_allow_resetlogs_corruption= TRUE之后数据库报ORA-600[2662]

Tue Mar 03 11:19:26 2015
SMON: enabling cache recovery
Errors in file /u01/app/oracle/diag/rdbms/oa/oa/trace/oa_ora_6864.trc  (incident=18195):
ORA-00600: internal error code, arguments: [2662], [0], [13007002], [0], [13016626], [4194545], [], [], [], [], [], []
Incident details in: /u01/app/oracle/diag/rdbms/oa/oa/incident/incdir_18195/oa_ora_6864_i18195.trc
Errors in file /u01/app/oracle/diag/rdbms/oa/oa/trace/oa_ora_6864.trc:
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00600: internal error code, arguments: [2662], [0], [13007002], [0], [13016626], [4194545], [], [], [], [], [], []
Error 704 happened during db open, shutting down database
USER (ospid: 6864): terminating the instance due to error 704
Instance terminated by USER, pid = 6864
ORA-1092 signalled during: alter database open...
opiodr aborting process unknown ospid (6864) as a result of ORA-1092
Tue Mar 03 11:19:29 2015
ORA-1092 : opitsk aborting process

进一步折腾,可以看出来undo已经被其offline,无法正常访问,导致系统报ORA-704和ORA-00376

Wed Mar 04 21:10:58 2015
SMON: enabling cache recovery
Errors in file /u01/app/oracle/diag/rdbms/oa/oa/trace/oa_ora_17074.trc:
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 2
ORA-00376: file 3 cannot be read at this time
ORA-01110: data file 3: '/u01/app/oracle/oradata/oa/undotbs01.dbf'
Errors in file /u01/app/oracle/diag/rdbms/oa/oa/trace/oa_ora_17074.trc:
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 2
ORA-00376: file 3 cannot be read at this time
ORA-01110: data file 3: '/u01/app/oracle/oradata/oa/undotbs01.dbf'
Error 704 happened during db open, shutting down database
USER (ospid: 17074): terminating the instance due to error 704
Instance terminated by USER, pid = 17074
ORA-1092 signalled during: alter database open...
opiodr aborting process unknown ospid (17074) as a result of ORA-1092
Wed Mar 04 21:11:00 2015
ORA-1092 : opitsk aborting process

通过Oracle数据库异常恢复检查脚本(Oracle Database Recovery Check)检测结果见附件(xifenfei_db_recover_20150304),这里可以知道undo 不知道怎么折腾的数据文件scn较大而且还offline,
通过一些列方法(bbed,隐含参数等)调整数据库scn,强制启动数据库,报如下错误

Wed Mar 04 22:50:23 2015
SMON: enabling cache recovery
ORA-01555 caused by SQL statement below (SQL ID: 3nkd3g3ju5ph1, SCN: 0x0000.4000003e):
select obj#,type#,ctime,mtime,stime, status, dataobj#, flags, oid$, spare1, spare2 from obj$ where owner#=:1 and name=:2 and namespace=:3 and remoteowner is null and linkname is null and subname is null
Errors in file /u01/app/oracle/diag/rdbms/oa/oa/trace/oa_ora_17807.trc:
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 2
ORA-01555: snapshot too old: rollback segment number 10 with name "_SYSSMU10_3550978943$" too small
Errors in file /u01/app/oracle/diag/rdbms/oa/oa/trace/oa_ora_17807.trc:
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 2
ORA-01555: snapshot too old: rollback segment number 10 with name "_SYSSMU10_3550978943$" too small
Error 704 happened during db open, shutting down database
USER (ospid: 17807): terminating the instance due to error 704
Instance terminated by USER, pid = 17807
ORA-1092 signalled during: alter database open resetlogs...
opiodr aborting process unknown ospid (17807) as a result of ORA-1092

根据经验,该错误怀疑是文件头scn不够大,块延迟清理导致,进一步增加scn尝试,最后依旧是ORA-00704/ORA-00604/ORA-01555错误

Wed Mar 04 22:50:23 2015
SMON: enabling cache recovery
ORA-01555 caused by SQL statement below (SQL ID: 3nkd3g3ju5ph1, SCN: 0x0000.4000003e):
select obj#,type#,ctime,mtime,stime, status, dataobj#, flags, oid$, spare1, spare2 from obj$ where owner#=:1 and name=:2 and namespace=:3 and remoteowner is null and linkname is null and subname is null
Errors in file /u01/app/oracle/diag/rdbms/oa/oa/trace/oa_ora_17807.trc:
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 2
ORA-01555: snapshot too old: rollback segment number 10 with name "_SYSSMU10_3550978943$" too small
Errors in file /u01/app/oracle/diag/rdbms/oa/oa/trace/oa_ora_17807.trc:
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 2
ORA-01555: snapshot too old: rollback segment number 10 with name "_SYSSMU10_3550978943$" too small
Error 704 happened during db open, shutting down database
USER (ospid: 17807): terminating the instance due to error 704
Instance terminated by USER, pid = 17807
ORA-1092 signalled during: alter database open resetlogs...
opiodr aborting process unknown ospid (17807) as a result of ORA-1092

根据经验,在scn上做手脚估计难以解决给问题,对其启动过程做10046和errorstack分析发现

PARSING IN CURSOR #3 len=202 dep=2 uid=0 oct=3 lid=0 tim=1425481940448439 hv=3819099649 ad='64ff91af8' sqlid='3nkd3g3ju5ph1'
select obj#,type#,ctime,mtime,stime, status, dataobj#, flags, oid$, spare1, spare2 from obj$ where owner#=:1 and name=:2 and namespace=:3 and remoteowner is null and linkname is null and subname is null
END OF STMT
PARSE #3:c=1000,e=334,p=0,cr=0,cu=0,mis=1,r=0,dep=2,og=4,plh=0,tim=1425481940448439
BINDS #3:
 Bind#0
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=7f5b3253a6f0  bln=22  avl=01  flg=05
  value=0
 Bind#1
  oacdty=01 mxl=32(06) mxlc=00 mal=00 scl=00 pre=00
  oacflg=18 fl2=0001 frm=01 csi=852 siz=32 off=0
  kxsbbbfp=7f5b3253a6b8  bln=32  avl=06  flg=05
  value="PROPS$"
 Bind#2
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=7f5b3253a688  bln=24  avl=02  flg=05
  value=1
EXEC #3:c=0,e=640,p=0,cr=0,cu=0,mis=1,r=0,dep=2,og=4,plh=2853959010,tim=1425481940449147
WAIT #3: nam='db file sequential read' ela= 5 file#=1 block#=345 blocks=1 obj#=37 tim=1425481940449186
WAIT #3: nam='db file sequential read' ela= 4 file#=1 block#=44528 blocks=1 obj#=37 tim=1425481940449221
WAIT #3: nam='db file sequential read' ela= 3 file#=1 block#=5505 blocks=1 obj#=37 tim=1425481940449247
*** 2015-03-04 23:12:20.450
dbkedDefDump(): Starting a non-incident diagnostic dump (flags=0x0, level=3, mask=0x0)
----- Error Stack Dump -----
ORA-00604: error occurred at recursive SQL level 2
ORA-01555: snapshot too old: rollback segment number 10 with name "_SYSSMU10_3550978943$" too small
----- Current SQL Statement for this session (sql_id=g64r07v2jn8nq) -----
SELECT NULL FROM PROPS$ WHERE NAME='BOOTSTRAP_UPGRADE_ERROR'

这里可以发现是数据库在启动的过程中需要执行SELECT NULL FROM PROPS$ WHERE NAME=’BOOTSTRAP_UPGRADE_ERROR’语句,而该语句递归调用了select obj#,type#,ctime,mtime,stime, status, dataobj#, flags, oid$, spare1, spare2 from obj$ where owner#=:1 and name=:2 and namespace=:3 and remoteowner is null and linkname is null and subname is null 语句。既然这样通过一些方法避免数据库启动之时查询SELECT NULL FROM PROPS$ WHERE NAME=’BOOTSTRAP_UPGRADE_ERROR’语句,果然数据库启动成功。

知识点补充
ORA-600 [4502] [a]

Arg [a] ITL entry with a lock count
Meaning: During ITL cleanout we clear all row locks but the ITL entry
	 still thinks there is an uncleared lock. Ie: ITL has a locked
	 row but there are no locked rows in the block

大体意思是数据库发现undo 的itl已经被清除,但是block中的itl依然存在,从而出现ORA-600[4502],引起该问题除bug外主要原因是坏块

ORA-600 [2037] [a] [b] {c} [d] [e] [f] [g]

Arg [a] Relative Data Block Address (RDBA) that the redo vector is for
Arg [b] The Block format
Arg {c} RDBA in the block itself
Arg [d] The block type
Arg [e] The sequence number
Arg [f] Flags, if set
Arg [g] The return value from the block head/tail checker.
DESCRIPTION:
  During recovery we are examining a block to ensure that it is not
  corrupt prior to applying any change vectors.
  The block has failed this check and this exception is raised

大体意思是在恢复过程中,正在检查的块,以确保它在应用任何变化向量之前不损坏。如果检查失败排除该异常ORA-600[2037],引起该问题除bug外主要原因是坏块

ORA-600 [kcbzpbuf_1],[a],[b]

Arg [a] Corruption reason
Arg [b] Calculate checksum flag
Corruption reason:
#define KCBH_GOOD    0                                     /* block is valid */
#define KCBH_ZERO    1             /* block header was entirely zero on disk */
#define KCBH_BROKEN  2      /* corruption could be from a partial disk write */
#define KCBH_CHKVAL  3               /* The check value for the block failed */
#define KCBH_CORRUPT 4     /* this is the wrong block or is not a data block */
#define KCBH_ZERONG  5               /* all zero block and it is not allowed */
Calculate checksum flag:
The possible values are 1 (Generate Checksum - db_block_checksum is enabled - default value)
                        0 (do not generate checksum - db_block_checksum=false)

kcbzpbuf_1是该错误的源码函数

ORA-600 [17182] [a] [b] {c} [d] [e]

DESCRIPTION:
  Oracle has detected that the magic number in a memory chunk header has been overwritten.
  This is a heap (in memory) corruption and there is no underlying data corruption.
  The error may occur in the one of the process specific heaps
  (the Call heap, PGA heap, or session heap) or in the shared heap (SGA).

ORACLE 发现在内存中重要的块头被重新,但是没有基础数据损坏,大部分和数据块或者内存损坏有关系.

ORA-600 [4552] [a] [b] {c} [d] [e]

DESCRIPTION:
  This assertion is raised because we are trying to unlock the rows in a
  block, but receive an incorrect block type.
  The second argument is the block type received.

ORACLE尝试对某行进行解锁但是接收到了不正确的数据块类型,Arg [b]是接收到的数据块类型

ORA-600 [2662] [a] [b] {c} [d] [e]

DESCRIPTION:
  A data block SCN is ahead of the current SCN.
  The ORA-600 [2662] occurs when an SCN is compared to the dependent SCN
  stored in a UGA variable.
  If the SCN is less than the dependent SCN then we signal the ORA-600 [2662]
  internal error.
ARGUMENTS:
  Arg [a]  Current SCN WRAP
  Arg [b]  Current SCN BASE
  Arg {c}  dependent SCN WRAP
  Arg [d]  dependent SCN BASE
  Arg [e]  Where present this is the DBA where the dependent SCN came from.

主要的含义就是oracle文件头scn比某个block dependent scn小从而出现该问题

SYSTEM表空间坏块恢复—C_TS#对象坏块恢复(file 1 block 60)

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:SYSTEM表空间坏块恢复—C_TS#对象坏块恢复(file 1 block 60)

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

一朋友给我电话,说他们客户公司数据库故障,被另外一家公司恢复了一天不能正常恢复,请求我协助解决.接手一看数据库已经被破坏的不像样子了,根据alert日志信息大概分析了故障原因和上家公司处理情况。后面接手后通过bbed修复block数据库恢复过程,在本次恢复中出现大量ORA-600错误,主要包括ORA-00600 400,ORA-00600 2662,ORA-00600 2663,ORA-00600 krhpfh_03-1209,ORA-00600 3600,ORA-00600 ktsitbs_info1,ORA-00600 4137,ORA-00600 4511,ORA-00600 4198,ORA-00600 6807等
故障原因redo文件丢失

Thu Nov 20 11:28:39 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_lgwr_1404.trc:
ORA-00313: open failed for members of log group 7 of thread 1
ORA-00312: online log 9 thread 1: '/data2/oradata/redo0902.log'
ORA-27037: unable to obtain file status
SVR4 Error: 2: No such file or directory
Additional information: 3
Thu Nov 20 11:28:39 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_lgwr_1404.trc:
ORA-00313: open failed for members of log group 7 of thread 1
ORA-00312: online log 9 thread 1: '/data2/oradata/redo0902.log'
ORA-27037: unable to obtain file status
SVR4 Error: 2: No such file or directory
Additional information: 3
Thu Nov 20 11:28:39 2014
LGWR: terminating instance due to error 313
Thu Nov 20 11:28:39 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_pmon_1394.trc:
ORA-00313: open failed for members of log group  of thread
Thu Nov 20 11:28:39 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_ckpt_1406.trc:
ORA-00313: open failed for members of log group  of thread
Instance terminated by LGWR, pid = 1404

尝试clear redo文件方式恢复

Thu Nov 20 13:04:16 2014
alter database clear logfile group 9
Thu Nov 20 13:04:16 2014
ORA-1624 signalled during: alter database clear logfile group 9...
Thu Nov 20 13:04:45 2014
alter database clear logfile group 9
Thu Nov 20 13:04:46 2014
ORA-1624 signalled during: alter database clear logfile group 9...
Thu Nov 20 13:04:59 2014
alter database clear unarchived logfile group 9
Thu Nov 20 13:04:59 2014
ORA-1624 signalled during: alter database clear unarchived logfile group 9...
Thu Nov 20 13:05:00 2014
alter database clear unarchived logfile group 9
Thu Nov 20 13:05:00 2014
ORA-1624 signalled during: alter database clear unarchived logfile group 9...

不完全恢复resetlogs尝试打开数据库

ORA-279 signalled during: ALTER DATABASE RECOVER  database using backup controlfile  ...
Thu Nov 20 13:49:01 2014
ALTER DATABASE RECOVER    CONTINUE DEFAULT
Thu Nov 20 13:49:02 2014
Media Recovery Log /opt/oracle/flash_recovery_area/xifenfei/archivelog/2014_11_20/o1_mf_1_285999_%u_.arc
Errors with log /opt/oracle/flash_recovery_area/xifenfei/archivelog/2014_11_20/o1_mf_1_285999_%u_.arc
ORA-308 signalled during: ALTER DATABASE RECOVER    CONTINUE DEFAULT  ...
Thu Nov 20 13:49:02 2014
ALTER DATABASE RECOVER    CONTINUE DEFAULT
Thu Nov 20 13:49:02 2014
Media Recovery Log /opt/oracle/flash_recovery_area/xifenfei/archivelog/2014_11_20/o1_mf_1_285999_%u_.arc
Errors with log /opt/oracle/flash_recovery_area/xifenfei/archivelog/2014_11_20/o1_mf_1_285999_%u_.arc
ORA-308 signalled during: ALTER DATABASE RECOVER    CONTINUE DEFAULT  ...
Thu Nov 20 13:49:02 2014
ALTER DATABASE RECOVER CANCEL
Thu Nov 20 13:49:03 2014
Media Recovery Canceled
Completed: ALTER DATABASE RECOVER CANCEL
Thu Nov 20 13:49:33 2014
alter database open resetlogs
Thu Nov 20 13:49:34 2014
ORA-1113 signalled during: alter database open resetlogs...

使用隐含参数

_allow_resetlogs_corruption= TRUE

进行不完全恢复,尝试open数据库报ORA-600 4000错误

Thu Nov 20 14:35:02 2014
ALTER DATABASE   MOUNT
Thu Nov 20 14:35:07 2014
Setting recovery target incarnation to 2
Thu Nov 20 14:35:07 2014
Successful mount of redo thread 1, with mount id 4039504598
Thu Nov 20 14:35:07 2014
Database mounted in Exclusive Mode
Completed: ALTER DATABASE   MOUNT
Thu Nov 20 14:40:33 2014
ALTER DATABASE RECOVER  database until cancel
Thu Nov 20 14:40:33 2014
Media Recovery Start
Thu Nov 20 14:40:33 2014
Media Recovery failed with error 1610
ORA-283 signalled during: ALTER DATABASE RECOVER  database until cancel  ...
Thu Nov 20 14:41:23 2014
ALTER DATABASE RECOVER  database using backup controlfile until cancel
Thu Nov 20 14:43:08 2014
alter database open resetlogs
Thu Nov 20 14:43:08 2014
RESETLOGS is being done without consistancy checks. This may result
in a corrupted database. The database should be recreated.
RESETLOGS after incomplete recovery UNTIL CHANGE 31293973571
Resetting resetlogs activation ID 3855216310 (0xe5c9eeb6)
Online log /data2/oradata/redo0802.log: Thread 1 Group 8 was previously cleared
Online log /data2/oradata/redo0902.log: Thread 1 Group 9 was previously cleared
Thu Nov 20 14:43:14 2014
Setting recovery target incarnation to 3
Thu Nov 20 14:43:14 2014
Assigning activation ID 4039504598 (0xf0c5f2d6)
Thread 1 opened at log sequence 1
  Current log# 9 seq# 1 mem# 0: /data2/oradata/redo0902.log
Successful open of redo thread 1
Thu Nov 20 14:43:14 2014
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Thu Nov 20 14:43:14 2014
SMON: enabling cache recovery
Thu Nov 20 14:43:14 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_1844.trc:
ORA-00600: internal error code, arguments: [4000], [17], [], [], [], [], [], []
Thu Nov 20 14:43:16 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_1844.trc:
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00600: internal error code, arguments: [4000], [17], [], [], [], [], [], []
Thu Nov 20 14:43:16 2014
Error 704 happened during db open, shutting down database
USER: terminating instance due to error 704
Instance terminated by USER, pid = 1844
ORA-1092 signalled during: alter database open resetlogs...

尝试隐含屏蔽回滚段

_corrupted_rollback_segments= _SYSSMU1$, _SYSSMU2$,…………

错误依旧ORA-600 4000

Thu Nov 20 15:09:21 2014
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Thu Nov 20 15:09:21 2014
SMON: enabling cache recovery
Thu Nov 20 15:09:21 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_624.trc:
ORA-00600: internal error code, arguments: [4000], [17], [], [], [], [], [], []
Thu Nov 20 15:09:23 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_624.trc:
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00600: internal error code, arguments: [4000], [17], [], [], [], [], [], []
Thu Nov 20 15:09:23 2014
Error 704 happened during db open, shutting down database
USER: terminating instance due to error 704
Instance terminated by USER, pid = 624
ORA-1092 signalled during: alter database open

多次重启,resetlogs后,数据库出现ORA-600 2662错误

Successful open of redo thread 1
Thu Nov 20 17:13:24 2014
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Thu Nov 20 17:13:24 2014
SMON: enabling cache recovery
Thu Nov 20 17:13:24 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_7967.trc:
ORA-00600: internal error code, arguments: [2662], [7], [1229382552], [7], [1229560642], [8388633], [], []
Thu Nov 20 17:13:25 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_7967.trc:
ORA-00600: internal error code, arguments: [2662], [7], [1229382552], [7], [1229560642], [8388633], [], []
Thu Nov 20 17:13:25 2014
Error 600 happened during db open, shutting down database
USER: terminating instance due to error 600
Instance terminated by USER, pid = 7967
ORA-1092 signalled during: ALTER DATABASE OPEN...
Thu Nov 20 17:18:23 2014
USER: terminating instance due to error 1092
Instance terminated by USER, pid = 7967

offline undo相关文件,尝试打开数据库

Database mounted in Exclusive Mode
Completed: ALTER DATABASE   MOUNT
Thu Nov 20 17:52:31 2014
ALTER DATABASE RECOVER  database until cancel
Thu Nov 20 17:52:31 2014
Media Recovery Start
 parallel recovery started with 15 processes
ORA-279 signalled during: ALTER DATABASE RECOVER  database until cancel  ...
Thu Nov 20 17:53:42 2014
ALTER DATABASE RECOVER CANCEL
Thu Nov 20 17:53:44 2014
ORA-1547 signalled during: ALTER DATABASE RECOVER CANCEL ...
Thu Nov 20 17:56:34 2014
alter database datafile '/opt/oracle/oradata/xifenfei/undotbs01.dbf' offline
Thu Nov 20 17:56:35 2014
Completed: alter database datafile '/opt/oracle/oradata/xifenfei/undotbs01.dbf' offline
Thu Nov 20 17:57:01 2014
alter database datafile '/data2/oradata/undotbs02.dbf' offline
Thu Nov 20 17:57:02 2014
Completed: alter database datafile '/data2/oradata/undotbs02.dbf' offline
Thu Nov 20 17:57:26 2014
alter database datafile '/data2/oradata/undotbs03.dbf' offline
Thu Nov 20 17:57:27 2014
Completed: alter database datafile '/data2/oradata/undotbs03.dbf' offline
Thu Nov 20 17:57:43 2014
alter database open resetlogs
Thu Nov 20 17:57:43 2014
RESETLOGS is being done without consistancy checks. This may result
in a corrupted database. The database should be recreated.
ORA-1245 signalled during: alter database open resetlogs...
Thu Nov 20 17:58:58 2014
alter database datafile '/opt/oracle/oradata/xifenfei/undotbs01.dbf' offline drop
Thu Nov 20 17:58:58 2014
Completed: alter database datafile '/opt/oracle/oradata/xifenfei/undotbs01.dbf' offline drop
Thu Nov 20 17:59:15 2014
alter database open resetlogs
Thu Nov 20 17:59:15 2014
RESETLOGS is being done without consistancy checks. This may result
in a corrupted database. The database should be recreated.
ORA-1245 signalled during: alter database open resetlogs...
Thu Nov 20 17:59:35 2014
alter database datafile '/data2/oradata/undotbs02.dbf' offline drop
Thu Nov 20 17:59:35 2014
Completed: alter database datafile '/data2/oradata/undotbs02.dbf' offline drop
Thu Nov 20 17:59:50 2014
alter database datafile '/data2/oradata/undotbs03.dbf' offline drop
Thu Nov 20 17:59:50 2014
Completed: alter database datafile '/data2/oradata/undotbs03.dbf' offline drop
Thu Nov 20 18:00:07 2014
alter database open resetlogs
Thu Nov 20 18:00:07 2014
RESETLOGS is being done without consistancy checks. This may result
in a corrupted database. The database should be recreated.
RESETLOGS after incomplete recovery UNTIL CHANGE 31294173628
Resetting resetlogs activation ID 4039492628 (0xf0c5c414)
Online log /data2/oradata/redo0802.log: Thread 1 Group 8 was previously cleared
Thu Nov 20 18:00:14 2014
Setting recovery target incarnation to 8
Thu Nov 20 18:00:14 2014
Assigning activation ID 4039504142 (0xf0c5f10e)
Thread 1 opened at log sequence 1
  Current log# 9 seq# 1 mem# 0: /data2/oradata/redo0902.log
Successful open of redo thread 1
Thu Nov 20 18:00:15 2014
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Thu Nov 20 18:00:15 2014
SMON: enabling cache recovery
Thu Nov 20 18:00:15 2014
Successfully onlined Undo Tablespace 1.
Dictionary check beginning
File #2 is offline, but is part of an online tablespace.
data file 2: '/opt/oracle/oradata/xifenfei/undotbs01.dbf'
File #100 is offline, but is part of an online tablespace.
data file 100: '/data2/oradata/undotbs02.dbf'
Thu Nov 20 18:00:28 2014
File #185 is offline, but is part of an online tablespace.
data file 185: '/data2/oradata/undotbs03.dbf'
Dictionary check complete
Thu Nov 20 18:00:35 2014
SMON: enabling tx recovery
Thu Nov 20 18:00:36 2014
Database Characterset is ZHS16CGB231280
Thu Nov 20 18:00:37 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_28472.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-00376: file 185 cannot be read at this time
ORA-01110: data file 185: '/data2/oradata/undotbs03.dbf'
Error 604 happened during db open, shutting down database
USER: terminating instance due to error 604
Thu Nov 20 18:00:37 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_lgwr_28450.trc:
ORA-00604: error occurred at recursive SQL level
Thu Nov 20 18:00:37 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_dbw0_28446.trc:
ORA-00604: error occurred at recursive SQL level
Instance terminated by USER, pid = 28472
ORA-1092 signalled during: alter database open resetlogs...

不知道做了什么操作出现file 1 block 60坏块,很可能bbed修改错误导致

Thu Nov 20 19:18:15 2014
SMON: enabling cache recovery
Thu Nov 20 19:18:16 2014
Hex dump of (file 1, block 60) in trace file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_13232.trc
Corrupt block relative dba: 0x0040003c (file 1, block 60)
Bad header found during buffer read
Data in bad block:
 type: 128 format: 0 rdba: 0x0040003c
 last change scn: 0x0005.ebe04bc9 seq: 0x2 flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x4bc90602
 check value in block header: 0x6faa
 computed block checksum: 0x0
Reread of rdba: 0x0040003c (file 1, block 60) found same corrupted data
Successfully onlined Undo Tablespace 1.
Thu Nov 20 19:18:16 2014
SMON: enabling tx recovery
Thu Nov 20 19:18:17 2014
Database Characterset is ZHS16CGB231280
Thu Nov 20 19:18:17 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_13232.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-00376: file 185 cannot be read at this time
ORA-01110: data file 185: '/data2/oradata/undotbs03.dbf'
Error 604 happened during db open, shutting down database
USER: terminating instance due to error 604
Instance terminated by USER, pid = 13232
ORA-1092 signalled during: alter database open...

尝试不完全恢复,并resetlogs操作

ALTER DATABASE RECOVER  database until cancel
Thu Nov 20 19:33:41 2014
Media Recovery Start
Datafile 2 is on orphaned branch
          File status = 4
        Abs fuzzy SCN = 0
 Hot backup fuzzy SCN = 0
Thu Nov 20 19:33:41 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_20878.trc:
ORA-00600: internal error code, arguments: [krhpfh_03-1209], [2], [864151207], [864153315], [1229402557], [7], [0], [0]
ORA-01110: data file 2: '/opt/oracle/oradata/xifenfei/undotbs01.dbf'
Thu Nov 20 19:33:42 2014
Media Recovery failed with error 600
ORA-283 signalled during: ALTER DATABASE RECOVER  database until cancel  ...
Thu Nov 20 19:34:06 2014
alter database open resetlogs
Thu Nov 20 19:34:06 2014
ORA-1139 signalled during: alter database open resetlogs...
Thu Nov 20 19:34:17 2014
alter database open
Thu Nov 20 19:34:17 2014
ORA-1190 signalled during: alter database open...
Thu Nov 20 19:35:57 2014
ALTER DATABASE RECOVER  database until cancel
Thu Nov 20 19:35:57 2014
Media Recovery Start
Datafile 2 is on orphaned branch
          File status = 4
        Abs fuzzy SCN = 0
 Hot backup fuzzy SCN = 0
Thu Nov 20 19:35:58 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_20878.trc:
ORA-00600: internal error code, arguments: [krhpfh_03-1209], [2], [864151207], [864153315], [1229402557], [7], [0], [0]
ORA-01110: data file 2: '/opt/oracle/oradata/xifenfei/undotbs01.dbf'
Thu Nov 20 19:35:59 2014
Media Recovery failed with error 600
ORA-283 signalled during: ALTER DATABASE RECOVER  database until cancel  ...
Thu Nov 20 19:37:19 2014
alter database open resetlogs
Thu Nov 20 19:37:19 2014
ORA-1139 signalled during: alter database open resetlogs...

继续打开报 ORA-600 3600错误

Thu Nov 20 19:43:14 2014
alter database datafile '/opt/oracle/oradata/xifenfei/undotbs01.dbf' offline drop
Thu Nov 20 19:43:14 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_dbw0_20856.trc:
ORA-00600: internal error code, arguments: [3600], [2], [14], [], [], [], [], []
Thu Nov 20 19:43:15 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_dbw0_20856.trc:
ORA-00600: internal error code, arguments: [3600], [2], [14], [], [], [], [], []
Thu Nov 20 19:43:15 2014
DBW0: terminating instance due to error 471
Instance terminated by DBW0, pid = 20856

<strong>中间多次重启和resetlogs,还出现ORA-600 2663错误</strong>

Fri Nov 21 12:35:12 2014
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Fri Nov 21 12:35:12 2014
SMON: enabling cache recovery
Fri Nov 21 12:35:13 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_15596.trc:
ORA-00600: internal error code, arguments: [2663], [7], [1229543007], [7], [1229560642], [], [], []
Fri Nov 21 12:35:14 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_15596.trc:
ORA-00600: internal error code, arguments: [2663], [7], [1229543007], [7], [1229560642], [], [], []
Fri Nov 21 12:35:14 2014
Error 600 happened during db open, shutting down database
USER: terminating instance due to error 600
Fri Nov 21 12:35:14 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_mman_15572.trc:
ORA-00600: internal error code, arguments: [], [], [], [], [], [], [], []
Fri Nov 21 12:35:14 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_dbw1_15576.trc:
ORA-00600: internal error code, arguments: [], [], [], [], [], [], [], []
Instance terminated by USER, pid = 15596
ORA-1092 signalled during: ALTER DATABASE OPEN..

继续尝试打开数据库出现ORA-600 ktsitbs_info1错误

SMON: enabling cache recovery
Fri Nov 21 13:54:25 2014
Hex dump of (file 1, block 60) in trace file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_21111.trc
Corrupt block relative dba: 0x0040003c (file 1, block 60)
Bad header found during buffer read
Data in bad block:
 type: 128 format: 0 rdba: 0x0040003c
 last change scn: 0x0005.ebe04bc9 seq: 0x2 flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x4bc90602
 check value in block header: 0x6faa
 computed block checksum: 0x0
Reread of rdba: 0x0040003c (file 1, block 60) found same corrupted data
Fri Nov 21 13:54:25 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_21111.trc:
ORA-00600: internal error code, arguments: [ktsitbs_info1], [2], [], [], [], [], [], []
Fri Nov 21 13:54:27 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_21111.trc:
ORA-00600: internal error code, arguments: [ktsitbs_info1], [2], [], [], [], [], [], []
Error 600 happened during db open, shutting down database
USER: terminating instance due to error 600
Instance terminated by USER, pid = 21111
ORA-1092 signalled during: alter database open...

以上是客户数据库故障原因和问题大概的处理过程,下面是我接手后的处理过程


dbv 检查system01.dbf文件,得到结果

HNDX-DB% dbv file=/opt/oracle/oradata/xifenfei/system01.dbf
DBVERIFY: Release 10.2.0.1.0 - Production on Fri Nov 21 16:22:37 2014
Copyright (c) 1982, 2005, Oracle.  All rights reserved.
DBVERIFY - Verification starting : FILE = /opt/oracle/oradata/xifenfei/system01.dbf
Page 60 is marked corrupt
Corrupt block relative dba: 0x0040003c (file 1, block 60)
Bad header found during dbv:
Data in bad block:
 type: 128 format: 0 rdba: 0x0040003c
 last change scn: 0x0005.ebe04bc9 seq: 0x2 flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x4bc90602
 check value in block header: 0x6faa
 computed block checksum: 0x0
Corrupt block relative dba: 0x004001f2 (file 1, block 498)
Bad check value found during buffer read
Data in bad block:
 type: 6 format: 2 rdba: 0x004001f2
 last change scn: 0x0007.49499ca1 seq: 0x1 flg: 0x06
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x9ca10601
 check value in block header: 0xe458
 computed block checksum: 0x9720
DBVERIFY - Verification complete
Total Pages Examined         : 786432
Total Pages Processed (Data) : 201131
Total Pages Failing   (Data) : 2
Total Pages Processed (Index): 221394
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 60265
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 303641
Total Pages Marked Corrupt   : 2
Total Pages Influx           : 0
Highest block SCN            : 1229823477 (7.1229823477)

这里知道数据库有两个坏块,而且根据对于bootstrap$的经验,可以大概确定60坏块很可能是C_TS#,第一反应type异常,498可能是seq$

对数据库启动过程做10046,得到trace文件

PARSING IN CURSOR #1 len=275 dep=2 uid=0 oct=3 lid=0 tim=27978051403575 hv=3408408745 ad='7df93cd0'
select name,online$,contents$,undofile#,undoblock#,blocksize,dflmaxext,dflinit,dflincr,dflextpct,dflminext,
dflminlen, owner#,scnwrp,scnbas, NVL(pitrscnwrp, 0), NVL(pitrscnbas, 0), dflogging, bitmapped, inc#, flags,
plugged, NVL(spare1,0), NVL(spare2,0) from ts$ where ts#=:1
END OF STMT
PARSE #1:c=0,e=92,p=0,cr=0,cu=0,mis=0,r=0,dep=2,og=4,tim=27978051403569
BINDS #1:
kkscoacd
 Bind#0
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=ffffffff7dbac9a8  bln=22  avl=02  flg=05
  value=2
EXEC #1:c=0,e=310,p=0,cr=0,cu=0,mis=0,r=0,dep=2,og=4,tim=27978051404296
WAIT #1: nam='db file sequential read' ela= 42 file#=1 block#=60 blocks=1 obj#=-1 tim=27978051404449
Hex dump of (file 1, block 60)
Corrupt block relative dba: 0x0040003c (file 1, block 60)
Bad header found during buffer read
Data in bad block:
 type: 128 format: 0 rdba: 0x0040003c
 last change scn: 0x0005.ebe04bc9 seq: 0x2 flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x4bc90602
 check value in block header: 0x6faa
 computed block checksum: 0x0
Reread of rdba: 0x0040003c (file 1, block 60) found same corrupted data
FETCH #1:c=10000,e=4072,p=1,cr=2,cu=0,mis=0,r=0,dep=2,og=4,tim=27978051408438
STAT #1 id=1 cnt=0 pid=0 pos=1 obj=16 op='TABLE ACCESS CLUSTER TS$ (cr=2 pr=1 pw=0 time=4075 us)'
STAT #1 id=2 cnt=1 pid=1 pos=1 obj=7 op='INDEX UNIQUE SCAN I_TS# (cr=1 pr=0 pw=0 time=13 us)'
*** 2014-11-22 14:44:43.235
ksedmp: internal or fatal error
ORA-00600: internal error code, arguments: [ktsitbs_info1], [2], [], [], [], [], [], []
Current SQL statement for this session:
select max(maxconcurrency) from sys.wrh$_undostat  where instance_number = :1 and dbid = :2
and snap_id in   (select snap_id from dba_hist_snapshot where end_interval_time >
(select max(end_interval_time)-7 from dba_hist_snapshot))

这里显示了数据库启动报ORA-00600[ktsitbs_info1],[2],明显的表示了b中的2是表示表空间号,由于ts$坏块,无法读取ts$中表空间信息,从而出现数据字典不一致,从而出现该错误。所以恢复该库的关键是修复file 1 block 60.

bbed尝试修复file 1 block 60

HNDX-DB% bbed password=blockedit mode=edit
BBED: Release 2.0.0.0.0 - Limited Production on Sat Nov 22 15:16:26 2014
Copyright (c) 1982, 2005, Oracle.  All rights reserved.
************* !!! For Oracle Internal Use only !!! ***************
BBED> set filename '/opt/oracle/oradata/xifenfei/system01.dbf'
        FILENAME        /opt/oracle/oradata/xifenfei/system01.dbf
BBED> set block 8192
        BLOCK#          8192
BBED> set block 60
        BLOCK#          60
BBED> set count 64
        COUNT           64
BBED> map
 File: /opt/oracle/oradata/xifenfei/system01.dbf (0)
 Block: 60                                    Dba:0x00000000
------------------------------------------------------------
BBED-00400: invalid blocktype (128)
BBED> set block 61
        BLOCK#          61
BBED> map
 File: /opt/oracle/oradata/xifenfei/system01.dbf (0)
 Block: 61                                    Dba:0x00000000
------------------------------------------------------------
 KTB Data Block (Table/Cluster)
 struct kcbh, 20 bytes                      @0
 struct ktbbh, 72 bytes                     @20
 struct kdbh, 14 bytes                      @92
 struct kdbt[3], 12 bytes                   @106
 sb2 kdbr[2]                                @118
 ub1 freespace[7959]                        @122
 ub1 rowdata[107]                           @8081
 ub4 tailchk                                @8188
BBED> p kcbh
struct kcbh, 20 bytes                       @0
   ub1 type_kcbh                            @0        0x06
   ub1 frmt_kcbh                            @1        0xa2
   ub1 spare1_kcbh                          @2        0x00
   ub1 spare2_kcbh                          @3        0x00
   ub4 rdba_kcbh                            @4        0x0040003d
   ub4 bas_kcbh                             @8        0x0000235b
   ub2 wrp_kcbh                             @12       0x0000
   ub1 seq_kcbh                             @14       0x01
   ub1 flg_kcbh                             @15       0x04 (KCBHFCKV)
   ub2 chkval_kcbh                          @16       0x7a85
   ub2 spare3_kcbh                          @18       0x0000
BBED> set block 60
        BLOCK#          60
BBED> d
 File: /opt/oracle/oradata/xifenfei/system01.dbf (0)
 Block: 60               Offsets:    0 to   63           Dba:0x00000000
------------------------------------------------------------------------
 80000000 0040003c ebe04bc9 00050204 6faa0000 01000000 00000006 29b3a204
 00040ca0 00020200 00000000 000a0000 00000002 0080009b 00000100 80000000
 <32 bytes per line>
BBED> d block 61
 File: /opt/oracle/oradata/xifenfei/system01.dbf (0)
 Block: 61               Offsets:    0 to   63           Dba:0x00000000
------------------------------------------------------------------------
 06a20000 0040003d 0000235b 00000104 7a850000 01000000 00000006 00001837
 00001738 00020200 00000000 0007002e 00000002 00800075 00012300 80000000
 <32 bytes per line>
BBED> set block 60
        BLOCK#          60
BBED> m /x 06a2
 File: /opt/oracle/oradata/xifenfei/system01.dbf (0)
 Block: 60               Offsets:    0 to   63           Dba:0x00000000
------------------------------------------------------------------------
 06a20000 0040003c ebe04bc9 00050204 6faa0000 01000000 00000006 29b3a204
 00040ca0 00020200 00000000 000a0000 00000002 0080009b 00000100 80000000
 <32 bytes per line>
BBED> map
 File: /opt/oracle/oradata/xifenfei/system01.dbf (0)
 Block: 60                                    Dba:0x00000000
------------------------------------------------------------
 KTB Data Block (Table/Cluster)
 struct kcbh, 20 bytes                      @0
 struct ktbbh, 72 bytes                     @20
 struct kdbh, 14 bytes                      @92
 struct kdbt[3], 12 bytes                   @106
 sb2 kdbr[2]                                @118
 ub1 freespace[7598]                        @122
 ub1 rowdata[468]                           @7720
 ub4 tailchk                                @8188
BBED> sum apply
Check value for File 0, Block 60:
current = 0xe908, required = 0xe908
BBED> verify
DBVERIFY - Verification starting
FILE = /opt/oracle/oradata/xifenfei/system01.dbf
BLOCK = 60
DBVERIFY - Verification complete
Total Blocks Examined         : 1
Total Blocks Processed (Data) : 1
Total Blocks Failing   (Data) : 0
Total Blocks Processed (Index): 0
Total Blocks Failing   (Index): 0
Total Blocks Empty            : 0
Total Blocks Marked Corrupt   : 0
Total Blocks Influx           : 0
BBED>

尝试启动数据库

Sat Nov 22 15:51:33 2014
alter database open
Sat Nov 22 15:51:34 2014
Thread 1 opened at log sequence 7
  Current log# 8 seq# 7 mem# 0: /data2/oradata/redo0802.log
Successful open of redo thread 1
Sat Nov 22 15:51:34 2014
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Sat Nov 22 15:51:34 2014
SMON: enabling cache recovery
SMON: enabling tx recovery
Sat Nov 22 15:51:34 2014
Database Characterset is ZHS16CGB231280
Hex dump of (file 1, block 498) in trace file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_2818.trc
Corrupt block relative dba: 0x004001f2 (file 1, block 498)
Bad check value found during buffer read
Data in bad block:
 type: 6 format: 2 rdba: 0x004001f2
 last change scn: 0x0007.49499ca1 seq: 0x1 flg: 0x06
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x9ca10601
 check value in block header: 0xe458
 computed block checksum: 0x9720
Reread of rdba: 0x004001f2 (file 1, block 498) found same corrupted data
Sat Nov 22 15:51:35 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_smon_2803.trc:
ORA-00600: internal error code, arguments: [4000], [12], [], [], [], [], [], []
replication_dependency_tracking turned off (no async multimaster replication found)
Starting background process QMNC
QMNC started with pid=18, OS id=3000
Sat Nov 22 15:51:36 2014
Completed: alter database open
Sat Nov 22 15:51:36 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_3010.trc:
ORA-00600: internal error code, arguments: [6807], [AUDSES$], [144], [], [], [], [], []
Sat Nov 22 15:51:37 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_mmon_2809.trc:
ORA-00600: internal error code, arguments: [6807], [WRI$_ALERT_SEQUENCE], [8783], [], [], [], [], []
Sat Nov 22 15:51:37 2014
Non-fatal internal error happenned while SMON was doing non-existent object cleanup.
SMON encountered 1 out of maximum 100 non-fatal internal errors.
Sat Nov 22 15:51:38 2014
ORA-600 encountered when generating server alert SMG-3000
Sat Nov 22 15:51:38 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_mmon_2809.trc:
ORA-00600: internal error code, arguments: [ktcpoptx_0], [0x772705E60], [], [], [], [], [], []

只要出现ORA-600 4000和ORA-600 6807错误,其中ORA-600 6807错误比较明显是由于seq$坏块,导致AUDSES$ seq异常导致。ORA-600 4000应该是回滚段异常,继续分析回滚段

SQL> select name,ts#,status$ from undo$;
NAME                                  TS#    STATUS$
------------------------------ ---------- ----------
SYSTEM                                  0          2
_SYSSMU1$                               1          2
_SYSSMU2$                               1          2
_SYSSMU3$                               1          2
…………
_SYSSMU168$                             1          2
_SYSSMU169$                             1          2

这里很异常,system回滚段在数据库open之后,按照常理不可能处于STATUS$=2(OFFLINE)状态。而且其他回滚段全部为OFFLINE状态也属于异常情况.而且尝试drop undo报ORA-01561,另外在dba_rollback_segs中无SYSTEM(查询结果忘记保存)

SQL> drop tablespace undotbs1 including contents;
drop tablespace undotbs1 including contents
*
ERROR at line 1:
ORA-01561: failed to remove all objects in the tablespace specified

通过这一系列很怀疑是由于bbed 修改了undo$等相关基表信息导致现在system中的undo信息混乱.信息反馈给客户后,客户想起来昨天给他们恢复的公司在bbed操作前备份了system01.dbf.突然感觉救星来了.实在怕不懂bbed的人折腾bbed

dbv检测备份文件

DBVERIFY - Verification starting : FILE = /data3/backup/system01.dbf_bak
Page 60 is marked corrupt
Corrupt block relative dba: 0x0040003c (file 1, block 60)
Bad header found during dbv:
Data in bad block:
 type: 128 format: 0 rdba: 0x0040003c
 last change scn: 0x0005.ebe04bc9 seq: 0x2 flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x4bc90602
 check value in block header: 0x6faa
 computed block checksum: 0x0
Block Checking: DBA = 4194802, Block Type = KTB-managed data block
data header at 0x1002ef05c
kdbchk: row locked by non-existent transaction
        table=0   slot=4
        lockid=1   ktbbhitc=2
Page 498 failed with check code 6101
DBVERIFY - Verification complete
Total Pages Examined         : 786432
Total Pages Processed (Data) : 201131
Total Pages Failing   (Data) : 1
Total Pages Processed (Index): 221394
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 60265
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 303641
Total Pages Marked Corrupt   : 1
Total Pages Influx           : 0
Highest block SCN            : 1229823477 (7.1229823477)

好家伙只有一个物理坏块和一个逻辑坏块,而对于物理坏块block 60已经知道如何修复,逻辑坏块可以尝试设置隐含参数跳过去,bbed修改相关block(同上步骤)

再次启动数据库

dd if=/opt/oracle/oradata/xifenfei/system01.dbf bs=8192 count=2 of=/tmp/system01.2
dd if=/tmp/system01.2 of=/data3/backup/system01.dbf_bak bs=8192 count=2 conv=notrunc
Sat Nov 22 17:52:50 2014
Database mounted in Exclusive Mode
Completed: ALTER DATABASE   MOUNT
Sat Nov 22 17:53:38 2014
alter database rename file '/opt/oracle/oradata/xifenfei/system01.dbf' to '/data3/backup/system01.dbf_bak'
Sat Nov 22 17:53:39 2014
Completed: alter database rename file '/opt/oracle/oradata/xifenfei/system01.dbf' to '/data3/backup/system01.dbf_bak'
Sat Nov 22 17:55:43 2014
alter database open
Sat Nov 22 17:55:48 2014
LGWR: STARTING ARCH PROCESSES
ARC0 started with pid=18, OS id=15858
Sat Nov 22 17:56:10 2014
ARC0: Archival started
ARC1: Archival started
LGWR: STARTING ARCH PROCESSES COMPLETE
ARC1 started with pid=17, OS id=15879
Sat Nov 22 17:56:19 2014
Thread 1 opened at log sequence 7
  Current log# 8 seq# 7 mem# 0: /data2/oradata/redo0802.log
Successful open of redo thread 1
Sat Nov 22 17:56:19 2014
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Sat Nov 22 17:56:19 2014
SMON: enabling cache recovery
SMON: enabling tx recovery
Sat Nov 22 17:56:20 2014
ARC1: STARTING ARCH PROCESSES
Sat Nov 22 17:56:20 2014
ARC0: Becoming the 'no FAL' ARCH
ARC0: Becoming the 'no SRL' ARCH
Sat Nov 22 17:56:22 2014
Database Characterset is ZHS16CGB231280
replication_dependency_tracking turned off (no async multimaster replication found)
Starting background process QMNC
Sat Nov 22 17:56:33 2014
ARC2: Archival started
ARC1: STARTING ARCH PROCESSES COMPLETE
ARC1: Becoming the heartbeat ARCH
ARC2 started with pid=23, OS id=15928
QMNC started with pid=25, OS id=15996
Sat Nov 22 17:57:11 2014
Completed: alter database open
Sat Nov 22 17:57:18 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_16010.trc:
ORA-00600: internal error code, arguments: [4511], [], [], [], [], [], [], []
Sat Nov 22 17:57:26 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_16012.trc:
ORA-00600: internal error code, arguments: [4511], [], [], [], [], [], [], []
Sat Nov 22 17:58:17 2014
Starting background process EMN0
Sat Nov 22 18:00:03 2014
Shutting down instance: further logons disabled
EMN0 started with pid=71, OS id=16421
Sat Nov 22 18:00:12 2014
SMON: Restarting fast_start parallel rollback
Sat Nov 22 18:00:23 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_p000_15951.trc:
ORA-00600: internal error code, arguments: [4198], [9], [], [], [], [], [], []
Sat Nov 22 18:00:24 2014
Stopping background process CJQ0
Sat Nov 22 18:00:24 2014
Stopping background process QMNC
Sat Nov 22 18:00:27 2014
Doing block recovery for file 2 block 41
Block recovery from logseq 7, block 180883 to scn 214748389244
Sat Nov 22 18:00:27 2014
Recovery of Online Redo Log: Thread 1 Group 8 Seq 7 Reading mem 0
  Mem# 0 errs 0: /data2/oradata/redo0802.log
Block recovery stopped at EOT rba 7.180988.16
Block recovery completed at rba 7.180988.16, scn 50.24441
Sat Nov 22 18:00:32 2014
Stopping background process MMNL
Sat Nov 22 18:00:38 2014
Stopping background process MMON
Sat Nov 22 18:00:41 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_smon_15395.trc:
ORA-00600: internal error code, arguments: [4137], [], [], [], [], [], [], []
Sat Nov 22 18:00:42 2014
ORACLE Instance xifenfei (pid = 9) - Error 600 encountered while recovering transaction (3, 4).
Sat Nov 22 18:00:42 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_smon_15395.trc:
ORA-00600: internal error code, arguments: [4137], [], [], [], [], [], [], []

这里都是很常规的错误,查询undo$也已经正常,重建新undo表空间删除老undo,然后alert日志中无其他报错,数据库恢复至此完成,建议客户导出导入重建数据库