ORACLE 8.1.7 数据库ORA-600 4194故障恢复

一个817数据库报ORA-600 4194 无法正常启动

Fri Jul 25 10:49:47 2014
Database mounted in Exclusive Mode.
Completed: ALTER DATABASE   MOUNT
Fri Jul 25 10:49:58 2014
ALTER DATABASE RECOVER  database
Fri Jul 25 10:49:58 2014
Media Recovery Start
Media Recovery Log
Recovery of Online Redo Log: Thread 1 Group 2 Seq 3320 Reading mem 0
  Mem# 0 errs 0: D:\ORACLE\ORADATA\ORCL\REDO02.LOG
Media Recovery Complete
Completed: ALTER DATABASE RECOVER  database
Fri Jul 25 10:50:09 2014
alter database open
Beginning crash recovery of 1 threads
Fri Jul 25 10:50:09 2014
Thread recovery: start rolling forward thread 1
Recovery of Online Redo Log: Thread 1 Group 2 Seq 3320 Reading mem 0
  Mem# 0 errs 0: D:\ORACLE\ORADATA\ORCL\REDO02.LOG
Fri Jul 25 10:50:09 2014
Thread recovery: finish rolling forward thread 1
Thread recovery: 0 data blocks read, 0 data blocks written, 3 redo blocks read
Crash recovery completed successfully
Fri Jul 25 10:50:09 2014
Thread 1 advanced to log sequence 3321
Thread 1 opened at log sequence 3321
  Current log# 3 seq# 3321 mem# 0: D:\ORACLE\ORADATA\ORCL\REDO01.LOG
Successful open of redo thread 1.
Fri Jul 25 10:50:09 2014
SMON: enabling cache recovery
Fri Jul 25 10:50:10 2014
Errors in file D:\oracle\admin\ORCL\udump\ORA03216.TRC:
ORA-00600: ??????????: [4194], [12], [37], [], [], [], [], []
Fri Jul 25 10:50:10 2014
Recovery of Online Redo Log: Thread 1 Group 3 Seq 3321 Reading mem 0
  Mem# 0 errs 0: D:\ORACLE\ORADATA\ORCL\REDO01.LOG
Fri Jul 25 10:50:10 2014
SMON: disabling cache recovery
Fri Jul 25 10:50:10 2014
ORA-600 signalled during: alter database open

ORA-600 4194这个错误在数据库异常恢复中非常常见,因为库不是很重要,因此就是直接屏蔽掉故障回滚段,然后强制拉库,该库的恢复过程中,也直接使用隐含参数屏蔽回滚段
_corrupted_rollback_segments= RBS0, RBS1, RBS2, RBS3, RBS4, RBS5, RBS6, RBS_HDSYS,数据库依然无法open,进一步分析trace文件

Fri Jul 25 11:26:07 2014
ORACLE V8.1.7.0.0 - Production vsnsta=0
vsnsql=e vsnxtr=3
Windows 2000 Version 5.2 Service Pack 2, CPU type 586
Oracle8i Release 8.1.7.0.0 - Production
JServer Release 8.1.7.0.0 - Production
Windows 2000 Version 5.2 Service Pack 2, CPU type 586
Instance name: orcl
Redo thread mounted by this instance: 1
Oracle process number: 14
Windows thread id: 3648, image: ORACLE.EXE
*** SESSION ID:(11.1) 2014-07-25 11:26:07.843
*** 2014-07-25 11:26:07.843
ksedmp: internal or fatal error
ORA-00600: ??????????: [4194], [12], [37], [], [], [], [], []
Current SQL statement for this session:
update undo$ set name=:2,file#=:3,block#=:4,status$=:5,user#=:6,undosqn=:7,xactsqn=:8,
scnbas=:9,scnwrp=:10,inst#=:11,ts#=:12 where us#=:1
----- Call Stack Trace -----

这里很明显看出来,数据库是在open过程中,update undo$表遭遇到ORA-600 4194,因为该过程需要使用系统回滚段,但是由于其所对应的undo和redo信息不一致,所以无法正常启动数据库.继续读trace文件

  Extent Control Header
  -----------------------------------------------------------------
  Extent Header:: spare1: 0      space2: 0      #extents: 5      #blocks: 49
                  last map  0x00000000  #maps: 0      offset: 4128
      Highwater::  0x00400006  ext#: 0      blk#: 3      ext size: 9
  #blocks in seg. hdr's freelists: 0
  #blocks below: 0
  mapblk  0x00000000  offset: 0
                   Unlocked
     Map Header:: next  0x00000000  #extents: 5    obj#: 0      flag: 0x40000000
  Extent Map
  -----------------------------------------------------------------
   0x00400003  length: 9
   0x0040000c  length: 10
   0x0040008f  length: 10
   0x00400099  length: 10
   0x004000a3  length: 10
  TRN CTL:: seq: 0x003c chd: 0x004e ctl: 0x0050 inc: 0x00000000 nfb: 0x0000
            mgc: 0x8002 xts: 0x0068 flg: 0x0001 opt: 2147483646 (0x7ffffffe)
            uba: 0x00400006.003c.25 scn: 0x0000.009a4009
Version: 0x01
  FREE BLOCK POOL::
    uba: 0x00000000.003c.24 ext: 0x0  spc: 0x196
    uba: 0x00000000.001f.14 ext: 0x1  spc: 0x16f6
    uba: 0x00000000.0018.02 ext: 0x4  spc: 0x1f1a
    uba: 0x00000000.0000.00 ext: 0x0  spc: 0x0
    uba: 0x00000000.0000.00 ext: 0x0  spc: 0x0
  TRN TBL::

通过这里可以看出来,数据库在启动的时候,使用system undo的block为为0x00400006,使用bbed清除掉该uba记录,让数据库启动的时候重新分配system undo block给数据库执行update undo$使用,数据库open成功

BBED> m /x 0x00000000
 File: D:\ORACLE\ORADATA\ORCL\SYSTEM01.DBF (0)
 Block: 2                Offsets: 4188 to 4192           Dba:0x00000000
------------------------------------------------------------------------
 00000000 3c002400 00009601 00000000 1f001400 0100f616 00000000 18000200
BBED> m /x 0x0000
 File: D:\ORACLE\ORADATA\ORCL\SYSTEM01.DBF (0)
 Block: 2                Offsets: 4028 to 4032           Dba:0x00000000
------------------------------------------------------------------------
 00000000 00000000 3c005000 02800100 68000000 feffff7f 06004000 3c002400
Sat Jul 26 12:09:21 2014
Thread recovery: start rolling forward thread 1
Recovery of Online Redo Log: Thread 1 Group 2 Seq 3326 Reading mem 0
  Mem# 0 errs 0: D:\ORACLE\ORADATA\ORCL\REDO02.LOG
Sat Jul 26 12:09:21 2014
Thread recovery: finish rolling forward thread 1
Thread recovery: 0 data blocks read, 0 data blocks written, 3 redo blocks read
Crash recovery completed successfully
Sat Jul 26 12:09:22 2014
Thread 1 advanced to log sequence 3327
Thread 1 opened at log sequence 3327
  Current log# 3 seq# 3327 mem# 0: D:\ORACLE\ORADATA\ORCL\REDO01.LOG
Successful open of redo thread 1.
Sat Jul 26 12:09:22 2014
SMON: enabling cache recovery
SMON: enabling tx recovery
Sat Jul 26 12:09:39 2014
Completed: alter database open

数据库启动ORA-08103故障恢复

数据库在open过程报ORA-08103错误导致数据库无法正确启动

Fri Jul 18 22:02:51 2014
SMON: enabling tx recovery
Fri Jul 18 22:02:51 2014
Errors in file d:\oracle\product\10.2.0\admin\kemu3\udump\kemu3_ora_29788.trc:
ORA-00604: ?? SQL ?? 1 ????
ORA-08103: ??????
Fri Jul 18 22:02:51 2014
Database Characterset is ZHS16GBK
Fri Jul 18 22:02:51 2014
Errors in file d:\oracle\product\10.2.0\admin\kemu3\bdump\kemu3_smon_29704.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-08103: object no longer exists
Fri Jul 18 22:02:51 2014
Errors in file d:\oracle\product\10.2.0\admin\kemu3\bdump\kemu3_smon_29704.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-08103: object no longer exists
Fri Jul 18 22:02:51 2014
Errors in file d:\oracle\product\10.2.0\admin\kemu3\bdump\kemu3_smon_29704.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-08103: object no longer exists
Fri Jul 18 22:02:52 2014
Errors in file d:\oracle\product\10.2.0\admin\kemu3\bdump\kemu3_smon_29704.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-08103: object no longer exists
replication_dependency_tracking turned off (no async multimaster replication found)
Starting background process QMNC
QMNC started with pid=18, OS id=29876
Fri Jul 18 22:02:53 2014
Errors in file d:\oracle\product\10.2.0\admin\kemu3\bdump\kemu3_smon_29704.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-08103: object no longer exists
Fri Jul 18 22:02:54 2014
ORA-604 signalled during: alter database open...

对数据库启动过程做10046

PARSING IN CURSOR #22 len=210 dep=2 uid=0 oct=3 lid=0 tim=20960424464 hv=864012087 ad='3063f0b4'
select /*+ rule */ bucket_cnt, row_cnt, cache_cnt, null_cnt, timestamp#, sample_size, minimum,
maximum, distcnt, lowval, hival, density, col#, spare1, spare2, avgcln from hist_head$ where obj#=:1 and intcol#=:2
END OF STMT
EXEC #22:c=0,e=80,p=0,cr=0,cu=0,mis=0,r=0,dep=2,og=3,tim=20960424461
WAIT #22: nam='db file sequential read' ela= 5452 file#=1 block#=60213 blocks=1 obj#=4586 tim=20960429962
FETCH #22:c=0,e=5967,p=1,cr=1,cu=0,mis=0,r=0,dep=2,og=3,tim=20960430462
*** KEWUXS - encountered error: (ORA-00604: 递归 SQL 级别 2 出现错误
ORA-08103: 对象不再存在
)
*** kewrwdbi_1: Error=13515 encountered during run_once
BINDS #21:
kkscoacd
 Bind#0
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=0a217744  bln=22  avl=01  flg=05
  value=0
 Bind#1
  oacdty=01 mxl=32(20) mxlc=00 mal=00 scl=00 pre=00
  oacflg=18 fl2=0001 frm=01 csi=852 siz=32 off=0
  kxsbbbfp=0a217718  bln=32  avl=20  flg=05
  value="WRI$_ADV_DEFINITIONS"
 Bind#2
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=0a2176f4  bln=24  avl=02  flg=05
  value=1

这里很明显数据库启动过程,由于hist_head$的file 1 block 60213中的object_id 与 data_object_id 不匹配,从而出现ORA-08103错误,导致数据库无法正常启动,这里的故障的对象为hist_head$,非oracle核心对象,因此直接标记该block 为坏块(模拟普通ORA-08103并解决,模拟极端ORA-08103并解决,rman制造坏块,bbed修复坏块,bbed破坏数据文件),然后启动数据库,备份hist_head$表数据,然后truncate hist_head$,再插入hist_head$,整体完工.
在数据库open过程中,如果遇到ora-8103错误,导致数据库无法正常open,可以对其做10046定位到故障block和对象,然后判断对象是否数据库启动必须的对象,甚至是bootstarp$中对象,然后采取不同的处理方法.

数据文件的三个创建SCN一点点探讨

在给一个朋友数据库恢复的过程中语句该库大量删除表空间,然后创建表空,由于在创建控制文件的时候,列出来不正确文件,导致出现v$datafile_header.error出现WRONG FILE CREATE错误.通过试验重现了该错误,并且进一步测试如果真的需要历史数据文件,该如何狸猫换太子(本实验为了进一步理解数据文件创建scn相关信息)
创建xifenfei表空间,然后删除表空间,但不删除数据文件,然后创建重名表空间

SQL> select to_char(sysdate,'yyyy-mm-dd hh24:mi:ss') today,'www.xifenfei.com' xifenfei from dual;
TODAY               XIFENFEI
------------------- ----------------
2014-07-16 15:54:26 www.xifenfei.com
SQL> create tablespace xifenfei datafile '/u01/app/oracle/oradata/ORCL/xifenfei_old.dbf' size 10m;
Tablespace created.
SQL> select file#,name from v$datafile;
     FILE# NAME
---------- --------------------------------------------------
         1 /u01/app/oracle/oradata/ORCL/system01.dbf
         2 /u01/app/oracle/oradata/ORCL/sysaux01.dbf
         3 /u01/app/oracle/oradata/ORCL/undotbs01.dbf
         4 /u01/app/oracle/oradata/ORCL/users01.dbf
         5 /u01/app/oracle/oradata/ORCL/xifenfei_old.dbf
SQL> select file#,CREATION_CHANGE#,to_char(CREATION_TIME,'yyyy-mm-dd hh24:mi:ss') CREATION_TIME from v$datafile;
     FILE# CREATION_CHANGE# CREATION_TIME
---------- ---------------- -------------------
         1               18 2014-07-14 21:53:05
         2             2338 2014-07-14 21:53:42
         3             3130 2014-07-14 21:53:51
         4            15268 2014-07-14 21:54:25
         5           593520 2014-07-16 16:00:54
SQL> select file#,CREATION_CHANGE#,to_char(CREATION_TIME,'yyyy-mm-dd hh24:mi:ss') CREATION_TIME from v$datafile_header;
     FILE# CREATION_CHANGE# CREATION_TIME
---------- ---------------- -------------------
         1               18 2014-07-14 21:53:05
         2             2338 2014-07-14 21:53:42
         3             3130 2014-07-14 21:53:51
         4            15268 2014-07-14 21:54:25
         5           593520 2014-07-16 16:00:54
SQL> drop tablespace xifenfei;
Tablespace dropped.
SQL>  create tablespace xifenfei datafile '/u01/app/oracle/oradata/ORCL/xifenfei_new.dbf' size 10m;
Tablespace created.
SQL> select file#,CREATION_CHANGE#,to_char(CREATION_TIME,'yyyy-mm-dd hh24:mi:ss') CREATION_TIME from v$datafile;
     FILE# CREATION_CHANGE# CREATION_TIME
---------- ---------------- -------------------
         1               18 2014-07-14 21:53:05
         2             2338 2014-07-14 21:53:42
         3             3130 2014-07-14 21:53:51
         4            15268 2014-07-14 21:54:25
         5           593613 2014-07-16 16:02:45
SQL> select file#,CREATION_CHANGE#,to_char(CREATION_TIME,'yyyy-mm-dd hh24:mi:ss') CREATION_TIME from v$datafile_header;
     FILE# CREATION_CHANGE# CREATION_TIME
---------- ---------------- -------------------
         1               18 2014-07-14 21:53:05
         2             2338 2014-07-14 21:53:42
         3             3130 2014-07-14 21:53:51
         4            15268 2014-07-14 21:54:25
         5           593613 2014-07-16 16:02:45

rename xifenfei表空间数据文件到老数据文件

SQL> alter database datafile 5 offline drop;
Database altered.
SQL> alter database rename file '/u01/app/oracle/oradata/ORCL/xifenfei_new.dbf'
 2   to '/u01/app/oracle/oradata/ORCL/xifenfei_old.dbf';
Database altered.
SQL> alter database datafile 5 online;
alter database datafile 5 online
*
ERROR at line 1:
ORA-01122: database file 5 failed verification check
ORA-01110: data file 5: '/u01/app/oracle/oradata/ORCL/xifenfei_old.dbf'
ORA-01203: wrong incarnation of this file - wrong creation SCN
SQL> select file#,CREATION_CHANGE#,to_char(CREATION_TIME,'yyyy-mm-dd hh24:mi:ss') CREATION_TIME from v$datafile;
     FILE# CREATION_CHANGE# CREATION_TIME
---------- ---------------- -------------------
         1               18 2014-07-14 21:53:05
         2             2338 2014-07-14 21:53:42
         3             3130 2014-07-14 21:53:51
         4            15268 2014-07-14 21:54:25
         5           593613 2014-07-16 16:02:45
SQL> select file#,CREATION_CHANGE#,to_char(CREATION_TIME,'yyyy-mm-dd hh24:mi:ss') CREATION_TIME from v$datafile_header;
     FILE# CREATION_CHANGE# CREATION_TIME
---------- ---------------- -------------------
         1               18 2014-07-14 21:53:05
         2             2338 2014-07-14 21:53:42
         3             3130 2014-07-14 21:53:51
         4            15268 2014-07-14 21:54:25
         5           593520 2014-07-16 16:00:54
SQL> select file#,error from v$datafile_header;
     FILE# ERROR
---------- -----------------------------------------------------------------
         1
         2
         3
         4
         5 WRONG FILE CREATE

至此今天数据库恢复的故障已经模拟出来,就是因为数据文件头的scn和控制文件中scn不一致,从而出现了v$datafile_header.error报WRONG FILE CREATE的现象.

因为控制文件中数据文件scn和数据文件头scn不一致,因此通过重建控制文件来实现两者scn一致

SQL> alter database backup controlfile to trace as '/tmp/ctl';
Database altered.
SQL> shutdown immediate;
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> STARTUP NOMOUNT
ORACLE instance started.
Total System Global Area  718225408 bytes
Fixed Size                  2292432 bytes
Variable Size             373294384 bytes
Database Buffers          339738624 bytes
Redo Buffers                2899968 bytes
SQL> CREATE CONTROLFILE REUSE DATABASE "ORCL" NORESETLOGS  NOARCHIVELOG
  2      MAXLOGFILES 16
  3      MAXLOGMEMBERS 3
  4      MAXDATAFILES 100
  5      MAXINSTANCES 8
  6      MAXLOGHISTORY 292
  7  LOGFILE
  8    GROUP 1 '/u01/app/oracle/oradata/ORCL/redo01.log'  SIZE 50M BLOCKSIZE 512,
  9    GROUP 2 '/u01/app/oracle/oradata/ORCL/redo02.log'  SIZE 50M BLOCKSIZE 512,
 10    GROUP 3 '/u01/app/oracle/oradata/ORCL/redo03.log'  SIZE 50M BLOCKSIZE 512
 11  DATAFILE
 12    '/u01/app/oracle/oradata/ORCL/system01.dbf',
 13    '/u01/app/oracle/oradata/ORCL/sysaux01.dbf',
 14    '/u01/app/oracle/oradata/ORCL/undotbs01.dbf',
 15    '/u01/app/oracle/oradata/ORCL/users01.dbf',
 16    '/u01/app/oracle/oradata/ORCL/xifenfei_old.dbf'
 17  CHARACTER SET ZHS16GBK
 18  ;
Control file created.
SQL> select file#,CREATION_CHANGE#,to_char(CREATION_TIME,'yyyy-mm-dd hh24:mi:ss') CREATION_TIME from v$datafile_header;
     FILE# CREATION_CHANGE# CREATION_TIME
---------- ---------------- -------------------
         1               18 2014-07-14 21:53:05
         2             2338 2014-07-14 21:53:42
         3             3130 2014-07-14 21:53:51
         4            15268 2014-07-14 21:54:25
         5           593520 2014-07-16 16:00:54
SQL> select file#,CREATION_CHANGE#,to_char(CREATION_TIME,'yyyy-mm-dd hh24:mi:ss') CREATION_TIME from v$datafile;
     FILE# CREATION_CHANGE# CREATION_TIME
---------- ---------------- -------------------
         1               18 2014-07-14 21:53:05
         2             2338 2014-07-14 21:53:42
         3             3130 2014-07-14 21:53:51
         4            15268 2014-07-14 21:54:25
         5           593520 2014-07-16 16:00:54
SQL> select file#,error from v$datafile_header;
     FILE# ERROR
---------- -----------------------------------------------------------------
         1
         2
         3
         4
         5

通过重建控制文件消除了v$datafile_header.error报WRONG FILE CREATE错误,继续尝试online文件

SQL> recover datafile 5;
Media recovery complete.
SQL> alter database datafile 5 online;
Database altered.
SQL> select file#,name from v$datafile;
     FILE# NAME
---------- --------------------------------------------------
         1 /u01/app/oracle/oradata/ORCL/system01.dbf
         2 /u01/app/oracle/oradata/ORCL/sysaux01.dbf
         3 /u01/app/oracle/oradata/ORCL/undotbs01.dbf
         4 /u01/app/oracle/oradata/ORCL/users01.dbf
         5 /u01/app/oracle/oradata/ORCL/xifenfei_old.dbf
SQL> alter database open;
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-01177: data file does not match dictionary - probably old incarnation
ORA-01110: data file 5: '/u01/app/oracle/oradata/ORCL/xifenfei_old.dbf'
Process ID: 7437
Session ID: 7 Serial number: 5

出现这个错误,是由于数据库中,还有file$中也记录了数据文件创建scn,而这个scn现在和数据文件头和控制文件中的scn不相等,因此无法启动数据库成功.现在需要做的就是在数据库未启动状态下修改file$中的数据文件创建scn相关值,让其和数据文件头(控制文件中记录)一致

使用第三方工具定位file$记录

1|2|89600|0|1|4194302|1280|0|18||4194306|0x004000e9|0
2|2|70400|1|2|4194302|1280|0|2338||8388610|0x004000e9|1
3|2|25600|2|3|4194302|640|0|3130||12582914|0x004000e9|2
4|2|640|4|4|4194302|160|0|15268||16777218|0x004000e9|3
5|2|1280|7|5|0|0|0|593613||20971522|0x004000e9|4
6|1|3840|||0|0|0|586295||25165826|0x004000e9|5
7|1|3840|||3932160|1280|0|587030||29360130|0x004000e9|6
对应file$结构确定每列含义,以及确定需要修改的列
每行倒数第二列为rdba地址,可以通过转换为file and block,这里对应的就是file 1 block 233
每行最后一列为该条记录在该rdba中的记录顺序

使用工具修改593613为593520,使得file$中的scn与现在控制文件和数据文件头一致,具体参考bbed修改数据内容

修改好file$中数据文件创建scn后,尝试继续操作

SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-01113: file 5 needs media recovery
ORA-01110: data file 5: '/u01/app/oracle/oradata/ORCL/xifenfei_old.dbf'
SQL> recover datafile 5;
Media recovery complete.
SQL> alter database open;
Database altered.
SQL> select file#,name from v$datafile;
     FILE# NAME
---------- --------------------------------------------------
         1 /u01/app/oracle/oradata/ORCL/system01.dbf
         2 /u01/app/oracle/oradata/ORCL/sysaux01.dbf
         3 /u01/app/oracle/oradata/ORCL/undotbs01.dbf
         4 /u01/app/oracle/oradata/ORCL/users01.dbf
         5 /u01/app/oracle/oradata/ORCL/xifenfei_old.dbf

通过这里的简单测试,发现几个问题
1.v$datafile_header.error报WRONG FILE CREATE错误 不一定就是数据文件异常,而其本质是数据文件头scn和控制文件中scn不一致
2.数据文件online需要file$,v$datafile_header,v$datafile中关于数据文件创建scn都一致
3.通过该分析,证明在一些极端情况下,考虑考虑该替换思路实现删除数据文件重新加入数据库

分享一次ORA-01113 ORA-01110故障处理过程

数据库报ORA-01113 ORA-01110无法正常open

SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-01113: file 1 needs media recovery
ORA-01110: data file 1: 'D:\APP\ADMINISTRATOR\ORADATA\ORCL\SYSTEM01.DBF'

alert日志

Wed Jul 16 17:17:56 2014
ALTER DATABASE   MOUNT
Successful mount of redo thread 1, with mount id 1380870980
Database mounted in Exclusive Mode
Lost write protection disabled
Completed: ALTER DATABASE   MOUNT
Wed Jul 16 17:19:01 2014
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\trace\orcl_m000_3540.trc  (incident=367575):
ORA-00700: soft internal error, arguments: [dbgrmblcp_corrupt_page], [D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\metadata\HM_FINDING.ams],
[1245], [], [], [], [], [], [], [], [], []
ORA-01113: file 1 needs media recovery
ORA-01110: data file 1: 'D:\APP\ADMINISTRATOR\ORADATA\ORCL\SYSTEM01.DBF'
Incident details in: D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\incident\incdir_367575\orcl_m000_3540_i367575.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\trace\orcl_m000_3540.trc  (incident=367576):
ORA-00600: internal error code, arguments: [dbgrmblgp_get_page_1], [1245], [0], [80], [], [], [], [], [], [], [], []
ORA-01113: file 1 needs media recovery
ORA-01110: data file 1: 'D:\APP\ADMINISTRATOR\ORADATA\ORCL\SYSTEM01.DBF'
Incident details in: D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\incident\incdir_367576\orcl_m000_3540_i367576.trc
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\trace\orcl_m000_3540.trc:
ORA-51106: check failed to complete due to an error.  See error below
ORA-00600: internal error code, arguments: [dbgrmblgp_get_page_1], [1245], [0], [80], [], [], [], [], [], [], [], []
ORA-01113: file 1 needs media recovery
ORA-01110: data file 1: 'D:\APP\ADMINISTRATOR\ORADATA\ORCL\SYSTEM01.DBF'

这里出现ORA-00700[dbgrmblcp_corrupt_page]与ORA-00600[dbgrmblgp_get_page_1]是Bug 10321285 : ORA-600 [DBGRMBLCP_CORRUPT_PAGE]

尝试recover database 遭遇ORA-00283 ORA-16433

SQL> recover database;
ORA-00283: recovery session canceled due to errors
ORA-16433: The database must be opened in read/write mode.

alert日志信息

Wed Jul 16 17:36:33 2014
ALTER DATABASE RECOVER  database
Media Recovery Start
 started logmerger process
Wed Jul 16 17:36:33 2014
Media Recovery failed with error 16433
Slave exiting with ORA-283 exception
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\trace\orcl_pr00_3868.trc:
ORA-00283: recovery session canceled due to errors
ORA-16433: The database must be opened in read/write mode.
Recovery Slave PR00 previously exited with exception 283
ORA-283 signalled during: ALTER DATABASE RECOVER  database  ...
Wed Jul 16 17:36:45 2014
alter database open
Errors in file D:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\trace\orcl_ora_3236.trc:
ORA-01113: file 1 needs media recovery
ORA-01110: data file 1: 'D:\APP\ADMINISTRATOR\ORADATA\ORCL\SYSTEM01.DBF'
ORA-1113 signalled during: alter database open...

重建控制文件后恢复,遭遇ORA-600 2662错误

SQL> shutdown abort
ORACLE instance shut down.
SQL> startup nomount
ORACLE instance started.
Total System Global Area      10255212544 bytes
Fixed Size                        2264088 bytes
Variable Size                  4529849320 bytes
Database Buffers               5704253440 bytes
Redo Buffers                     18845696 bytes
SQL> @d:\ctl.sql
Control file created.
SQL> recover database using backup controlfile;
ORA-00279: change 13953689551797 generated at 07/16/2014 17:53:41 needed for
thread 1
ORA-00289: suggestion :
D:\APP\ADMINISTRATOR\FAST_RECOVERY_AREA\ORCL\ARCHIVELOG\2014_07_16\O1_MF_1_1_%U_
.ARC
ORA-00280: change 13953689551797 for thread 1 is in sequence #1
Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
d:\app\administrator\oradata\orcl\redo01.log
Log applied.
Media recovery complete.
SQL> alter database open resetlogs;
alter database open resetlogs
*
ERROR at line 1:
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00600: internal error code, arguments: [2662], [3248], [3635774399],
[3249], [3580470066], [4194545], [], [], [], [], [], []
Process ID: 3932
Session ID: 200 Serial number: 1

因为该数据库是11.2.0.3未打上scn patch,直接推进scn(可以使用event,隐含参数,oradebug,bbed等),数据库就正常open

ORA-00600[17182],ORA-00600[25027],ORA-00600[kghfrempty:ds]故障处理

数据库不能启动(或者启动后马上crash),alert日志报错ORA-00600[17182],ORA-00600[25027],ORA-00600[kghfrempty:ds]等错误

Tue Jul 08 23:36:06 2014
alter database open
Beginning crash recovery of 1 threads
 parallel recovery started with 32 processes
Started redo scan
Completed redo scan
 read 1409 KB redo, 0 data blocks need recovery
Started redo application at
 Thread 1: logseq 2855, block 3
Recovery of Online Redo Log: Thread 1 Group 5 Seq 2855 Reading mem 0
  Mem# 0: /backup/oradata/ztmdb/redo05.log
Completed redo application of 0.00MB
Completed crash recovery at
 Thread 1: logseq 2855, block 2822, scn 104627804
 0 data blocks read, 0 data blocks written, 1409 redo k-bytes read
Tue Jul 08 23:36:12 2014
Thread 1 advanced to log sequence 2856 (thread open)
Thread 1 opened at log sequence 2856
  Current log# 1 seq# 2856 mem# 0: /backup/oradata/ztmdb/redo01.log
Successful open of redo thread 1
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Tue Jul 08 23:36:13 2014
SMON: enabling cache recovery
[15126] Successfully onlined Undo Tablespace 2.
Undo initialization finished serial:0 start:10340864 end:10340964 diff:100 (1 seconds)
Verifying file header compatibility for 11g tablespace encryption..
Verifying 11g file header compatibility for tablespace encryption completed
SMON: enabling tx recovery
Database Characterset is ZHS16GBK
Errors in file /home/oracle/diag/rdbms/ztmdb/ztmdb/trace/ztmdb_smon_15100.trc  (incident=62061):
ORA-00600: internal error code, arguments: [17182], [0x2B6DFD23D7A0], [], [], [], [], [], [], [], [], [], []
Incident details in: /home/oracle/diag/rdbms/ztmdb/ztmdb/incident/incdir_62061/ztmdb_smon_15100_i62061.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
No Resource Manager plan active
Errors in file /home/oracle/diag/rdbms/ztmdb/ztmdb/trace/ztmdb_ora_15126.trc  (incident=62093):
ORA-00600: internal error code, arguments: [25027], [0], [875836979], [], [], [], [], [], [], [], [], []
Incident details in: /home/oracle/diag/rdbms/ztmdb/ztmdb/incident/incdir_62093/ztmdb_ora_15126_i62093.trc
Block recovery from logseq 2856, block 98 to scn 104627948
Recovery of Online Redo Log: Thread 1 Group 1 Seq 2856 Reading mem 0
  Mem# 0: /backup/oradata/ztmdb/redo01.log
Block recovery completed at rba 2856.99.16, scn 0.104627949
ORACLE Instance ztmdb (pid = 32) - Error 600 encountered while recovering transaction (14, 2) on object 15113.
Errors in file /home/oracle/diag/rdbms/ztmdb/ztmdb/trace/ztmdb_smon_15100.trc  (incident=62066):
ORA-00600: internal error code, arguments: [kghfrempty:ds], [0x2B6DFD23D790], [], [], [], [], [], [], [], [], [], []
ORA-07445: exception encountered: core dump [kghrst()+1835] [SIGSEGV] [ADDR:0x0] [PC:0x97EE385] [SI_KERNEL(general_protection)] []
ORA-00600: internal error code, arguments: [17182], [0x2B6DFD23D7A0], [], [], [], [], [], [], [], [], [], []
Incident details in: /home/oracle/diag/rdbms/ztmdb/ztmdb/incident/incdir_62066/ztmdb_smon_15100_i62066.trc
PMON (ospid: 14978): terminating the instance due to error 474
Tue Jul 08 23:36:26 2014
System state dump requested by (instance=1, osid=14978 (PMON)), summary=[abnormal instance termination].
System State dumped to trace file /home/oracle/diag/rdbms/ztmdb/ztmdb/trace/ztmdb_diag_14996_20140708233626.trc
Dumping diagnostic data in directory=[cdmp_20140708233626], requested by (instance=1, osid=14978 (PMON)), summary=[abnormal instance termination].
Tue Jul 08 23:36:27 2014
Instance terminated by PMON, pid = 14978

这里的错误比较明显,数据库进行完前滚后,smon进行回滚发现有部分block CORRUPTED 导致该问题,解决该问题思路就是:
1.设置event 屏蔽回滚
2.如果1不行,设置回滚段相关隐含参数,屏蔽回滚
3.逻辑方式重建数据库

补充知识点
ORA-600 [25027]

ERROR:
  Format: ORA-600 [25027] [a] [b]
VERSIONS:
  versions 9.2 and above
ARGUMENTS:
  Arg [a]  Tablespace Number (TSN)
  Arg [b]  Decimal Relative Data Block Address (RDBA)

记录一次ORA-600 3004 恢复过程和处理思路

10.1.0.2数据库在启动的时候报ORA-00600[3004]错误

SQL> startup
ORACLE 例程已经启动。
Total System Global Area 1042254360 bytes
Fixed Size 743960 bytes
Variable Size 503316480 bytes
Database Buffers 536870912 bytes
Redo Buffers 1323008 bytes
数据库装载完毕。
ORA-00600: 内部错误代码,参数: [3004], [1], [0], [0], [], [], [], []

alert日志信息

Tue Jul 15 10:57:11 2014
Database mounted in Exclusive Mode.
Completed: ALTER DATABASE   MOUNT
Tue Jul 15 10:57:12 2014
ALTER DATABASE OPEN
Tue Jul 15 10:57:12 2014
Beginning crash recovery of 1 threads
 attempting to start a parallel recovery with 3 processes
 parallel recovery started with 3 processes
Tue Jul 15 10:57:12 2014
Started first pass scan
Tue Jul 15 10:57:12 2014
Errors in file d:\oracle\product\10.1.0\admin\orcl\udump\orcl_ora_1084.trc:
ORA-00600: 内部错误代码, 参数: [3004], [1], [0], [0], [], [], [], []
Tue Jul 15 10:57:13 2014
Errors in file d:\oracle\product\10.1.0\admin\orcl\bdump\orcl_p000_3780.trc:
ORA-10388: parallel query server interrupt (failure)
Tue Jul 15 10:57:13 2014
Errors in file d:\oracle\product\10.1.0\admin\orcl\bdump\orcl_p002_3420.trc:
ORA-10388: parallel query server interrupt (failure)
Tue Jul 15 10:57:14 2014
Errors in file d:\oracle\product\10.1.0\admin\orcl\bdump\orcl_p001_3784.trc:
ORA-10388: parallel query server interrupt (failure)
Tue Jul 15 10:57:14 2014
Aborting crash recovery due to error 600
Tue Jul 15 10:57:14 2014
Errors in file d:\oracle\product\10.1.0\admin\orcl\udump\orcl_ora_1084.trc:
ORA-00600: 内部错误代码, 参数: [3004], [1], [0], [0], [], [], [], []
ORA-600 signalled during: ALTER DATABASE OPEN...

根据老杨的blog,出现该问题,很可能是crontrolfile异常导致,所以这里可以考虑通过重建控制文件或者把当前控制文件当作备份控制文件来使用

Tue Jul 15 13:42:31 2014
ALTER DATABASE RECOVER  database using backup controlfile
Tue Jul 15 13:42:31 2014
Media Recovery Start
WARNING! Recovering data file 1 from a fuzzy file. If not the current file
it might be an online backup taken without entering the begin backup command.
WARNING! Recovering data file 2 from a fuzzy file. If not the current file
it might be an online backup taken without entering the begin backup command.
WARNING! Recovering data file 3 from a fuzzy file. If not the current file
it might be an online backup taken without entering the begin backup command.
WARNING! Recovering data file 4 from a fuzzy file. If not the current file
it might be an online backup taken without entering the begin backup command.
WARNING! Recovering data file 5 from a fuzzy file. If not the current file
it might be an online backup taken without entering the begin backup command.
WARNING! Recovering data file 6 from a fuzzy file. If not the current file
it might be an online backup taken without entering the begin backup command.
WARNING! Recovering data file 7 from a fuzzy file. If not the current file
it might be an online backup taken without entering the begin backup command.
WARNING! Recovering data file 8 from a fuzzy file. If not the current file
it might be an online backup taken without entering the begin backup command.
WARNING! Recovering data file 9 from a fuzzy file. If not the current file
it might be an online backup taken without entering the begin backup command.
WARNING! Recovering data file 10 from a fuzzy file. If not the current file
it might be an online backup taken without entering the begin backup command.
WARNING! Recovering data file 11 from a fuzzy file. If not the current file
it might be an online backup taken without entering the begin backup command.
 attempting to start a parallel recovery with 3 processes
 parallel recovery started with 3 processes
Starting datafile 1 with incarnation depth 0 in thread 1 sequence 794
Datafile 1: 'D:\ORACLE\PRODUCT\10.1.0\ORADATA\ORCL\SYSTEM01.DBF'
Starting datafile 2 with incarnation depth 0 in thread 1 sequence 794
Datafile 2: 'D:\ORACLE\PRODUCT\10.1.0\ORADATA\ORCL\UNDOTBS01.DBF'
Starting datafile 3 with incarnation depth 0 in thread 1 sequence 794
Datafile 3: 'D:\ORACLE\PRODUCT\10.1.0\ORADATA\ORCL\SYSAUX01.DBF'
Starting datafile 4 with incarnation depth 0 in thread 1 sequence 794
Datafile 4: 'D:\ORACLE\PRODUCT\10.1.0\ORADATA\ORCL\USERS01.DBF'
Starting datafile 5 with incarnation depth 0 in thread 1 sequence 794
Datafile 5: 'E:\ORADB\USER1_1.ORA'
Starting datafile 6 with incarnation depth 0 in thread 1 sequence 794
Datafile 6: 'E:\ORADB\USER1_2.ORA'
Starting datafile 7 with incarnation depth 0 in thread 1 sequence 794
Datafile 7: 'E:\ORADB\USER1_3.ORA'
Starting datafile 8 with incarnation depth 0 in thread 1 sequence 794
Datafile 8: 'E:\ORADB\USER1_4.ORA'
Starting datafile 9 with incarnation depth 0 in thread 1 sequence 794
Datafile 9: 'E:\ORADB\INDEX1_1.ORA'
Starting datafile 10 with incarnation depth 0 in thread 1 sequence 794
Datafile 10: 'E:\ORADB\INDEX1_2.ORA'
Starting datafile 11 with incarnation depth 0 in thread 1 sequence 794
Datafile 11: 'E:\ORADB\TEMP1_1.ORA'
Media Recovery Log
ORA-279 signalled during: ALTER DATABASE RECOVER  database using backup cont...
Tue Jul 15 13:43:03 2014
ALTER DATABASE RECOVER    LOGFILE 'D:\ORACLE\PRODUCT\10.1.0\ORADATA\ORCL\REDO01.LOG'
Tue Jul 15 13:43:03 2014
Media Recovery Log D:\ORACLE\PRODUCT\10.1.0\ORADATA\ORCL\REDO01.LOG
Completed: ALTER DATABASE RECOVER    LOGFILE 'D:\ORACLE\PRODU
Tue Jul 15 13:43:17 2014
alter database open resetlogs
RESETLOGS after complete recovery through change 3142208
Resetting resetlogs activation ID 1378452168 (0x522982c8)
Setting recovery target incarnation to 3
Tue Jul 15 13:43:18 2014
Setting recovery target incarnation to 3
Tue Jul 15 13:43:18 2014
Flashback Database Disabled
Tue Jul 15 13:43:19 2014
Assigning activation ID 1380750902 (0x524c9636)
Maximum redo generation record size = 120832 bytes
Maximum redo generation change vector size = 116476 bytes
Private_strands 7 at log switch
Thread 1 opened at log sequence 1
  Current log# 1 seq# 1 mem# 0: D:\ORACLE\PRODUCT\10.1.0\ORADATA\ORCL\REDO01.LOG
Successful open of redo thread 1
Tue Jul 15 13:43:19 2014
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Tue Jul 15 13:43:19 2014
SMON: enabling cache recovery
Tue Jul 15 13:43:20 2014
Errors in file d:\oracle\product\10.1.0\admin\orcl\udump\orcl_ora_1484.trc:
ORA-00600: 内部错误代码, 参数: [2662], [0], [3142214], [0], [3142438], [8388617], [], []
Tue Jul 15 13:43:21 2014
Errors in file d:\oracle\product\10.1.0\admin\orcl\udump\orcl_ora_1484.trc:
ORA-00600: 内部错误代码, 参数: [2662], [0], [3142214], [0], [3142438], [8388617], [], []
Tue Jul 15 13:43:21 2014
Error 600 happened during db open, shutting down database
USER: terminating instance due to error 600

这里出现ORA-600[2662]因为scn相差较小,直接重启几次,然后数据库出现如下启动正常但是出现ORA-01595和ORA-00600[4194]错误

Tue Jul 15 13:48:26 2014
Starting background process MMON
Starting background process MMNL
MMON started with pid=17, OS id=2896
MMNL started with pid=18, OS id=3828
Tue Jul 15 13:48:26 2014
Block recovery completed at rba 2.90.16, scn 0.3162303
Tue Jul 15 13:48:26 2014
Errors in file d:\oracle\product\10.1.0\admin\orcl\bdump\orcl_smon_3724.trc:
ORA-01595: error freeing extent (3) of rollback segment (1))
ORA-00607: Internal error occurred while making a change to a data block
ORA-00600: internal error code, arguments: [4194], [57], [6], [], [], [], [], []
Tue Jul 15 13:48:26 2014
Completed: alter database open
Tue Jul 15 13:48:27 2014
Errors in file d:\oracle\product\10.1.0\admin\orcl\bdump\orcl_j000_2956.trc:
ORA-00600: internal error code, arguments: [4193], [1375], [1379], [], [], [], [], []

通过设置undo_management=manual,重建undo表空间解决,至此本数据库恢复完成,建议对其进行逻辑方式重建

误drop tablespace后使用flashback database闪回异常处理

有朋友夜间打电话请求技术支持,数据库表空间被删除,然后使用flashback database 无法正常恢复。通过分析alert日志发现,创建表空间(xifenfei 别名),发现已经存在,就删除了该表空间(其实是业务核心表空间,误删除了,是否是连接错了数据库?)

Sat Jul 05 17:10:06 2014
create tablespace XIFENFEI datafile 'D:\Oracle\oradata\orcl\HANDBB.DBF'
size 50M autoextend on next 50M maxsize 1536M extent management local
Sat Jul 05 17:10:06 2014
ORA-1543 signalled during: create tablespace XIFENFEI datafile 'D:\Oracle\oradata\orcl\HANDBB.DBF'
size 50M autoextend on next 50M maxsize 1536M extent management local
...
Sat Jul 05 17:10:59 2014
drop tablespace XIFENFEI
Sat Jul 05 17:10:59 2014
ORA-1549 signalled during: drop tablespace XIFENFEI
...
Sat Jul 05 17:11:05 2014
drop tablespace XIFENFEI
ORA-1549 signalled during: drop tablespace XIFENFEI
...
Sat Jul 05 17:11:24 2014
drop tablespace XIFENFEI including contents
Sat Jul 05 17:11:36 2014
Thread 1 advanced to log sequence 186895 (LGWR switch)
  Current log# 1 seq# 186895 mem# 0: D:\ORACLE\PRODUCT\10.2.0\ORADATA\GZSERVER\REDO01.LOG
Sat Jul 05 17:11:36 2014
ARC3: Warning.  Log sequence in archive filename wrapped
to fix length as indicated by %S in LOG_ARCHIVE_FORMAT.
Old log archive with same name might be overwritten.
Sat Jul 05 17:11:43 2014
LNS: Standby redo logfile selected for thread 1 sequence 186895 for destination LOG_ARCHIVE_DEST_4
Sat Jul 05 17:11:49 2014
LNS: Standby redo logfile selected for thread 1 sequence 186895 for destination LOG_ARCHIVE_DEST_2
Sat Jul 05 17:12:09 2014
Starting control autobackup
Control autobackup written to DISK device
	handle 'D:\FULLBACK\C-1342406147-20140705-00'
Completed: drop tablespace XIFENFEI including contents

通过这里可以发现删除表空间时间点为2014年7月5日17:12:09

闪回数据库到删除之前时间点

Sat Jul 05 18:16:54 2014
Database mounted in Exclusive Mode
Completed: ALTER DATABASE   MOUNT
Sat Jul 05 18:19:23 2014
FLASHBACK DATABASE TO TIMESTAMP TO_TIMESTAMP('2014-07-05 17:09:00','YYYY-MM-DD HH24:MI:SS')
Sat Jul 05 18:19:25 2014
Flashback Restore Start
Sat Jul 05 18:20:52 2014
--闪回时的控制文件中无表空间XIFENFEI信息(因为已经被删除),
--但是由于闪回的system 数据字典里面有相关文件信息,因此数据库在控制文件里面创建相关文件信息
Flashback: created tablespace #6: 'XIFENFEI' in the controlfile.
Flashback: created OFFLINE file 'UNNAMED00012' for tablespace #6 in the controlfile.
Filename was:
'D:\ORACLE\PRODUCT\10.2.0\ORADATA\GZSERVER\XIFENFEI4.DBF' when dropped.
File will have to be restored from a backup and recovered.
Flashback: created OFFLINE file 'UNNAMED00010' for tablespace #6 in the controlfile.
Filename was:
'D:\ORACLE\PRODUCT\10.2.0\ORADATA\GZSERVER\XIFENFEI3.DBF' when dropped.
File will have to be restored from a backup and recovered.
Flashback: created OFFLINE file 'UNNAMED00008' for tablespace #6 in the controlfile.
Filename was:
'D:\ORACLE\PRODUCT\10.2.0\ORADATA\GZSERVER\XIFENFEI2.DBF' when dropped.
File will have to be restored from a backup and recovered.
Flashback: created OFFLINE file 'UNNAMED00005' for tablespace #6 in the controlfile.
Filename was:
'D:\ORACLE\PRODUCT\10.2.0\ORADATA\GZSERVER\XIFENFEI.DBF' when dropped.
File will have to be restored from a backup and recovered.
Flashback Restore Complete
Flashback Media Recovery Start
 parallel recovery started with 15 processes
Flashback Media Recovery Log D:\ORACLE\PRODUCT\10.2.0\ORADATA\ARCHIVE\ARC\ARC86891_0766797318.001
Flashback Media Recovery Log D:\ORACLE\PRODUCT\10.2.0\ORADATA\ARCHIVE\ARC\ARC86892_0766797318.001
Sat Jul 05 18:21:40 2014
Flashback Media Recovery Log D:\ORACLE\PRODUCT\10.2.0\ORADATA\ARCHIVE\ARC\ARC86893_0766797318.001
Sat Jul 05 18:21:47 2014
WARNING: inbound connection timed out (ORA-3136)
Sat Jul 05 18:22:11 2014
Recovery of Online Redo Log: Thread 1 Group 3 Seq 186894 Reading mem 0
  Mem# 0: D:\ORACLE\PRODUCT\10.2.0\ORADATA\GZSERVER\REDO03.LOG
Sat Jul 05 18:22:39 2014
Incomplete Recovery applied until change 9078991241
Flashback Media Recovery Complete
ORA-38795 signalled during: FLASHBACK DATABASE TO TIMESTAMP TO_TIMESTAMP('2014-07-05 17:09:00','YYYY-MM-DD HH24:MI:SS')...
Sat Jul 05 18:30:11 2014
ALTER DATABASE OPEN RESETLOGS
Sat Jul 05 18:30:11 2014
ORA-1245 signalled during: ALTER DATABASE OPEN RESETLOGS...
--重命名相关UNNAMExxxxx文件名到硬盘上被删除表空间文件
Sat Jul 05 18:39:31 2014
alter database rename file 'D:\ORACLE\PRODUCT\10.2.0\DB_1\DATABASE\UNNAMED00005'
to 'D:\oracle\product\10.2.0\oradata\gzserver\XIFENFEI.DBF'
Sat Jul 05 18:39:31 2014
Completed: alter database rename file 'D:\ORACLE\PRODUCT\10.2.0\DB_1\DATABASE\UNNAMED00005'
to 'D:\oracle\product\10.2.0\oradata\gzserver\XIFENFEI.DBF'
Sat Jul 05 18:39:47 2014
alter database rename file 'D:\ORACLE\PRODUCT\10.2.0\DB_1\DATABASE\UNNAMED00008'
to 'D:\oracle\product\10.2.0\oradata\gzserver\XIFENFEI2.DBF'
Sat Jul 05 18:39:47 2014
Completed: alter database rename file 'D:\ORACLE\PRODUCT\10.2.0\DB_1\DATABASE\UNNAMED00008'
to 'D:\oracle\product\10.2.0\oradata\gzserver\XIFENFEI2.DBF'
Sat Jul 05 18:39:59 2014
alter database rename file 'D:\ORACLE\PRODUCT\10.2.0\DB_1\DATABASE\UNNAMED00010'
to 'D:\oracle\product\10.2.0\oradata\gzserver\XIFENFEI3.DBF'
Sat Jul 05 18:39:59 2014
Completed: alter database rename file 'D:\ORACLE\PRODUCT\10.2.0\DB_1\DATABASE\UNNAMED00010'
to 'D:\oracle\product\10.2.0\oradata\gzserver\XIFENFEI3.DBF'
Sat Jul 05 18:40:12 2014
alter database rename file 'D:\ORACLE\PRODUCT\10.2.0\DB_1\DATABASE\UNNAMED00012'
to 'D:\oracle\product\10.2.0\oradata\gzserver\XIFENFEI4.DBF'
Sat Jul 05 18:40:12 2014
Completed: alter database rename file 'D:\ORACLE\PRODUCT\10.2.0\DB_1\DATABASE\UNNAMED00012'
to 'D:\oracle\product\10.2.0\oradata\gzserver\XIFENFEI4.DBF'
Sat Jul 05 18:41:25 2014
ALTER DATABASE OPEN RESETLOGS
Sat Jul 05 18:41:25 2014
ORA-1245 signalled during: ALTER DATABASE OPEN RESETLOGS...

到这里,可以看出来,因为数据库整体已经闪回,但是被drop 表空间的四个数据文件未被正常闪回,因此该四个文件的scn可能异常,通过数据库恢复检查脚本(Oracle Database Recovery Check)检查结果如下
datafile_scn
datafile_header_scn
这里很明显控制文件中的scn信息混乱不做过多参考,数据文件头信息看到只有ts# 6中的四个文件(就是被删除的表空间文件)scn过大,其他文件scn都处于正常状态(处于干净状态),到这里很明显,数据库闪回成功,但是被drop tablespace的数据文件未被闪回,因此该故障可以通过bbed修改四个文件头信息和其他文件相同即可使得数据库恢复正常
温馨提示:数据库操作需要慎重,备份重于一切

数据库恢复历史再次刷新到Oracle 7.3.2版本—redo异常恢复

有网友在QQ上找我,说Oracle 7.3的数据库,因为redo异常咨询我是否可以恢复
qq咨询


检查数据库得到以下信息

SVRMGR> select * from v$version;
BANNER
----------------------------------------------------------------
Oracle7 Workgroup Server Release 7.3.2.2.1 - Production Release
PL/SQL Release 2.3.2.2.0 - Production
CORE Version 3.5.2.0.0 - Production
TNS for 32-bit Windows: Version 2.3.2.1.0 - Production
NLSRTL Version 3.2.2.0.0 - Production
已选择 5 行

数据文件信息
qq
qq2


redo信息
qq3


跳过redo进行恢复,在resetlogs过程中报rbs表空间坏块,然后通过dul工具获得回滚段名称,然后使用隐含参数屏蔽掉

License high water mark = 2
Starting up ORACLE RDBMS Version: 7.3.2.2.1.
System parameters with non-default values:
  processes                = 800
  shared_pool_size         = 540000000
  control_files            = D:\ORANT\DATABASE\ctl1orcl.ora, D:\ORANT\DATABASE\ctl2orcl.ora
  compatible               = 7.3.0.0.0
  log_buffer               = 327680
  log_checkpoint_interval  = 1000000
  db_files                 = 40
  db_file_simultaneous_writes= 1280
  max_rollback_segments    = 12800
  _offline_rollback_segments= RB13, RB14, RB15, RB16, RB20
  _corrupted_rollback_segments= RB13, RB14, RB15, RB16, RB20
  sequence_cache_entries   = 100
  sequence_cache_hash_buckets= 100
  remote_login_passwordfile= SHARED
  mts_servers              = 0
  mts_max_servers          = 0
  mts_max_dispatchers      = 0
  audit_trail              = NONE
  sort_area_retained_size  = 65536
  sort_direct_writes       = AUTO
  db_name                  = oracle
  open_cursors             = 800
  text_enable              = TRUE
  snapshot_refresh_processes= 1
  background_dump_dest     = %RDBMS73%\trace
  user_dump_dest           = %RDBMS73%\trace
Mon Jun 16 16:46:57 2014
PMON started
Mon Jun 16 16:46:57 2014
DBWR started
Mon Jun 16 16:46:57 2014
LGWR started
Mon Jun 16 16:46:57 2014
RECO started
Mon Jun 16 16:46:57 2014
SNP0 started
Mon Jun 16 16:46:57 2014
alter database  mount exclusive
Mon Jun 16 16:46:58 2014
Successful mount of redo thread 1.
Mon Jun 16 16:46:58 2014
Completed: alter database  mount exclusive
Mon Jun 16 16:48:15 2014
alter database open
Mon Jun 16 16:48:16 2014
Beginning crash recovery of 1 threads
Crash recovery completed successfully
Mon Jun 16 16:48:17 2014
Thread 1 advanced to log sequence 9
  Current log# 1 seq# 9 mem# 0: D:\ORANT\DATABASE\LOG2ORCL.ORA
Thread 1 opened at log sequence 9
  Current log# 1 seq# 9 mem# 0: D:\ORANT\DATABASE\LOG2ORCL.ORA
Successful open of redo thread 1.
Mon Jun 16 16:48:18 2014
SMON: enabling cache recovery
Mon Jun 16 16:48:19 2014
Completed: alter database open
Mon Jun 16 16:48:20 2014
SMON: enabling tx recovery
SMON: about to recover undo segment 14
SMON: mark undo segment 14 as needs recovery
SMON: about to recover undo segment 15
SMON: mark undo segment 15 as needs recovery
SMON: about to recover undo segment 16
SMON: mark undo segment 16 as needs recovery
SMON: about to recover undo segment 17
SMON: mark undo segment 17 as needs recovery
SMON: about to recover undo segment 18
SMON: mark undo segment 18 as needs recovery
Mon Jun 16 16:48:20 2014
Errors in file D:\ORANT\RDBMS73\trace\orclSMON.TRC:
ORA-00600: internal error code, arguments: [4306], [21], [2], [], [], [], [], []

数据库在启动过程中出现ORA-00600[4306],导致smon异常。该错误是因为在数据库open过程中smon会清理临时段从而出现该错误,通过设置event跳过,数据库算整体打开,不过在恢复过程中还遇到了

Mon Jun 16 17:53:10 2014
Errors in file D:\ORANT\RDBMS73\trace\orclDBWR.TRC:
ORA-00600: internal error code, arguments: [3600], [3], [14], [], [], [], [], []
Mon Jun 16 18:05:12 2014
Errors in file D:\ORANT\RDBMS73\trace\ORA06880.TRC:
ORA-01578: ORACLE数据块有错(文件号12, 块号46644)
ORA-01110: 文件'12'没有联机
ORA-00600: 内部错误码, 变元: [4194], [18], [5], [], [], []
ORA-00600: 内部错误码, 变元: [4194], [18], [5], [], [], []

ORA-00600[3600]是因为在offline 回滚段所在表空间锁出现的问题
ORA-00600[4194]是因为回滚段所在的表空间数据文件出现坏块所导致

Oracle安全警示录:加错裸设备导致redo异常

最近一个朋友数据库异常了,咨询我,通过分析日志发现对方人员根本不懂aix中的裸设备和Oracle数据库然后就直接使用OEM创建新表空间,导致了数据库crash而且不能正常启动

Thread 1 advanced to log sequence 4395
  Current log# 1 seq# 4395 mem# 0: /dev/rorcl_redo01
Thu Jun 12 19:28:38 2014
/* OracleOEM */ CREATE SMALLFILE TABLESPACE "XIFENFEI" LOGGING DATAFILE '/dev/orcl_redo04' SIZE 2000M EXTENT MANAGEMENT
LOCAL SEGMENT SPACE MANAGEMENT  AUTO
ORA-1119 signalled during: /* OracleOEM */ CREATE SMALLFILE TABLESPACE "XIFENFEI" LOGGING DATAFILE '/dev/orcl_redo04'
SIZE 2000M EXTENT MANAGEMENT LOCAL SEGMENT SPACE MANAGEMENT  AUTO ...
Thu Jun 12 19:36:23 2014
/* OracleOEM */ CREATE SMALLFILE TABLESPACE "XIFENFEI" LOGGING DATAFILE '/dev/orcl_redo03' SIZE 2000M EXTENT MANAGEMENT
LOCAL SEGMENT SPACE MANAGEMENT  AUTO
Thu Jun 12 19:43:56 2014
ORA-604 signalled during: /* OracleOEM */ CREATE SMALLFILE TABLESPACE "XIFENFEI" LOGGING DATAFILE '/dev/orcl_redo03'
SIZE 2000M EXTENT MANAGEMENT LOCAL SEGMENT SPACE MANAGEMENT  AUTO ...
Thu Jun 12 19:48:11 2014
/* OracleOEM */ CREATE SMALLFILE TABLESPACE "XIFENFEI" LOGGING DATAFILE '/dev/rorcl_redo03' SIZE 2000M EXTENT
MANAGEMENT LOCAL SEGMENT SPACE MANAGEMENT  AUTO
Thu Jun 12 19:48:11 2014
ORA-1537 signalled during: /* OracleOEM */ CREATE SMALLFILE TABLESPACE "XIFENFEI" LOGGING DATAFILE '/dev/rorcl_redo03'
 SIZE 2000M EXTENT MANAGEMENT LOCAL SEGMENT SPACE MANAGEMENT  AUTO ...
Thu Jun 12 19:48:20 2014
/* OracleOEM */ CREATE SMALLFILE TABLESPACE "XIFENFEI" LOGGING DATAFILE '/dev/rorcl_redo04' SIZE 2000M EXTENT
MANAGEMENT LOCAL SEGMENT SPACE MANAGEMENT  AUTO
ORA-1537 signalled during: /* OracleOEM */ CREATE SMALLFILE TABLESPACE "XIFENFEI" LOGGING DATAFILE '/dev/rorcl_redo04'
SIZE 2000M EXTENT MANAGEMENT LOCAL SEGMENT SPACE MANAGEMENT  AUTO ...
Fri Jun 13 00:50:37 2014
Trace dumping is performing id=[cdmp_20140613005032]
Fri Jun 13 00:50:40 2014
Reconfiguration started (old inc 4, new inc 6)
List of nodes:
 0
 Global Resource Directory frozen
 * dead instance detected - domain 0 invalid = TRUE
…………
Fri Jun 13 00:50:40 2014
Beginning instance recovery of 1 threads
Reconfiguration complete
Fri Jun 13 00:50:41 2014
 parallel recovery started with 7 processes
Fri Jun 13 00:50:43 2014
Started redo scan
Fri Jun 13 00:50:43 2014
Errors in file /oracle/admin/orcl/bdump/orcl1_smon_213438.trc:
ORA-00316: log 3 of thread 2, type 0 in header is not log file
ORA-00312: online log 3 thread 2: '/dev/rorcl_redo03'
Fri Jun 13 00:50:43 2014
Errors in file /oracle/admin/orcl/bdump/orcl1_smon_213438.trc:
ORA-00316: log 3 of thread 2, type 0 in header is not log file
ORA-00312: online log 3 thread 2: '/dev/rorcl_redo03'
SMON: terminating instance due to error 316
Fri Jun 13 00:50:43 2014
Errors in file /oracle/admin/orcl/bdump/orcl1_lgwr_335980.trc:
ORA-00316: log  of thread , type  in header is not log file
Instance terminated by SMON, pid = 213438

从这里可以看出来,在使用OEM创建表空间的过程中犯了两个错误
1. 未分清楚aix的块设备和字符设备的命名方式
2. 对于2节点正在使用的current redo作为不适用设备当作未使用设备来创建新表空间
由于创建表空间的使用了错误的文件和错误的设备,导致2节点的当前redo(/dev/rorcl_redo03)被损坏(因为先读redo header,所以数据库中优先反馈出来的是ORA-00316: log of thread , type in header is not log file).从而导致数据库2节点先crash,然后节点1进行实例恢复,但是由于2节点的current redo已经损坏,导致实例恢复无法完成,从而两个节点都crash.因为是rac的一个节点的当前redo损坏,数据库无法正常.
如果有备份该数据库可以使用备份还原进行恢复,如果没有备份只能使用强制拉库的方法抢救数据.希望不要发生一个大的数据丢失悲剧
介绍这个案例希望给大家以警示:对数据库的裸设备操作请谨慎,不清楚切不可乱操作,否则后果严重

记录一次ORA-00600[kdxlin:psno out of range]/ORA-00600[3020]/ORA-00600[4000]/ORA-00600[4193]的数据库恢复

尝试recover database,遭遇ORA-00600[kdxlin:psno out of range]/ORA-00600[3020]/ORA-00354错误

Media Recovery Log
Recovery of Online Redo Log: Thread 1 Group 1 Seq 5645 Reading mem 0
  Mem# 0 errs 0: D:\ORACLE\PRODUCT\10.1.0\ORADATA\GTGS\REDO01.LOG
Mon Jun 09 15:36:10 2014
Errors in file d:\oracle\product\10.1.0\admin\gtgs\bdump\gtgs_p001_9604.trc:
ORA-00600: internal error code, arguments: [kdxlin:psno out of range], [], [], [], [], [], [], []
Mon Jun 09 15:36:12 2014
Errors in file d:\oracle\product\10.1.0\admin\gtgs\bdump\gtgs_p002_9592.trc:
ORA-00600: internal error code, arguments: [3020], [3], [23337], [12606249], [], [], [], []
ORA-10567: Redo is inconsistent with data block (file# 3, block# 23337)
ORA-10564: tablespace SYSAUX
ORA-01110: data file 3: 'D:\ORACLE\PRODUCT\10.1.0\ORADATA\GTGS\SYSAUX01.DBF'
ORA-10560: block type 'FIRST LEVEL BITMAP BLOCK'
Mon Jun 09 15:36:12 2014
Errors in file d:\oracle\product\10.1.0\admin\gtgs\bdump\gtgs_p001_9604.trc:
ORA-10562: Error occurred while applying redo to data block (file# 3, block# 20142)
ORA-10564: tablespace SYSAUX
ORA-01110: data file 3: 'D:\ORACLE\PRODUCT\10.1.0\ORADATA\GTGS\SYSAUX01.DBF'
ORA-10561: block type 'TRANSACTION MANAGED INDEX BLOCK', data object# 47841
ORA-00600: internal error code, arguments: [kdxlin:psno out of range], [], [], [], [], [], [], []
Mon Jun 09 15:36:13 2014
Errors in file d:\oracle\product\10.1.0\admin\gtgs\bdump\gtgs_p002_9592.trc:
ORA-00600: internal error code, arguments: [3020], [3], [23337], [12606249], [], [], [], []
ORA-10567: Redo is inconsistent with data block (file# 3, block# 23337)
ORA-10564: tablespace SYSAUX
ORA-01110: data file 3: 'D:\ORACLE\PRODUCT\10.1.0\ORADATA\GTGS\SYSAUX01.DBF'
ORA-10560: block type 'FIRST LEVEL BITMAP BLOCK'
Errors with log
Mon Jun 09 15:36:14 2014
Errors in file d:\oracle\product\10.1.0\admin\gtgs\bdump\gtgs_p000_9600.trc:
ORA-00354: corrupt redo log block header
ORA-00353: log corruption near block 2357 change 25400286 time 06/06/2014 04:00:41
ORA-00334: archived log: 'D:\ORACLE\PRODUCT\10.1.0\ORADATA\GTGS\REDO02.LOG'
Mon Jun 09 15:36:14 2014
Errors in file d:\oracle\product\10.1.0\admin\gtgs\bdump\gtgs_p000_9600.trc:
ORA-00600: internal error code, arguments: [kddummy_blkchk], [1], [1490], [6401], [], [], [], []
Mon Jun 09 15:36:16 2014
Errors in file d:\oracle\product\10.1.0\admin\gtgs\bdump\gtgs_p000_9600.trc:
ORA-10562: Error occurred while applying redo to data block (file# 1, block# 1490)
ORA-10564: tablespace SYSTEM
ORA-01110: data file 1: 'D:\ORACLE\PRODUCT\10.1.0\ORADATA\GTGS\SYSTEM01.DBF'
ORA-10561: block type 'TRANSACTION MANAGED INDEX BLOCK', data object# 203
ORA-00600: internal error code, arguments: [kddummy_blkchk], [1], [1490], [6401], [], [], [], []
Media Recovery failed with error 12801
ORA-283 signalled during: ALTER DATABASE RECOVER  database  ...

因为数据库允许少量丢失数据,且redo文件发生损坏,直接使用隐含参数屏蔽redo前滚,尝试强制拉库,报ORA-00704,ORA-00600[4000]错误

Mon Jun 09 15:57:51 2014
SMON: enabling cache recovery
Mon Jun 09 15:57:51 2014
Errors in file d:\oracle\product\10.1.0\admin\gtgs\udump\gtgs_ora_8664.trc:
ORA-00600: 内部错误代码, 参数: [4000], [1], [], [], [], [], [], []
Mon Jun 09 15:57:52 2014
Errors in file d:\oracle\product\10.1.0\admin\gtgs\udump\gtgs_ora_8664.trc:
ORA-00704: 引导程序进程失败
ORA-00704: 引导程序进程失败
ORA-00600: 内部错误代码, 参数: [4000], [1], [], [], [], [], [], []
Mon Jun 09 15:57:52 2014
Error 704 happened during db open, shutting down database
USER: terminating instance due to error 704
Mon Jun 09 15:57:52 2014
Errors in file d:\oracle\product\10.1.0\admin\gtgs\bdump\gtgs_pmon_9760.trc:
ORA-00704: bootstrap process failure
Mon Jun 09 15:57:52 2014
Errors in file d:\oracle\product\10.1.0\admin\gtgs\bdump\gtgs_reco_5244.trc:
ORA-00704: bootstrap process failure
Mon Jun 09 15:57:52 2014
Errors in file d:\oracle\product\10.1.0\admin\gtgs\bdump\gtgs_smon_7096.trc:
ORA-00704: bootstrap process failure
Mon Jun 09 15:57:53 2014
Errors in file d:\oracle\product\10.1.0\admin\gtgs\bdump\gtgs_ckpt_7924.trc:
ORA-00704: bootstrap process failure
Mon Jun 09 15:57:53 2014
Errors in file d:\oracle\product\10.1.0\admin\gtgs\bdump\gtgs_lgwr_708.trc:
ORA-00704: bootstrap process failure
Mon Jun 09 15:57:53 2014
Errors in file d:\oracle\product\10.1.0\admin\gtgs\bdump\gtgs_dbw0_7400.trc:
ORA-00704: bootstrap process failure
Mon Jun 09 15:57:53 2014
Errors in file d:\oracle\product\10.1.0\admin\gtgs\bdump\gtgs_mman_9836.trc:
ORA-00704: bootstrap process failure
Instance terminated by USER, pid = 8664
ORA-1092 signalled during: alter database open resetlogs...

对数据库启动过程做10046,然后使用bbed修改scn绕过该错误,然后继续尝试打开数据库,报ORA-00604/ORA-00607/ORA-00600[4193]错误

Mon Jun 09 16:01:09 2014
SMON: enabling cache recovery
Mon Jun 09 16:01:10 2014
Errors in file d:\oracle\product\10.1.0\admin\gtgs\udump\gtgs_ora_7548.trc:
ORA-00600: 内部错误代码, 参数: [4193], [57], [51], [], [], [], [], []
Mon Jun 09 16:01:10 2014
Doing block recovery for file 1 block 397
Block recovery range from rba 2.3.0 to scn 0.1073741830
Recovery of Online Redo Log: Thread 1 Group 2 Seq 2 Reading mem 0
  Mem# 0 errs 0: D:\ORACLE\PRODUCT\10.1.0\ORADATA\GTGS\REDO02.LOG
Block recovery stopped at EOT rba 2.5.16
Block recovery completed at rba 2.5.16, scn 0.1073741830
Doing block recovery for file 1 block 9
Block recovery range from rba 2.3.0 to scn 0.1073741829
Recovery of Online Redo Log: Thread 1 Group 2 Seq 2 Reading mem 0
  Mem# 0 errs 0: D:\ORACLE\PRODUCT\10.1.0\ORADATA\GTGS\REDO02.LOG
Block recovery completed at rba 2.5.16, scn 0.1073741830
Mon Jun 09 16:01:11 2014
Errors in file d:\oracle\product\10.1.0\admin\gtgs\udump\gtgs_ora_7548.trc:
ORA-00604: 递归 SQL 级别 1 出现错误
ORA-00607: 当更改数据块时出现内部错误
ORA-00600: 内部错误代码, 参数: [4193], [57], [51], [], [], [], [], []
Error 604 happened during db open, shutting down database
USER: terminating instance due to error 604
Mon Jun 09 16:01:11 2014
Errors in file d:\oracle\product\10.1.0\admin\gtgs\bdump\gtgs_reco_9176.trc:
ORA-00604: error occurred at recursive SQL level
Mon Jun 09 16:01:11 2014
Errors in file d:\oracle\product\10.1.0\admin\gtgs\bdump\gtgs_smon_7932.trc:
ORA-00604: error occurred at recursive SQL level
Mon Jun 09 16:01:12 2014
Errors in file d:\oracle\product\10.1.0\admin\gtgs\bdump\gtgs_ckpt_7428.trc:
ORA-00604: error occurred at recursive SQL level
Mon Jun 09 16:01:12 2014
Errors in file d:\oracle\product\10.1.0\admin\gtgs\bdump\gtgs_lgwr_6936.trc:
ORA-00604: error occurred at recursive SQL level
Mon Jun 09 16:01:12 2014
Errors in file d:\oracle\product\10.1.0\admin\gtgs\bdump\gtgs_dbw0_404.trc:
ORA-00604: error occurred at recursive SQL level
Mon Jun 09 16:01:12 2014
Errors in file d:\oracle\product\10.1.0\admin\gtgs\bdump\gtgs_mman_7968.trc:
ORA-00604: error occurred at recursive SQL level
Instance terminated by USER, pid = 7548
ORA-1092 signalled during: ALTER DATABASE OPEN...

该错误的原因是因为数据库在启动的过程中,会事先利用上次数据库运行过程中system undo segment header指向的block,而该block异常,所以出现该错误,使用bbed/dul之类的工具清除掉undo seg header 指向block指针,然后数据库启动会重新分配一个block,从而实现数据库正常启动.