kfed恢复误删除磁盘组

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:kfed恢复误删除磁盘组

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

在某些情况下,可能因为误操作,不小先drop diskgroup,这个时候千万别紧张,出现此类故障,可以通过kfed进行完美恢复(数据0丢失).如果进一步损坏了相关asm disk,那后续恢复就很麻烦了,可能需要使用dul扫描磁盘来进行抢救性恢复,而且可能导致数据丢失.
创建测试磁盘组xifenfei

[grid@xifenfei ~]$ sqlplus / as sysasm
SQL*Plus: Release 11.2.0.4.0 Production on Thu Apr 30 15:12:08 2015
Copyright (c) 1982, 2013, Oracle.  All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Automatic Storage Management option
SQL>  select name,path,header_status from v$asm_disk;
NAME                           PATH                           HEADER_STATU
------------------------------ ------------------------------ ------------
                               /dev/asm-disk3                 CANDIDATE
DATA_0000                      /dev/asm-disk1                 MEMBER
DATA_0001                      /dev/asm-disk2                 MEMBER
SQL> create diskgroup xifenfei external redundancy disk '/dev/asm-disk3';
Diskgroup created.
SQL> select name,path,header_status from v$asm_disk;
NAME                           PATH                           HEADER_STATU
------------------------------ ------------------------------ ------------
XIFENFEI_0000                  /dev/asm-disk3                 MEMBER
DATA_0000                      /dev/asm-disk1                 MEMBER
DATA_0001                      /dev/asm-disk2                 MEMBER

使用/dev/asm-disk3这个磁盘创建磁盘组xifenfei

创建表,存储在xifenfei磁盘组中

[oracle@xifenfei ~]$ sqlplus / as sysdba
SQL*Plus: Release 11.2.0.4.0 Production on Thu Apr 30 15:14:55 2015
Copyright (c) 1982, 2013, Oracle.  All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, Automatic Storage Management, OLAP, Data Mining
and Real Application Testing options
SQL> create tablespace xifenfei datafile '+xifenfei' size 100M;
Tablespace created.
SQL> select name from v$datafile;
NAME
--------------------------------------------------------------------------------
+DATA/xifenfei/datafile/system.256.878224279
+DATA/xifenfei/datafile/sysaux.257.878224279
+DATA/xifenfei/datafile/undotbs1.258.878224279
+DATA/xifenfei/datafile/users.259.878224279
+XIFENFEI/xifenfei/datafile/xifenfei.256.878397315
SQL> create table t_xifenfei tablespace xifenfei
  2  as select * from dba_objects;
Table created.
SQL> select count(*) from t_xifenfei;
  COUNT(*)
----------
     86259

通过在磁盘组中创建表空间,从而实现表xifenfei存放在测试磁盘组中

尝试删除磁盘组xifenfei

SQL> drop diskgroup xifenfei;
drop diskgroup xifenfei
*
ERROR at line 1:
ORA-15039: diskgroup not dropped
ORA-15053: diskgroup "XIFENFEI" contains existing files
SQL> drop diskgroup xifenfei  including contents;
drop diskgroup xifenfei  including contents
*
ERROR at line 1:
ORA-15039: diskgroup not dropped
ORA-15027: active use of diskgroup "XIFENFEI" precludes its dismount
[grid@xifenfei ~]$ asmcmd
ASMCMD> lsof
DB_Name   Instance_Name  Path
xifenfei  xifenfei       +data/xifenfei/controlfile/current.260.878224379
xifenfei  xifenfei       +data/xifenfei/datafile/sysaux.257.878224279
xifenfei  xifenfei       +data/xifenfei/datafile/system.256.878224279
xifenfei  xifenfei       +data/xifenfei/datafile/undotbs1.258.878224279
xifenfei  xifenfei       +data/xifenfei/datafile/users.259.878224279
xifenfei  xifenfei       +data/xifenfei/onlinelog/group_1.261.878224381
xifenfei  xifenfei       +data/xifenfei/onlinelog/group_2.262.878224383
xifenfei  xifenfei       +data/xifenfei/onlinelog/group_3.263.878224385
xifenfei  xifenfei       +data/xifenfei/tempfile/temp.264.878224395
xifenfei  xifenfei       +xifenfei/xifenfei/datafile/xifenfei.256.878397315

由于xifenfei磁盘组被实例使用,因此磁盘组无法删除,报ORA-15027错误
由于xifenfei磁盘组中有文件,因此磁盘组无法删除,报ORA-15053错误
如果这两个阻止你误删除磁盘组的警告依然不能救你,那我也不好多说啥了,只能向我一样继续往下

关闭数据库实例,删除磁盘组

SQL> shutdown immediate
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> drop diskgroup xifenfei;
drop diskgroup xifenfei
*
ERROR at line 1:
ORA-15039: diskgroup not dropped
ORA-15053: diskgroup "XIFENFEI" contains existing files
SQL> drop diskgroup xifenfei  including contents;
Diskgroup dropped.
SQL>  select name,path,header_status from v$asm_disk;
NAME                           PATH                           HEADER_STATU
------------------------------ ------------------------------ ------------
                               /dev/asm-disk3                 FORMER
DATA_0000                      /dev/asm-disk1                 MEMBER
DATA_0001                      /dev/asm-disk2                 MEMBER
SQL> alter diskgroup xifenfei mount;
alter diskgroup xifenfei mount
*
ERROR at line 1:
ORA-15032: not all alterations performed
ORA-15017: diskgroup "XIFENFEI" cannot be mounted
ORA-15063: ASM discovered an insufficient number of disks for diskgroup
"XIFENFEI"

磁盘组被drop之后,无法正常mount,mount之时报ORA-15063凑无

kfed恢复删除磁盘组

[grid@xifenfei ~]$ kfed read /dev/asm-disk3 >/tmp/disk3-0-0
[grid@xifenfei ~]$ kfed  read /dev/asm-disk3  blkn=1 >/tmp/disk3-0-1
[grid@xifenfei ~]$ kfed  read /dev/asm-disk3  aun=1 >/tmp/disk3-1-0
通过vi修改这些/tmp/disk3-*中的部分值
[grid@xifenfei ~]$ kfed merge /dev/asm-disk3 text=/tmp/disk3-0-0
[grid@xifenfei ~]$ kfed merge /dev/asm-disk3  blkn=1 text=/tmp/disk3-0-1
[grid@xifenfei ~]$ kfed merge /dev/asm-disk3 aun=1 text=/tmp/disk3-1-0

查询修复后的asm disk

SQL> col path for a30
SQL> set lines 150
SQL> select name,path,header_status from v$asm_disk;
NAME                           PATH                           HEADER_STATU
------------------------------ ------------------------------ ------------
                               /dev/asm-disk3                 MEMBER
DATA_0000                      /dev/asm-disk1                 MEMBER
DATA_0001                      /dev/asm-disk2                 MEMBER

尝试mount xifenfei 磁盘组

SQL> alter diskgroup xifenfei mount;
Diskgroup altered.
SQL> select name,path,header_status from v$asm_disk;
NAME                           PATH                           HEADER_STATU
------------------------------ ------------------------------ ------------
XIFENFEI_0000                  /dev/asm-disk3                 MEMBER
DATA_0000                      /dev/asm-disk1                 MEMBER
DATA_0001                      /dev/asm-disk2                 MEMBER

测试恢复后磁盘组

SQL> startup
ORACLE instance started.
Total System Global Area  952020992 bytes
Fixed Size                  2258960 bytes
Variable Size             306186224 bytes
Database Buffers          637534208 bytes
Redo Buffers                6041600 bytes
Database mounted.
Database opened.
SQL>  select count(*) from t_xifenfei;
  COUNT(*)
----------
     86259

这里证明,当磁盘组被误删除后,立即停止进一步损坏,可以通过kfed进行完美恢复
如果您遇到此类情况,无法解决请联系我们,提供专业ORACLE数据库恢复技术支持
Phone:17813235971    Q Q:107644445QQ咨询惜分飞    E-Mail:dba@xifenfei.com

分享I_OBJ4 ORA-8102故障恢复案例

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:分享I_OBJ4 ORA-8102故障恢复案例

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

在测试环境中对于OBJ$中i_obj4中出现ORA-8102进行了重新并恢复测试,认为自己已经比较清楚的掌握了I_OBJ4的ORA-8102问题处理,可是实际的一个案例,还是比较比实验中复杂,这里贴出来主要操作供大家参考,再次证明数据库恢复的场景不可大意,客户的故障只有你想不到的,没有遇不到的
通过bbed修改obj$中dataobj$重现I_OBJ4索引报ORA-08102错误
使用bbed 修复I_OBJ4 index 报ORA-8102
数据库创建表提示ORA-8102错误

SQL> startup
ORACLE instance started.
Total System Global Area 2.6991E+10 bytes
Fixed Size		    2213976 bytes
Variable Size		 1.9327E+10 bytes
Database Buffers	 7516192768 bytes
Redo Buffers		  145174528 bytes
Database mounted.
Database opened.
SQL> create table t1 as select * from dual;
create table t1 as select * from dual
                                 *
ERROR at line 1:
ORA-00604: error occurred at recursive SQL level 1
ORA-08102: index key not found, obj# 39, file 1, block 93842 (2)

分析ORA-08102错误

SQL> select object_name,object_type from dba_objects where object_id=39;
OBJECT_NAME                    OBJECT_TYPE
------------------------------ -------------------
I_OBJ4                         INDEX
SQL> create table t1 as select * from dual;
create table t1 as select * from dual
                                 *
ERROR at line 1:
ORA-00604: error occurred at recursive SQL level 1
ORA-08102: index key not found, obj# 39, file 1, block 93842 (2)
SQL> select /*+ index(t i_obj4) */ DATAOBJ#,type#,owner# from obj$  t
minus
select /*+ full(t1) */ DATAOBJ#,type#,owner# from obj$  t1;
  2    3
  DATAOBJ#	TYPE#	  OWNER#
---------- ---------- ----------
     97109	    0	       0
SQL>  select /*+ full(t1) */ DATAOBJ#,type#,owner# from obj$  t1
  minus
  select /*+ index(t i_obj4) */ DATAOBJ#,type#,owner# from obj$  t
  ;
  2    3    4
  DATAOBJ#	TYPE#	  OWNER#
---------- ---------- ----------
     97094	    0	       0
SQL> SET LINES 122
COL INDEX_OWNER FOR A20
COL INDEX_NAME FOR A30
COL TABLE_OWNER FOR A20
COL COLUMN_NAME FOR A25
SELECT TABLE_OWNER,INDEX_NAME,COLUMN_NAME,COLUMN_POSITION
  FROM Dba_Ind_Columns
 WHERE table_name = upper('&TABLE_NAME') order by TABLE_OWNER,INDEX_OWNER,INDEX_NAME,COLUMN_POSITION
and index_name='I_OBJ4';
SQL> SQL> SQL> SQL> SQL>   2    3
Enter value for table_name: OBJ$
old   3:  WHERE table_name = upper('&TABLE_NAME') order by TABLE_OWNER,INDEX_OWNER,INDEX_NAME,COLUMN_POSITION
new   3:  WHERE table_name = upper('OBJ$') order by TABLE_OWNER,INDEX_OWNER,INDEX_NAME,COLUMN_POSITION
TABLE_OWNER	     INDEX_NAME 		    COLUMN_NAME 	      COLUMN_POSITION
-------------------- ------------------------------ ------------------------- ---------------
SYS		     I_OBJ4			    DATAOBJ#				    1
SYS		     I_OBJ4			    TYPE#				    2
SYS		     I_OBJ4			    OWNER#				    3
SQL> SELECT DATAOBJ# FROM OBJ$ WHERE OBJ#=97109;
no rows selected
SQL> SELECT DATAOBJ# FROM OBJ$ WHERE OBJ#=97094;
  DATAOBJ#
----------
     97094
SQL> select /*+ index(t i_obj4) */ rowid,DATAOBJ#,type#,owner# from obj$  t
minus
select /*+ full(t1) */ rowid,DATAOBJ#,type#,owner# from obj$  t1;
  2    3
ROWID		     DATAOBJ#	   TYPE#     OWNER#
------------------ ---------- ---------- ----------
AAAAASAABAAAADxAAb	97109	       0	  0
SQL>  select /*+ full(t1) */ rowid,DATAOBJ#,type#,owner# from obj$  t1
  minus
  select /*+ index(t i_obj4) */ rowid,DATAOBJ#,type#,owner# from obj$  t
  ;
  2    3    4
ROWID		     DATAOBJ#	   TYPE#     OWNER#
------------------ ---------- ---------- ----------
AAAAASAABAAAADxAAb	97094	       0	  0
SQL> select name,obj#,dataobj# from obj$ where rowid='AAAAASAABAAAADxAAb';
NAME				     OBJ#   DATAOBJ#
------------------------------ ---------- ----------
_NEXT_OBJECT				1      97094

到此也比较清楚,rowid为AAAAASAABAAAADxAAb的dataobj#记录在obj$表中为97094而在I_OBJ4中记录为97109,因此两者不一致,从而出现ORA-8102错误

尝试bbed解决ORA-8102问题
尝试修改obj$和i_obj4中的dataobj#记录一致,这里修改obj$中的对应记录

SQL> select dbms_rowid.rowid_relative_fno(rowid) file#,dbms_rowid.rowid_block_number(rowid) block#,
dbms_rowid.rowid_row_number(rowid) row#
from obj$ where rowid='AAAAASAABAAAADxAAb'  2    3
  4  /
     FILE#     BLOCK#	    ROW#
---------- ---------- ----------
	 1	  241	      27
SQL> select dump(97109,16) from dual;
DUMP(97109,16)
----------------------
Typ=2 Len=4: c3,a,48,a
SQL> select dump(97094,16) from dual;
DUMP(97094,16)
-----------------------
Typ=2 Len=4: c3,a,47,5f
-bash-4.1$ bbed blocksize=8192 mode=edit filename=/u01/app/oracle/oradata/oa/system01.dbf
Password:
BBED: Release 2.0.0.0.0 - Limited Production on Sat Mar 14 19:30:18 2015
Copyright (c) 1982, 2009, Oracle and/or its affiliates.  All rights reserved.
************* !!! For Oracle Internal Use only !!! ***************
BBED> show all
	FILE#          	0
	BLOCK#         	1
	OFFSET         	0
	DBA            	0x00000000 (0 0,1)
	FILENAME       	/u01/app/oracle/oradata/oa/system01.dbf
	BIFILE         	bifile.bbd
	LISTFILE
	BLOCKSIZE      	8192
	MODE           	Edit
	EDIT           	Unrecoverable
	IBASE          	Dec
	OBASE          	Dec
	WIDTH          	80
	COUNT          	512
	LOGFILE        	log.bbd
	SPOOL          	No
BBED> set block 241
	BLOCK#         	241
BBED> map
 File: /u01/app/oracle/oradata/oa/system01.dbf (0)
 Block: 241                                   Dba:0x00000000
------------------------------------------------------------
 KTB Data Block (Table/Cluster)
 struct kcbh, 20 bytes                      @0
 struct ktbbh, 48 bytes                     @20
 struct kdbh, 14 bytes                      @68
 struct kdbt[1], 4 bytes                    @82
 sb2 kdbr[105]                              @86
 ub1 freespace[87]                          @296
 ub1 rowdata[7805]                          @383
 ub4 tailchk                                @8188
BBED> p *kdbr[27]
rowdata[0]
----------
ub1 rowdata[0]                              @383      0x2c
BBED> x /rnnncnnncc
rowdata[0]                                  @383
----------
flag@383:  0x2c (KDRHFL, KDRHFF, KDRHFH)
lock@384:  0x00
cols@385:    18
col    0[2] @386: 1
col    1[4] @389: 97094
col    2[1] @394: 0
col   3[12] @396: _NEXT_OBJECT
col    4[2] @409: 1
col    5[0] @412: *NULL*
col    6[1] @413: 0
col    7[7] @415: xm....4
col    8[7] @423: xs....6
col    9[7] @431: xm....4
col   10[1] @439: .
col   11[0] @441: *NULL*
col   12[0] @442: *NULL*
col   13[1] @443: .
col   14[0] @445: *NULL*
col   15[1] @446: .
col   16[4] @448: ..8$
col   17[1] @453: .
BBED> set count 32
	COUNT          	32
BBED> set offset 389
	OFFSET         	389
BBED> d
 File: /u01/app/oracle/oradata/oa/system01.dbf (0)
 Block: 241              Offsets:  389 to  420           Dba:0x00000000
------------------------------------------------------------------------
 04c30a47 5f01800c 5f4e4558 545f4f42 4a454354 02c102ff 01800778 6d080f01
 <32 bytes per line>
BBED> set offset +3
	OFFSET         	392
BBED> d
 File: /u01/app/oracle/oradata/oa/system01.dbf (0)
 Block: 241              Offsets:  392 to  423           Dba:0x00000000
------------------------------------------------------------------------
 475f0180 0c5f4e45 58545f4f 424a4543 5402c102 ff018007 786d080f 01113407
 <32 bytes per line>
BBED> m /x 480a
 File: /u01/app/oracle/oradata/oa/system01.dbf (0)
 Block: 241              Offsets:  392 to  423           Dba:0x00000000
------------------------------------------------------------------------
 480a0180 0c5f4e45 58545f4f 424a4543 5402c102 ff018007 786d080f 01113407
 <32 bytes per line>
BBED>  p *kdbr[27]
rowdata[0]
----------
ub1 rowdata[0]                              @383      0x2c
BBED> x /rnnncnnncc
rowdata[0]                                  @383
----------
flag@383:  0x2c (KDRHFL, KDRHFF, KDRHFH)
lock@384:  0x00
cols@385:    18
col    0[2] @386: 1
col    1[4] @389: 97109
col    2[1] @394: 0
col   3[12] @396: _NEXT_OBJECT
col    4[2] @409: 1
col    5[0] @412: *NULL*
col    6[1] @413: 0
col    7[7] @415: xm....4
col    8[7] @423: xs....6
col    9[7] @431: xm....4
col   10[1] @439: .
col   11[0] @441: *NULL*
col   12[0] @442: *NULL*
col   13[1] @443: .
col   14[0] @445: *NULL*
col   15[1] @446: .
col   16[4] @448: ..8$
col   17[1] @453: .
BBED> sum apply
Check value for File 0, Block 241:
current = 0x913d, required = 0x913d

验证bbed修改后效果

SQL> startup
ORACLE instance started.
Total System Global Area 2.6991E+10 bytes
Fixed Size		    2213976 bytes
Variable Size		 1.9327E+10 bytes
Database Buffers	 7516192768 bytes
Redo Buffers		  145174528 bytes
Database mounted.
Database opened.
SQL> select /*+ index(t i_obj4) */ rowid,DATAOBJ#,type#,owner# from obj$  t
minus
select /*+ full(t1) */ rowid,DATAOBJ#,type#,owner# from obj$  t1;
  2    3
no rows selected
SQL>  select /*+ full(t1) */ rowid,DATAOBJ#,type#,owner# from obj$  t1
  minus
  select /*+ index(t i_obj4) */ rowid,DATAOBJ#,type#,owner# from obj$  t
  ;
  2    3    4
no rows selected
SQL> ANALYZE TABLE sys.obj$ VALIDATE STRUCTURE CASCADE;
ANALYZE TABLE sys.obj$ VALIDATE STRUCTURE CASCADE
*
ERROR at line 1:
ORA-01499: table/index cross reference failure - see trace file
SQL> create table t_xifenfei as select * from dual;
create table t_xifenfei as select * from dual
                                         *
ERROR at line 1:
ORA-00604: error occurred at recursive SQL level 1
ORA-08102: index key not found, obj# 39, file 1, block 93842 (2)

这里比较悲剧,我们看到i_obj4和obj$中对应记录已经一致(条数和数据值),但是依然不能执行创建表操作,依旧报ORA-8102错误.

进一步分析错误原因

SQL> ALTER SESSION SET EVENTS '802 trace name errorstack level 3';
Session altered.
SQL> create table t as select * from dual;
create table t as select * from dual
                                *
ERROR at line 1:
ORA-00604: error occurred at recursive SQL level 1
ORA-08102: index key not found, obj# 39, file 1, block 93842 (2)
SQL> select value from v$diag_info where name='Default Trace File';
VALUE
--------------------------------------------------------------------------------
/u01/app/oracle/diag/rdbms/oa/oa/trace/oa_ora_6163.trc
oer 8102.2 - obj# 39, rdba: 0x00416e92(afn 1, blk# 93842)
kdk key 8102.2:
  ncol: 4, len: 16
  key: (16):  04 c3 0a 48 0a 01 80 01 80 06 00 40 00 f1 00 1b
--这里可以看出来,提示ORA-8102错误依旧在I_OBJ4,97109记录上
SQL> select max(dataobj#) from obj$;
MAX(DATAOBJ#)
-------------
	96815
SQL> select name,obj#,dataobj# from obj$ where rowid='AAAAASAABAAAADxAAb';
NAME				     OBJ#   DATAOBJ#
------------------------------ ---------- ----------
_NEXT_OBJECT				1      97109
--这里很奇怪,通过rowid查询我们已经的出来在obj$中有dataobj#为97109,而通过max(dataobj#)只有96815
分析max(dataobj#)执行计划
SQL> SET AUTOT TRACE
SQL> select max(dataobj#) from obj$;
Execution Plan
----------------------------------------------------------
Plan hash value: 721075849
------------------------------------------------------------------------------------
| Id  | Operation		   | Name   | Rows  | Bytes | Cost (%CPU)| Time    |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT	   |	    |	  1 |	  2 |	  2   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE 	   |	    |	  1 |	  2 |		 |    |
|   2 |   INDEX FULL SCAN (MIN/MAX)| I_OBJ4 |	  1 |	  2 |	  2   (0)| 00:00:01 |
-------------------------------------------------------------------------------------
Statistics
----------------------------------------------------------
	  0  recursive calls
	  0  db block gets
	  2  consistent gets
	  0  physical reads
	  0  redo size
	533  bytes sent via SQL*Net to client
	520  bytes received via SQL*Net from client
	  2  SQL*Net roundtrips to/from client
	  0  sorts (memory)
	  0  sorts (disk)
	  1  rows processed
--这里知晓,由于max(dataobj#)使用了INDEX FULL SCAN (MIN/MAX)执行计划,从而的出来最大值为96815,
--而我们从ORA-8102错误中可以看到index中有dataobj#为97109,证明index中的链表可能出问题
--为什么怀疑是链表有问题呢?因为该index的ffs正常

尝试把ORA-8102报错标记坏块尝试
并且通过event和隐含参数屏蔽index坏块

-bash-4.1$ bbed blocksize=8192 mode=edit filename=/u01/app/oracle/oradata/oa/system01.dbf
Password:
BBED: Release 2.0.0.0.0 - Limited Production on Sat Mar 14 20:30:58 2015
Copyright (c) 1982, 2009, Oracle and/or its affiliates.  All rights reserved.
************* !!! For Oracle Internal Use only !!! ***************
BBED> set block 93842
	BLOCK#         	93842
BBED> set offset 8188
	OFFSET         	8188
BBED> d
 File: /u01/app/oracle/oradata/oa/system01.dbf (0)
 Block: 93842            Offsets: 8188 to 8191           Dba:0x00000000
------------------------------------------------------------------------
 010675ad
 <32 bytes per line>
BBED> m /x 02
 File: /u01/app/oracle/oradata/oa/system01.dbf (0)
 Block: 93842            Offsets: 8188 to 8191           Dba:0x00000000
------------------------------------------------------------------------
 020675ad
 <32 bytes per line>
BBED> sum apply
Check value for File 0, Block 93842:
current = 0x9186, required = 0x9186
BBED> verify
DBVERIFY - Verification starting
FILE = /u01/app/oracle/oradata/oa/system01.dbf
BLOCK = 93842
Block 93842 is corrupt
Corrupt block relative dba: 0x00416e92 (file 0, block 93842)
Fractured block found during verification
Data in bad block:
 type: 6 format: 2 rdba: 0x00416e92
 last change scn: 0x0000.c007ad75 seq: 0x1 flg: 0x06
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0xad750602
 check value in block header: 0x9186
 computed block checksum: 0x0
DBVERIFY - Verification complete
Total Blocks Examined         : 1
Total Blocks Processed (Data) : 0
Total Blocks Failing   (Data) : 0
Total Blocks Processed (Index): 0
Total Blocks Failing   (Index): 0
Total Blocks Empty            : 0
Total Blocks Marked Corrupt   : 1
Total Blocks Influx           : 2
Message 531 not found;  product=RDBMS; facility=BBED
-bash-4.1$ sqlplus / as sysdba
SQL*Plus: Release 11.2.0.1.0 Production on Sat Mar 14 20:33:19 2015
Copyright (c) 1982, 2009, Oracle.  All rights reserved.
Connected to an idle instance.
SQL> startup pfile='/tmp/pfile'
ORACLE instance started.
Total System Global Area 2.6991E+10 bytes
Fixed Size		    2213976 bytes
Variable Size		 1.9327E+10 bytes
Database Buffers	 7516192768 bytes
Redo Buffers		  145174528 bytes
Database mounted.
Database opened.
SQL> create table t as select * from dual;
create table t as select * from dual
                                *
ERROR at line 1:
ORA-00604: error occurred at recursive SQL level 1
ORA-01578: ORACLE data block corrupted (file # 1, block # 93842)
ORA-01110: data file 1: '/u01/app/oracle/oradata/oa/system01.dbf'

通过这一步测试证明,在该ora-8102的错误中,通过坏块是无法解决绕过去该错误,只是把错误从ORA-8102转变为了ORA-1578

修复好制造坏块block

BBED> m /x 01
 File: /u01/app/oracle/oradata/oa/system01.dbf (0)
 Block: 93842            Offsets: 8188 to 8191           Dba:0x00000000
------------------------------------------------------------------------
 010675ad
 <32 bytes per line>
BBED> sum apply
Check value for File 0, Block 93842:
current = 0x9185, required = 0x9185
BBED> verify
DBVERIFY - Verification starting
FILE = /u01/app/oracle/oradata/oa/system01.dbf
BLOCK = 93842
DBVERIFY - Verification complete
Total Blocks Examined         : 1
Total Blocks Processed (Data) : 0
Total Blocks Failing   (Data) : 0
Total Blocks Processed (Index): 1
Total Blocks Failing   (Index): 0
Total Blocks Empty            : 0
Total Blocks Marked Corrupt   : 0
Total Blocks Influx           : 0
Message 531 not found;  product=RDBMS; facility=BBED
SQL> shutdown abort
ORACLE instance shut down.
SQL> startup
ORACLE instance started.
Total System Global Area 2.6991E+10 bytes
Fixed Size		    2213976 bytes
Variable Size		 1.9327E+10 bytes
Database Buffers	 7516192768 bytes
Redo Buffers		  145174528 bytes
Database mounted.
Database opened.
SQL> create table t1 as select * from dual;
create table t1 as select * from dual
                                 *
ERROR at line 1:
ORA-00604: error occurred at recursive SQL level 1
ORA-08102: index key not found, obj# 39, file 1, block 93842 (2)

至此我们大体出来信息:
1. ORA-8102的是I_OBJ4中的_NEXT_OBJECT记录异常和该index链表异常
2. 通过bbed修改I_OBJ4和obj$中相应记录无法解决该问题,因为还有链表异常
3. 通过标记为坏块也无法绕过该问题
由于后续如果继续修复修复i_obj4可能工作量过大,而且可以也比较急,通过人工直接删除I_OBJ4数据字典记录,然后结合bootstrap$核心index(I_OBJ1,I_USER1,I_FILE#_BLOCK#,I_IND1,I_TS#,I_CDEF1等)异常恢复—ORA-00701错误解决处理实现完美恢复

ORA-01157: cannot identify/lock data file导致表空间无法online

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:ORA-01157: cannot identify/lock data file导致表空间无法online

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

朋友和我聊到一个案例,一个表空间本来有一个数据文件(已经有大量数据进入),然后他加入7个数据文件,这个时候他发现他加多了文件,想删除这些数据文件,可是又不怎么懂数据库,直接offline 表空间,然后rm掉相关数据文件,最后表空间无法online.我对其场景进行了模拟恢复测试,使用bbed和dul对其进行恢复

模拟场景
表空间有一个数据文件,里面有数据,然后加入一个新数据文件,进入offline tbs,然后rm数据文件
offline表空间后,该表空间存在的数据文件也会offline

SQL> create tablespace xifenfei datafile
  2  '/u01/oracle/ora9i/oradata/xifenfei01.dbf' size 10M autoextend on next 16M;
Tablespace created.
SQL> create table chf.t_xifenfei01 tablespace xifenfei
  2  as select * from dba_objects;
Table created.
SQL> create table chf.t_xifenfei02 tablespace xifenfei
  2  as select * from dba_tables;
Table created.
SQL> SELECT COUNT(*) FROM CHF.T_XIFENFEI01;
  COUNT(*)
----------
     30758
SQL> C/01/02
  1* SELECT COUNT(*) FROM CHF.T_XIFENFEI02
SQL> /
  COUNT(*)
----------
       865
SQL> select tablespace_name,segment_name from dba_segments
  2  where segment_name like 'T_XIFENFEI%';
TABLESPACE_NAME                SEGMENT_NAME
------------------------------ -------------------------
XIFENFEI                       T_XIFENFEI01
XIFENFEI                       T_XIFENFEI02
SQL> ALTER tablespace xifenfei add datafile
  2  '/u01/oracle/ora9i/oradata/xifenfei02.dbf' size 16m;
Tablespace altered.
SQL> alter tablespace xifenfei offline;
Tablespace altered.
SQL> select file#,status$ from file$;
     FILE#    STATUS$
---------- ----------
         1          2
         2          2
         3          2
         4          2
         5          2
         6          2
         7          2
         8          2
         9          2
        10          2
        11          2
        12          2
12 rows selected.
SQL> set pages 1000
SQL> /
     FILE#    STATUS$
---------- ----------
         1          2
         2          2
         3          2
         4          2
         5          2
         6          2
         7          2
         8          2
         9          2
        10          2
        11          2
        12          2
12 rows selected.
SQL> select file#,status from v$datafile;
     FILE# STATUS
---------- -------
         1 SYSTEM
         2 ONLINE
         3 ONLINE
         4 ONLINE
         5 ONLINE
         6 ONLINE
         7 ONLINE
         8 ONLINE
         9 ONLINE
        10 ONLINE
        11 OFFLINE
        12 OFFLINE
12 rows selected.
SQL> select file#,status from v$datafile_header;
     FILE# STATUS
---------- -------
         1 ONLINE
         2 ONLINE
         3 ONLINE
         4 ONLINE
         5 ONLINE
         6 ONLINE
         7 ONLINE
         8 ONLINE
         9 ONLINE
        10 ONLINE
        11 OFFLINE
        12 OFFLINE
12 rows selected.
SQL> select tablespace_name,status from dba_tablespaces;
TABLESPACE_NAME                STATUS
------------------------------ ---------
SYSTEM                         ONLINE
UNDOTBS1                       ONLINE
TEMP                           ONLINE
CWMLITE                        ONLINE
DRSYS                          ONLINE
EXAMPLE                        ONLINE
INDX                           ONLINE
ODM                            ONLINE
TOOLS                          ONLINE
USERS                          ONLINE
XDB                            ONLINE
XIFENFEI                       OFFLINE
12 rows selected.
SQL> select ts#,name,online$ from ts$;
       TS# NAME                              ONLINE$
---------- ------------------------------ ----------
         0 SYSTEM                                  1
         1 UNDOTBS1                                1
         2 TEMP                                    1
         3 CWMLITE                                 1
         4 DRSYS                                   1
         5 EXAMPLE                                 1
         6 INDX                                    1
         7 ODM                                     1
         8 TOOLS                                   1
         9 USERS                                   1
        10 XDB                                     1
        11 UNDOTBS2                                3
        12 XIFENFEI                                2
13 rows selected.
SQL> col name for a50
SQL> select file#,name from v$datafile;
     FILE# NAME
---------- --------------------------------------------------
         1 /u01/oracle/ora9i/oradata/ora9i/system01.dbf
         2 /u01/oracle/ora9i/oradata/ora9i/undotbs01.dbf
         3 /u01/oracle/ora9i/oradata/ora9i/cwmlite01.dbf
         4 /u01/oracle/ora9i/oradata/ora9i/drsys01.dbf
         5 /u01/oracle/ora9i/oradata/ora9i/example01.dbf
         6 /u01/oracle/ora9i/oradata/ora9i/indx01.dbf
         7 /u01/oracle/ora9i/oradata/ora9i/odm01.dbf
         8 /u01/oracle/ora9i/oradata/ora9i/tools01.dbf
         9 /u01/oracle/ora9i/oradata/ora9i/users01.dbf
        10 /u01/oracle/ora9i/oradata/ora9i/xdb01.dbf
        11 /u01/oracle/ora9i/oradata/xifenfei01.dbf
        12 /u01/oracle/ora9i/oradata/xifenfei02.dbf
12 rows selected.
SQL> !rm /u01/oracle/ora9i/oradata/xifenfei02.dbf
SQL> !ls -l /u01/oracle/ora9i/oradata/xifenfei*
-rw-r-----  1 ora9i oinstall 10493952 Dec 25 20:19 /u01/oracle/ora9i/oradata/xifenfei01.dbf
SQL> alter tablespace xifenfei online;
alter tablespace xifenfei online
*
ERROR at line 1:
ORA-01157: cannot identify/lock data file 12 - see DBWR trace file
ORA-01110: data file 12: '/u01/oracle/ora9i/oradata/xifenfei02.dbf'

这里已经构造出来,由于由于数据文件丢失导致表空间online报ORA-01157: cannot identify/lock data file的错误

online存在数据文件
存在的数据文件online之后,表空间依然offline

SQL> alter database datafile 11 online;
Database altered.
SQL> select file#,status from v$datafile_header;
     FILE# STATUS
---------- -------
         1 ONLINE
         2 ONLINE
         3 ONLINE
         4 ONLINE
         5 ONLINE
         6 ONLINE
         7 ONLINE
         8 ONLINE
         9 ONLINE
        10 ONLINE
        11 ONLINE
        12 OFFLINE
12 rows selected.
SQL> select file#,status from v$datafile;
     FILE# STATUS
---------- -------
         1 SYSTEM
         2 ONLINE
         3 ONLINE
         4 ONLINE
         5 ONLINE
         6 ONLINE
         7 ONLINE
         8 ONLINE
         9 ONLINE
        10 ONLINE
        11 ONLINE
        12 OFFLINE
12 rows selected.
SQL> select tablespace_name,status from dba_tablespaces;
TABLESPACE_NAME                STATUS
------------------------------ ---------
SYSTEM                         ONLINE
UNDOTBS1                       ONLINE
TEMP                           ONLINE
CWMLITE                        ONLINE
DRSYS                          ONLINE
EXAMPLE                        ONLINE
INDX                           ONLINE
ODM                            ONLINE
TOOLS                          ONLINE
USERS                          ONLINE
XDB                            ONLINE
XIFENFEI                       OFFLINE
12 rows selected.
SQL> select ts#,name,online$ from ts$;
       TS# NAME                                                  ONLINE$
---------- -------------------------------------------------- ----------
         0 SYSTEM                                                      1
         1 UNDOTBS1                                                    1
         2 TEMP                                                        1
         3 CWMLITE                                                     1
         4 DRSYS                                                       1
         5 EXAMPLE                                                     1
         6 INDX                                                        1
         7 ODM                                                         1
         8 TOOLS                                                       1
         9 USERS                                                       1
        10 XDB                                                         1
        11 UNDOTBS2                                                    3
        12 XIFENFEI                                                    2
13 rows selected.
SQL> create table chf.t_1 tablespace xifenfei
  2  as select * from dual;
as select * from dual
                 *
ERROR at line 2:
ORA-01542: tablespace 'XIFENFEI' is offline, cannot allocate space in it

由于表空间无法online,因此对该表空间分配报ORA-01542错误

使用bbed构造数据文件欺骗数据库

[ora9i@xifenfei tmp]$ dd if=/dev/zero of=/u01/oracle/ora9i/oradata/xifenfei02.dbf
> bs=8192 count=2049
2049+0 records in
2049+0 records out
BBED> copy file 11 block 1 to file 12 block 1
 File: /u01/oracle/ora9i/oradata/xifenfei02.dbf (12)
 Block: 1                Offsets:    0 to  511           Dba:0x03000001
------------------------------------------------------------------------
 0b020000 0100c002 00000000 00000104 1deb0000 00002009 00000008 329bf29e
 4f524139 49000000 11010000 00050000 00200000 0b000300 00000000 00000000
 <32 bytes per line>
BBED> set count 32
        COUNT           32
BBED> m /x 0c
Warning: contents of previous BIFILE will be lost. Proceed? (Y/N) y
 File: /u01/oracle/ora9i/oradata/xifenfei02.dbf (12)
 Block: 1                Offsets:  280 to  311           Dba:0x03000001
------------------------------------------------------------------------
 0c000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
 <32 bytes per line>
BBED> m /x 0c
 File: /u01/oracle/ora9i/oradata/xifenfei02.dbf (12)
 Block: 1                Offsets:   52 to   83           Dba:0x03000001
------------------------------------------------------------------------
 0c000300 00000000 00000000 00000000 00000000 00000000 00000000 00000000
 <32 bytes per line>
BBED> sum apply
Check value for File 12, Block 1:
current = 0xad21, required = 0xad21
SQL> alter tablespace xifenfei online;
alter tablespace xifenfei online
*
ERROR at line 1:
ORA-01122: database file 12 failed verification check
ORA-01110: data file 12: '/u01/oracle/ora9i/oradata/xifenfei02.dbf'
ORA-01251: Unknown File Header Version read for file number 12
SQL> !oerr ora 01251
01251, 00000, "Unknown File Header Version read for file number %s"
// *Cause:  Read of the file header returned a record but its version cannot
//         be identified.  Either the header has been corrupted, or the file
//         is not a valid database file.
// *Action: Have the operating system make the correct file available to
//         the database, or recover the file.
BBED> m /x 01000003
Warning: contents of previous BIFILE will be lost. Proceed? (Y/N) y
 File: /u01/oracle/ora9i/oradata/xifenfei02.dbf (12)
 Block: 1                Offsets:    4 to  36           Dba:0x03000001
------------------------------------------------------------------------
 01000003 00000000 00000104 21ad0000 00002009 00000008 329bf29e 4f524139
 <32 bytes per line>
BBED> sum apply
Check value for File 12, Block 1:
current = 0xace1, required = 0xace1
SQL> alter tablespace xifenfei online;
alter tablespace xifenfei online
*
ERROR at line 1:
ORA-01122: database file 12 failed verification check
ORA-01110: data file 12: '/u01/oracle/ora9i/oradata/xifenfei02.dbf'
ORA-01203: wrong incarnation of this file - wrong creation SCN
BBED> m /x 613B0300
Warning: contents of previous BIFILE will be lost. Proceed? (Y/N) y
 File: /u01/oracle/ora9i/oradata/xifenfei02.dbf (12)
 Block: 1                Offsets:  100 to  131           Dba:0x03000001
------------------------------------------------------------------------
 613b0300 00000000 f35bb133 f39bae33 b46c0200 00000000 00000000 00000000
 <32 bytes per line>
BBED> sum apply
Check value for File 12, Block 1:
current = 0xa0c9, required = 0xa0c9
--提示数据文件scn比控制文件新(由于正常数据文件online过)
SQL> alter tablespace xifenfei online;
alter tablespace xifenfei online
*
ERROR at line 1:
ORA-01122: database file 12 failed verification check
ORA-01110: data file 12: '/u01/oracle/ora9i/oradata/xifenfei02.dbf'
ORA-01207: file is more recent than controlfile - old controlfile
SQL> set lines 160
SQL> col name for a50
SQL> select ts#,file#,BYTES/1024/1024/1024 status,enabled,
  2  to_char(checkpoint_change#,'999999999999999') "SCN",
  3  to_char(last_change#,'999999999999999')"STOP_SCN",
  4  name from v$datafile;
  TS#  FILE#     STATUS ENABLED SCN      STOP_SCN         NAME
----- --- ---------- ---------- -------- ------------- ---------------------------------------------
    0   1  .37109375 READ WRITE   198231               /u01/oracle/ora9i/oradata/ora9i/system01.dbf
    1   2   .1953125 READ WRITE   198231               /u01/oracle/ora9i/oradata/ora9i/undotbs01.dbf
    3   3  .01953125 READ WRITE   198231               /u01/oracle/ora9i/oradata/ora9i/cwmlite01.dbf
    4   4  .01953125 READ WRITE   198231               /u01/oracle/ora9i/oradata/ora9i/drsys01.dbf
    5   5 .145874023 READ WRITE   198231               /u01/oracle/ora9i/oradata/ora9i/example01.dbf
    6   6 .024414063 READ WRITE   198231               /u01/oracle/ora9i/oradata/ora9i/indx01.dbf
    7   7  .01953125 READ WRITE   198231               /u01/oracle/ora9i/oradata/ora9i/odm01.dbf
    8   8 .009765625 READ WRITE   198231               /u01/oracle/ora9i/oradata/ora9i/tools01.dbf
    9   9 .024414063 READ WRITE   198231               /u01/oracle/ora9i/oradata/ora9i/users01.dbf
   10  10 .043945313 READ WRITE   198231               /u01/oracle/ora9i/oradata/ora9i/xdb01.dbf
   12  11 .009765625 DISABLED     228353        228353 /u01/oracle/ora9i/oradata/xifenfei01.dbf
   12  12          0 DISABLED     213917        213917 /u01/oracle/ora9i/oradata/xifenfei02.dbf
12 rows selected.
SQL> set pages 100
SQL> set linesize 150
SQL> select ts#,file#,TABLESPACE_NAME,status,
  2  to_char(CREATION_TIME,'yyyy-mm-dd hh24:mi:ss') CREATE_TIME,
  3  to_char(checkpoint_change#,'9999999999999999') "SCN",
  4  to_char(RESETLOGS_CHANGE#,'9999999999999999') "RESETLOGS SCN",FUZZY
from v$datafile_header;  5
TS# FILE# TABLESPACE_NAME  STATUS  CREATE_TIME         SCN       RESETLOGS SCN     FUZ
--- ----- ---------------- ------- ------------------- --------- ----------------- ---
  0     1 SYSTEM           ONLINE  2004-03-10 12:17:49    198231            158900 YES
  1     2 UNDOTBS1         ONLINE  2004-03-10 20:30:20    198231            158900 YES
  3     3 CWMLITE          ONLINE  2004-03-10 12:18:10    198231            158900 YES
  4     4 DRSYS            ONLINE  2004-03-10 12:18:11    198231            158900 YES
  5     5 EXAMPLE          ONLINE  2004-03-10 12:18:12    198231            158900 YES
  6     6 INDX             ONLINE  2004-03-10 12:18:16    198231            158900 YES
  7     7 ODM              ONLINE  2004-03-10 12:18:17    198231            158900 YES
  8     8 TOOLS            ONLINE  2004-03-10 12:18:19    198231            158900 YES
  9     9 USERS            ONLINE  2004-03-10 12:18:19    198231            158900 YES
 10    10 XDB              ONLINE  2004-03-10 12:18:20    198231            158900 YES
 12    11 XIFENFEI         OFFLINE 2014-12-25 17:57:07    228353            158900 NO
  0    12                  OFFLINE                             0                 0
12 rows selected.
BBED> m /x 9D430300
 File: /u01/oracle/ora9i/oradata/xifenfei02.dbf (12)
 Block: 1                Offsets:  140 to  171           Dba:0x03000001
------------------------------------------------------------------------
 9d430300 00000000 f670b133 01000000 02000000 fc4e0000 10000000 02000000
 <32 bytes per line>
BBED> SUM APPLY
Check value for File 12, Block 1:
current = 0x9982, required = 0x9982
SQL> ALTER TABLESPACE XIFENFEI ONLINE;
Tablespace altered.

在bbed的恢复过程中出现以下常见错误:
ORA-01122: database file 12 failed verification check
ORA-01110: data file 12: ‘/u01/oracle/ora9i/oradata/xifenfei02.dbf’
ORA-01251: Unknown File Header Version read for file number 12
这个主要是块号和块地址不一匹配,导致数据库无法识别

ORA-01122: database file 12 failed verification check
ORA-01110: data file 12: ‘/u01/oracle/ora9i/oradata/xifenfei02.dbf’
ORA-01203: wrong incarnation of this file – wrong creation SCN
这个由于数据文件头的创建scn和控制文件以及file$中的记录不一致

ORA-01122: database file 12 failed verification check
ORA-01110: data file 12: ‘/u01/oracle/ora9i/oradata/xifenfei02.dbf’
ORA-01207: file is more recent than controlfile – old controlfile
这个由于正常数据文件online过,或者你构造文件头使用的scn大于需要构造文件在控制文件中记录的scn值

后续处理

SQL> SELECT COUNT(*) FROM CHF.T_XIFENFEI01;
  COUNT(*)
----------
     30758
SQL> C/01/02
  1* SELECT COUNT(*) FROM CHF.T_XIFENFEI02
SQL> /
  COUNT(*)
----------
       865
SQL> ALTER DATABASE DATAFILE 12 OFFLINE DROP;
Database altered.

证明通过bbed,模拟数据文件,使用完美恢复数据内容

dul恢复

SQL> ALTER TABLESPACE XIFENFEI OFFLINE;
Tablespace altered.
[ora9i@xifenfei dul]$ ./dul
Data UnLoader: 10.2.0.5.33 - Internal Only - on Thu Dec 25 21:18:54 2014
with 64-bit io functions
Copyright (c) 1994 2014 Bernard van Duijnen All rights reserved.
 Strictly Oracle Internal Use Only
Within one week you will need a more recent DUL version for this os
Found db_id = 2666699570
Found db_name = ORA9I
DUL> bootstrap;
Probing file = 1, block = 417
  database version 9 bootstrap$ at file 1, block 377
. unloading table                BOOTSTRAP$
DUL: Warning: block number is non zero but marked deferred trying to process it anyhow
      57 rows unloaded
DUL: Warning: Dictionary cache DC_BOOTSTRAP is empty
Reading BOOTSTRAP.dat 57 entries loaded
Parsing Bootstrap$ contents
Generating dict.ddl for version 9
 OBJ$: segobjno 18, file 1 block 121
 TAB$: segobjno 2, tabno 1, file 1  block 25
 COL$: segobjno 2, tabno 5, file 1  block 25
 USER$: segobjno 10, tabno 1, file 1  block 89
Running generated file "@dict.ddl" to unload the dictionary tables
. unloading table                      OBJ$   31490 rows unloaded
. unloading table                      TAB$     909 rows unloaded
. unloading table                      COL$   35582 rows unloaded
. unloading table                     USER$      63 rows unloaded
Reading USER.dat 63 entries loaded
Reading OBJ.dat 31490 entries loaded and sorted 31490 entries
Reading TAB.dat 909 entries loaded
Reading COL.dat 35582 entries loaded and sorted 35582 entries
Reading BOOTSTRAP.dat 57 entries loaded
DUL: Warning: Recreating file "dict.ddl"
Generating dict.ddl for version 9
 OBJ$: segobjno 18, file 1 block 121
 TAB$: segobjno 2, tabno 1, file 1  block 25
 COL$: segobjno 2, tabno 5, file 1  block 25
 USER$: segobjno 10, tabno 1, file 1  block 89
 TABPART$: segobjno 230, file 1 block 1657
 INDPART$: segobjno 234, file 1 block 1689
 TABCOMPART$: segobjno 249, file 1 block 1809
 INDCOMPART$: segobjno 253, file 1 block 1841
 TABSUBPART$: segobjno 240, file 1 block 1737
 INDSUBPART$: segobjno 245, file 1 block 1777
 IND$: segobjno 2, tabno 3, file 1  block 25
 ICOL$: segobjno 2, tabno 4, file 1  block 25
 LOB$: segobjno 2, tabno 6, file 1  block 25
 COLTYPE$: segobjno 2, tabno 7, file 1  block 25
 TYPE$: segobjno 296, tabno 1, file 1  block 2129
 COLLECTION$: segobjno 296, tabno 2, file 1  block 2129
 ATTRIBUTE$: segobjno 296, tabno 3, file 1  block 2129
 LOBFRAG$: segobjno 258, file 1 block 1881
 LOBCOMPPART$: segobjno 261, file 1 block 1905
 UNDO$: segobjno 15, file 1 block 105
 TS$: segobjno 6, tabno 2, file 1  block 57
 PROPS$: segobjno 101, file 1 block 769
Running generated file "@dict.ddl" to unload the dictionary tables
. unloading table                      OBJ$
DUL: Warning: Recreating file "OBJ.ctl"
   31490 rows unloaded
. unloading table                      TAB$
DUL: Warning: Recreating file "TAB.ctl"
     909 rows unloaded
. unloading table                      COL$
DUL: Warning: Recreating file "COL.ctl"
   35582 rows unloaded
. unloading table                     USER$
DUL: Warning: Recreating file "USER.ctl"
      63 rows unloaded
. unloading table                  TABPART$      55 rows unloaded
. unloading table                  INDPART$     128 rows unloaded
. unloading table               TABCOMPART$       0 rows unloaded
. unloading table               INDCOMPART$       0 rows unloaded
. unloading table               TABSUBPART$       0 rows unloaded
. unloading table               INDSUBPART$       0 rows unloaded
. unloading table                      IND$    1396 rows unloaded
. unloading table                     ICOL$    1842 rows unloaded
. unloading table                      LOB$     427 rows unloaded
. unloading table                  COLTYPE$    1160 rows unloaded
. unloading table                     TYPE$     956 rows unloaded
. unloading table               COLLECTION$     250 rows unloaded
. unloading table                ATTRIBUTE$    3623 rows unloaded
. unloading table                  LOBFRAG$       0 rows unloaded
. unloading table              LOBCOMPPART$       0 rows unloaded
. unloading table                     UNDO$      21 rows unloaded
. unloading table                       TS$      13 rows unloaded
. unloading table                    PROPS$      25 rows unloaded
Reading USER.dat 63 entries loaded
Reading OBJ.dat 31490 entries loaded and sorted 31490 entries
Reading TAB.dat 909 entries loaded
Reading COL.dat 35582 entries loaded and sorted 35582 entries
Reading TABPART.dat 55 entries loaded and sorted 55 entries
Reading TABCOMPART.dat 0 entries loaded and sorted 0 entries
Reading TABSUBPART.dat 0 entries loaded and sorted 0 entries
Reading INDPART.dat 128 entries loaded and sorted 128 entries
Reading INDCOMPART.dat 0 entries loaded and sorted 0 entries
Reading INDSUBPART.dat 0 entries loaded and sorted 0 entries
Reading IND.dat 1396 entries loaded
Reading LOB.dat 427 entries loaded
Reading ICOL.dat 1842 entries loaded
Reading COLTYPE.dat 1160 entries loaded
Reading TYPE.dat 956 entries loaded
Reading ATTRIBUTE.dat 3623 entries loaded
Reading COLLECTION.dat 250 entries loaded
Reading BOOTSTRAP.dat 57 entries loaded
Reading LOBFRAG.dat 0 entries loaded and sorted 0 entries
Reading LOBCOMPPART.dat 0 entries loaded and sorted 0 entries
Reading UNDO.dat 21 entries loaded
Reading TS.dat 13 entries loaded
Reading PROPS.dat 25 entries loaded
Database character set is ZHS16GBK
Database national character set is AL16UTF16
DUL> unload table chf.t_xifenfei01;
. unloading table              T_XIFENFEI01   30758 rows unloaded
DUL> unload table chf.t_xifenfei02;
. unloading table              T_XIFENFEI02     865 rows unloaded
DUL>

这里证明,在表空间不能online的情况下,dul处理类似故障很轻松(但是需要重建表空间,bbed处理的从原理上可以不重建表空间),当然如果涉及的对象很多,而丢失的文件很少,可能bbed也不失为一种很不错的选择.
另外补充:如果表空间丢失的文件包含数据,也可以通过类似bbed和dul方法处理:bbed 让表空间online之后需要考虑跳过丢失文件;dul处理完全相同

asm disk误设置pvid导致asm diskgroup无法mount恢复

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:asm disk误设置pvid导致asm diskgroup无法mount恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

有朋友找到我说他们把以前存储到AIX直连的存储切换为含光纤交换机的存储网络后,RAC无法启动,让我给予支持.通过分析是由于换链路之后开始磁盘顺序不对,维护人员对其asm disk 设置了pvid,导致asm 磁盘组无法正常mount,从而使得含votedisk的dg的asm disk无法正常访问,从而RAC的cssd进程无法启动,同样数据文件的磁盘组也无法mount,通过kfed修复成功,实现数据0丢失.
平台版本信息(2节点RAC)

$ sqlplus -v
SQL*Plus: Release 11.2.0.4.0 Production
$ uname -a
AIX db2 1 7 00F9733E4C00

GI日志报错信息

2014-12-20 16:44:08.769:
[ohasd(6946818)]CRS-2769:Unable to failover resource 'ora.diskmon'.
2014-12-20 16:44:11.775:
[cssd(9502756)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds;
Details at (:CSSNM00070:) in /u01/app/11.2.0/grid/log/db1/cssd/ocssd.log
2014-12-20 16:44:26.791:
[cssd(9502756)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds;
、Details at (:CSSNM00070:) in /u01/app/11.2.0/grid/log/db1/cssd/ocssd.log
2014-12-20 16:44:41.812:
[cssd(9502756)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds;
Details at (:CSSNM00070:) in /u01/app/11.2.0/grid/log/db1/cssd/ocssd.log

从这里可以看出来是由于RAC启动过程中无法获得votedisk使得其无法正常启动,通过分析日志找出来votedisk相关磁盘

2014-12-15 17:36:15.424:
[cssd(10027070)]CRS-1605:CSSD voting file is online: /dev/rhdisk4; details in /u01/app/11.2.0/grid/log/db1/cssd/ocssd.log
2014-12-15 17:36:15.433:
[cssd(10027070)]CRS-1605:CSSD voting file is online: /dev/rhdisk5; details in /u01/app/11.2.0/grid/log/db1/cssd/ocssd.log
2014-12-15 17:36:15.445:
[cssd(10027070)]CRS-1605:CSSD voting file is online: /dev/rhdisk6; details in /u01/app/11.2.0/grid/log/db1/cssd/ocssd.log

从这里可以知道rhdisk4,5,6为votedisk对应磁盘,使用kfed查看磁盘头信息

$kfed read /dev/rhdisk4
kfbh.endian:                        201 ; 0x000: 0xc9
kfbh.hard:                          194 ; 0x001: 0xc2
kfbh.type:                          212 ; 0x002: *** Unknown Enum ***
kfbh.datfmt:                        193 ; 0x003: 0xc1
kfbh.block.blk:                       0 ; 0x004: blk=0
kfbh.block.obj:                       0 ; 0x008: file=0
kfbh.check:                           0 ; 0x00c: 0x00000000
kfbh.fcn.base:                        0 ; 0x010: 0x00000000
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000
1102BEE00 C9C2D4C1 00000000 00000000 00000000  [................]
1102BEE10 00000000 00000000 00000000 00000000  [................]
        Repeat 6 times
1102BEE80 00F9733D 67553E0A 00000000 00000000  [..s=gU>.........]
1102BEE90 00000000 00000000 00000000 00000000  [................]
  Repeat 246 times
KFED-00322: Invalid content encountered during block traversal: [kfbtTraverseBlock][Invalid OSM block type][][212]
$kfed read /dev/rhdisk4 blkn=1
kfbh.endian:                          0 ; 0x000: 0x00
kfbh.hard:                          130 ; 0x001: 0x82
kfbh.type:                            2 ; 0x002: KFBTYP_FREESPC
kfbh.datfmt:                          2 ; 0x003: 0x02
kfbh.block.blk:                       1 ; 0x004: blk=1
kfbh.block.obj:              2147483648 ; 0x008: disk=0
kfbh.check:                  3883664132 ; 0x00c: 0xe77c0304
kfbh.fcn.base:                        0 ; 0x010: 0x00000000
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000
kfdfsb.aunum:                         0 ; 0x000: 0x00000000
kfdfsb.max:                         254 ; 0x004: 0x00fe
kfdfsb.cnt:                          23 ; 0x006: 0x0017
kfdfsb.bound:                         0 ; 0x008: 0x0000
kfdfsb.flag:                          1 ; 0x00a: B=1
kfdfsb.ub1spare:                      0 ; 0x00b: 0x00
kfdfsb.spare[0]:                      0 ; 0x00c: 0x00000000
kfdfsb.spare[1]:                      0 ; 0x010: 0x00000000
kfdfsb.spare[2]:                      0 ; 0x014: 0x00000000
kfdfse[0].fse:                      119 ; 0x018: FREE=0x7 FRAG=0x7
kfdfse[1].fse:                       16 ; 0x019: FREE=0x0 FRAG=0x1
…………
$kfed read /dev/rhdisk4 blkn=510
kfbh.endian:                          0 ; 0x000: 0x00
kfbh.hard:                          130 ; 0x001: 0x82
kfbh.type:                            1 ; 0x002: KFBTYP_DISKHEAD
kfbh.datfmt:                          1 ; 0x003: 0x01
kfbh.block.blk:                     254 ; 0x004: blk=254
kfbh.block.obj:              2147483648 ; 0x008: disk=0
kfbh.check:                  3460116983 ; 0x00c: 0xce3d31f7
kfbh.fcn.base:                        0 ; 0x010: 0x00000000
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000
kfdhdb.driver.provstr:         ORCLDISK ; 0x000: length=8
kfdhdb.driver.reserved[0]:            0 ; 0x008: 0x00000000
kfdhdb.driver.reserved[1]:            0 ; 0x00c: 0x00000000
kfdhdb.driver.reserved[2]:            0 ; 0x010: 0x00000000
kfdhdb.driver.reserved[3]:            0 ; 0x014: 0x00000000
kfdhdb.driver.reserved[4]:            0 ; 0x018: 0x00000000
kfdhdb.driver.reserved[5]:            0 ; 0x01c: 0x00000000
kfdhdb.compat:                186646528 ; 0x020: 0x0b200000
kfdhdb.dsknum:                        0 ; 0x024: 0x0000
kfdhdb.grptyp:                        2 ; 0x026: KFDGTP_NORMAL
kfdhdb.hdrsts:                        3 ; 0x027: KFDHDR_MEMBER
kfdhdb.dskname:                CRS_0000 ; 0x028: length=8
kfdhdb.grpname:                     CRS ; 0x048: length=3
kfdhdb.fgname:                 CRS_0000 ; 0x068: length=8
…………

由上述分析可以基本上确定是asm disk header 被破坏,进一步分析破坏原因

[db2/dev#]lspv
hdisk0          00f9733ef7cf27e9                    rootvg          active
hdisk1          00f9733e21b953e6                    rootvg          active
hdisk2          00f9733e21b97a83                    appvg           active
hdisk3          00f9733e21b98434                    appvg           active
hdisk4          00f9733d67553e0a                    None
hdisk5          00f9733d67553f31                    None
hdisk6          00f9733d67554011                    None
hdisk7          00f9733d67554165                    None
hdisk8          00f9733d675541e5                    None
hdisk9          00f9733d675542e4                    None
hdisk10         none                                None
[db2/dev#]ls -l rhdisk*
crw-------    2 root     system       24,  1 Oct 18 11:45 rhdisk0
crw-------    1 root     system       24,  3 Oct 18 13:27 rhdisk1
crw-------    1 root     system       24,  5 Dec 20 20:02 rhdisk10
crw-------    1 root     system       24,  2 Oct 18 13:32 rhdisk2
crw-------    1 root     system       24,  0 Oct 18 13:32 rhdisk3
crw-rw----    1 grid     asmadmin     24,  8 Dec 20 20:02 rhdisk4
crw-rw----    1 grid     asmadmin     24,  9 Dec 20 20:02 rhdisk5
crw-rw----    1 grid     asmadmin     24, 10 Dec 20 20:02 rhdisk6
crw-rw----    1 grid     asmadmin     24,  4 Dec 20 20:02 rhdisk7
crw-rw----    1 grid     asmadmin     24,  6 Dec 20 20:02 rhdisk8
crw-rw----    1 grid     asmadmin     24,  7 Dec 20 20:02 rhdisk9

从这里基本上可以看出来,是由于磁盘头被重写了pvid,导致asm disk header 被破坏.进一步分析asm log,确定哪些磁盘被用作asm disk

SQL> CREATE DISKGROUP CRS NORMAL REDUNDANCY  DISK '/dev/rhdisk4',
'/dev/rhdisk5',
'/dev/rhdisk6' ATTRIBUTE 'compatible.asm'='11.2.0.0.0','au_size'='1M' /* ASMCA */
NOTE: Assigning number (1,0) to disk (/dev/rhdisk4)
NOTE: Assigning number (1,1) to disk (/dev/rhdisk5)
NOTE: Assigning number (1,2) to disk (/dev/rhdisk6)
NOTE: initializing header on grp 1 disk CRS_0000
NOTE: initializing header on grp 1 disk CRS_0001
NOTE: initializing header on grp 1 disk CRS_0002
SQL> CREATE DISKGROUP DATA EXTERNAL REDUNDANCY  DISK
'/dev/rhdisk9' SIZE 614400M  ATTRIBUTE 'compatible.asm'='11.2.0.0.0','au_size'='1M' /* ASMCA */
NOTE: Assigning number (2,0) to disk (/dev/rhdisk9)
NOTE: initializing header on grp 2 disk DATA_0000
SQL> CREATE DISKGROUP FBA EXTERNAL REDUNDANCY  DISK
'/dev/rhdisk8' SIZE 204800M  ATTRIBUTE 'compatible.asm'='11.2.0.0.0','au_size'='1M' /* ASMCA */
NOTE: Assigning number (3,0) to disk (/dev/rhdisk8)
NOTE: initializing header on grp 3 disk FBA_0000
SQL> CREATE DISKGROUP ARCH EXTERNAL REDUNDANCY  DISK
'/dev/rhdisk7' SIZE 102400M  ATTRIBUTE 'compatible.asm'='11.2.0.0.0','au_size'='1M' /* ASMCA */
NOTE: Assigning number (4,0) to disk (/dev/rhdisk7)
NOTE: initializing header on grp 4 disk ARCH_0000

这里可以确定asm disk为rhdisk[4-9],通过kfed分析全部和rhdisk4一样的问题,也符合lspv查询出来的结果,使用kfed repair修复asm disk header后

SQL> alter diskgroup data mount;
Diskgroup altered.
SQL> alter diskgroup fba mount;
Diskgroup altered.
SQL> alter diskgroup arch mount;
Diskgroup altered.
SQL> alter diskgroup crs mount;
Diskgroup altered.
SQL> select group_number,disk_number,path from v$asm_disk;
GROUP_NUMBER DISK_NUMBER PATH
------------ ----------- --------------------------------------------------
           2           0 /dev/rhdisk4
           2           1 /dev/rhdisk5
           2           2 /dev/rhdisk6
           1           0 /dev/rhdisk7
           4           0 /dev/rhdisk8
           3           0 /dev/rhdisk9
6 rows selected.
SQL> select group_number,name from v$asm_diskgroup;
GROUP_NUMBER NAME
------------ ------------------------------------------------------------
           1 ARCH
           2 CRS
           3 DATA
           4 FBA

这里证明通过kfed对磁盘头的修复,asm磁盘组已经全部mount成功,GI状态也恢复正常

[db2/#]crsctl status res -t
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.ARCH.dg
               ONLINE  ONLINE       db1
               ONLINE  ONLINE       db2
ora.CRS.dg
               ONLINE  ONLINE       db1
               ONLINE  ONLINE       db2
ora.DATA.dg
               ONLINE  ONLINE       db1
               ONLINE  ONLINE       db2
ora.FBA.dg
               ONLINE  ONLINE       db1
               ONLINE  ONLINE       db2
ora.LISTENER.lsnr
               ONLINE  ONLINE       db1
               ONLINE  ONLINE       db2
ora.asm
               ONLINE  ONLINE       db1                      Started
               ONLINE  ONLINE       db2                      Started
ora.gsd
               OFFLINE OFFLINE      db1
               OFFLINE OFFLINE      db2
ora.net1.network
               ONLINE  ONLINE       db1
               ONLINE  ONLINE       db2
ora.ons
               ONLINE  ONLINE       db1
               ONLINE  ONLINE       db2
ora.registry.acfs
               ONLINE  ONLINE       db1
               ONLINE  ONLINE       db2
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  ONLINE       db1
ora.cvu
      1        ONLINE  ONLINE       db1
ora.db1.vip
      1        ONLINE  ONLINE       db1
ora.db2.vip
      1        ONLINE  ONLINE       db2
ora.nkora.db
      1        ONLINE  ONLINE       db1                      Open
      2        ONLINE  ONLINE       db2                      Open
ora.oc4j
      1        ONLINE  ONLINE       db1
ora.scan1.vip
      1        ONLINE  ONLINE       db1

这里忽略了一个问题,在修复磁盘头之前没有清除pvid,导致磁盘头修复后,pvid依然存储在odm中

[db2/dev#]lspv
hdisk0          00f9733ef7cf27e9                    rootvg          active
hdisk1          00f9733e21b953e6                    rootvg          active
hdisk2          00f9733e21b97a83                    appvg           active
hdisk3          00f9733e21b98434                    appvg           active
hdisk4          00f9733d67553e0a                    None
hdisk5          00f9733d67553f31                    None
hdisk6          00f9733d67554011                    None
hdisk7          00f9733d67554165                    None
hdisk8          00f9733d675541e5                    None
hdisk9          00f9733d675542e4                    None
hdisk10         none                                None

通过分析发现fba磁盘组中无任何记录,使用该磁盘组进行直接清除pvid测试

$ sqlplus / as sysasm
SQL*Plus: Release 11.2.0.4.0 Production on Sun Dec 21 03:13:31 2014
Copyright (c) 1982, 2013, Oracle.  All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Real Application Clusters and Automatic Storage Management options
SQL> alter diskgroup fba dismount;
Diskgroup altered.
SQL> exit
Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Real Application Clusters and Automatic Storage Management options
$ exit
You have mail in /usr/spool/mail/root
[db2/#]chdev -l hdisk8 -a pv=clear
hdisk8 changed
[db2/#]lspv
hdisk0          00f9733ef7cf27e9                    rootvg          active
hdisk1          00f9733e21b953e6                    rootvg          active
hdisk2          00f9733e21b97a83                    appvg           active
hdisk3          00f9733e21b98434                    appvg           active
hdisk4          00f9733d67553e0a                    None
hdisk5          00f9733d67553f31                    None
hdisk6          00f9733d67554011                    None
hdisk7          00f9733d67554165                    None
hdisk8          none                                None
hdisk9          00f9733d675542e4                    None
hdisk10         none                                None
[db2/#]su - grid
$ sqlplus / as sysasm
SQL*Plus: Release 11.2.0.4.0 Production on Sun Dec 21 03:15:19 2014
Copyright (c) 1982, 2013, Oracle.  All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Real Application Clusters and Automatic Storage Management options
SQL> alter diskgroup fba mount;
Diskgroup altered.
SQL> exit
Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Real Application Clusters and Automatic Storage Management options

通过测试直接清除pvid asm 磁盘头依然工作正常,关闭GI,使用chdev清除hdisk[4-9]所有pvid,启动GI一切正常

[db1/#]crsctl status res -t
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.ARCH.dg
               ONLINE  ONLINE       db1
               ONLINE  ONLINE       db2
ora.CRS.dg
               ONLINE  ONLINE       db1
               ONLINE  ONLINE       db2
ora.DATA.dg
               ONLINE  ONLINE       db1
               ONLINE  ONLINE       db2
ora.FBA.dg
               ONLINE  ONLINE       db1
               ONLINE  ONLINE       db2
ora.LISTENER.lsnr
               ONLINE  ONLINE       db1
               ONLINE  ONLINE       db2
ora.asm
               ONLINE  ONLINE       db1                      Started
               ONLINE  ONLINE       db2                      Started
ora.gsd
               OFFLINE OFFLINE      db1
               OFFLINE OFFLINE      db2
ora.net1.network
               ONLINE  ONLINE       db1
               ONLINE  ONLINE       db2
ora.ons
               ONLINE  ONLINE       db1
               ONLINE  ONLINE       db2
ora.registry.acfs
               ONLINE  ONLINE       db1
               ONLINE  ONLINE       db2
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  ONLINE       db1
ora.cvu
      1        ONLINE  ONLINE       db1
ora.db1.vip
      1        ONLINE  ONLINE       db1
ora.db2.vip
      1        ONLINE  ONLINE       db2
ora.nkora.db
      1        ONLINE  ONLINE       db1                      Open
      2        ONLINE  ONLINE       db2                      Open
ora.oc4j
      1        ONLINE  ONLINE       db1
ora.scan1.vip
      1        ONLINE  ONLINE       db1
[db1/#]lspv
hdisk0          00f9733df7c7a9db                    rootvg          active
hdisk1          00f9733d21dad8fe                    rootvg          active
hdisk2          00f9733d21dbd08b                    appvg           active
hdisk3          00f9733d21dbd2ab                    appvg           active
hdisk4          none                                None
hdisk5          none                                None
hdisk6          none                                None
hdisk7          none                                None
hdisk8          none                                None
hdisk9          none                                None
hdisk10         none                                None

至此设置pvid导致asm disk header损坏的asm 恢复正常,实现数据0丢失。
温馨提示:aix asm disk磁盘中不能设置pvid,否则将会导致asm disk header 损坏,无法正常mount
如果您遇到此类情况,无法解决请联系我们,提供专业ORACLE数据库恢复技术支持
Phone:17813235971    Q Q:107644445QQ咨询惜分飞    E-Mail:dba@xifenfei.com

SYSTEM表空间坏块恢复—C_TS#对象坏块恢复(file 1 block 60)

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:SYSTEM表空间坏块恢复—C_TS#对象坏块恢复(file 1 block 60)

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

一朋友给我电话,说他们客户公司数据库故障,被另外一家公司恢复了一天不能正常恢复,请求我协助解决.接手一看数据库已经被破坏的不像样子了,根据alert日志信息大概分析了故障原因和上家公司处理情况。后面接手后通过bbed修复block数据库恢复过程,在本次恢复中出现大量ORA-600错误,主要包括ORA-00600 400,ORA-00600 2662,ORA-00600 2663,ORA-00600 krhpfh_03-1209,ORA-00600 3600,ORA-00600 ktsitbs_info1,ORA-00600 4137,ORA-00600 4511,ORA-00600 4198,ORA-00600 6807等
故障原因redo文件丢失

Thu Nov 20 11:28:39 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_lgwr_1404.trc:
ORA-00313: open failed for members of log group 7 of thread 1
ORA-00312: online log 9 thread 1: '/data2/oradata/redo0902.log'
ORA-27037: unable to obtain file status
SVR4 Error: 2: No such file or directory
Additional information: 3
Thu Nov 20 11:28:39 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_lgwr_1404.trc:
ORA-00313: open failed for members of log group 7 of thread 1
ORA-00312: online log 9 thread 1: '/data2/oradata/redo0902.log'
ORA-27037: unable to obtain file status
SVR4 Error: 2: No such file or directory
Additional information: 3
Thu Nov 20 11:28:39 2014
LGWR: terminating instance due to error 313
Thu Nov 20 11:28:39 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_pmon_1394.trc:
ORA-00313: open failed for members of log group  of thread
Thu Nov 20 11:28:39 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_ckpt_1406.trc:
ORA-00313: open failed for members of log group  of thread
Instance terminated by LGWR, pid = 1404

尝试clear redo文件方式恢复

Thu Nov 20 13:04:16 2014
alter database clear logfile group 9
Thu Nov 20 13:04:16 2014
ORA-1624 signalled during: alter database clear logfile group 9...
Thu Nov 20 13:04:45 2014
alter database clear logfile group 9
Thu Nov 20 13:04:46 2014
ORA-1624 signalled during: alter database clear logfile group 9...
Thu Nov 20 13:04:59 2014
alter database clear unarchived logfile group 9
Thu Nov 20 13:04:59 2014
ORA-1624 signalled during: alter database clear unarchived logfile group 9...
Thu Nov 20 13:05:00 2014
alter database clear unarchived logfile group 9
Thu Nov 20 13:05:00 2014
ORA-1624 signalled during: alter database clear unarchived logfile group 9...

不完全恢复resetlogs尝试打开数据库

ORA-279 signalled during: ALTER DATABASE RECOVER  database using backup controlfile  ...
Thu Nov 20 13:49:01 2014
ALTER DATABASE RECOVER    CONTINUE DEFAULT
Thu Nov 20 13:49:02 2014
Media Recovery Log /opt/oracle/flash_recovery_area/xifenfei/archivelog/2014_11_20/o1_mf_1_285999_%u_.arc
Errors with log /opt/oracle/flash_recovery_area/xifenfei/archivelog/2014_11_20/o1_mf_1_285999_%u_.arc
ORA-308 signalled during: ALTER DATABASE RECOVER    CONTINUE DEFAULT  ...
Thu Nov 20 13:49:02 2014
ALTER DATABASE RECOVER    CONTINUE DEFAULT
Thu Nov 20 13:49:02 2014
Media Recovery Log /opt/oracle/flash_recovery_area/xifenfei/archivelog/2014_11_20/o1_mf_1_285999_%u_.arc
Errors with log /opt/oracle/flash_recovery_area/xifenfei/archivelog/2014_11_20/o1_mf_1_285999_%u_.arc
ORA-308 signalled during: ALTER DATABASE RECOVER    CONTINUE DEFAULT  ...
Thu Nov 20 13:49:02 2014
ALTER DATABASE RECOVER CANCEL
Thu Nov 20 13:49:03 2014
Media Recovery Canceled
Completed: ALTER DATABASE RECOVER CANCEL
Thu Nov 20 13:49:33 2014
alter database open resetlogs
Thu Nov 20 13:49:34 2014
ORA-1113 signalled during: alter database open resetlogs...

使用隐含参数

_allow_resetlogs_corruption= TRUE

进行不完全恢复,尝试open数据库报ORA-600 4000错误

Thu Nov 20 14:35:02 2014
ALTER DATABASE   MOUNT
Thu Nov 20 14:35:07 2014
Setting recovery target incarnation to 2
Thu Nov 20 14:35:07 2014
Successful mount of redo thread 1, with mount id 4039504598
Thu Nov 20 14:35:07 2014
Database mounted in Exclusive Mode
Completed: ALTER DATABASE   MOUNT
Thu Nov 20 14:40:33 2014
ALTER DATABASE RECOVER  database until cancel
Thu Nov 20 14:40:33 2014
Media Recovery Start
Thu Nov 20 14:40:33 2014
Media Recovery failed with error 1610
ORA-283 signalled during: ALTER DATABASE RECOVER  database until cancel  ...
Thu Nov 20 14:41:23 2014
ALTER DATABASE RECOVER  database using backup controlfile until cancel
Thu Nov 20 14:43:08 2014
alter database open resetlogs
Thu Nov 20 14:43:08 2014
RESETLOGS is being done without consistancy checks. This may result
in a corrupted database. The database should be recreated.
RESETLOGS after incomplete recovery UNTIL CHANGE 31293973571
Resetting resetlogs activation ID 3855216310 (0xe5c9eeb6)
Online log /data2/oradata/redo0802.log: Thread 1 Group 8 was previously cleared
Online log /data2/oradata/redo0902.log: Thread 1 Group 9 was previously cleared
Thu Nov 20 14:43:14 2014
Setting recovery target incarnation to 3
Thu Nov 20 14:43:14 2014
Assigning activation ID 4039504598 (0xf0c5f2d6)
Thread 1 opened at log sequence 1
  Current log# 9 seq# 1 mem# 0: /data2/oradata/redo0902.log
Successful open of redo thread 1
Thu Nov 20 14:43:14 2014
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Thu Nov 20 14:43:14 2014
SMON: enabling cache recovery
Thu Nov 20 14:43:14 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_1844.trc:
ORA-00600: internal error code, arguments: [4000], [17], [], [], [], [], [], []
Thu Nov 20 14:43:16 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_1844.trc:
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00600: internal error code, arguments: [4000], [17], [], [], [], [], [], []
Thu Nov 20 14:43:16 2014
Error 704 happened during db open, shutting down database
USER: terminating instance due to error 704
Instance terminated by USER, pid = 1844
ORA-1092 signalled during: alter database open resetlogs...

尝试隐含屏蔽回滚段

_corrupted_rollback_segments= _SYSSMU1$, _SYSSMU2$,…………

错误依旧ORA-600 4000

Thu Nov 20 15:09:21 2014
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Thu Nov 20 15:09:21 2014
SMON: enabling cache recovery
Thu Nov 20 15:09:21 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_624.trc:
ORA-00600: internal error code, arguments: [4000], [17], [], [], [], [], [], []
Thu Nov 20 15:09:23 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_624.trc:
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00600: internal error code, arguments: [4000], [17], [], [], [], [], [], []
Thu Nov 20 15:09:23 2014
Error 704 happened during db open, shutting down database
USER: terminating instance due to error 704
Instance terminated by USER, pid = 624
ORA-1092 signalled during: alter database open

多次重启,resetlogs后,数据库出现ORA-600 2662错误

Successful open of redo thread 1
Thu Nov 20 17:13:24 2014
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Thu Nov 20 17:13:24 2014
SMON: enabling cache recovery
Thu Nov 20 17:13:24 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_7967.trc:
ORA-00600: internal error code, arguments: [2662], [7], [1229382552], [7], [1229560642], [8388633], [], []
Thu Nov 20 17:13:25 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_7967.trc:
ORA-00600: internal error code, arguments: [2662], [7], [1229382552], [7], [1229560642], [8388633], [], []
Thu Nov 20 17:13:25 2014
Error 600 happened during db open, shutting down database
USER: terminating instance due to error 600
Instance terminated by USER, pid = 7967
ORA-1092 signalled during: ALTER DATABASE OPEN...
Thu Nov 20 17:18:23 2014
USER: terminating instance due to error 1092
Instance terminated by USER, pid = 7967

offline undo相关文件,尝试打开数据库

Database mounted in Exclusive Mode
Completed: ALTER DATABASE   MOUNT
Thu Nov 20 17:52:31 2014
ALTER DATABASE RECOVER  database until cancel
Thu Nov 20 17:52:31 2014
Media Recovery Start
 parallel recovery started with 15 processes
ORA-279 signalled during: ALTER DATABASE RECOVER  database until cancel  ...
Thu Nov 20 17:53:42 2014
ALTER DATABASE RECOVER CANCEL
Thu Nov 20 17:53:44 2014
ORA-1547 signalled during: ALTER DATABASE RECOVER CANCEL ...
Thu Nov 20 17:56:34 2014
alter database datafile '/opt/oracle/oradata/xifenfei/undotbs01.dbf' offline
Thu Nov 20 17:56:35 2014
Completed: alter database datafile '/opt/oracle/oradata/xifenfei/undotbs01.dbf' offline
Thu Nov 20 17:57:01 2014
alter database datafile '/data2/oradata/undotbs02.dbf' offline
Thu Nov 20 17:57:02 2014
Completed: alter database datafile '/data2/oradata/undotbs02.dbf' offline
Thu Nov 20 17:57:26 2014
alter database datafile '/data2/oradata/undotbs03.dbf' offline
Thu Nov 20 17:57:27 2014
Completed: alter database datafile '/data2/oradata/undotbs03.dbf' offline
Thu Nov 20 17:57:43 2014
alter database open resetlogs
Thu Nov 20 17:57:43 2014
RESETLOGS is being done without consistancy checks. This may result
in a corrupted database. The database should be recreated.
ORA-1245 signalled during: alter database open resetlogs...
Thu Nov 20 17:58:58 2014
alter database datafile '/opt/oracle/oradata/xifenfei/undotbs01.dbf' offline drop
Thu Nov 20 17:58:58 2014
Completed: alter database datafile '/opt/oracle/oradata/xifenfei/undotbs01.dbf' offline drop
Thu Nov 20 17:59:15 2014
alter database open resetlogs
Thu Nov 20 17:59:15 2014
RESETLOGS is being done without consistancy checks. This may result
in a corrupted database. The database should be recreated.
ORA-1245 signalled during: alter database open resetlogs...
Thu Nov 20 17:59:35 2014
alter database datafile '/data2/oradata/undotbs02.dbf' offline drop
Thu Nov 20 17:59:35 2014
Completed: alter database datafile '/data2/oradata/undotbs02.dbf' offline drop
Thu Nov 20 17:59:50 2014
alter database datafile '/data2/oradata/undotbs03.dbf' offline drop
Thu Nov 20 17:59:50 2014
Completed: alter database datafile '/data2/oradata/undotbs03.dbf' offline drop
Thu Nov 20 18:00:07 2014
alter database open resetlogs
Thu Nov 20 18:00:07 2014
RESETLOGS is being done without consistancy checks. This may result
in a corrupted database. The database should be recreated.
RESETLOGS after incomplete recovery UNTIL CHANGE 31294173628
Resetting resetlogs activation ID 4039492628 (0xf0c5c414)
Online log /data2/oradata/redo0802.log: Thread 1 Group 8 was previously cleared
Thu Nov 20 18:00:14 2014
Setting recovery target incarnation to 8
Thu Nov 20 18:00:14 2014
Assigning activation ID 4039504142 (0xf0c5f10e)
Thread 1 opened at log sequence 1
  Current log# 9 seq# 1 mem# 0: /data2/oradata/redo0902.log
Successful open of redo thread 1
Thu Nov 20 18:00:15 2014
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Thu Nov 20 18:00:15 2014
SMON: enabling cache recovery
Thu Nov 20 18:00:15 2014
Successfully onlined Undo Tablespace 1.
Dictionary check beginning
File #2 is offline, but is part of an online tablespace.
data file 2: '/opt/oracle/oradata/xifenfei/undotbs01.dbf'
File #100 is offline, but is part of an online tablespace.
data file 100: '/data2/oradata/undotbs02.dbf'
Thu Nov 20 18:00:28 2014
File #185 is offline, but is part of an online tablespace.
data file 185: '/data2/oradata/undotbs03.dbf'
Dictionary check complete
Thu Nov 20 18:00:35 2014
SMON: enabling tx recovery
Thu Nov 20 18:00:36 2014
Database Characterset is ZHS16CGB231280
Thu Nov 20 18:00:37 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_28472.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-00376: file 185 cannot be read at this time
ORA-01110: data file 185: '/data2/oradata/undotbs03.dbf'
Error 604 happened during db open, shutting down database
USER: terminating instance due to error 604
Thu Nov 20 18:00:37 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_lgwr_28450.trc:
ORA-00604: error occurred at recursive SQL level
Thu Nov 20 18:00:37 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_dbw0_28446.trc:
ORA-00604: error occurred at recursive SQL level
Instance terminated by USER, pid = 28472
ORA-1092 signalled during: alter database open resetlogs...

不知道做了什么操作出现file 1 block 60坏块,很可能bbed修改错误导致

Thu Nov 20 19:18:15 2014
SMON: enabling cache recovery
Thu Nov 20 19:18:16 2014
Hex dump of (file 1, block 60) in trace file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_13232.trc
Corrupt block relative dba: 0x0040003c (file 1, block 60)
Bad header found during buffer read
Data in bad block:
 type: 128 format: 0 rdba: 0x0040003c
 last change scn: 0x0005.ebe04bc9 seq: 0x2 flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x4bc90602
 check value in block header: 0x6faa
 computed block checksum: 0x0
Reread of rdba: 0x0040003c (file 1, block 60) found same corrupted data
Successfully onlined Undo Tablespace 1.
Thu Nov 20 19:18:16 2014
SMON: enabling tx recovery
Thu Nov 20 19:18:17 2014
Database Characterset is ZHS16CGB231280
Thu Nov 20 19:18:17 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_13232.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-00376: file 185 cannot be read at this time
ORA-01110: data file 185: '/data2/oradata/undotbs03.dbf'
Error 604 happened during db open, shutting down database
USER: terminating instance due to error 604
Instance terminated by USER, pid = 13232
ORA-1092 signalled during: alter database open...

尝试不完全恢复,并resetlogs操作

ALTER DATABASE RECOVER  database until cancel
Thu Nov 20 19:33:41 2014
Media Recovery Start
Datafile 2 is on orphaned branch
          File status = 4
        Abs fuzzy SCN = 0
 Hot backup fuzzy SCN = 0
Thu Nov 20 19:33:41 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_20878.trc:
ORA-00600: internal error code, arguments: [krhpfh_03-1209], [2], [864151207], [864153315], [1229402557], [7], [0], [0]
ORA-01110: data file 2: '/opt/oracle/oradata/xifenfei/undotbs01.dbf'
Thu Nov 20 19:33:42 2014
Media Recovery failed with error 600
ORA-283 signalled during: ALTER DATABASE RECOVER  database until cancel  ...
Thu Nov 20 19:34:06 2014
alter database open resetlogs
Thu Nov 20 19:34:06 2014
ORA-1139 signalled during: alter database open resetlogs...
Thu Nov 20 19:34:17 2014
alter database open
Thu Nov 20 19:34:17 2014
ORA-1190 signalled during: alter database open...
Thu Nov 20 19:35:57 2014
ALTER DATABASE RECOVER  database until cancel
Thu Nov 20 19:35:57 2014
Media Recovery Start
Datafile 2 is on orphaned branch
          File status = 4
        Abs fuzzy SCN = 0
 Hot backup fuzzy SCN = 0
Thu Nov 20 19:35:58 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_20878.trc:
ORA-00600: internal error code, arguments: [krhpfh_03-1209], [2], [864151207], [864153315], [1229402557], [7], [0], [0]
ORA-01110: data file 2: '/opt/oracle/oradata/xifenfei/undotbs01.dbf'
Thu Nov 20 19:35:59 2014
Media Recovery failed with error 600
ORA-283 signalled during: ALTER DATABASE RECOVER  database until cancel  ...
Thu Nov 20 19:37:19 2014
alter database open resetlogs
Thu Nov 20 19:37:19 2014
ORA-1139 signalled during: alter database open resetlogs...

继续打开报 ORA-600 3600错误

Thu Nov 20 19:43:14 2014
alter database datafile '/opt/oracle/oradata/xifenfei/undotbs01.dbf' offline drop
Thu Nov 20 19:43:14 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_dbw0_20856.trc:
ORA-00600: internal error code, arguments: [3600], [2], [14], [], [], [], [], []
Thu Nov 20 19:43:15 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_dbw0_20856.trc:
ORA-00600: internal error code, arguments: [3600], [2], [14], [], [], [], [], []
Thu Nov 20 19:43:15 2014
DBW0: terminating instance due to error 471
Instance terminated by DBW0, pid = 20856

<strong>中间多次重启和resetlogs,还出现ORA-600 2663错误</strong>

Fri Nov 21 12:35:12 2014
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Fri Nov 21 12:35:12 2014
SMON: enabling cache recovery
Fri Nov 21 12:35:13 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_15596.trc:
ORA-00600: internal error code, arguments: [2663], [7], [1229543007], [7], [1229560642], [], [], []
Fri Nov 21 12:35:14 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_15596.trc:
ORA-00600: internal error code, arguments: [2663], [7], [1229543007], [7], [1229560642], [], [], []
Fri Nov 21 12:35:14 2014
Error 600 happened during db open, shutting down database
USER: terminating instance due to error 600
Fri Nov 21 12:35:14 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_mman_15572.trc:
ORA-00600: internal error code, arguments: [], [], [], [], [], [], [], []
Fri Nov 21 12:35:14 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_dbw1_15576.trc:
ORA-00600: internal error code, arguments: [], [], [], [], [], [], [], []
Instance terminated by USER, pid = 15596
ORA-1092 signalled during: ALTER DATABASE OPEN..

继续尝试打开数据库出现ORA-600 ktsitbs_info1错误

SMON: enabling cache recovery
Fri Nov 21 13:54:25 2014
Hex dump of (file 1, block 60) in trace file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_21111.trc
Corrupt block relative dba: 0x0040003c (file 1, block 60)
Bad header found during buffer read
Data in bad block:
 type: 128 format: 0 rdba: 0x0040003c
 last change scn: 0x0005.ebe04bc9 seq: 0x2 flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x4bc90602
 check value in block header: 0x6faa
 computed block checksum: 0x0
Reread of rdba: 0x0040003c (file 1, block 60) found same corrupted data
Fri Nov 21 13:54:25 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_21111.trc:
ORA-00600: internal error code, arguments: [ktsitbs_info1], [2], [], [], [], [], [], []
Fri Nov 21 13:54:27 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_21111.trc:
ORA-00600: internal error code, arguments: [ktsitbs_info1], [2], [], [], [], [], [], []
Error 600 happened during db open, shutting down database
USER: terminating instance due to error 600
Instance terminated by USER, pid = 21111
ORA-1092 signalled during: alter database open...

以上是客户数据库故障原因和问题大概的处理过程,下面是我接手后的处理过程


dbv 检查system01.dbf文件,得到结果

HNDX-DB% dbv file=/opt/oracle/oradata/xifenfei/system01.dbf
DBVERIFY: Release 10.2.0.1.0 - Production on Fri Nov 21 16:22:37 2014
Copyright (c) 1982, 2005, Oracle.  All rights reserved.
DBVERIFY - Verification starting : FILE = /opt/oracle/oradata/xifenfei/system01.dbf
Page 60 is marked corrupt
Corrupt block relative dba: 0x0040003c (file 1, block 60)
Bad header found during dbv:
Data in bad block:
 type: 128 format: 0 rdba: 0x0040003c
 last change scn: 0x0005.ebe04bc9 seq: 0x2 flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x4bc90602
 check value in block header: 0x6faa
 computed block checksum: 0x0
Corrupt block relative dba: 0x004001f2 (file 1, block 498)
Bad check value found during buffer read
Data in bad block:
 type: 6 format: 2 rdba: 0x004001f2
 last change scn: 0x0007.49499ca1 seq: 0x1 flg: 0x06
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x9ca10601
 check value in block header: 0xe458
 computed block checksum: 0x9720
DBVERIFY - Verification complete
Total Pages Examined         : 786432
Total Pages Processed (Data) : 201131
Total Pages Failing   (Data) : 2
Total Pages Processed (Index): 221394
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 60265
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 303641
Total Pages Marked Corrupt   : 2
Total Pages Influx           : 0
Highest block SCN            : 1229823477 (7.1229823477)

这里知道数据库有两个坏块,而且根据对于bootstrap$的经验,可以大概确定60坏块很可能是C_TS#,第一反应type异常,498可能是seq$

对数据库启动过程做10046,得到trace文件

PARSING IN CURSOR #1 len=275 dep=2 uid=0 oct=3 lid=0 tim=27978051403575 hv=3408408745 ad='7df93cd0'
select name,online$,contents$,undofile#,undoblock#,blocksize,dflmaxext,dflinit,dflincr,dflextpct,dflminext,
dflminlen, owner#,scnwrp,scnbas, NVL(pitrscnwrp, 0), NVL(pitrscnbas, 0), dflogging, bitmapped, inc#, flags,
plugged, NVL(spare1,0), NVL(spare2,0) from ts$ where ts#=:1
END OF STMT
PARSE #1:c=0,e=92,p=0,cr=0,cu=0,mis=0,r=0,dep=2,og=4,tim=27978051403569
BINDS #1:
kkscoacd
 Bind#0
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=ffffffff7dbac9a8  bln=22  avl=02  flg=05
  value=2
EXEC #1:c=0,e=310,p=0,cr=0,cu=0,mis=0,r=0,dep=2,og=4,tim=27978051404296
WAIT #1: nam='db file sequential read' ela= 42 file#=1 block#=60 blocks=1 obj#=-1 tim=27978051404449
Hex dump of (file 1, block 60)
Corrupt block relative dba: 0x0040003c (file 1, block 60)
Bad header found during buffer read
Data in bad block:
 type: 128 format: 0 rdba: 0x0040003c
 last change scn: 0x0005.ebe04bc9 seq: 0x2 flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x4bc90602
 check value in block header: 0x6faa
 computed block checksum: 0x0
Reread of rdba: 0x0040003c (file 1, block 60) found same corrupted data
FETCH #1:c=10000,e=4072,p=1,cr=2,cu=0,mis=0,r=0,dep=2,og=4,tim=27978051408438
STAT #1 id=1 cnt=0 pid=0 pos=1 obj=16 op='TABLE ACCESS CLUSTER TS$ (cr=2 pr=1 pw=0 time=4075 us)'
STAT #1 id=2 cnt=1 pid=1 pos=1 obj=7 op='INDEX UNIQUE SCAN I_TS# (cr=1 pr=0 pw=0 time=13 us)'
*** 2014-11-22 14:44:43.235
ksedmp: internal or fatal error
ORA-00600: internal error code, arguments: [ktsitbs_info1], [2], [], [], [], [], [], []
Current SQL statement for this session:
select max(maxconcurrency) from sys.wrh$_undostat  where instance_number = :1 and dbid = :2
and snap_id in   (select snap_id from dba_hist_snapshot where end_interval_time >
(select max(end_interval_time)-7 from dba_hist_snapshot))

这里显示了数据库启动报ORA-00600[ktsitbs_info1],[2],明显的表示了b中的2是表示表空间号,由于ts$坏块,无法读取ts$中表空间信息,从而出现数据字典不一致,从而出现该错误。所以恢复该库的关键是修复file 1 block 60.

bbed尝试修复file 1 block 60

HNDX-DB% bbed password=blockedit mode=edit
BBED: Release 2.0.0.0.0 - Limited Production on Sat Nov 22 15:16:26 2014
Copyright (c) 1982, 2005, Oracle.  All rights reserved.
************* !!! For Oracle Internal Use only !!! ***************
BBED> set filename '/opt/oracle/oradata/xifenfei/system01.dbf'
        FILENAME        /opt/oracle/oradata/xifenfei/system01.dbf
BBED> set block 8192
        BLOCK#          8192
BBED> set block 60
        BLOCK#          60
BBED> set count 64
        COUNT           64
BBED> map
 File: /opt/oracle/oradata/xifenfei/system01.dbf (0)
 Block: 60                                    Dba:0x00000000
------------------------------------------------------------
BBED-00400: invalid blocktype (128)
BBED> set block 61
        BLOCK#          61
BBED> map
 File: /opt/oracle/oradata/xifenfei/system01.dbf (0)
 Block: 61                                    Dba:0x00000000
------------------------------------------------------------
 KTB Data Block (Table/Cluster)
 struct kcbh, 20 bytes                      @0
 struct ktbbh, 72 bytes                     @20
 struct kdbh, 14 bytes                      @92
 struct kdbt[3], 12 bytes                   @106
 sb2 kdbr[2]                                @118
 ub1 freespace[7959]                        @122
 ub1 rowdata[107]                           @8081
 ub4 tailchk                                @8188
BBED> p kcbh
struct kcbh, 20 bytes                       @0
   ub1 type_kcbh                            @0        0x06
   ub1 frmt_kcbh                            @1        0xa2
   ub1 spare1_kcbh                          @2        0x00
   ub1 spare2_kcbh                          @3        0x00
   ub4 rdba_kcbh                            @4        0x0040003d
   ub4 bas_kcbh                             @8        0x0000235b
   ub2 wrp_kcbh                             @12       0x0000
   ub1 seq_kcbh                             @14       0x01
   ub1 flg_kcbh                             @15       0x04 (KCBHFCKV)
   ub2 chkval_kcbh                          @16       0x7a85
   ub2 spare3_kcbh                          @18       0x0000
BBED> set block 60
        BLOCK#          60
BBED> d
 File: /opt/oracle/oradata/xifenfei/system01.dbf (0)
 Block: 60               Offsets:    0 to   63           Dba:0x00000000
------------------------------------------------------------------------
 80000000 0040003c ebe04bc9 00050204 6faa0000 01000000 00000006 29b3a204
 00040ca0 00020200 00000000 000a0000 00000002 0080009b 00000100 80000000
 <32 bytes per line>
BBED> d block 61
 File: /opt/oracle/oradata/xifenfei/system01.dbf (0)
 Block: 61               Offsets:    0 to   63           Dba:0x00000000
------------------------------------------------------------------------
 06a20000 0040003d 0000235b 00000104 7a850000 01000000 00000006 00001837
 00001738 00020200 00000000 0007002e 00000002 00800075 00012300 80000000
 <32 bytes per line>
BBED> set block 60
        BLOCK#          60
BBED> m /x 06a2
 File: /opt/oracle/oradata/xifenfei/system01.dbf (0)
 Block: 60               Offsets:    0 to   63           Dba:0x00000000
------------------------------------------------------------------------
 06a20000 0040003c ebe04bc9 00050204 6faa0000 01000000 00000006 29b3a204
 00040ca0 00020200 00000000 000a0000 00000002 0080009b 00000100 80000000
 <32 bytes per line>
BBED> map
 File: /opt/oracle/oradata/xifenfei/system01.dbf (0)
 Block: 60                                    Dba:0x00000000
------------------------------------------------------------
 KTB Data Block (Table/Cluster)
 struct kcbh, 20 bytes                      @0
 struct ktbbh, 72 bytes                     @20
 struct kdbh, 14 bytes                      @92
 struct kdbt[3], 12 bytes                   @106
 sb2 kdbr[2]                                @118
 ub1 freespace[7598]                        @122
 ub1 rowdata[468]                           @7720
 ub4 tailchk                                @8188
BBED> sum apply
Check value for File 0, Block 60:
current = 0xe908, required = 0xe908
BBED> verify
DBVERIFY - Verification starting
FILE = /opt/oracle/oradata/xifenfei/system01.dbf
BLOCK = 60
DBVERIFY - Verification complete
Total Blocks Examined         : 1
Total Blocks Processed (Data) : 1
Total Blocks Failing   (Data) : 0
Total Blocks Processed (Index): 0
Total Blocks Failing   (Index): 0
Total Blocks Empty            : 0
Total Blocks Marked Corrupt   : 0
Total Blocks Influx           : 0
BBED>

尝试启动数据库

Sat Nov 22 15:51:33 2014
alter database open
Sat Nov 22 15:51:34 2014
Thread 1 opened at log sequence 7
  Current log# 8 seq# 7 mem# 0: /data2/oradata/redo0802.log
Successful open of redo thread 1
Sat Nov 22 15:51:34 2014
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Sat Nov 22 15:51:34 2014
SMON: enabling cache recovery
SMON: enabling tx recovery
Sat Nov 22 15:51:34 2014
Database Characterset is ZHS16CGB231280
Hex dump of (file 1, block 498) in trace file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_2818.trc
Corrupt block relative dba: 0x004001f2 (file 1, block 498)
Bad check value found during buffer read
Data in bad block:
 type: 6 format: 2 rdba: 0x004001f2
 last change scn: 0x0007.49499ca1 seq: 0x1 flg: 0x06
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x9ca10601
 check value in block header: 0xe458
 computed block checksum: 0x9720
Reread of rdba: 0x004001f2 (file 1, block 498) found same corrupted data
Sat Nov 22 15:51:35 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_smon_2803.trc:
ORA-00600: internal error code, arguments: [4000], [12], [], [], [], [], [], []
replication_dependency_tracking turned off (no async multimaster replication found)
Starting background process QMNC
QMNC started with pid=18, OS id=3000
Sat Nov 22 15:51:36 2014
Completed: alter database open
Sat Nov 22 15:51:36 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_3010.trc:
ORA-00600: internal error code, arguments: [6807], [AUDSES$], [144], [], [], [], [], []
Sat Nov 22 15:51:37 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_mmon_2809.trc:
ORA-00600: internal error code, arguments: [6807], [WRI$_ALERT_SEQUENCE], [8783], [], [], [], [], []
Sat Nov 22 15:51:37 2014
Non-fatal internal error happenned while SMON was doing non-existent object cleanup.
SMON encountered 1 out of maximum 100 non-fatal internal errors.
Sat Nov 22 15:51:38 2014
ORA-600 encountered when generating server alert SMG-3000
Sat Nov 22 15:51:38 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_mmon_2809.trc:
ORA-00600: internal error code, arguments: [ktcpoptx_0], [0x772705E60], [], [], [], [], [], []

只要出现ORA-600 4000和ORA-600 6807错误,其中ORA-600 6807错误比较明显是由于seq$坏块,导致AUDSES$ seq异常导致。ORA-600 4000应该是回滚段异常,继续分析回滚段

SQL> select name,ts#,status$ from undo$;
NAME                                  TS#    STATUS$
------------------------------ ---------- ----------
SYSTEM                                  0          2
_SYSSMU1$                               1          2
_SYSSMU2$                               1          2
_SYSSMU3$                               1          2
…………
_SYSSMU168$                             1          2
_SYSSMU169$                             1          2

这里很异常,system回滚段在数据库open之后,按照常理不可能处于STATUS$=2(OFFLINE)状态。而且其他回滚段全部为OFFLINE状态也属于异常情况.而且尝试drop undo报ORA-01561,另外在dba_rollback_segs中无SYSTEM(查询结果忘记保存)

SQL> drop tablespace undotbs1 including contents;
drop tablespace undotbs1 including contents
*
ERROR at line 1:
ORA-01561: failed to remove all objects in the tablespace specified

通过这一系列很怀疑是由于bbed 修改了undo$等相关基表信息导致现在system中的undo信息混乱.信息反馈给客户后,客户想起来昨天给他们恢复的公司在bbed操作前备份了system01.dbf.突然感觉救星来了.实在怕不懂bbed的人折腾bbed

dbv检测备份文件

DBVERIFY - Verification starting : FILE = /data3/backup/system01.dbf_bak
Page 60 is marked corrupt
Corrupt block relative dba: 0x0040003c (file 1, block 60)
Bad header found during dbv:
Data in bad block:
 type: 128 format: 0 rdba: 0x0040003c
 last change scn: 0x0005.ebe04bc9 seq: 0x2 flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x4bc90602
 check value in block header: 0x6faa
 computed block checksum: 0x0
Block Checking: DBA = 4194802, Block Type = KTB-managed data block
data header at 0x1002ef05c
kdbchk: row locked by non-existent transaction
        table=0   slot=4
        lockid=1   ktbbhitc=2
Page 498 failed with check code 6101
DBVERIFY - Verification complete
Total Pages Examined         : 786432
Total Pages Processed (Data) : 201131
Total Pages Failing   (Data) : 1
Total Pages Processed (Index): 221394
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 60265
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 303641
Total Pages Marked Corrupt   : 1
Total Pages Influx           : 0
Highest block SCN            : 1229823477 (7.1229823477)

好家伙只有一个物理坏块和一个逻辑坏块,而对于物理坏块block 60已经知道如何修复,逻辑坏块可以尝试设置隐含参数跳过去,bbed修改相关block(同上步骤)

再次启动数据库

dd if=/opt/oracle/oradata/xifenfei/system01.dbf bs=8192 count=2 of=/tmp/system01.2
dd if=/tmp/system01.2 of=/data3/backup/system01.dbf_bak bs=8192 count=2 conv=notrunc
Sat Nov 22 17:52:50 2014
Database mounted in Exclusive Mode
Completed: ALTER DATABASE   MOUNT
Sat Nov 22 17:53:38 2014
alter database rename file '/opt/oracle/oradata/xifenfei/system01.dbf' to '/data3/backup/system01.dbf_bak'
Sat Nov 22 17:53:39 2014
Completed: alter database rename file '/opt/oracle/oradata/xifenfei/system01.dbf' to '/data3/backup/system01.dbf_bak'
Sat Nov 22 17:55:43 2014
alter database open
Sat Nov 22 17:55:48 2014
LGWR: STARTING ARCH PROCESSES
ARC0 started with pid=18, OS id=15858
Sat Nov 22 17:56:10 2014
ARC0: Archival started
ARC1: Archival started
LGWR: STARTING ARCH PROCESSES COMPLETE
ARC1 started with pid=17, OS id=15879
Sat Nov 22 17:56:19 2014
Thread 1 opened at log sequence 7
  Current log# 8 seq# 7 mem# 0: /data2/oradata/redo0802.log
Successful open of redo thread 1
Sat Nov 22 17:56:19 2014
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Sat Nov 22 17:56:19 2014
SMON: enabling cache recovery
SMON: enabling tx recovery
Sat Nov 22 17:56:20 2014
ARC1: STARTING ARCH PROCESSES
Sat Nov 22 17:56:20 2014
ARC0: Becoming the 'no FAL' ARCH
ARC0: Becoming the 'no SRL' ARCH
Sat Nov 22 17:56:22 2014
Database Characterset is ZHS16CGB231280
replication_dependency_tracking turned off (no async multimaster replication found)
Starting background process QMNC
Sat Nov 22 17:56:33 2014
ARC2: Archival started
ARC1: STARTING ARCH PROCESSES COMPLETE
ARC1: Becoming the heartbeat ARCH
ARC2 started with pid=23, OS id=15928
QMNC started with pid=25, OS id=15996
Sat Nov 22 17:57:11 2014
Completed: alter database open
Sat Nov 22 17:57:18 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_16010.trc:
ORA-00600: internal error code, arguments: [4511], [], [], [], [], [], [], []
Sat Nov 22 17:57:26 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_16012.trc:
ORA-00600: internal error code, arguments: [4511], [], [], [], [], [], [], []
Sat Nov 22 17:58:17 2014
Starting background process EMN0
Sat Nov 22 18:00:03 2014
Shutting down instance: further logons disabled
EMN0 started with pid=71, OS id=16421
Sat Nov 22 18:00:12 2014
SMON: Restarting fast_start parallel rollback
Sat Nov 22 18:00:23 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_p000_15951.trc:
ORA-00600: internal error code, arguments: [4198], [9], [], [], [], [], [], []
Sat Nov 22 18:00:24 2014
Stopping background process CJQ0
Sat Nov 22 18:00:24 2014
Stopping background process QMNC
Sat Nov 22 18:00:27 2014
Doing block recovery for file 2 block 41
Block recovery from logseq 7, block 180883 to scn 214748389244
Sat Nov 22 18:00:27 2014
Recovery of Online Redo Log: Thread 1 Group 8 Seq 7 Reading mem 0
  Mem# 0 errs 0: /data2/oradata/redo0802.log
Block recovery stopped at EOT rba 7.180988.16
Block recovery completed at rba 7.180988.16, scn 50.24441
Sat Nov 22 18:00:32 2014
Stopping background process MMNL
Sat Nov 22 18:00:38 2014
Stopping background process MMON
Sat Nov 22 18:00:41 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_smon_15395.trc:
ORA-00600: internal error code, arguments: [4137], [], [], [], [], [], [], []
Sat Nov 22 18:00:42 2014
ORACLE Instance xifenfei (pid = 9) - Error 600 encountered while recovering transaction (3, 4).
Sat Nov 22 18:00:42 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_smon_15395.trc:
ORA-00600: internal error code, arguments: [4137], [], [], [], [], [], [], []

这里都是很常规的错误,查询undo$也已经正常,重建新undo表空间删除老undo,然后alert日志中无其他报错,数据库恢复至此完成,建议客户导出导入重建数据库

Oracle 12C的第一次异常恢复—文件头坏块

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:Oracle 12C的第一次异常恢复—文件头坏块

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

接到第一个使用Oracle 12C作为生产库的恢复救援.有两个业务数据文件报文件头损坏,其他数据文件全部是9月份的一次备份,在当前的条件下,希望我们能够帮他们恢复出来业务文件中的数据
数据库版本信息

SQL> select * from v$version;
BANNER                                                                               CON_ID
-------------------------------------------------------------------------------- ----------
Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production              0
PL/SQL Release 12.1.0.1.0 - Production                                                    0
CORE    12.1.0.1.0      Production                                                        0
TNS for Linux: Version 12.1.0.1.0 - Production                                            0
NLSRTL Version 12.1.0.1.0 - Production                                                    0

数据库故障
具体脚本请参考:数据库恢复检查脚本(Oracle Database Recovery Check)
控制文件信息
1
控制文件中关于数据文件信息
2
数据文件头信息
3
alert日志报错

Reading datafile '/app/oracle/oradata/freetouch/sales.dbf' for corruption at rdba: 0x00000001 (file 4, block 1)
Reread (file 4, block 1) found same corrupt data (no logical check)
Hex dump of (file 5, block 1) in trace file /app/oracle/diag/rdbms/valuenet/valuenet/trace/valuenet_ora_12384.trc
Corrupt block relative dba: 0x00000001 (file 5, block 1)
Fractured block found during kcvxfh v8
Data in bad block:
 type: 0 format: 2 rdba: 0x00000001
 last change scn: 0x0000.00000000 seq: 0x1 flg: 0x05
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x00000001
 check value in block header: 0xa701
 computed block checksum: 0x0
Reading datafile '/app/oracle/oradata/freetouch/drp_200200' for corruption at rdba: 0x00000001 (file 5, block 1)
Reread (file 5, block 1) found same corrupt data (no logical check)
Hex dump of (file 4, block 1) in trace file /app/oracle/diag/rdbms/valuenet/valuenet/trace/valuenet_ora_12384.trc
Corrupt block relative dba: 0x00000001 (file 4, block 1)
Fractured block found during kcvxfh v8
Data in bad block:
 type: 0 format: 2 rdba: 0x00000001
 last change scn: 0x0000.00000000 seq: 0x1 flg: 0x05
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x00000001
 check value in block header: 0xa701
 computed block checksum: 0x0

odu无法识别异常文件

[oracle@db odu]$ ./odu
Oracle Data Unloader trial version 4.1.3
Copyright (c) 2008,2009,2010,2011 XiongJun. All rights reserved.
loading default config.......
byte_order little
block_size  8192
db_timezone -7
Invalid db timezone:-7
client_timezone 8
Invalid client timezone:8
asmfile_extract_path /home/oracle/hongye/odu/data
data_path  /home/oracle/hongye/odu/data
lob_path  /home/oracle/hongye/odu/data
charset_name ZHS16GBK
ncharset_name AL16UTF16
output_form  dmp
error at line 10.
lob_storage infile
clob_byte_order big
trace_level 1
delimiter |
unload_deleted yes
file_header_offset 0
is_tru64 no
record_row_addr no
convert_clob_charset yes
load config file 'config.txt' successful
loading default asm disk file ......
can not open file 'asmdisk.txt', error message:No such file or directory.
loading default control file ......
unknown file format '/app/oracle/oradata/freetouch/sales.dbf'
unknown file format '/app/oracle/oradata/freetouch/drp_200200'
 ts#   fn  rfn bsize   blocks bf offset filename
---- ---- ---- ----- -------- -- ------ --------------------------------------------
   1    1    1  8192   194560 N       0 /app/oracle/oradata/freetouch/system01.dbf
   6    2   10  8192    45840 N       0 /app/oracle/oradata/freetouch/example01.dbf
   1    3    3  8192   907520 N       0 /app/oracle/oradata/freetouch/sysaux01.dbf
   4 1024   10  8192        0 N       0 /app/oracle/oradata/freetouch/sales.dbf
   5 1024    9  8192        0 N       0 /app/oracle/oradata/freetouch/drp_200200
   4    6    6  8192   128320 N       0 /app/oracle/oradata/freetouch/users01.dbf
   7    7    7  8192   780288 N       0 /app/oracle/oradata/freetouch/undotbs03.dbf
  11    8    8  8192    25600 N       0 /app/oracle/oradata/freetouch/indx01.dbf
load control file 'control.txt' successful
loading dictionary data......done
loading scanned data......done

dul无法识别异常文件

[oracle@db dul]$ ./dul
Data UnLoader: 10.2.0.5.32 - Internal Only - on Sun Nov  2 23:34:42 2014
with 64-bit io functions
Copyright (c) 1994 2014 Bernard van Duijnen All rights reserved.
 Strictly Oracle Internal Use Only
DUL: Warning: ulimit process stack size is only 33554432
Found db_id = 270587870
Found db_name = VALUENET
DUL: Warning: Cannot verify file number for /app/oracle/oradata/freetouch/sales.dbf
DUL: Warning: First four bytes(76 162 0 0) of block 2 are not the start of a proper data block header
DUL: Warning: Block corruption or configuration error
DUL: Warning: Check db_block_size and/or osd_file_leader_size and/or file offset
DUL: Error: File Number can only be zero for Single Tablespace Datafiles
DUL: Warning: Cannot verify file number for /app/oracle/oradata/freetouch/drp_200200
DUL: Warning: First four bytes(76 162 0 0) of block 2 are not the start of a proper data block header
DUL: Warning: Block corruption or configuration error
DUL: Warning: Check db_block_size and/or osd_file_leader_size and/or file offset
DUL: Error: File Number can only be zero for Single Tablespace Datafiles
DUL> show datafiles;
ts# rf# start   blocks offs open  err file name
  0   1     0   194561    0    1    0 /app/oracle/oradata/freetouch/system01.dbf
  1   3     0   907521    0    1    0 /app/oracle/oradata/freetouch/sysaux01.dbf
  4   6     0   128321    0    1    0 /app/oracle/oradata/freetouch/users01.dbf
  7   7     0   780289    0    1    0 /app/oracle/oradata/freetouch/undotbs03.dbf
 11   8     0    25601    0    1    0 /app/oracle/oradata/freetouch/indx01.dbf
  6  10     0    45841    0    1    0 /app/oracle/oradata/freetouch/example01.dbf

该异常文件使用dul/odu均无法正常识别.证明文件头确实已经损坏

dbv 检测

[oracle@db trace]$ dbv file=/app/oracle/oradata/freetouch/drp_200200
DBVERIFY: Release 12.1.0.1.0 - Production on Sun Nov 2 14:08:34 2014
Copyright (c) 1982, 2013, Oracle and/or its affiliates.  All rights reserved.
DBVERIFY - Verification starting : FILE = /app/oracle/oradata/freetouch/drp_200200
DBVERIFY - Verification complete
Total Pages Examined         : 194560
Total Pages Processed (Data) : 114596
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 26198
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 37787
Total Pages Processed (Seg)  : 1
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 15979
Total Pages Marked Corrupt   : 0
Total Pages Influx           : 0
Total Pages Encrypted        : 0
Highest block SCN            : 129603862 (0.129603862)
[oracle@db ~]$ dbv file=/app/oracle/oradata/freetouch/sales.dbf
DBVERIFY: Release 12.1.0.1.0 - Production on Sun Nov 2 23:12:05 2014
Copyright (c) 1982, 2013, Oracle and/or its affiliates.  All rights reserved.
DBVERIFY - Verification starting : FILE = /app/oracle/oradata/freetouch/sales.dbf
DBVERIFY - Verification complete
Total Pages Examined         : 655360
Total Pages Processed (Data) : 294938
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 233404
Total Pages Failing   (Index): 0
Total Pages Processed (Lob)  : 38
Total Pages Failing   (Lob)  : 0
Total Pages Processed (Other): 23252
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 103728
Total Pages Marked Corrupt   : 0
Total Pages Influx           : 0
Total Pages Encrypted        : 0
Highest block SCN            : 134665298 (0.134665298)

dbv检测结果无坏块,但是v$datafile_header和alert日志中报坏块,初步判断是由于该文件是bigfile,dbv未检测到文件头坏块,实际该该数据文件头损坏,其他block正常.所幸的是该库有9月份的rman备份(中间归档丢失),因此使用rman还原出来9月份的数据文件,然后使用dd拷贝两个 block(block 0和block 1)到异常文件.

[root@db freetouch]# dd if=/app1/oracle/oradata/freetouch/sales.dbf bs=8192 count=2 of=/tmp/odu/sales.2
2+0 records in
2+0 records out
16384 bytes (16 kB) copied, 0.0176368 s, 929 kB/s
[root@db freetouch]# dd if=/tmp/odu/sales.2 of=/app/oracle/oradata/freetouch/sales.dbf bs=8192 count=2 conv=notrunc
2+0 records in
2+0 records out
16384 bytes (16 kB) copied, 6.4281e-05 s, 255 MB/s
[root@db freetouch]# dd if=/app1/oracle/oradata/freetouch/drp_200200 bs=8192 count=2 of=/tmp/odu/drp_200200.2
2+0 records in
2+0 records out
16384 bytes (16 kB) copied, 0.0185934 s, 881 kB/s
[root@db freetouch]# dd if=/tmp/odu/drp_200200.2 of=/app/oracle/oradata/freetouch/drp_200200 bs=8192 count=2 conv=notrunc
2+0 records in
2+0 records out
16384 bytes (16 kB) copied, 6.4419e-05 s, 254 MB/s

尝试恢复数据库

SQL> recover database;
ORA-00283: recovery session canceled due to errors
ORA-01610: recovery using the BACKUP CONTROLFILE option must be done
SQL> recover database using backup controlfile;
ORA-00283: recovery session canceled due to errors
ORA-19909: datafile 4 belongs to an orphan incarnation
ORA-01110: data file 4: '/app/oracle/oradata/freetouch/sales.dbf'

使用bbed修改相关文件头,然后继续恢复
具体见:bbed解决ORA-01190类似方法处理

SQL> recover database using backup controlfile;
ORA-00279: change 129603904 generated at 11/02/2014 19:19:54 needed for thread
1
ORA-00289: suggestion :
/app/oracle/recovery_area/VALUENET/archivelog/2014_11_02/o1_mf_1_1_%u_.arc
ORA-00280: change 129603904 for thread 1 is in sequence #1
Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
cancel;
ORA-00308: cannot open archived log 'cancel;'
ORA-27037: unable to obtain file status
Linux-x86_64 Error: 2: No such file or directory
Additional information: 3
Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
cancel
Media recovery cancelled.
SQL> alter database open resetlogs;
alter database open resetlogs
*
ERROR at line 1:
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00600: internal error code, arguments: [2662], [0], [129603911], [0],
[129603913], [29360256], [], [], [], [], [], []
Process ID: 19881
Session ID: 1 Serial number: 3

出现ORA-600[2662]错误,因为scn相差比较小,重启数据库机器,出现ORA-600[4194]错误

SQL> startup pfile='/tmp/pfile.txt' mount
ORACLE instance started.
Total System Global Area 2505338880 bytes
Fixed Size                  2291472 bytes
Variable Size             973080816 bytes
Database Buffers         1526726656 bytes
Redo Buffers                3239936 bytes
Database mounted.
SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-01113: file 1 needs media recovery
ORA-01110: data file 1: '/app/oracle/oradata/freetouch/system01.dbf'
SQL> recover database ;
ORA-00283: recovery session canceled due to errors
ORA-16433: The database or pluggable database must be opened in read/write
mode.

重建控制文件后继续恢复

SQL> alter database open resetlogs;
alter database open resetlogs
*
ERROR at line 1:
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00600: internal error code, arguments: [4194], [46], [19], [], [], [], [],
[], [], [], [], []
Process ID: 20351
Session ID: 1 Serial number: 3

设置undo_management=MANUAL然后继续恢复

[oracle@db tmp]$ sqlplus / as sysdba
SQL*Plus: Release 12.1.0.1.0 Production on Sun Nov 2 19:29:45 2014
Copyright (c) 1982, 2013, Oracle.  All rights reserved.
Connected to an idle instance.
SQL> startup pfile='/tmp/pfile.txt'
ORACLE instance started.
Total System Global Area 2505338880 bytes
Fixed Size                  2291472 bytes
Variable Size             973080816 bytes
Database Buffers         1526726656 bytes
Redo Buffers                3239936 bytes
Database mounted.
Database opened.

这次的恢复也证明Oracle 12C确实有着越来越多的用户在使用.

ORA-27069: skgfdisp: 尝试在文件范围外执行 I/O

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:ORA-27069: skgfdisp: 尝试在文件范围外执行 I/O

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

接到网友技术支持请求,win 2003 ntfs格式文件系统,Oracle 8.1.7版本,主机重启后,数据库无法正常启动,offline datafile 15,数据库open成功,但是datafile 无法正常online,报错为:ORA-27069: skgfdisp: attempt to do I/O beyond the range of the file,请求协助处理

SQL> recover datafile 'D:\ORACLE\ORADATA\ORCL\ZSF_DATA.DBF';
ORA-00283: 恢复会话因错误而取消
ORA-01115: 从文件 15 读取块时出现 IO 错误 (块 # 1030071)
ORA-01110: 数据文件 15: 'D:\ORACLE\ORADATA\ORCL\ZSF_DATA.DBF'
ORA-27069: skgfdisp: 尝试在文件范围外执行 I/O
OSD-04026: 无效的参数经过. (OS 1030071)

使用bbed,成功online datafile 15

Tue Oct 28 16:30:35 2014
ALTER DATABASE RECOVER  datafile 15
Tue Oct 28 16:30:35 2014
Media Recovery Datafile: 15
Media Recovery Start
Media Recovery Log
Recovery of Online Redo Log: Thread 1 Group 1 Seq 245110 Reading mem 0
  Mem# 0 errs 0: D:\ORACLE\ORADATA\ORCL\REDO03.LOG
Media Recovery failed with error 1115
ORA-283 signalled during: ALTER DATABASE RECOVER  datafile 15  ...
Tue Oct 28 16:32:53 2014
Shutting down instance (abort)
License high water mark = 6
Instance terminated by USER, pid = 1548
Starting up ORACLE RDBMS Version: 8.1.7.0.0.
System parameters with non-default values:
  processes                = 600
  shared_pool_size         = 52428800
  large_pool_size          = 20971520
  java_pool_size           = 20971520
  control_files            = D:\oracle\oradata\ORCL\control01.ctl, D:\oracle\oradata\ORCL\control02.ctl
  db_block_buffers         = 19200
  db_block_size            = 8192
  compatible               = 8.1.0
  log_buffer               = 32768
  log_checkpoint_interval  = 10000
  log_checkpoint_timeout   = 1800
  db_files                 = 1024
  db_file_multiblock_read_count= 8
  max_enabled_roles        = 30
  remote_login_passwordfile= EXCLUSIVE
  global_names             = TRUE
  distributed_transactions = 500
  instance_name            = ORCL
  service_names            = ORCL
  mts_dispatchers          = (PROTOCOL=TCP)(PRE=oracle.aurora.server.SGiopServer)
  open_links               = 4
  sort_area_size           = 65536
  sort_area_retained_size  = 65536
  db_name                  = ORCL
  open_cursors             = 500
  ifile                    = D:\oracle\admin\ORCL\pfile\init.ora
  os_authent_prefix        =
  job_queue_processes      = 4
  job_queue_interval       = 10
  parallel_max_servers     = 5
  background_dump_dest     = D:\oracle\admin\ORCL\bdump
  user_dump_dest           = D:\oracle\admin\ORCL\udump
  max_dump_file_size       = 10240
  oracle_trace_collection_name=
PMON started with pid=2
DBW0 started with pid=3
LGWR started with pid=4
CKPT started with pid=5
SMON started with pid=6
RECO started with pid=7
SNP0 started with pid=8
SNP1 started with pid=9
SNP2 started with pid=10
SNP3 started with pid=11
Tue Oct 28 16:33:01 2014
starting up 1 shared server(s) ...
starting up 1 dispatcher(s) for network address '(ADDRESS=(PARTIAL=YES)(PROTOCOL=TCP))'...
Tue Oct 28 16:33:02 2014
ALTER DATABASE   MOUNT
Tue Oct 28 16:33:06 2014
Successful mount of redo thread 1, with mount id 1389958722.
Tue Oct 28 16:33:06 2014
Database mounted in Exclusive Mode.
Completed: ALTER DATABASE   MOUNT
Tue Oct 28 16:33:49 2014
ALTER DATABASE RECOVER  database until cancel
Tue Oct 28 16:33:49 2014
Media Recovery Start
Media Recovery Log
kcrrga: Warning.  Log sequence in archive filename wrapped
to fix length as indicated by %S in LOG_ARCHIVE_FORMAT.
Old log archive with same name might be overwritten.
ORA-279 signalled during: ALTER DATABASE RECOVER  database until cancel  ...
Tue Oct 28 16:34:03 2014
ALTER DATABASE RECOVER    LOGFILE 'D:\ORACLE\ORADATA\ORCL\REDO02.LOG'
Tue Oct 28 16:34:03 2014
Media Recovery Log D:\ORACLE\ORADATA\ORCL\REDO02.LOG
Incomplete recovery applied all redo ever generated.
Recovery completed through change %s139866389
Media Recovery Complete
Completed: ALTER DATABASE RECOVER    LOGFILE 'D:\ORACLE\ORADA
Tue Oct 28 16:34:29 2014
alter database datafile 15 online
Tue Oct 28 16:34:29 2014
Completed: alter database datafile 15 online
Tue Oct 28 16:34:36 2014
alter database open resetlogs
RESETLOGS is being done without consistancy checks. This may result
in a corrupted database. The database should be recreated.
RESETLOGS after incomplete recovery UNTIL CHANGE 139866389
Tue Oct 28 16:34:38 2014
Thread 1 opened at log sequence 1
  Current log# 2 seq# 1 mem# 0: D:\ORACLE\ORADATA\ORCL\REDO02.LOG
Successful open of redo thread 1.
Tue Oct 28 16:34:38 2014
SMON: enabling cache recovery
Tue Oct 28 16:34:38 2014
Dictionary check beginning
Dictionary check complete
Tue Oct 28 16:34:39 2014
SMON: enabling tx recovery
Tue Oct 28 16:34:44 2014
Completed: alter database open resetlogs

数据库datafile 15 online成功后,客户操作业务继续发生ORA-600[ktsxs_add2]错误

Tue Oct 28 17:07:42 2014
Errors in file D:\oracle\admin\ORCL\udump\ORA02340.TRC:
ORA-00600: 内部错误代码,自变量: [ktsxs_add2], [14], [15], [42534], [5732], [5733], [], []
Tue Oct 28 17:07:53 2014
Errors in file D:\oracle\admin\ORCL\udump\ORA02340.TRC:
ORA-00600: 内部错误代码,自变量: [ktsxs_add2], [14], [15], [42534], [5732], [5733], [], []
Tue Oct 28 17:08:03 2014
Errors in file D:\oracle\admin\ORCL\udump\ORA02340.TRC:
ORA-00600: 内部错误代码,自变量: [ktsxs_add2], [14], [15], [42534], [5732], [5733], [], []
Tue Oct 28 17:08:16 2014
Errors in file D:\oracle\admin\ORCL\udump\ORA02340.TRC:
ORA-00600: 内部错误代码,自变量: [ktsxs_add2], [14], [15], [42534], [5732], [5733], [], []
Tue Oct 28 17:08:23 2014
Errors in file D:\oracle\admin\ORCL\udump\ORA02308.TRC:
ORA-00600: 内部错误代码,自变量: [ktsxs_add2], [14], [15], [42534], [5732], [5733], [], []
Tue Oct 28 17:08:31 2014
Errors in file D:\oracle\admin\ORCL\udump\ORA02340.TRC:
ORA-00600: 内部错误代码,自变量: [ktsxs_add2], [14], [15], [42534], [5732], [5733], [], []
Tue Oct 28 17:08:38 2014
Errors in file D:\oracle\admin\ORCL\udump\ORA02308.TRC:
ORA-00600: 内部错误代码,自变量: [ktsxs_add2], [14], [15], [42534], [5732], [5733], [], []

通过分析相关日志发现是insert插入表报错,很好理解,该库的datafile 15已经超过了系统的限制,现在继续插入数据,因此报错,查询可能异常对象

SQL> col segment_name for a20
SQL> SELECT distinct OWNER, SEGMENT_NAME, SEGMENT_TYPE, A.PARTITION_NAME
  2    FROM DBA_EXTENTS A
  3   WHERE FILE_ID = 15
  4     AND 1030071 <= BLOCK_ID;
OWNER                          SEGMENT_NAME         SEGMENT_TYPE
------------------------------ -------------------- ------------------
PARTITION_NAME
------------------------------
ZSF                            DETAIL               TABLE
ZSF                            DETAIL1              INDEX
ZSF                            DETAIL2              INDEX
OWNER                          SEGMENT_NAME         SEGMENT_TYPE
------------------------------ -------------------- ------------------
PARTITION_NAME
------------------------------
ZSF                            DETAIL3              INDEX
ZSF                            DETAIL4              INDEX
ZSF                            FK_RECI_ORD          INDEX
OWNER                          SEGMENT_NAME         SEGMENT_TYPE
------------------------------ -------------------- ------------------
PARTITION_NAME
------------------------------
ZSF                            PREPAY1              INDEX
ZSF                            RECEDETAIL1          INDEX

创建新表空间

Create tablespace zsf_new datafile  'D:\ORACLE\ORADATA\ORCL\ZSF_DATA_new01.dbf' size 4096m;
alter tablespace zsf_new add datafile 'D:\ORACLE\ORADATA\ORCL\ZSF_DATA_new02.dbf'
size 128m autoextend on next 128M maxsize 4096m;

迁移异常对象到新表空间

alter table ZSF.DETAIL move tablespace ZSF_new;
alter index ZSF.DETAIL1 rebuild tablespace ZSF_new;
alter index ZSF.DETAIL2 rebuild tablespace ZSF_new;
alter index ZSF.DETAIL3 rebuild tablespace ZSF_new;
alter index ZSF.DETAIL4 rebuild tablespace ZSF_new;
alter index ZSF.FK_RECI_ORD rebuild tablespace ZSF_new;
alter index ZSF.PREPAY1 rebuild tablespace ZSF_new;
alter index ZSF.RECEDETAIL1 rebuild tablespace ZSF_new;

然后对于datafile 15所在表空间增加新文件,因为已经迁移了异常对象,然后resize datafile 15小于8G,关闭自扩展,至此该数据库恢复完成

bootstrap$核心index(I_OBJ1,I_USER1,I_FILE#_BLOCK#,I_IND1,I_TS#,I_CDEF1等)异常恢复—ORA-00701错误解决

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:bootstrap$核心index(I_OBJ1,I_USER1,I_FILE#_BLOCK#,I_IND1,I_TS#,I_CDEF1等)异常恢复—ORA-00701错误解决

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

在Oracle使用的过程中,经常遭遇一些核心index出现异常,导致数据库无法正常使用,特别是在数据库open的情况下,因为出现这些bootstrap$中的部分index是无法通过设置event38003进行重建,从而导致数据库无法正常使用,最常见异常index 有:I_ICOL1, I_TS1, I_CDEF1, I_CDEF2, I_CDEF3, I_CDEF4, I_PROXY_DATA$, I_IND1, I_TS#, I_UNDO1, I_UNDO2, I_COBJ#, I_USER1, I_USER2, I_CON1, I_CON2, I_FILE1, I_FILE2, I_FILE#_BLOCK#, I_USER#, I_OBJ#, I_PROXY_ROLE_DATA$_1, I_PROXY_ROLE_DATA$_2, I_CCOL1, I_CCOL2, I_TAB1, I_COL1, I_COL2, I_COL3, I_OBJ1, I_OBJ2, I_OBJ3, I_OBJ4, I_OBJ5
重建的index异常报错有:ORA-00701,ORA-01410,ORA-08102,ORA-08103,ORA-01578,ORA-01499,ORA-00600等各类错误
因为处理细节复杂,无法一一描述出来.这里假设Oracle数据库一个非常核心的表OBJ$中的I_OBJ1异常,现在准备重建它,大概步骤如下:
1. 假设异常index信息(I_OBJ1)

SQL> alter index sys.i_obj1 rebuild;
alter index sys.i_obj1 rebuild
*
ERROR at line 1:
ORA-00701: object necessary for warmstarting database cannot be altered
SQL> SELECT SQL_TEXT FROM BOOTSTRAP$ WHERE SQL_TEXT LIKE '%obj%';
SQL_TEXT
--------------------------------------------------------------------------------
CREATE UNIQUE INDEX I_OBJ1 ON OBJ$(OBJ#,OWNER#,TYPE#) PCTFREE 10 INITRANS 2 MAXT
RANS 255 STORAGE (  INITIAL 64K NEXT 1024K MINEXTENTS 1 MAXEXTENTS 2147483645 PC
TINCREASE 0 OBJNO 36 EXTENTS (FILE 1 BLOCK 336))

2. 创建获取异常index所属表

SQL>  select table_name from dba_indexes where owner='SYS' AND INDEX_NAME='I_OBJ1';
TABLE_NAME
--------------------------------------------------------------------------------
OBJ$

3. 获取obj$ ddl 语句

SQL> set long 10000
SQL> set pages 1000
SQL>  SELECT DBMS_METADATA.GET_DDL('TABLE','OBJ$','SYS') FROM DUAL;
DBMS_METADATA.GET_DDL('TABLE','OBJ$','SYS')
--------------------------------------------------------------------------------
  CREATE TABLE "SYS"."OBJ$"
   (    "OBJ#" NUMBER NOT NULL ENABLE,
        "DATAOBJ#" NUMBER,
        "OWNER#" NUMBER NOT NULL ENABLE,
        "NAME" VARCHAR2(128) NOT NULL ENABLE,
        "NAMESPACE" NUMBER NOT NULL ENABLE,
        "SUBNAME" VARCHAR2(128),
        "TYPE#" NUMBER NOT NULL ENABLE,
        "CTIME" DATE NOT NULL ENABLE,
        "MTIME" DATE NOT NULL ENABLE,
        "STIME" DATE NOT NULL ENABLE,
        "STATUS" NUMBER NOT NULL ENABLE,
        "REMOTEOWNER" VARCHAR2(128),
        "LINKNAME" VARCHAR2(128),
        "FLAGS" NUMBER,
        "OID$" RAW(16),
        "SPARE1" NUMBER,
        "SPARE2" NUMBER,
        "SPARE3" NUMBER,
        "SPARE4" VARCHAR2(1000),
        "SPARE5" VARCHAR2(1000),
        "SPARE6" DATE,
        "SIGNATURE" RAW(16),
        "SPARE7" NUMBER,
        "SPARE8" NUMBER,
        "SPARE9" NUMBER
   ) PCTFREE 10 PCTUSED 40 INITRANS 1 MAXTRANS 255
 NOCOMPRESS LOGGING
  STORAGE(INITIAL 16384 NEXT 106496 MINEXTENTS 1 MAXEXTENTS 2147483645
  PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1
  BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT)
  TABLESPACE "SYSTEM"

4. 获取obj$相关index 语句

SQL>  SELECT DBMS_METADATA.GET_DDL('INDEX',INDEX_NAME,'SYS') FROM dba_indexes where owner='SYS' AND TABLE_NAME='OBJ$';
DBMS_METADATA.GET_DDL('INDEX',INDEX_NAME,'SYS')
--------------------------------------------------------------------------------
  CREATE UNIQUE INDEX "SYS"."I_OBJ5" ON "SYS"."OBJ$" ("SPARE3", "NAME", "NAMESPA
CE", "TYPE#", "OWNER#", "REMOTEOWNER", "LINKNAME", "SUBNAME", "OBJ#")
  PCTFREE 10 INITRANS 2 MAXTRANS 255 COMPUTE STATISTICS
  STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
  PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1
  BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT)
  TABLESPACE "SYSTEM"
  CREATE INDEX "SYS"."I_OBJ4" ON "SYS"."OBJ$" ("DATAOBJ#", "TYPE#", "OWNER#")
  PCTFREE 10 INITRANS 2 MAXTRANS 255 COMPUTE STATISTICS
  STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
  PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1
  BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT)
  TABLESPACE "SYSTEM"
  CREATE INDEX "SYS"."I_OBJ3" ON "SYS"."OBJ$" ("OID$")
  PCTFREE 10 INITRANS 2 MAXTRANS 255 COMPUTE STATISTICS
  STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
  PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1
  BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT)
  TABLESPACE "SYSTEM"
  CREATE UNIQUE INDEX "SYS"."I_OBJ2" ON "SYS"."OBJ$" ("OWNER#", "NAME", "NAMESPA
CE", "REMOTEOWNER", "LINKNAME", "SUBNAME", "TYPE#", "SPARE3", "OBJ#")
  PCTFREE 10 INITRANS 2 MAXTRANS 255 COMPUTE STATISTICS
  STORAGE(INITIAL 16384 NEXT 106496 MINEXTENTS 1 MAXEXTENTS 2147483645
  PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1
  BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT)
  TABLESPACE "SYSTEM"
  CREATE UNIQUE INDEX "SYS"."I_OBJ1" ON "SYS"."OBJ$" ("OBJ#", "OWNER#", "TYPE#")
  PCTFREE 10 INITRANS 2 MAXTRANS 255 COMPUTE STATISTICS
  STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
  PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1
  BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT)
  TABLESPACE "SYSTEM"

5. 创建obj$img表和相关index

SQL>   CREATE TABLE "SYS"."OBJ$IMG"
  2     (    "OBJ#" NUMBER NOT NULL ENABLE,
  3          "DATAOBJ#" NUMBER,
  4          "OWNER#" NUMBER NOT NULL ENABLE,
  5          "NAME" VARCHAR2(128) NOT NULL ENABLE,
  6          "NAMESPACE" NUMBER NOT NULL ENABLE,
  7          "SUBNAME" VARCHAR2(128),
  8          "TYPE#" NUMBER NOT NULL ENABLE,
  9          "CTIME" DATE NOT NULL ENABLE,
 10          "MTIME" DATE NOT NULL ENABLE,
 11          "STIME" DATE NOT NULL ENABLE,
 12          "STATUS" NUMBER NOT NULL ENABLE,
 13          "REMOTEOWNER" VARCHAR2(128),
 14          "LINKNAME" VARCHAR2(128),
 15          "FLAGS" NUMBER,
 16          "OID$" RAW(16),
 17          "SPARE1" NUMBER,
 18          "SPARE2" NUMBER,
 19          "SPARE3" NUMBER,
 20          "SPARE4" VARCHAR2(1000),
 21          "SPARE5" VARCHAR2(1000),
 22          "SPARE6" DATE,
 23          "SIGNATURE" RAW(16),
 24          "SPARE7" NUMBER,
 25          "SPARE8" NUMBER,
 26          "SPARE9" NUMBER
 27     ) PCTFREE 10 PCTUSED 40 INITRANS 1 MAXTRANS 255
 28   NOCOMPRESS LOGGING
 29    STORAGE(INITIAL 16384 NEXT 106496 MINEXTENTS 1 MAXEXTENTS 2147483645
 30    PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1
 31    BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT)
 32    TABLESPACE "SYSTEM"
 33  ;
Table created.
SQL>
SQL>   CREATE UNIQUE INDEX "SYS"."I_OBJ_5IMG" ON "SYS"."OBJ$IMG" ("SPARE3", "NAME", "NAMESPACE", "TYPE#", "OWNER#", "REMOTEOWNER", "LINKNAME", "SUBNAME", "OBJ#")
  2    PCTFREE 10 INITRANS 2 MAXTRANS 255 COMPUTE STATISTICS
  3    STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
  4    PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1
  5    BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT)
  6    TABLESPACE "SYSTEM";
Index created.
SQL>   CREATE INDEX "SYS"."I_OBJ4IMG" ON "SYS"."OBJ$IMG" ("DATAOBJ#", "TYPE#", "OWNER#")
  2    PCTFREE 10 INITRANS 2 MAXTRANS 255 COMPUTE STATISTICS
  3    STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
  4    PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1
  5    BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT)
  6    TABLESPACE "SYSTEM";
Index created.
SQL>   CREATE INDEX "SYS"."I_OBJ3IMG" ON "SYS"."OBJ$IMG" ("OID$")
  2    PCTFREE 10 INITRANS 2 MAXTRANS 255 COMPUTE STATISTICS
  3    STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
  4    PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1
  5    BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT)
  6    TABLESPACE "SYSTEM";
Index created.
SQL>   CREATE UNIQUE INDEX "SYS"."I_OBJ2IMG" ON "SYS"."OBJ$IMG" ("OWNER#", "NAME", "NAMESPACE", "REMOTEOWNER", "LINKNAME", "SUBNAME", "TYPE#", "SPARE3", "OBJ#")
  2    PCTFREE 10 INITRANS 2 MAXTRANS 255 COMPUTE STATISTICS
  3    STORAGE(INITIAL 16384 NEXT 106496 MINEXTENTS 1 MAXEXTENTS 2147483645
  4    PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1
  5    BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT)
  6    TABLESPACE "SYSTEM";
Index created.
SQL>   CREATE UNIQUE INDEX "SYS"."I_OBJ1IMG" ON "SYS"."OBJ$IMG" ("OBJ#", "OWNER#", "TYPE#")
  2    PCTFREE 10 INITRANS 2 MAXTRANS 255 COMPUTE STATISTICS
  3    STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
  4    PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1
  5    BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT)
  6    TABLESPACE "SYSTEM"
  7  ;
Index created.

6. 获取bootstrap$语句

SQL> SELECT DBMS_METADATA.GET_DDL('TABLE','BOOTSTRAP$','SYS') FROM DUAL;
DBMS_METADATA.GET_DDL('TABLE','BOOTSTRAP$','SYS')
--------------------------------------------------------------------------------
  CREATE TABLE "SYS"."BOOTSTRAP$"
   (    "LINE#" NUMBER NOT NULL ENABLE,
        "OBJ#" NUMBER NOT NULL ENABLE,
        "SQL_TEXT" VARCHAR2(4000) NOT NULL ENABLE
   ) PCTFREE 10 PCTUSED 40 INITRANS 1 MAXTRANS 255
 NOCOMPRESS LOGGING
  STORAGE(INITIAL 57344 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
  PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1
  BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT)
  TABLESPACE "SYSTEM"

7. 创建bootstrap$img表

SQL>   CREATE TABLE "SYS"."BOOTSTRAP$IMG"
  2     (    "LINE#" NUMBER NOT NULL ENABLE,
  3          "OBJ#" NUMBER NOT NULL ENABLE,
  4          "SQL_TEXT" VARCHAR2(4000) NOT NULL ENABLE
  5     ) PCTFREE 10 PCTUSED 40 INITRANS 1 MAXTRANS 255
  6   NOCOMPRESS LOGGING
  7    STORAGE(INITIAL 57344 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
  8    PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1
  9    BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT)
 10    TABLESPACE "SYSTEM"
 11  /
Table created.

8.插入数据

insert into obj$img select * from obj$;
insert into bootstrap$img select * from bootstrap$;
commit;

9. 删除bootstrap$img对象名

delete from bootstrap$img where obj# in
 (select obj# from obj$
  where name in ('OBJ$',  'I_OBJ1',  'I_OBJ2', 'I_OBJ3', 'I_OBJ4', 'I_OBJ5',
                 'BOOTSTRAP$'));
commit;

10. 插入新创建对象

insert into bootstrap$img select * from bootstrap$tmpstr;
commit;

11. 关闭数据库

SQL> shutdown immediate;
Database closed.
Database dismounted.
ORACLE instance shut down.

12. bbed修改相关block值

[oracle@oel6 ~]$ bbed
Password:
BBED: Release 2.0.0.0.0 - Limited Production on Thu Aug 7 04:54:50 2014
Copyright (c) 1982, 2013, Oracle and/or its affiliates.  All rights reserved.
************* !!! For Oracle Internal Use only !!! ***************
BBED> set filename '/u01/app/oracle/oradata/XIFENFEI/system01.dbf'
        FILENAME        /u01/app/oracle/oradata/XIFENFEI/system01.dbf
BBED> set blocksize 8192
        BLOCKSIZE       8192
BBED> set block 1
        BLOCK#          1
BBED> set mode edit
        MODE            Edit
BBED> set count 32
        COUNT           32
BBED> m /x e81d
Warning: contents of previous BIFILE will be lost. Proceed? (Y/N) y
 File: /u01/app/oracle/oradata/XIFENFEI/system01.dbf (0)
 Block: 1                Offsets:   196 to  227           Dba:0x00000000
------------------------------------------------------------------------
 e81d4000 12000000 00000000 41bad632 15bad632 01000000 00000000 00000000
 <32 bytes per line>
BBED> m /x 5200 offset +2
 File: /u01/app/oracle/oradata/XIFENFEI/system01.dbf (0)
 Block: 1                Offsets:   198 to  229           Dba:0x00000000
------------------------------------------------------------------------
 52001200 00000000 000041ba d63215ba d6320100 00000000 00000000 00000000
 <32 bytes per line>
BBED> d offset 96
 File: /u01/app/oracle/oradata/XIFENFEI/system01.dbf (0)
 Block: 1                Offsets:   196 to  227           Dba:0x00000000
------------------------------------------------------------------------
 e81d5200 12000000 00000000 41bad632 15bad632 01000000 00000000 00000000
 <32 bytes per line>
BBED> sum apply
Check value for File 0, Block 1:
current = 0x5fbf, required = 0x5fbf
BBED> verify
DBVERIFY - Verification starting
FILE = /u01/app/oracle/oradata/XIFENFEI/system01.dbf
BLOCK = 1
DBVERIFY - Verification complete
Total Blocks Examined         : 1
Total Blocks Processed (Data) : 0
Total Blocks Failing   (Data) : 0
Total Blocks Processed (Index): 0
Total Blocks Failing   (Index): 0
Total Blocks Empty            : 0
Total Blocks Marked Corrupt   : 0
Total Blocks Influx           : 0
Message 531 not found;  product=RDBMS; facility=BBED
BBED> exit

13. 启动数据库

SQL> startup
ORACLE instance started.
Total System Global Area  663945216 bytes
Fixed Size                  2291808 bytes
Variable Size             369100704 bytes
Database Buffers          289406976 bytes
Redo Buffers                3145728 bytes
Database mounted.
Database opened.

14. 验证I_OBJ1 index重建情况

SQL> SELECT OBJECT_ID,DATA_OBJECT_ID,CREATED FROM DBA_OBJECTS WHERE OBJECT_NAME='I_OBJ1';
 OBJECT_ID DATA_OBJECT_ID CREATED
---------- -------------- ---------
     77120          77120 06-AUG-14
SQL> !date
Thu Aug  6 05:29:25 CST 2014
SQL> SELECT HEADER_FILE,HEADER_BLOCK FROM DBA_SEGMENTS WHERE SEGMENT_NAME='I_OBJ1';
HEADER_FILE HEADER_BLOCK
----------- ------------
          1        77296
SQL> SELECT SQL_TEXT FROM BOOTSTRAP$ WHERE SQL_TEXT LIKE '%obj%';
SQL_TEXT
-----------------------------------------------------------------------------------------------------------------------------------------------
create unique index i_obj1 on obj$(obj#, owner#, type#) PCTFREE 10 INITRANS 2 MAXTRANS 255 STORAGE (  INITIAL 65536 NEXT 1048576 MINEXTENTS 1 M
AXEXTENTS 2147483645 PCTINCREASE 0 OBJNO 77120 EXTENTS (FILE 1 BLOCK 77296))
create unique index i_obj2 on obj$(owner#, name, namespace,remoteowner, linkname, subname, type#, spare3, obj#) PCTFREE 10 INITRANS 2 MAXTRANS
255 STORAGE (  INITIAL 16384 NEXT 106496 MINEXTENTS 1 MAXEXTENTS 2147483645 PCTINCREASE 0 OBJNO 77119 EXTENTS (FILE 1 BLOCK 77288))
create index i_obj3 on obj$(oid$) PCTFREE 10 INITRANS 2 MAXTRANS 255 STORAGE (  INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645 P
CTINCREASE 0 OBJNO 77118 EXTENTS (FILE 1 BLOCK 77280))
create index i_obj4 on obj$(dataobj#, type#, owner#) PCTFREE 10 INITRANS 2 MAXTRANS 255 STORAGE (  INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXE
XTENTS 2147483645 PCTINCREASE 0 OBJNO 77117 EXTENTS (FILE 1 BLOCK 77272))
create unique index i_obj5 on obj$(spare3, name, namespace, type#, owner#, remoteowner, linkname, subname, obj#) PCTFREE 10 INITRANS 2 MAXTRANS
 255 STORAGE (  INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645 PCTINCREASE 0 OBJNO 77116 EXTENTS (FILE 1 BLOCK 77264))

不同版本对应的基表和index结构不一样(本篇blog使用的是12c做试验),但是本方法支持数据库从Oracle 7到Oracle 12c对于bootstrap$中核心index处理.
因为重建bootstrap$中的核心index是一个复杂而且风险非常大的事情,在你无法确定风险或者无法正常完成此类操作之时,如有需要请联系我
Phone:17813235971    Q Q:107644445    E-Mail:dba@xifenfei.com

某集团ebs数据库redo undo丢失导致悲剧

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:某集团ebs数据库redo undo丢失导致悲剧

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

某集团的ebs系统因磁盘空间不足把redo和undo存放到raid 0之上,而且该库无任何备份。最终悲剧发生了,raid 0异常导致redo undo全部丢失,数据库无法正常启动(我接手之时数据库已经resetlogs过,但是未成功)

Sun Jul 27 11:31:27 2014
SMON: enabling cache recovery
SMON: enabling tx recovery
Sun Jul 27 11:31:27 2014
Database Characterset is ZHS16GBK
Sun Jul 27 11:31:27 2014
Errors in file /prod/oracle/PROD/db/tech_st/10.2.0/admin/PROD_erpserver/bdump/prod_smon_454754.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-00376: file 42 cannot be read at this time
ORA-01110: data file 42: '/prod/oracle/PROD/logdata/undo/undo1.dbf'
Sun Jul 27 11:31:27 2014
Errors in file /prod/oracle/PROD/db/tech_st/10.2.0/admin/PROD_erpserver/bdump/prod_smon_454754.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-00376: file 42 cannot be read at this time
ORA-01110: data file 42: '/prod/oracle/PROD/logdata/undo/undo1.dbf'
Sun Jul 27 11:31:27 2014
Errors in file /prod/oracle/PROD/db/tech_st/10.2.0/admin/PROD_erpserver/bdump/prod_smon_454754.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-00376: file 42 cannot be read at this time
ORA-01110: data file 42: '/prod/oracle/PROD/logdata/undo/undo1.dbf'
Sun Jul 27 11:31:27 2014
Errors in file /prod/oracle/PROD/db/tech_st/10.2.0/admin/PROD_erpserver/udump/prod_ora_663670.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-00376: file 41 cannot be read at this time
ORA-01110: data file 41: '/prod/oracle/PROD/logdata/undo/undo2.dbf'
Error 604 happened during db open, shutting down database
USER: terminating instance due to error 604
Instance terminated by USER, pid = 663670
ORA-1092 signalled during: ALTER DATABASE OPEN...

查询相关文件状态发现,undo表空间文件丢失,被offline处理
df_status
因为以前alert日志被清理,通过这里大概猜测是offline丢失的undo文件,然后resetlogs了数据库,现在处理方式为
使用_corrupted_rollback_segments屏蔽回滚段,然后尝试启动数据库

Tue Jul 29 11:40:39 2014
SMON: enabling cache recovery
SMON: enabling tx recovery
Tue Jul 29 11:40:39 2014
Database Characterset is ZHS16GBK
Tue Jul 29 11:40:39 2014
Errors in file /prod/oracle/PROD/db/tech_st/10.2.0/admin/PROD_erpserver/bdump/prod_smon_569378.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-01555: snapshot too old: rollback segment number  with name "" too small
Tue Jul 29 11:40:39 2014
Errors in file /prod/oracle/PROD/db/tech_st/10.2.0/admin/PROD_erpserver/bdump/prod_smon_569378.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-01555: snapshot too old: rollback segment number  with name "" too small
Tue Jul 29 11:40:39 2014
Errors in file /prod/oracle/PROD/db/tech_st/10.2.0/admin/PROD_erpserver/bdump/prod_smon_569378.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-01555: snapshot too old: rollback segment number  with name "" too small
Tue Jul 29 11:40:39 2014
Errors in file /prod/oracle/PROD/db/tech_st/10.2.0/admin/PROD_erpserver/udump/prod_ora_585786.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-01555: snapshot too old: rollback segment number  with name "" too small
Error 604 happened during db open, shutting down database
USER: terminating instance due to error 604
Instance terminated by USER, pid = 585786
ORA-1092 signalled during: alter database open...

该错误是由于数据库启动需要找到对应的回滚段,但是由于undo异常导致该回滚段无法找到,因此出现该错误,解决方法是通过修改数据scn,让其不找回滚段,从而屏蔽该错误.数据库启动后,删除undo重新创建新undo

Tue Jul 29 15:59:22 2014
drop tablespace undo2 including contents and datafiles
Tue Jul 29 15:59:23 2014
Errors in file /prod/oracle/PROD/db/tech_st/10.2.0/admin/PROD_erpserver/udump/prod_ora_782490.trc:
ORA-01122: database file 41 failed verification check
ORA-01110: data file 41: '/prod/oracle/PROD/logdata/undo/undo2.dbf'
ORA-01565: error in identifying file '/prod/oracle/PROD/logdata/undo/undo2.dbf'
ORA-27037: unable to obtain file status
IBM AIX RISC System/6000 Error: 2: No such file or directory
Additional information: 3
Tue Jul 29 15:59:23 2014
Errors in file /prod/oracle/PROD/db/tech_st/10.2.0/admin/PROD_erpserver/udump/prod_ora_782490.trc:
ORA-01259: unable to delete datafile /prod/oracle/PROD/logdata/undo/undo2.dbf
Tue Jul 29 15:59:23 2014
Errors in file /prod/oracle/PROD/db/tech_st/10.2.0/admin/PROD_erpserver/udump/prod_ora_782490.trc:
ORA-01122: database file 42 failed verification check
ORA-01110: data file 42: '/prod/oracle/PROD/logdata/undo/undo1.dbf'
ORA-01565: error in identifying file '/prod/oracle/PROD/logdata/undo/undo1.dbf'
ORA-27037: unable to obtain file status
IBM AIX RISC System/6000 Error: 2: No such file or directory
Additional information: 3
ORA-01259: unable to delete datafile /prod/oracle/PROD/logdata/undo/undo2.dbf
Tue Jul 29 15:59:23 2014
Errors in file /prod/oracle/PROD/db/tech_st/10.2.0/admin/PROD_erpserver/udump/prod_ora_782490.trc:
ORA-01259: unable to delete datafile /prod/oracle/PROD/logdata/undo/undo1.dbf
Tue Jul 29 15:59:23 2014
Completed: drop tablespace undo2 including contents and datafiles
Tue Jul 29 15:59:56 2014
create undo tablespace undotbs1 datafile '/prod/oracle/PROD/logdata/undo_new01.dbf' size 100M autoextend on next 128M maxsize 30G
Tue Jul 29 15:59:57 2014
Completed: create undo tablespace undotbs1 datafile '/prod/oracle/PROD/logdata/undo_new01.dbf' size 100M autoextend on next 128M maxsize 30G
Tue Jul 29 16:00:03 2014
alter tablespace undotbs1 add datafile '/prod/oracle/PROD/logdata/undo_new02.dbf' size 100M autoextend on next 128M maxsize 30G
Completed: alter tablespace undotbs1 add datafile '/prod/oracle/PROD/logdata/undo_new02.dbf' size 100M autoextend on next 128M maxsize 30G

业务运行过程中,数据库报大量ORA-600 4097,ORA-600 kdsgrp1,ORA-600 kcfrbd_3错误

Tue Jul 29 16:07:03 2014
Errors in file /prod/oracle/PROD/db/tech_st/10.2.0/admin/PROD_erpserver/udump/prod_ora_950484.trc:
ORA-00600: internal error code, arguments: [4097], [], [], [], [], [], [], []
Tue Jul 29 16:07:06 2014
Errors in file /prod/oracle/PROD/db/tech_st/10.2.0/admin/PROD_erpserver/udump/prod_ora_950484.trc:
ORA-00600: internal error code, arguments: [kdsgrp1], [], [], [], [], [], [], []
Tue Jul 29 16:10:06 2014
Errors in file /prod/oracle/PROD/db/tech_st/10.2.0/admin/PROD_erpserver/udump/prod_ora_917702.trc:
ORA-00600: internal error code, arguments: [4097], [], [], [], [], [], [], []
Tue Jul 29 16:10:07 2014
Errors in file /prod/oracle/PROD/db/tech_st/10.2.0/admin/PROD_erpserver/udump/prod_ora_917702.trc:
ORA-00600: internal error code, arguments: [kdsgrp1], [], [], [], [], [], [], []
Tue Jul 29 16:12:45 2014
Errors in file /prod/oracle/PROD/db/tech_st/10.2.0/admin/PROD_erpserver/bdump/prod_m000_880692.trc:
ORA-00600: internal error code, arguments: [4097], [], [], [], [], [], [], []
Tue Jul 29 16:21:23 2014
Errors in file /prod/oracle/PROD/db/tech_st/10.2.0/admin/PROD_erpserver/udump/prod_ora_1040638.trc:
ORA-00600: 内部错误代码, 参数: [kcfrbd_3], [41], [231381], [1], [12800], [12800], [], []
Tue Jul 29 16:21:37 2014
Errors in file /prod/oracle/PROD/db/tech_st/10.2.0/admin/PROD_erpserver/udump/prod_ora_1040638.trc:
ORA-00600: 内部错误代码, 参数: [kcfrbd_3], [41], [231381], [1], [12800], [12800], [], []
Tue Jul 29 16:21:56 2014
Errors in file /prod/oracle/PROD/db/tech_st/10.2.0/admin/PROD_erpserver/udump/prod_ora_1040638.trc:
ORA-00600: 内部错误代码, 参数: [kcfrbd_3], [41], [231381], [1], [12800], [12800], [], []
Tue Jul 29 16:22:18 2014
Errors in file /prod/oracle/PROD/db/tech_st/10.2.0/admin/PROD_erpserver/udump/prod_ora_1040638.trc:
ORA-00600: 内部错误代码, 参数: [kcfrbd_3], [41], [231381], [1], [12800], [12800], [], []
Tue Jul 29 16:22:28 2014
Errors in file /prod/oracle/PROD/db/tech_st/10.2.0/admin/PROD_erpserver/udump/prod_ora_1105950.trc:
ORA-00600: 内部错误代码, 参数: [4097], [], [], [], [], [], [], []
Tue Jul 29 16:22:33 2014
Errors in file /prod/oracle/PROD/db/tech_st/10.2.0/admin/PROD_erpserver/udump/prod_ora_1159232.trc:
ORA-00600: 内部错误代码, 参数: [kcfrbd_3], [42], [61235], [1], [12800], [12800], [], []

出现该错误有几个原因和解决方法:
ORA-600 kdsgrp1 是因为相关坏块引起(tab,index,memory,cr block等),结合日志分析对象异常原因,根据具体情况确定对象然后选择合适处理方案(具体参考NOTE:1332252.1)
ORA-600 4097 由于数据库异常关闭然后open,创建回滚段,可能触发bug导致该问题(虽然说在当前版本修复,但是实际处理我确实按照NOTE:1030620.6解决)
ORA-600 kcfrbd_3 有事务的block被访问之后,根据回滚槽信息定位到相关回滚段,而正好新建的回滚段信息又和以前的名字编号一致,从而反馈出来是数据文件大小不够,从而出现该错误(具体参考NOTE:601798.1)
最终该数据库虽然恢复了,抢救了大量数据,但是对于ebs系统来说,丢失redo和undo数据的损失还是巨大的.再次温馨提示:数据库的redo,undo也很重要,数据库的备份更加重要

ORACLE 8.1.7 数据库ORA-600 4194故障恢复

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:ORACLE 8.1.7 数据库ORA-600 4194故障恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

一个817数据库报ORA-600 4194 无法正常启动

Fri Jul 25 10:49:47 2014
Database mounted in Exclusive Mode.
Completed: ALTER DATABASE   MOUNT
Fri Jul 25 10:49:58 2014
ALTER DATABASE RECOVER  database
Fri Jul 25 10:49:58 2014
Media Recovery Start
Media Recovery Log
Recovery of Online Redo Log: Thread 1 Group 2 Seq 3320 Reading mem 0
  Mem# 0 errs 0: D:\ORACLE\ORADATA\ORCL\REDO02.LOG
Media Recovery Complete
Completed: ALTER DATABASE RECOVER  database
Fri Jul 25 10:50:09 2014
alter database open
Beginning crash recovery of 1 threads
Fri Jul 25 10:50:09 2014
Thread recovery: start rolling forward thread 1
Recovery of Online Redo Log: Thread 1 Group 2 Seq 3320 Reading mem 0
  Mem# 0 errs 0: D:\ORACLE\ORADATA\ORCL\REDO02.LOG
Fri Jul 25 10:50:09 2014
Thread recovery: finish rolling forward thread 1
Thread recovery: 0 data blocks read, 0 data blocks written, 3 redo blocks read
Crash recovery completed successfully
Fri Jul 25 10:50:09 2014
Thread 1 advanced to log sequence 3321
Thread 1 opened at log sequence 3321
  Current log# 3 seq# 3321 mem# 0: D:\ORACLE\ORADATA\ORCL\REDO01.LOG
Successful open of redo thread 1.
Fri Jul 25 10:50:09 2014
SMON: enabling cache recovery
Fri Jul 25 10:50:10 2014
Errors in file D:\oracle\admin\ORCL\udump\ORA03216.TRC:
ORA-00600: ??????????: [4194], [12], [37], [], [], [], [], []
Fri Jul 25 10:50:10 2014
Recovery of Online Redo Log: Thread 1 Group 3 Seq 3321 Reading mem 0
  Mem# 0 errs 0: D:\ORACLE\ORADATA\ORCL\REDO01.LOG
Fri Jul 25 10:50:10 2014
SMON: disabling cache recovery
Fri Jul 25 10:50:10 2014
ORA-600 signalled during: alter database open

ORA-600 4194这个错误在数据库异常恢复中非常常见,因为库不是很重要,因此就是直接屏蔽掉故障回滚段,然后强制拉库,该库的恢复过程中,也直接使用隐含参数屏蔽回滚段
_corrupted_rollback_segments= RBS0, RBS1, RBS2, RBS3, RBS4, RBS5, RBS6, RBS_HDSYS,数据库依然无法open,进一步分析trace文件

Fri Jul 25 11:26:07 2014
ORACLE V8.1.7.0.0 - Production vsnsta=0
vsnsql=e vsnxtr=3
Windows 2000 Version 5.2 Service Pack 2, CPU type 586
Oracle8i Release 8.1.7.0.0 - Production
JServer Release 8.1.7.0.0 - Production
Windows 2000 Version 5.2 Service Pack 2, CPU type 586
Instance name: orcl
Redo thread mounted by this instance: 1
Oracle process number: 14
Windows thread id: 3648, image: ORACLE.EXE
*** SESSION ID:(11.1) 2014-07-25 11:26:07.843
*** 2014-07-25 11:26:07.843
ksedmp: internal or fatal error
ORA-00600: ??????????: [4194], [12], [37], [], [], [], [], []
Current SQL statement for this session:
update undo$ set name=:2,file#=:3,block#=:4,status$=:5,user#=:6,undosqn=:7,xactsqn=:8,
scnbas=:9,scnwrp=:10,inst#=:11,ts#=:12 where us#=:1
----- Call Stack Trace -----

这里很明显看出来,数据库是在open过程中,update undo$表遭遇到ORA-600 4194,因为该过程需要使用系统回滚段,但是由于其所对应的undo和redo信息不一致,所以无法正常启动数据库.继续读trace文件

  Extent Control Header
  -----------------------------------------------------------------
  Extent Header:: spare1: 0      space2: 0      #extents: 5      #blocks: 49
                  last map  0x00000000  #maps: 0      offset: 4128
      Highwater::  0x00400006  ext#: 0      blk#: 3      ext size: 9
  #blocks in seg. hdr's freelists: 0
  #blocks below: 0
  mapblk  0x00000000  offset: 0
                   Unlocked
     Map Header:: next  0x00000000  #extents: 5    obj#: 0      flag: 0x40000000
  Extent Map
  -----------------------------------------------------------------
   0x00400003  length: 9
   0x0040000c  length: 10
   0x0040008f  length: 10
   0x00400099  length: 10
   0x004000a3  length: 10
  TRN CTL:: seq: 0x003c chd: 0x004e ctl: 0x0050 inc: 0x00000000 nfb: 0x0000
            mgc: 0x8002 xts: 0x0068 flg: 0x0001 opt: 2147483646 (0x7ffffffe)
            uba: 0x00400006.003c.25 scn: 0x0000.009a4009
Version: 0x01
  FREE BLOCK POOL::
    uba: 0x00000000.003c.24 ext: 0x0  spc: 0x196
    uba: 0x00000000.001f.14 ext: 0x1  spc: 0x16f6
    uba: 0x00000000.0018.02 ext: 0x4  spc: 0x1f1a
    uba: 0x00000000.0000.00 ext: 0x0  spc: 0x0
    uba: 0x00000000.0000.00 ext: 0x0  spc: 0x0
  TRN TBL::

通过这里可以看出来,数据库在启动的时候,使用system undo的block为为0x00400006,使用bbed清除掉该uba记录,让数据库启动的时候重新分配system undo block给数据库执行update undo$使用,数据库open成功

BBED> m /x 0x00000000
 File: D:\ORACLE\ORADATA\ORCL\SYSTEM01.DBF (0)
 Block: 2                Offsets: 4188 to 4192           Dba:0x00000000
------------------------------------------------------------------------
 00000000 3c002400 00009601 00000000 1f001400 0100f616 00000000 18000200
BBED> m /x 0x0000
 File: D:\ORACLE\ORADATA\ORCL\SYSTEM01.DBF (0)
 Block: 2                Offsets: 4028 to 4032           Dba:0x00000000
------------------------------------------------------------------------
 00000000 00000000 3c005000 02800100 68000000 feffff7f 06004000 3c002400
Sat Jul 26 12:09:21 2014
Thread recovery: start rolling forward thread 1
Recovery of Online Redo Log: Thread 1 Group 2 Seq 3326 Reading mem 0
  Mem# 0 errs 0: D:\ORACLE\ORADATA\ORCL\REDO02.LOG
Sat Jul 26 12:09:21 2014
Thread recovery: finish rolling forward thread 1
Thread recovery: 0 data blocks read, 0 data blocks written, 3 redo blocks read
Crash recovery completed successfully
Sat Jul 26 12:09:22 2014
Thread 1 advanced to log sequence 3327
Thread 1 opened at log sequence 3327
  Current log# 3 seq# 3327 mem# 0: D:\ORACLE\ORADATA\ORCL\REDO01.LOG
Successful open of redo thread 1.
Sat Jul 26 12:09:22 2014
SMON: enabling cache recovery
SMON: enabling tx recovery
Sat Jul 26 12:09:39 2014
Completed: alter database open