在win 64位平台上运行bbed(支持ORACLE 10g 11g 12c)

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:在win 64位平台上运行bbed(支持ORACLE 10g 11g 12c)

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

很多朋友反馈在win 64位操作系统之上无法使用bbed(包括9i,10g,11g,12c数据库版本),以前写过一篇文章,完美实现了在win平台的各个版本的数据库版本之上实现使用bbed(在win中运行bbed程序),可惜很遗憾没有注明平台信息,留下了不少疑问,今天在自己的电脑上再次实现此功能,用来证明win 64位的平台之上也可以运行bbed程序(数据库版本包括10g,11g,12c,在10g之前x86架构中无win 64位版本数据库,因此我也无能为力).

操作系统版本64位
本机测试为win 7 64位操作系统
win-64


win-64-2


数据库版本64位
本机测试数据库版本为12.1.0.2 64位版本(因为12c都支持,那对于10g/11g更是不在话下)
db-64


bbed运行情况
这里的bbed只是运行起来,并未加载数据文件,因此这里看到的FILENAME为空,但是不妨碍证明bbed可以在win平台,64位的数据库中运行
bbed-win64


整体证明win 64位平台,64位数据库运行bbed
一图抵上千言万语,让我们使用一幅完整截图来说明,bbed是可以运行在win 64位平台的64位版本的数据库之上(而且这里使用了目前最新的12.1.0.2版本)
oracle-64-bbed-10g-11g-12c

ORA-01052: required destination LOG_ARCHIVE_DUPLEX_DEST is not specified 恢复思路

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:ORA-01052: required destination LOG_ARCHIVE_DUPLEX_DEST is not specified 恢复思路

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

今天一网友找到我,说数据库恢复在推scn的过程中遇到了ORA-01052: required destination LOG_ARCHIVE_DUPLEX_DEST is not specified错误无法解决,让我给予支持.这不禁让我想起,现在由于数据库bug和dblink原因导致了很多数据库scn很大,距离天花板非常近,从而使得数据库恢复过程中无法直接简单的推scn,这里正好结合该例子,简单说明下ORA-01052故障的处理.类似文档以前也写过:ORA-01052发生原因的类似文章

由于坏块导致数据库进行实例恢复无法进行

Beginning crash recovery of 1 threads
Started redo scan
Completed redo scan
 read 1901 KB redo, 276 data blocks need recovery
Started redo application at
 Thread 1: logseq 1004, block 172771
Recovery of Online Redo Log: Thread 1 Group 2 Seq 1004 Reading mem 0
  Mem# 0: F:\APP\ADMINISTRATOR\ORADATA\XFF\REDO02.LOG
Fri May 29 10:59:56 2015
RECOVERY OF THREAD 1 STUCK AT BLOCK 439938 OF FILE 19
Fri May 29 11:00:00 2015
Exception [type: ACCESS_VIOLATION, UNABLE_TO_READ] [ADDR:0x2048] [PC:0x6215134, __intel_new_memcpy()+260]
Fri May 29 11:00:12 2015
Trace dumping is performing id=[cdmp_20150529110012]
Fri May 29 11:00:12 2015
Slave exiting with ORA-1172 exception
Errors in file f:\app\administrator\diag\rdbms\XFF\XFF\trace\XFF_p007_1612.trc:
ORA-01172: 线程 1 的恢复停止在块 439938 (在文件 19 中)
ORA-01151: 如果需要, 请使用介质恢复以恢复块和还原备份
Fri May 29 11:00:27 2015
ORA-01578: ORACLE 数据块损坏 (文件号 19, 块号 450245)
ORA-01110: 数据文件 19: 'F:\APP\ADMINISTRATOR\ORADATA\XFF\PSTORE_02.DBF'
ORA-10564: tablespace PSTORE
ORA-01110: 数据文件 19: 'F:\APP\ADMINISTRATOR\ORADATA\XFF\PSTORE_02.DBF'
ORA-10561: block type 'TRANSACTION MANAGED INDEX BLOCK', data object# 91642
ORA-00607: 当更改数据块时出现内部错误
ORA-00602: 内部编程异常错误
ORA-07445: 出现异常错误: 核心转储 [_intel_new_memcpy()+260] [ACCESS_VIOLATION] [ADDR:0x2048]
[PC:0x6215134] [UNABLE_TO_READ] []
Fri May 29 11:00:27 2015
Aborting crash recovery due to slave death, attempting serial crash recovery
RECOVERY OF THREAD 1 STUCK AT BLOCK 439938 OF FILE 19
Fri May 29 11:00:45 2015
Trace dumping is performing id=[cdmp_20150529110045]
Aborting crash recovery due to error 1172
ORA-1172 signalled during: alter database open...

设置_allow_resetlogs_corruption并resetlogs尝试打开数据库

Assigning activation ID 4272042346 (0xfea2316a)
Thread 1 opened at log sequence 1
  Current log# 1 seq# 1 mem# 0: F:\APP\ADMINISTRATOR\ORADATA\XFF\REDO01.LOG
Successful open of redo thread 1
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Fri May 29 11:30:47 2015
SMON: enabling cache recovery
Fri May 29 11:30:47 2015
Errors in file f:\app\administrator\diag\rdbms\XFF\XFF\trace\XFF_ora_3004.trc  (incident=181236):
ORA-00600: ??????, ??: [2662], [3360], [2233437186], [3360], [2235447064], [4194545], [], [], [], [], [], []
Incident details in: f:\app\administrator\diag\rdbms\XFF\XFF\incident\incdir_181236\XFF_ora_3004_i181236.trc
Errors in file f:\app\administrator\diag\rdbms\XFF\XFF\trace\XFF_ora_3004.trc:
ORA-00704: ????????
ORA-00704: ????????
ORA-00600: ??????, ??: [2662], [3360], [2233437186], [3360], [2235447064], [4194545], [], [], [], [], [], []
Errors in file f:\app\administrator\diag\rdbms\XFF\XFF\trace\XFF_ora_3004.trc:
ORA-00704: ????????
ORA-00704: ????????
ORA-00600: ??????, ??: [2662], [3360], [2233437186], [3360], [2235447064], [4194545], [], [], [], [], [], []
Error 704 happened during db open, shutting down database
USER (ospid: 3004): terminating the instance due to error 704
Instance terminated by USER, pid = 3004
ORA-1092 signalled during: alter database open resetlogs...
opiodr aborting process unknown ospid (3004) as a result of ORA-1092

这里可以看到数据库通过设置_allow_resetlogs_corruption参数,进行不完全恢复,跳过数据库启动的实例恢复,然后强制拉库,然后遭遇大家熟悉的ORA-600[2662]错误,使得恢复失败,根据经验,通过推scn来绕过该错误

使用_minimum_giga_scn尝试推SCN

SQL> shutdown immediate
Database closed.
Database dismounted.
ORACLE instance shut down.
--------------------------------
*._minimum_giga_scn=13443
--------------------------------
SQL> startup pfile='/tmp/pfile'
ORACLE instance started.
Total System Global Area  236000356 bytes
Fixed Size                   451684 bytes
Variable Size             201326592 bytes
Database Buffers           33554432 bytes
Redo Buffers                 667648 bytes
Database mounted.
ORA-01052: required destination LOG_ARCHIVE_DUPLEX_DEST is not specified

通过运行Oracle Database Recovery Check检查发现数据库的scn已经非常大,距离天花板较近
_minimum_giga_scn
这里最大允许的推进的scn为13442.7,但是常规的最小的推scn的方法最小值为1024*1024*1024的倍数,因此这里遇到麻烦.
这里数据库遭遇了ORA-01052错误,导致推scn不成功,数据库无法正常启动.出现这类情况,由于scn可以增加的空间非常小,因此可以使用使用oradebug修改数据库scn或者直接修改控制文件scn的方式来精确控制推scn的值(可以实现任何值的scn增加,只要不超过天花板),也可以通过方法修改数据库scn距离天花板的距离,从而实现大幅度使用_minimum_giga_scn来推scn.另外还有一种解决方法:由于ORA-01052是由于scn过大导致(超过了数据库现在的天花板scn),因此出现了ORA-01052.所以另外一种变通的方法,就是通过调整数据库的天花板scn,从而使得_minimum_giga_scn可以继续推scn.在本次恢复中使用最为简单的增加天花板scn的方式来恢复(不过该方法恢复之后需要重建库,其实已经使用了隐含参数屏蔽redo恢复,本身就建议重建库保证数据字典一致性)

kfed恢复误删除磁盘组

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:kfed恢复误删除磁盘组

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

在某些情况下,可能因为误操作,不小先drop diskgroup,这个时候千万别紧张,出现此类故障,可以通过kfed进行完美恢复(数据0丢失).如果进一步损坏了相关asm disk,那后续恢复就很麻烦了,可能需要使用dul扫描磁盘来进行抢救性恢复,而且可能导致数据丢失.
创建测试磁盘组xifenfei

[grid@xifenfei ~]$ sqlplus / as sysasm
SQL*Plus: Release 11.2.0.4.0 Production on Thu Apr 30 15:12:08 2015
Copyright (c) 1982, 2013, Oracle.  All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Automatic Storage Management option
SQL>  select name,path,header_status from v$asm_disk;
NAME                           PATH                           HEADER_STATU
------------------------------ ------------------------------ ------------
                               /dev/asm-disk3                 CANDIDATE
DATA_0000                      /dev/asm-disk1                 MEMBER
DATA_0001                      /dev/asm-disk2                 MEMBER
SQL> create diskgroup xifenfei external redundancy disk '/dev/asm-disk3';
Diskgroup created.
SQL> select name,path,header_status from v$asm_disk;
NAME                           PATH                           HEADER_STATU
------------------------------ ------------------------------ ------------
XIFENFEI_0000                  /dev/asm-disk3                 MEMBER
DATA_0000                      /dev/asm-disk1                 MEMBER
DATA_0001                      /dev/asm-disk2                 MEMBER

使用/dev/asm-disk3这个磁盘创建磁盘组xifenfei

创建表,存储在xifenfei磁盘组中

[oracle@xifenfei ~]$ sqlplus / as sysdba
SQL*Plus: Release 11.2.0.4.0 Production on Thu Apr 30 15:14:55 2015
Copyright (c) 1982, 2013, Oracle.  All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, Automatic Storage Management, OLAP, Data Mining
and Real Application Testing options
SQL> create tablespace xifenfei datafile '+xifenfei' size 100M;
Tablespace created.
SQL> select name from v$datafile;
NAME
--------------------------------------------------------------------------------
+DATA/xifenfei/datafile/system.256.878224279
+DATA/xifenfei/datafile/sysaux.257.878224279
+DATA/xifenfei/datafile/undotbs1.258.878224279
+DATA/xifenfei/datafile/users.259.878224279
+XIFENFEI/xifenfei/datafile/xifenfei.256.878397315
SQL> create table t_xifenfei tablespace xifenfei
  2  as select * from dba_objects;
Table created.
SQL> select count(*) from t_xifenfei;
  COUNT(*)
----------
     86259

通过在磁盘组中创建表空间,从而实现表xifenfei存放在测试磁盘组中

尝试删除磁盘组xifenfei

SQL> drop diskgroup xifenfei;
drop diskgroup xifenfei
*
ERROR at line 1:
ORA-15039: diskgroup not dropped
ORA-15053: diskgroup "XIFENFEI" contains existing files
SQL> drop diskgroup xifenfei  including contents;
drop diskgroup xifenfei  including contents
*
ERROR at line 1:
ORA-15039: diskgroup not dropped
ORA-15027: active use of diskgroup "XIFENFEI" precludes its dismount
[grid@xifenfei ~]$ asmcmd
ASMCMD> lsof
DB_Name   Instance_Name  Path
xifenfei  xifenfei       +data/xifenfei/controlfile/current.260.878224379
xifenfei  xifenfei       +data/xifenfei/datafile/sysaux.257.878224279
xifenfei  xifenfei       +data/xifenfei/datafile/system.256.878224279
xifenfei  xifenfei       +data/xifenfei/datafile/undotbs1.258.878224279
xifenfei  xifenfei       +data/xifenfei/datafile/users.259.878224279
xifenfei  xifenfei       +data/xifenfei/onlinelog/group_1.261.878224381
xifenfei  xifenfei       +data/xifenfei/onlinelog/group_2.262.878224383
xifenfei  xifenfei       +data/xifenfei/onlinelog/group_3.263.878224385
xifenfei  xifenfei       +data/xifenfei/tempfile/temp.264.878224395
xifenfei  xifenfei       +xifenfei/xifenfei/datafile/xifenfei.256.878397315

由于xifenfei磁盘组被实例使用,因此磁盘组无法删除,报ORA-15027错误
由于xifenfei磁盘组中有文件,因此磁盘组无法删除,报ORA-15053错误
如果这两个阻止你误删除磁盘组的警告依然不能救你,那我也不好多说啥了,只能向我一样继续往下

关闭数据库实例,删除磁盘组

SQL> shutdown immediate
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> drop diskgroup xifenfei;
drop diskgroup xifenfei
*
ERROR at line 1:
ORA-15039: diskgroup not dropped
ORA-15053: diskgroup "XIFENFEI" contains existing files
SQL> drop diskgroup xifenfei  including contents;
Diskgroup dropped.
SQL>  select name,path,header_status from v$asm_disk;
NAME                           PATH                           HEADER_STATU
------------------------------ ------------------------------ ------------
                               /dev/asm-disk3                 FORMER
DATA_0000                      /dev/asm-disk1                 MEMBER
DATA_0001                      /dev/asm-disk2                 MEMBER
SQL> alter diskgroup xifenfei mount;
alter diskgroup xifenfei mount
*
ERROR at line 1:
ORA-15032: not all alterations performed
ORA-15017: diskgroup "XIFENFEI" cannot be mounted
ORA-15063: ASM discovered an insufficient number of disks for diskgroup
"XIFENFEI"

磁盘组被drop之后,无法正常mount,mount之时报ORA-15063凑无

kfed恢复删除磁盘组

[grid@xifenfei ~]$ kfed read /dev/asm-disk3 >/tmp/disk3-0-0
[grid@xifenfei ~]$ kfed  read /dev/asm-disk3  blkn=1 >/tmp/disk3-0-1
[grid@xifenfei ~]$ kfed  read /dev/asm-disk3  aun=1 >/tmp/disk3-1-0
通过vi修改这些/tmp/disk3-*中的部分值
[grid@xifenfei ~]$ kfed merge /dev/asm-disk3 text=/tmp/disk3-0-0
[grid@xifenfei ~]$ kfed merge /dev/asm-disk3  blkn=1 text=/tmp/disk3-0-1
[grid@xifenfei ~]$ kfed merge /dev/asm-disk3 aun=1 text=/tmp/disk3-1-0

查询修复后的asm disk

SQL> col path for a30
SQL> set lines 150
SQL> select name,path,header_status from v$asm_disk;
NAME                           PATH                           HEADER_STATU
------------------------------ ------------------------------ ------------
                               /dev/asm-disk3                 MEMBER
DATA_0000                      /dev/asm-disk1                 MEMBER
DATA_0001                      /dev/asm-disk2                 MEMBER

尝试mount xifenfei 磁盘组

SQL> alter diskgroup xifenfei mount;
Diskgroup altered.
SQL> select name,path,header_status from v$asm_disk;
NAME                           PATH                           HEADER_STATU
------------------------------ ------------------------------ ------------
XIFENFEI_0000                  /dev/asm-disk3                 MEMBER
DATA_0000                      /dev/asm-disk1                 MEMBER
DATA_0001                      /dev/asm-disk2                 MEMBER

测试恢复后磁盘组

SQL> startup
ORACLE instance started.
Total System Global Area  952020992 bytes
Fixed Size                  2258960 bytes
Variable Size             306186224 bytes
Database Buffers          637534208 bytes
Redo Buffers                6041600 bytes
Database mounted.
Database opened.
SQL>  select count(*) from t_xifenfei;
  COUNT(*)
----------
     86259

这里证明,当磁盘组被误删除后,立即停止进一步损坏,可以通过kfed进行完美恢复
如果您遇到此类情况,无法解决请联系我们,提供专业ORACLE数据库恢复技术支持
Phone:17813235971    Q Q:107644445QQ咨询惜分飞    E-Mail:dba@xifenfei.com

分享I_OBJ4 ORA-8102故障恢复案例

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:分享I_OBJ4 ORA-8102故障恢复案例

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

在测试环境中对于OBJ$中i_obj4中出现ORA-8102进行了重新并恢复测试,认为自己已经比较清楚的掌握了I_OBJ4的ORA-8102问题处理,可是实际的一个案例,还是比较比实验中复杂,这里贴出来主要操作供大家参考,再次证明数据库恢复的场景不可大意,客户的故障只有你想不到的,没有遇不到的
通过bbed修改obj$中dataobj$重现I_OBJ4索引报ORA-08102错误
使用bbed 修复I_OBJ4 index 报ORA-8102
数据库创建表提示ORA-8102错误

SQL> startup
ORACLE instance started.
Total System Global Area 2.6991E+10 bytes
Fixed Size		    2213976 bytes
Variable Size		 1.9327E+10 bytes
Database Buffers	 7516192768 bytes
Redo Buffers		  145174528 bytes
Database mounted.
Database opened.
SQL> create table t1 as select * from dual;
create table t1 as select * from dual
                                 *
ERROR at line 1:
ORA-00604: error occurred at recursive SQL level 1
ORA-08102: index key not found, obj# 39, file 1, block 93842 (2)

分析ORA-08102错误

SQL> select object_name,object_type from dba_objects where object_id=39;
OBJECT_NAME                    OBJECT_TYPE
------------------------------ -------------------
I_OBJ4                         INDEX
SQL> create table t1 as select * from dual;
create table t1 as select * from dual
                                 *
ERROR at line 1:
ORA-00604: error occurred at recursive SQL level 1
ORA-08102: index key not found, obj# 39, file 1, block 93842 (2)
SQL> select /*+ index(t i_obj4) */ DATAOBJ#,type#,owner# from obj$  t
minus
select /*+ full(t1) */ DATAOBJ#,type#,owner# from obj$  t1;
  2    3
  DATAOBJ#	TYPE#	  OWNER#
---------- ---------- ----------
     97109	    0	       0
SQL>  select /*+ full(t1) */ DATAOBJ#,type#,owner# from obj$  t1
  minus
  select /*+ index(t i_obj4) */ DATAOBJ#,type#,owner# from obj$  t
  ;
  2    3    4
  DATAOBJ#	TYPE#	  OWNER#
---------- ---------- ----------
     97094	    0	       0
SQL> SET LINES 122
COL INDEX_OWNER FOR A20
COL INDEX_NAME FOR A30
COL TABLE_OWNER FOR A20
COL COLUMN_NAME FOR A25
SELECT TABLE_OWNER,INDEX_NAME,COLUMN_NAME,COLUMN_POSITION
  FROM Dba_Ind_Columns
 WHERE table_name = upper('&TABLE_NAME') order by TABLE_OWNER,INDEX_OWNER,INDEX_NAME,COLUMN_POSITION
and index_name='I_OBJ4';
SQL> SQL> SQL> SQL> SQL>   2    3
Enter value for table_name: OBJ$
old   3:  WHERE table_name = upper('&TABLE_NAME') order by TABLE_OWNER,INDEX_OWNER,INDEX_NAME,COLUMN_POSITION
new   3:  WHERE table_name = upper('OBJ$') order by TABLE_OWNER,INDEX_OWNER,INDEX_NAME,COLUMN_POSITION
TABLE_OWNER	     INDEX_NAME 		    COLUMN_NAME 	      COLUMN_POSITION
-------------------- ------------------------------ ------------------------- ---------------
SYS		     I_OBJ4			    DATAOBJ#				    1
SYS		     I_OBJ4			    TYPE#				    2
SYS		     I_OBJ4			    OWNER#				    3
SQL> SELECT DATAOBJ# FROM OBJ$ WHERE OBJ#=97109;
no rows selected
SQL> SELECT DATAOBJ# FROM OBJ$ WHERE OBJ#=97094;
  DATAOBJ#
----------
     97094
SQL> select /*+ index(t i_obj4) */ rowid,DATAOBJ#,type#,owner# from obj$  t
minus
select /*+ full(t1) */ rowid,DATAOBJ#,type#,owner# from obj$  t1;
  2    3
ROWID		     DATAOBJ#	   TYPE#     OWNER#
------------------ ---------- ---------- ----------
AAAAASAABAAAADxAAb	97109	       0	  0
SQL>  select /*+ full(t1) */ rowid,DATAOBJ#,type#,owner# from obj$  t1
  minus
  select /*+ index(t i_obj4) */ rowid,DATAOBJ#,type#,owner# from obj$  t
  ;
  2    3    4
ROWID		     DATAOBJ#	   TYPE#     OWNER#
------------------ ---------- ---------- ----------
AAAAASAABAAAADxAAb	97094	       0	  0
SQL> select name,obj#,dataobj# from obj$ where rowid='AAAAASAABAAAADxAAb';
NAME				     OBJ#   DATAOBJ#
------------------------------ ---------- ----------
_NEXT_OBJECT				1      97094

到此也比较清楚,rowid为AAAAASAABAAAADxAAb的dataobj#记录在obj$表中为97094而在I_OBJ4中记录为97109,因此两者不一致,从而出现ORA-8102错误

尝试bbed解决ORA-8102问题
尝试修改obj$和i_obj4中的dataobj#记录一致,这里修改obj$中的对应记录

SQL> select dbms_rowid.rowid_relative_fno(rowid) file#,dbms_rowid.rowid_block_number(rowid) block#,
dbms_rowid.rowid_row_number(rowid) row#
from obj$ where rowid='AAAAASAABAAAADxAAb'  2    3
  4  /
     FILE#     BLOCK#	    ROW#
---------- ---------- ----------
	 1	  241	      27
SQL> select dump(97109,16) from dual;
DUMP(97109,16)
----------------------
Typ=2 Len=4: c3,a,48,a
SQL> select dump(97094,16) from dual;
DUMP(97094,16)
-----------------------
Typ=2 Len=4: c3,a,47,5f
-bash-4.1$ bbed blocksize=8192 mode=edit filename=/u01/app/oracle/oradata/oa/system01.dbf
Password:
BBED: Release 2.0.0.0.0 - Limited Production on Sat Mar 14 19:30:18 2015
Copyright (c) 1982, 2009, Oracle and/or its affiliates.  All rights reserved.
************* !!! For Oracle Internal Use only !!! ***************
BBED> show all
	FILE#          	0
	BLOCK#         	1
	OFFSET         	0
	DBA            	0x00000000 (0 0,1)
	FILENAME       	/u01/app/oracle/oradata/oa/system01.dbf
	BIFILE         	bifile.bbd
	LISTFILE
	BLOCKSIZE      	8192
	MODE           	Edit
	EDIT           	Unrecoverable
	IBASE          	Dec
	OBASE          	Dec
	WIDTH          	80
	COUNT          	512
	LOGFILE        	log.bbd
	SPOOL          	No
BBED> set block 241
	BLOCK#         	241
BBED> map
 File: /u01/app/oracle/oradata/oa/system01.dbf (0)
 Block: 241                                   Dba:0x00000000
------------------------------------------------------------
 KTB Data Block (Table/Cluster)
 struct kcbh, 20 bytes                      @0
 struct ktbbh, 48 bytes                     @20
 struct kdbh, 14 bytes                      @68
 struct kdbt[1], 4 bytes                    @82
 sb2 kdbr[105]                              @86
 ub1 freespace[87]                          @296
 ub1 rowdata[7805]                          @383
 ub4 tailchk                                @8188
BBED> p *kdbr[27]
rowdata[0]
----------
ub1 rowdata[0]                              @383      0x2c
BBED> x /rnnncnnncc
rowdata[0]                                  @383
----------
flag@383:  0x2c (KDRHFL, KDRHFF, KDRHFH)
lock@384:  0x00
cols@385:    18
col    0[2] @386: 1
col    1[4] @389: 97094
col    2[1] @394: 0
col   3[12] @396: _NEXT_OBJECT
col    4[2] @409: 1
col    5[0] @412: *NULL*
col    6[1] @413: 0
col    7[7] @415: xm....4
col    8[7] @423: xs....6
col    9[7] @431: xm....4
col   10[1] @439: .
col   11[0] @441: *NULL*
col   12[0] @442: *NULL*
col   13[1] @443: .
col   14[0] @445: *NULL*
col   15[1] @446: .
col   16[4] @448: ..8$
col   17[1] @453: .
BBED> set count 32
	COUNT          	32
BBED> set offset 389
	OFFSET         	389
BBED> d
 File: /u01/app/oracle/oradata/oa/system01.dbf (0)
 Block: 241              Offsets:  389 to  420           Dba:0x00000000
------------------------------------------------------------------------
 04c30a47 5f01800c 5f4e4558 545f4f42 4a454354 02c102ff 01800778 6d080f01
 <32 bytes per line>
BBED> set offset +3
	OFFSET         	392
BBED> d
 File: /u01/app/oracle/oradata/oa/system01.dbf (0)
 Block: 241              Offsets:  392 to  423           Dba:0x00000000
------------------------------------------------------------------------
 475f0180 0c5f4e45 58545f4f 424a4543 5402c102 ff018007 786d080f 01113407
 <32 bytes per line>
BBED> m /x 480a
 File: /u01/app/oracle/oradata/oa/system01.dbf (0)
 Block: 241              Offsets:  392 to  423           Dba:0x00000000
------------------------------------------------------------------------
 480a0180 0c5f4e45 58545f4f 424a4543 5402c102 ff018007 786d080f 01113407
 <32 bytes per line>
BBED>  p *kdbr[27]
rowdata[0]
----------
ub1 rowdata[0]                              @383      0x2c
BBED> x /rnnncnnncc
rowdata[0]                                  @383
----------
flag@383:  0x2c (KDRHFL, KDRHFF, KDRHFH)
lock@384:  0x00
cols@385:    18
col    0[2] @386: 1
col    1[4] @389: 97109
col    2[1] @394: 0
col   3[12] @396: _NEXT_OBJECT
col    4[2] @409: 1
col    5[0] @412: *NULL*
col    6[1] @413: 0
col    7[7] @415: xm....4
col    8[7] @423: xs....6
col    9[7] @431: xm....4
col   10[1] @439: .
col   11[0] @441: *NULL*
col   12[0] @442: *NULL*
col   13[1] @443: .
col   14[0] @445: *NULL*
col   15[1] @446: .
col   16[4] @448: ..8$
col   17[1] @453: .
BBED> sum apply
Check value for File 0, Block 241:
current = 0x913d, required = 0x913d

验证bbed修改后效果

SQL> startup
ORACLE instance started.
Total System Global Area 2.6991E+10 bytes
Fixed Size		    2213976 bytes
Variable Size		 1.9327E+10 bytes
Database Buffers	 7516192768 bytes
Redo Buffers		  145174528 bytes
Database mounted.
Database opened.
SQL> select /*+ index(t i_obj4) */ rowid,DATAOBJ#,type#,owner# from obj$  t
minus
select /*+ full(t1) */ rowid,DATAOBJ#,type#,owner# from obj$  t1;
  2    3
no rows selected
SQL>  select /*+ full(t1) */ rowid,DATAOBJ#,type#,owner# from obj$  t1
  minus
  select /*+ index(t i_obj4) */ rowid,DATAOBJ#,type#,owner# from obj$  t
  ;
  2    3    4
no rows selected
SQL> ANALYZE TABLE sys.obj$ VALIDATE STRUCTURE CASCADE;
ANALYZE TABLE sys.obj$ VALIDATE STRUCTURE CASCADE
*
ERROR at line 1:
ORA-01499: table/index cross reference failure - see trace file
SQL> create table t_xifenfei as select * from dual;
create table t_xifenfei as select * from dual
                                         *
ERROR at line 1:
ORA-00604: error occurred at recursive SQL level 1
ORA-08102: index key not found, obj# 39, file 1, block 93842 (2)

这里比较悲剧,我们看到i_obj4和obj$中对应记录已经一致(条数和数据值),但是依然不能执行创建表操作,依旧报ORA-8102错误.

进一步分析错误原因

SQL> ALTER SESSION SET EVENTS '802 trace name errorstack level 3';
Session altered.
SQL> create table t as select * from dual;
create table t as select * from dual
                                *
ERROR at line 1:
ORA-00604: error occurred at recursive SQL level 1
ORA-08102: index key not found, obj# 39, file 1, block 93842 (2)
SQL> select value from v$diag_info where name='Default Trace File';
VALUE
--------------------------------------------------------------------------------
/u01/app/oracle/diag/rdbms/oa/oa/trace/oa_ora_6163.trc
oer 8102.2 - obj# 39, rdba: 0x00416e92(afn 1, blk# 93842)
kdk key 8102.2:
  ncol: 4, len: 16
  key: (16):  04 c3 0a 48 0a 01 80 01 80 06 00 40 00 f1 00 1b
--这里可以看出来,提示ORA-8102错误依旧在I_OBJ4,97109记录上
SQL> select max(dataobj#) from obj$;
MAX(DATAOBJ#)
-------------
	96815
SQL> select name,obj#,dataobj# from obj$ where rowid='AAAAASAABAAAADxAAb';
NAME				     OBJ#   DATAOBJ#
------------------------------ ---------- ----------
_NEXT_OBJECT				1      97109
--这里很奇怪,通过rowid查询我们已经的出来在obj$中有dataobj#为97109,而通过max(dataobj#)只有96815
分析max(dataobj#)执行计划
SQL> SET AUTOT TRACE
SQL> select max(dataobj#) from obj$;
Execution Plan
----------------------------------------------------------
Plan hash value: 721075849
------------------------------------------------------------------------------------
| Id  | Operation		   | Name   | Rows  | Bytes | Cost (%CPU)| Time    |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT	   |	    |	  1 |	  2 |	  2   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE 	   |	    |	  1 |	  2 |		 |    |
|   2 |   INDEX FULL SCAN (MIN/MAX)| I_OBJ4 |	  1 |	  2 |	  2   (0)| 00:00:01 |
-------------------------------------------------------------------------------------
Statistics
----------------------------------------------------------
	  0  recursive calls
	  0  db block gets
	  2  consistent gets
	  0  physical reads
	  0  redo size
	533  bytes sent via SQL*Net to client
	520  bytes received via SQL*Net from client
	  2  SQL*Net roundtrips to/from client
	  0  sorts (memory)
	  0  sorts (disk)
	  1  rows processed
--这里知晓,由于max(dataobj#)使用了INDEX FULL SCAN (MIN/MAX)执行计划,从而的出来最大值为96815,
--而我们从ORA-8102错误中可以看到index中有dataobj#为97109,证明index中的链表可能出问题
--为什么怀疑是链表有问题呢?因为该index的ffs正常

尝试把ORA-8102报错标记坏块尝试
并且通过event和隐含参数屏蔽index坏块

-bash-4.1$ bbed blocksize=8192 mode=edit filename=/u01/app/oracle/oradata/oa/system01.dbf
Password:
BBED: Release 2.0.0.0.0 - Limited Production on Sat Mar 14 20:30:58 2015
Copyright (c) 1982, 2009, Oracle and/or its affiliates.  All rights reserved.
************* !!! For Oracle Internal Use only !!! ***************
BBED> set block 93842
	BLOCK#         	93842
BBED> set offset 8188
	OFFSET         	8188
BBED> d
 File: /u01/app/oracle/oradata/oa/system01.dbf (0)
 Block: 93842            Offsets: 8188 to 8191           Dba:0x00000000
------------------------------------------------------------------------
 010675ad
 <32 bytes per line>
BBED> m /x 02
 File: /u01/app/oracle/oradata/oa/system01.dbf (0)
 Block: 93842            Offsets: 8188 to 8191           Dba:0x00000000
------------------------------------------------------------------------
 020675ad
 <32 bytes per line>
BBED> sum apply
Check value for File 0, Block 93842:
current = 0x9186, required = 0x9186
BBED> verify
DBVERIFY - Verification starting
FILE = /u01/app/oracle/oradata/oa/system01.dbf
BLOCK = 93842
Block 93842 is corrupt
Corrupt block relative dba: 0x00416e92 (file 0, block 93842)
Fractured block found during verification
Data in bad block:
 type: 6 format: 2 rdba: 0x00416e92
 last change scn: 0x0000.c007ad75 seq: 0x1 flg: 0x06
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0xad750602
 check value in block header: 0x9186
 computed block checksum: 0x0
DBVERIFY - Verification complete
Total Blocks Examined         : 1
Total Blocks Processed (Data) : 0
Total Blocks Failing   (Data) : 0
Total Blocks Processed (Index): 0
Total Blocks Failing   (Index): 0
Total Blocks Empty            : 0
Total Blocks Marked Corrupt   : 1
Total Blocks Influx           : 2
Message 531 not found;  product=RDBMS; facility=BBED
-bash-4.1$ sqlplus / as sysdba
SQL*Plus: Release 11.2.0.1.0 Production on Sat Mar 14 20:33:19 2015
Copyright (c) 1982, 2009, Oracle.  All rights reserved.
Connected to an idle instance.
SQL> startup pfile='/tmp/pfile'
ORACLE instance started.
Total System Global Area 2.6991E+10 bytes
Fixed Size		    2213976 bytes
Variable Size		 1.9327E+10 bytes
Database Buffers	 7516192768 bytes
Redo Buffers		  145174528 bytes
Database mounted.
Database opened.
SQL> create table t as select * from dual;
create table t as select * from dual
                                *
ERROR at line 1:
ORA-00604: error occurred at recursive SQL level 1
ORA-01578: ORACLE data block corrupted (file # 1, block # 93842)
ORA-01110: data file 1: '/u01/app/oracle/oradata/oa/system01.dbf'

通过这一步测试证明,在该ora-8102的错误中,通过坏块是无法解决绕过去该错误,只是把错误从ORA-8102转变为了ORA-1578

修复好制造坏块block

BBED> m /x 01
 File: /u01/app/oracle/oradata/oa/system01.dbf (0)
 Block: 93842            Offsets: 8188 to 8191           Dba:0x00000000
------------------------------------------------------------------------
 010675ad
 <32 bytes per line>
BBED> sum apply
Check value for File 0, Block 93842:
current = 0x9185, required = 0x9185
BBED> verify
DBVERIFY - Verification starting
FILE = /u01/app/oracle/oradata/oa/system01.dbf
BLOCK = 93842
DBVERIFY - Verification complete
Total Blocks Examined         : 1
Total Blocks Processed (Data) : 0
Total Blocks Failing   (Data) : 0
Total Blocks Processed (Index): 1
Total Blocks Failing   (Index): 0
Total Blocks Empty            : 0
Total Blocks Marked Corrupt   : 0
Total Blocks Influx           : 0
Message 531 not found;  product=RDBMS; facility=BBED
SQL> shutdown abort
ORACLE instance shut down.
SQL> startup
ORACLE instance started.
Total System Global Area 2.6991E+10 bytes
Fixed Size		    2213976 bytes
Variable Size		 1.9327E+10 bytes
Database Buffers	 7516192768 bytes
Redo Buffers		  145174528 bytes
Database mounted.
Database opened.
SQL> create table t1 as select * from dual;
create table t1 as select * from dual
                                 *
ERROR at line 1:
ORA-00604: error occurred at recursive SQL level 1
ORA-08102: index key not found, obj# 39, file 1, block 93842 (2)

至此我们大体出来信息:
1. ORA-8102的是I_OBJ4中的_NEXT_OBJECT记录异常和该index链表异常
2. 通过bbed修改I_OBJ4和obj$中相应记录无法解决该问题,因为还有链表异常
3. 通过标记为坏块也无法绕过该问题
由于后续如果继续修复修复i_obj4可能工作量过大,而且可以也比较急,通过人工直接删除I_OBJ4数据字典记录,然后结合bootstrap$核心index(I_OBJ1,I_USER1,I_FILE#_BLOCK#,I_IND1,I_TS#,I_CDEF1等)异常恢复—ORA-00701错误解决处理实现完美恢复

ORA-01157: cannot identify/lock data file导致表空间无法online

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:ORA-01157: cannot identify/lock data file导致表空间无法online

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

朋友和我聊到一个案例,一个表空间本来有一个数据文件(已经有大量数据进入),然后他加入7个数据文件,这个时候他发现他加多了文件,想删除这些数据文件,可是又不怎么懂数据库,直接offline 表空间,然后rm掉相关数据文件,最后表空间无法online.我对其场景进行了模拟恢复测试,使用bbed和dul对其进行恢复

模拟场景
表空间有一个数据文件,里面有数据,然后加入一个新数据文件,进入offline tbs,然后rm数据文件
offline表空间后,该表空间存在的数据文件也会offline

SQL> create tablespace xifenfei datafile
  2  '/u01/oracle/ora9i/oradata/xifenfei01.dbf' size 10M autoextend on next 16M;
Tablespace created.
SQL> create table chf.t_xifenfei01 tablespace xifenfei
  2  as select * from dba_objects;
Table created.
SQL> create table chf.t_xifenfei02 tablespace xifenfei
  2  as select * from dba_tables;
Table created.
SQL> SELECT COUNT(*) FROM CHF.T_XIFENFEI01;
  COUNT(*)
----------
     30758
SQL> C/01/02
  1* SELECT COUNT(*) FROM CHF.T_XIFENFEI02
SQL> /
  COUNT(*)
----------
       865
SQL> select tablespace_name,segment_name from dba_segments
  2  where segment_name like 'T_XIFENFEI%';
TABLESPACE_NAME                SEGMENT_NAME
------------------------------ -------------------------
XIFENFEI                       T_XIFENFEI01
XIFENFEI                       T_XIFENFEI02
SQL> ALTER tablespace xifenfei add datafile
  2  '/u01/oracle/ora9i/oradata/xifenfei02.dbf' size 16m;
Tablespace altered.
SQL> alter tablespace xifenfei offline;
Tablespace altered.
SQL> select file#,status$ from file$;
     FILE#    STATUS$
---------- ----------
         1          2
         2          2
         3          2
         4          2
         5          2
         6          2
         7          2
         8          2
         9          2
        10          2
        11          2
        12          2
12 rows selected.
SQL> set pages 1000
SQL> /
     FILE#    STATUS$
---------- ----------
         1          2
         2          2
         3          2
         4          2
         5          2
         6          2
         7          2
         8          2
         9          2
        10          2
        11          2
        12          2
12 rows selected.
SQL> select file#,status from v$datafile;
     FILE# STATUS
---------- -------
         1 SYSTEM
         2 ONLINE
         3 ONLINE
         4 ONLINE
         5 ONLINE
         6 ONLINE
         7 ONLINE
         8 ONLINE
         9 ONLINE
        10 ONLINE
        11 OFFLINE
        12 OFFLINE
12 rows selected.
SQL> select file#,status from v$datafile_header;
     FILE# STATUS
---------- -------
         1 ONLINE
         2 ONLINE
         3 ONLINE
         4 ONLINE
         5 ONLINE
         6 ONLINE
         7 ONLINE
         8 ONLINE
         9 ONLINE
        10 ONLINE
        11 OFFLINE
        12 OFFLINE
12 rows selected.
SQL> select tablespace_name,status from dba_tablespaces;
TABLESPACE_NAME                STATUS
------------------------------ ---------
SYSTEM                         ONLINE
UNDOTBS1                       ONLINE
TEMP                           ONLINE
CWMLITE                        ONLINE
DRSYS                          ONLINE
EXAMPLE                        ONLINE
INDX                           ONLINE
ODM                            ONLINE
TOOLS                          ONLINE
USERS                          ONLINE
XDB                            ONLINE
XIFENFEI                       OFFLINE
12 rows selected.
SQL> select ts#,name,online$ from ts$;
       TS# NAME                              ONLINE$
---------- ------------------------------ ----------
         0 SYSTEM                                  1
         1 UNDOTBS1                                1
         2 TEMP                                    1
         3 CWMLITE                                 1
         4 DRSYS                                   1
         5 EXAMPLE                                 1
         6 INDX                                    1
         7 ODM                                     1
         8 TOOLS                                   1
         9 USERS                                   1
        10 XDB                                     1
        11 UNDOTBS2                                3
        12 XIFENFEI                                2
13 rows selected.
SQL> col name for a50
SQL> select file#,name from v$datafile;
     FILE# NAME
---------- --------------------------------------------------
         1 /u01/oracle/ora9i/oradata/ora9i/system01.dbf
         2 /u01/oracle/ora9i/oradata/ora9i/undotbs01.dbf
         3 /u01/oracle/ora9i/oradata/ora9i/cwmlite01.dbf
         4 /u01/oracle/ora9i/oradata/ora9i/drsys01.dbf
         5 /u01/oracle/ora9i/oradata/ora9i/example01.dbf
         6 /u01/oracle/ora9i/oradata/ora9i/indx01.dbf
         7 /u01/oracle/ora9i/oradata/ora9i/odm01.dbf
         8 /u01/oracle/ora9i/oradata/ora9i/tools01.dbf
         9 /u01/oracle/ora9i/oradata/ora9i/users01.dbf
        10 /u01/oracle/ora9i/oradata/ora9i/xdb01.dbf
        11 /u01/oracle/ora9i/oradata/xifenfei01.dbf
        12 /u01/oracle/ora9i/oradata/xifenfei02.dbf
12 rows selected.
SQL> !rm /u01/oracle/ora9i/oradata/xifenfei02.dbf
SQL> !ls -l /u01/oracle/ora9i/oradata/xifenfei*
-rw-r-----  1 ora9i oinstall 10493952 Dec 25 20:19 /u01/oracle/ora9i/oradata/xifenfei01.dbf
SQL> alter tablespace xifenfei online;
alter tablespace xifenfei online
*
ERROR at line 1:
ORA-01157: cannot identify/lock data file 12 - see DBWR trace file
ORA-01110: data file 12: '/u01/oracle/ora9i/oradata/xifenfei02.dbf'

这里已经构造出来,由于由于数据文件丢失导致表空间online报ORA-01157: cannot identify/lock data file的错误

online存在数据文件
存在的数据文件online之后,表空间依然offline

SQL> alter database datafile 11 online;
Database altered.
SQL> select file#,status from v$datafile_header;
     FILE# STATUS
---------- -------
         1 ONLINE
         2 ONLINE
         3 ONLINE
         4 ONLINE
         5 ONLINE
         6 ONLINE
         7 ONLINE
         8 ONLINE
         9 ONLINE
        10 ONLINE
        11 ONLINE
        12 OFFLINE
12 rows selected.
SQL> select file#,status from v$datafile;
     FILE# STATUS
---------- -------
         1 SYSTEM
         2 ONLINE
         3 ONLINE
         4 ONLINE
         5 ONLINE
         6 ONLINE
         7 ONLINE
         8 ONLINE
         9 ONLINE
        10 ONLINE
        11 ONLINE
        12 OFFLINE
12 rows selected.
SQL> select tablespace_name,status from dba_tablespaces;
TABLESPACE_NAME                STATUS
------------------------------ ---------
SYSTEM                         ONLINE
UNDOTBS1                       ONLINE
TEMP                           ONLINE
CWMLITE                        ONLINE
DRSYS                          ONLINE
EXAMPLE                        ONLINE
INDX                           ONLINE
ODM                            ONLINE
TOOLS                          ONLINE
USERS                          ONLINE
XDB                            ONLINE
XIFENFEI                       OFFLINE
12 rows selected.
SQL> select ts#,name,online$ from ts$;
       TS# NAME                                                  ONLINE$
---------- -------------------------------------------------- ----------
         0 SYSTEM                                                      1
         1 UNDOTBS1                                                    1
         2 TEMP                                                        1
         3 CWMLITE                                                     1
         4 DRSYS                                                       1
         5 EXAMPLE                                                     1
         6 INDX                                                        1
         7 ODM                                                         1
         8 TOOLS                                                       1
         9 USERS                                                       1
        10 XDB                                                         1
        11 UNDOTBS2                                                    3
        12 XIFENFEI                                                    2
13 rows selected.
SQL> create table chf.t_1 tablespace xifenfei
  2  as select * from dual;
as select * from dual
                 *
ERROR at line 2:
ORA-01542: tablespace 'XIFENFEI' is offline, cannot allocate space in it

由于表空间无法online,因此对该表空间分配报ORA-01542错误

使用bbed构造数据文件欺骗数据库

[ora9i@xifenfei tmp]$ dd if=/dev/zero of=/u01/oracle/ora9i/oradata/xifenfei02.dbf
> bs=8192 count=2049
2049+0 records in
2049+0 records out
BBED> copy file 11 block 1 to file 12 block 1
 File: /u01/oracle/ora9i/oradata/xifenfei02.dbf (12)
 Block: 1                Offsets:    0 to  511           Dba:0x03000001
------------------------------------------------------------------------
 0b020000 0100c002 00000000 00000104 1deb0000 00002009 00000008 329bf29e
 4f524139 49000000 11010000 00050000 00200000 0b000300 00000000 00000000
 <32 bytes per line>
BBED> set count 32
        COUNT           32
BBED> m /x 0c
Warning: contents of previous BIFILE will be lost. Proceed? (Y/N) y
 File: /u01/oracle/ora9i/oradata/xifenfei02.dbf (12)
 Block: 1                Offsets:  280 to  311           Dba:0x03000001
------------------------------------------------------------------------
 0c000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
 <32 bytes per line>
BBED> m /x 0c
 File: /u01/oracle/ora9i/oradata/xifenfei02.dbf (12)
 Block: 1                Offsets:   52 to   83           Dba:0x03000001
------------------------------------------------------------------------
 0c000300 00000000 00000000 00000000 00000000 00000000 00000000 00000000
 <32 bytes per line>
BBED> sum apply
Check value for File 12, Block 1:
current = 0xad21, required = 0xad21
SQL> alter tablespace xifenfei online;
alter tablespace xifenfei online
*
ERROR at line 1:
ORA-01122: database file 12 failed verification check
ORA-01110: data file 12: '/u01/oracle/ora9i/oradata/xifenfei02.dbf'
ORA-01251: Unknown File Header Version read for file number 12
SQL> !oerr ora 01251
01251, 00000, "Unknown File Header Version read for file number %s"
// *Cause:  Read of the file header returned a record but its version cannot
//         be identified.  Either the header has been corrupted, or the file
//         is not a valid database file.
// *Action: Have the operating system make the correct file available to
//         the database, or recover the file.
BBED> m /x 01000003
Warning: contents of previous BIFILE will be lost. Proceed? (Y/N) y
 File: /u01/oracle/ora9i/oradata/xifenfei02.dbf (12)
 Block: 1                Offsets:    4 to  36           Dba:0x03000001
------------------------------------------------------------------------
 01000003 00000000 00000104 21ad0000 00002009 00000008 329bf29e 4f524139
 <32 bytes per line>
BBED> sum apply
Check value for File 12, Block 1:
current = 0xace1, required = 0xace1
SQL> alter tablespace xifenfei online;
alter tablespace xifenfei online
*
ERROR at line 1:
ORA-01122: database file 12 failed verification check
ORA-01110: data file 12: '/u01/oracle/ora9i/oradata/xifenfei02.dbf'
ORA-01203: wrong incarnation of this file - wrong creation SCN
BBED> m /x 613B0300
Warning: contents of previous BIFILE will be lost. Proceed? (Y/N) y
 File: /u01/oracle/ora9i/oradata/xifenfei02.dbf (12)
 Block: 1                Offsets:  100 to  131           Dba:0x03000001
------------------------------------------------------------------------
 613b0300 00000000 f35bb133 f39bae33 b46c0200 00000000 00000000 00000000
 <32 bytes per line>
BBED> sum apply
Check value for File 12, Block 1:
current = 0xa0c9, required = 0xa0c9
--提示数据文件scn比控制文件新(由于正常数据文件online过)
SQL> alter tablespace xifenfei online;
alter tablespace xifenfei online
*
ERROR at line 1:
ORA-01122: database file 12 failed verification check
ORA-01110: data file 12: '/u01/oracle/ora9i/oradata/xifenfei02.dbf'
ORA-01207: file is more recent than controlfile - old controlfile
SQL> set lines 160
SQL> col name for a50
SQL> select ts#,file#,BYTES/1024/1024/1024 status,enabled,
  2  to_char(checkpoint_change#,'999999999999999') "SCN",
  3  to_char(last_change#,'999999999999999')"STOP_SCN",
  4  name from v$datafile;
  TS#  FILE#     STATUS ENABLED SCN      STOP_SCN         NAME
----- --- ---------- ---------- -------- ------------- ---------------------------------------------
    0   1  .37109375 READ WRITE   198231               /u01/oracle/ora9i/oradata/ora9i/system01.dbf
    1   2   .1953125 READ WRITE   198231               /u01/oracle/ora9i/oradata/ora9i/undotbs01.dbf
    3   3  .01953125 READ WRITE   198231               /u01/oracle/ora9i/oradata/ora9i/cwmlite01.dbf
    4   4  .01953125 READ WRITE   198231               /u01/oracle/ora9i/oradata/ora9i/drsys01.dbf
    5   5 .145874023 READ WRITE   198231               /u01/oracle/ora9i/oradata/ora9i/example01.dbf
    6   6 .024414063 READ WRITE   198231               /u01/oracle/ora9i/oradata/ora9i/indx01.dbf
    7   7  .01953125 READ WRITE   198231               /u01/oracle/ora9i/oradata/ora9i/odm01.dbf
    8   8 .009765625 READ WRITE   198231               /u01/oracle/ora9i/oradata/ora9i/tools01.dbf
    9   9 .024414063 READ WRITE   198231               /u01/oracle/ora9i/oradata/ora9i/users01.dbf
   10  10 .043945313 READ WRITE   198231               /u01/oracle/ora9i/oradata/ora9i/xdb01.dbf
   12  11 .009765625 DISABLED     228353        228353 /u01/oracle/ora9i/oradata/xifenfei01.dbf
   12  12          0 DISABLED     213917        213917 /u01/oracle/ora9i/oradata/xifenfei02.dbf
12 rows selected.
SQL> set pages 100
SQL> set linesize 150
SQL> select ts#,file#,TABLESPACE_NAME,status,
  2  to_char(CREATION_TIME,'yyyy-mm-dd hh24:mi:ss') CREATE_TIME,
  3  to_char(checkpoint_change#,'9999999999999999') "SCN",
  4  to_char(RESETLOGS_CHANGE#,'9999999999999999') "RESETLOGS SCN",FUZZY
from v$datafile_header;  5
TS# FILE# TABLESPACE_NAME  STATUS  CREATE_TIME         SCN       RESETLOGS SCN     FUZ
--- ----- ---------------- ------- ------------------- --------- ----------------- ---
  0     1 SYSTEM           ONLINE  2004-03-10 12:17:49    198231            158900 YES
  1     2 UNDOTBS1         ONLINE  2004-03-10 20:30:20    198231            158900 YES
  3     3 CWMLITE          ONLINE  2004-03-10 12:18:10    198231            158900 YES
  4     4 DRSYS            ONLINE  2004-03-10 12:18:11    198231            158900 YES
  5     5 EXAMPLE          ONLINE  2004-03-10 12:18:12    198231            158900 YES
  6     6 INDX             ONLINE  2004-03-10 12:18:16    198231            158900 YES
  7     7 ODM              ONLINE  2004-03-10 12:18:17    198231            158900 YES
  8     8 TOOLS            ONLINE  2004-03-10 12:18:19    198231            158900 YES
  9     9 USERS            ONLINE  2004-03-10 12:18:19    198231            158900 YES
 10    10 XDB              ONLINE  2004-03-10 12:18:20    198231            158900 YES
 12    11 XIFENFEI         OFFLINE 2014-12-25 17:57:07    228353            158900 NO
  0    12                  OFFLINE                             0                 0
12 rows selected.
BBED> m /x 9D430300
 File: /u01/oracle/ora9i/oradata/xifenfei02.dbf (12)
 Block: 1                Offsets:  140 to  171           Dba:0x03000001
------------------------------------------------------------------------
 9d430300 00000000 f670b133 01000000 02000000 fc4e0000 10000000 02000000
 <32 bytes per line>
BBED> SUM APPLY
Check value for File 12, Block 1:
current = 0x9982, required = 0x9982
SQL> ALTER TABLESPACE XIFENFEI ONLINE;
Tablespace altered.

在bbed的恢复过程中出现以下常见错误:
ORA-01122: database file 12 failed verification check
ORA-01110: data file 12: ‘/u01/oracle/ora9i/oradata/xifenfei02.dbf’
ORA-01251: Unknown File Header Version read for file number 12
这个主要是块号和块地址不一匹配,导致数据库无法识别

ORA-01122: database file 12 failed verification check
ORA-01110: data file 12: ‘/u01/oracle/ora9i/oradata/xifenfei02.dbf’
ORA-01203: wrong incarnation of this file – wrong creation SCN
这个由于数据文件头的创建scn和控制文件以及file$中的记录不一致

ORA-01122: database file 12 failed verification check
ORA-01110: data file 12: ‘/u01/oracle/ora9i/oradata/xifenfei02.dbf’
ORA-01207: file is more recent than controlfile – old controlfile
这个由于正常数据文件online过,或者你构造文件头使用的scn大于需要构造文件在控制文件中记录的scn值

后续处理

SQL> SELECT COUNT(*) FROM CHF.T_XIFENFEI01;
  COUNT(*)
----------
     30758
SQL> C/01/02
  1* SELECT COUNT(*) FROM CHF.T_XIFENFEI02
SQL> /
  COUNT(*)
----------
       865
SQL> ALTER DATABASE DATAFILE 12 OFFLINE DROP;
Database altered.

证明通过bbed,模拟数据文件,使用完美恢复数据内容

dul恢复

SQL> ALTER TABLESPACE XIFENFEI OFFLINE;
Tablespace altered.
[ora9i@xifenfei dul]$ ./dul
Data UnLoader: 10.2.0.5.33 - Internal Only - on Thu Dec 25 21:18:54 2014
with 64-bit io functions
Copyright (c) 1994 2014 Bernard van Duijnen All rights reserved.
 Strictly Oracle Internal Use Only
Within one week you will need a more recent DUL version for this os
Found db_id = 2666699570
Found db_name = ORA9I
DUL> bootstrap;
Probing file = 1, block = 417
  database version 9 bootstrap$ at file 1, block 377
. unloading table                BOOTSTRAP$
DUL: Warning: block number is non zero but marked deferred trying to process it anyhow
      57 rows unloaded
DUL: Warning: Dictionary cache DC_BOOTSTRAP is empty
Reading BOOTSTRAP.dat 57 entries loaded
Parsing Bootstrap$ contents
Generating dict.ddl for version 9
 OBJ$: segobjno 18, file 1 block 121
 TAB$: segobjno 2, tabno 1, file 1  block 25
 COL$: segobjno 2, tabno 5, file 1  block 25
 USER$: segobjno 10, tabno 1, file 1  block 89
Running generated file "@dict.ddl" to unload the dictionary tables
. unloading table                      OBJ$   31490 rows unloaded
. unloading table                      TAB$     909 rows unloaded
. unloading table                      COL$   35582 rows unloaded
. unloading table                     USER$      63 rows unloaded
Reading USER.dat 63 entries loaded
Reading OBJ.dat 31490 entries loaded and sorted 31490 entries
Reading TAB.dat 909 entries loaded
Reading COL.dat 35582 entries loaded and sorted 35582 entries
Reading BOOTSTRAP.dat 57 entries loaded
DUL: Warning: Recreating file "dict.ddl"
Generating dict.ddl for version 9
 OBJ$: segobjno 18, file 1 block 121
 TAB$: segobjno 2, tabno 1, file 1  block 25
 COL$: segobjno 2, tabno 5, file 1  block 25
 USER$: segobjno 10, tabno 1, file 1  block 89
 TABPART$: segobjno 230, file 1 block 1657
 INDPART$: segobjno 234, file 1 block 1689
 TABCOMPART$: segobjno 249, file 1 block 1809
 INDCOMPART$: segobjno 253, file 1 block 1841
 TABSUBPART$: segobjno 240, file 1 block 1737
 INDSUBPART$: segobjno 245, file 1 block 1777
 IND$: segobjno 2, tabno 3, file 1  block 25
 ICOL$: segobjno 2, tabno 4, file 1  block 25
 LOB$: segobjno 2, tabno 6, file 1  block 25
 COLTYPE$: segobjno 2, tabno 7, file 1  block 25
 TYPE$: segobjno 296, tabno 1, file 1  block 2129
 COLLECTION$: segobjno 296, tabno 2, file 1  block 2129
 ATTRIBUTE$: segobjno 296, tabno 3, file 1  block 2129
 LOBFRAG$: segobjno 258, file 1 block 1881
 LOBCOMPPART$: segobjno 261, file 1 block 1905
 UNDO$: segobjno 15, file 1 block 105
 TS$: segobjno 6, tabno 2, file 1  block 57
 PROPS$: segobjno 101, file 1 block 769
Running generated file "@dict.ddl" to unload the dictionary tables
. unloading table                      OBJ$
DUL: Warning: Recreating file "OBJ.ctl"
   31490 rows unloaded
. unloading table                      TAB$
DUL: Warning: Recreating file "TAB.ctl"
     909 rows unloaded
. unloading table                      COL$
DUL: Warning: Recreating file "COL.ctl"
   35582 rows unloaded
. unloading table                     USER$
DUL: Warning: Recreating file "USER.ctl"
      63 rows unloaded
. unloading table                  TABPART$      55 rows unloaded
. unloading table                  INDPART$     128 rows unloaded
. unloading table               TABCOMPART$       0 rows unloaded
. unloading table               INDCOMPART$       0 rows unloaded
. unloading table               TABSUBPART$       0 rows unloaded
. unloading table               INDSUBPART$       0 rows unloaded
. unloading table                      IND$    1396 rows unloaded
. unloading table                     ICOL$    1842 rows unloaded
. unloading table                      LOB$     427 rows unloaded
. unloading table                  COLTYPE$    1160 rows unloaded
. unloading table                     TYPE$     956 rows unloaded
. unloading table               COLLECTION$     250 rows unloaded
. unloading table                ATTRIBUTE$    3623 rows unloaded
. unloading table                  LOBFRAG$       0 rows unloaded
. unloading table              LOBCOMPPART$       0 rows unloaded
. unloading table                     UNDO$      21 rows unloaded
. unloading table                       TS$      13 rows unloaded
. unloading table                    PROPS$      25 rows unloaded
Reading USER.dat 63 entries loaded
Reading OBJ.dat 31490 entries loaded and sorted 31490 entries
Reading TAB.dat 909 entries loaded
Reading COL.dat 35582 entries loaded and sorted 35582 entries
Reading TABPART.dat 55 entries loaded and sorted 55 entries
Reading TABCOMPART.dat 0 entries loaded and sorted 0 entries
Reading TABSUBPART.dat 0 entries loaded and sorted 0 entries
Reading INDPART.dat 128 entries loaded and sorted 128 entries
Reading INDCOMPART.dat 0 entries loaded and sorted 0 entries
Reading INDSUBPART.dat 0 entries loaded and sorted 0 entries
Reading IND.dat 1396 entries loaded
Reading LOB.dat 427 entries loaded
Reading ICOL.dat 1842 entries loaded
Reading COLTYPE.dat 1160 entries loaded
Reading TYPE.dat 956 entries loaded
Reading ATTRIBUTE.dat 3623 entries loaded
Reading COLLECTION.dat 250 entries loaded
Reading BOOTSTRAP.dat 57 entries loaded
Reading LOBFRAG.dat 0 entries loaded and sorted 0 entries
Reading LOBCOMPPART.dat 0 entries loaded and sorted 0 entries
Reading UNDO.dat 21 entries loaded
Reading TS.dat 13 entries loaded
Reading PROPS.dat 25 entries loaded
Database character set is ZHS16GBK
Database national character set is AL16UTF16
DUL> unload table chf.t_xifenfei01;
. unloading table              T_XIFENFEI01   30758 rows unloaded
DUL> unload table chf.t_xifenfei02;
. unloading table              T_XIFENFEI02     865 rows unloaded
DUL>

这里证明,在表空间不能online的情况下,dul处理类似故障很轻松(但是需要重建表空间,bbed处理的从原理上可以不重建表空间),当然如果涉及的对象很多,而丢失的文件很少,可能bbed也不失为一种很不错的选择.
另外补充:如果表空间丢失的文件包含数据,也可以通过类似bbed和dul方法处理:bbed 让表空间online之后需要考虑跳过丢失文件;dul处理完全相同

asm disk误设置pvid导致asm diskgroup无法mount恢复

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:asm disk误设置pvid导致asm diskgroup无法mount恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

有朋友找到我说他们把以前存储到AIX直连的存储切换为含光纤交换机的存储网络后,RAC无法启动,让我给予支持.通过分析是由于换链路之后开始磁盘顺序不对,维护人员对其asm disk 设置了pvid,导致asm 磁盘组无法正常mount,从而使得含votedisk的dg的asm disk无法正常访问,从而RAC的cssd进程无法启动,同样数据文件的磁盘组也无法mount,通过kfed修复成功,实现数据0丢失.
平台版本信息(2节点RAC)

$ sqlplus -v
SQL*Plus: Release 11.2.0.4.0 Production
$ uname -a
AIX db2 1 7 00F9733E4C00

GI日志报错信息

2014-12-20 16:44:08.769:
[ohasd(6946818)]CRS-2769:Unable to failover resource 'ora.diskmon'.
2014-12-20 16:44:11.775:
[cssd(9502756)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds;
Details at (:CSSNM00070:) in /u01/app/11.2.0/grid/log/db1/cssd/ocssd.log
2014-12-20 16:44:26.791:
[cssd(9502756)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds;
、Details at (:CSSNM00070:) in /u01/app/11.2.0/grid/log/db1/cssd/ocssd.log
2014-12-20 16:44:41.812:
[cssd(9502756)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds;
Details at (:CSSNM00070:) in /u01/app/11.2.0/grid/log/db1/cssd/ocssd.log

从这里可以看出来是由于RAC启动过程中无法获得votedisk使得其无法正常启动,通过分析日志找出来votedisk相关磁盘

2014-12-15 17:36:15.424:
[cssd(10027070)]CRS-1605:CSSD voting file is online: /dev/rhdisk4; details in /u01/app/11.2.0/grid/log/db1/cssd/ocssd.log
2014-12-15 17:36:15.433:
[cssd(10027070)]CRS-1605:CSSD voting file is online: /dev/rhdisk5; details in /u01/app/11.2.0/grid/log/db1/cssd/ocssd.log
2014-12-15 17:36:15.445:
[cssd(10027070)]CRS-1605:CSSD voting file is online: /dev/rhdisk6; details in /u01/app/11.2.0/grid/log/db1/cssd/ocssd.log

从这里可以知道rhdisk4,5,6为votedisk对应磁盘,使用kfed查看磁盘头信息

$kfed read /dev/rhdisk4
kfbh.endian:                        201 ; 0x000: 0xc9
kfbh.hard:                          194 ; 0x001: 0xc2
kfbh.type:                          212 ; 0x002: *** Unknown Enum ***
kfbh.datfmt:                        193 ; 0x003: 0xc1
kfbh.block.blk:                       0 ; 0x004: blk=0
kfbh.block.obj:                       0 ; 0x008: file=0
kfbh.check:                           0 ; 0x00c: 0x00000000
kfbh.fcn.base:                        0 ; 0x010: 0x00000000
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000
1102BEE00 C9C2D4C1 00000000 00000000 00000000  [................]
1102BEE10 00000000 00000000 00000000 00000000  [................]
        Repeat 6 times
1102BEE80 00F9733D 67553E0A 00000000 00000000  [..s=gU>.........]
1102BEE90 00000000 00000000 00000000 00000000  [................]
  Repeat 246 times
KFED-00322: Invalid content encountered during block traversal: [kfbtTraverseBlock][Invalid OSM block type][][212]
$kfed read /dev/rhdisk4 blkn=1
kfbh.endian:                          0 ; 0x000: 0x00
kfbh.hard:                          130 ; 0x001: 0x82
kfbh.type:                            2 ; 0x002: KFBTYP_FREESPC
kfbh.datfmt:                          2 ; 0x003: 0x02
kfbh.block.blk:                       1 ; 0x004: blk=1
kfbh.block.obj:              2147483648 ; 0x008: disk=0
kfbh.check:                  3883664132 ; 0x00c: 0xe77c0304
kfbh.fcn.base:                        0 ; 0x010: 0x00000000
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000
kfdfsb.aunum:                         0 ; 0x000: 0x00000000
kfdfsb.max:                         254 ; 0x004: 0x00fe
kfdfsb.cnt:                          23 ; 0x006: 0x0017
kfdfsb.bound:                         0 ; 0x008: 0x0000
kfdfsb.flag:                          1 ; 0x00a: B=1
kfdfsb.ub1spare:                      0 ; 0x00b: 0x00
kfdfsb.spare[0]:                      0 ; 0x00c: 0x00000000
kfdfsb.spare[1]:                      0 ; 0x010: 0x00000000
kfdfsb.spare[2]:                      0 ; 0x014: 0x00000000
kfdfse[0].fse:                      119 ; 0x018: FREE=0x7 FRAG=0x7
kfdfse[1].fse:                       16 ; 0x019: FREE=0x0 FRAG=0x1
…………
$kfed read /dev/rhdisk4 blkn=510
kfbh.endian:                          0 ; 0x000: 0x00
kfbh.hard:                          130 ; 0x001: 0x82
kfbh.type:                            1 ; 0x002: KFBTYP_DISKHEAD
kfbh.datfmt:                          1 ; 0x003: 0x01
kfbh.block.blk:                     254 ; 0x004: blk=254
kfbh.block.obj:              2147483648 ; 0x008: disk=0
kfbh.check:                  3460116983 ; 0x00c: 0xce3d31f7
kfbh.fcn.base:                        0 ; 0x010: 0x00000000
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000
kfdhdb.driver.provstr:         ORCLDISK ; 0x000: length=8
kfdhdb.driver.reserved[0]:            0 ; 0x008: 0x00000000
kfdhdb.driver.reserved[1]:            0 ; 0x00c: 0x00000000
kfdhdb.driver.reserved[2]:            0 ; 0x010: 0x00000000
kfdhdb.driver.reserved[3]:            0 ; 0x014: 0x00000000
kfdhdb.driver.reserved[4]:            0 ; 0x018: 0x00000000
kfdhdb.driver.reserved[5]:            0 ; 0x01c: 0x00000000
kfdhdb.compat:                186646528 ; 0x020: 0x0b200000
kfdhdb.dsknum:                        0 ; 0x024: 0x0000
kfdhdb.grptyp:                        2 ; 0x026: KFDGTP_NORMAL
kfdhdb.hdrsts:                        3 ; 0x027: KFDHDR_MEMBER
kfdhdb.dskname:                CRS_0000 ; 0x028: length=8
kfdhdb.grpname:                     CRS ; 0x048: length=3
kfdhdb.fgname:                 CRS_0000 ; 0x068: length=8
…………

由上述分析可以基本上确定是asm disk header 被破坏,进一步分析破坏原因

[db2/dev#]lspv
hdisk0          00f9733ef7cf27e9                    rootvg          active
hdisk1          00f9733e21b953e6                    rootvg          active
hdisk2          00f9733e21b97a83                    appvg           active
hdisk3          00f9733e21b98434                    appvg           active
hdisk4          00f9733d67553e0a                    None
hdisk5          00f9733d67553f31                    None
hdisk6          00f9733d67554011                    None
hdisk7          00f9733d67554165                    None
hdisk8          00f9733d675541e5                    None
hdisk9          00f9733d675542e4                    None
hdisk10         none                                None
[db2/dev#]ls -l rhdisk*
crw-------    2 root     system       24,  1 Oct 18 11:45 rhdisk0
crw-------    1 root     system       24,  3 Oct 18 13:27 rhdisk1
crw-------    1 root     system       24,  5 Dec 20 20:02 rhdisk10
crw-------    1 root     system       24,  2 Oct 18 13:32 rhdisk2
crw-------    1 root     system       24,  0 Oct 18 13:32 rhdisk3
crw-rw----    1 grid     asmadmin     24,  8 Dec 20 20:02 rhdisk4
crw-rw----    1 grid     asmadmin     24,  9 Dec 20 20:02 rhdisk5
crw-rw----    1 grid     asmadmin     24, 10 Dec 20 20:02 rhdisk6
crw-rw----    1 grid     asmadmin     24,  4 Dec 20 20:02 rhdisk7
crw-rw----    1 grid     asmadmin     24,  6 Dec 20 20:02 rhdisk8
crw-rw----    1 grid     asmadmin     24,  7 Dec 20 20:02 rhdisk9

从这里基本上可以看出来,是由于磁盘头被重写了pvid,导致asm disk header 被破坏.进一步分析asm log,确定哪些磁盘被用作asm disk

SQL> CREATE DISKGROUP CRS NORMAL REDUNDANCY  DISK '/dev/rhdisk4',
'/dev/rhdisk5',
'/dev/rhdisk6' ATTRIBUTE 'compatible.asm'='11.2.0.0.0','au_size'='1M' /* ASMCA */
NOTE: Assigning number (1,0) to disk (/dev/rhdisk4)
NOTE: Assigning number (1,1) to disk (/dev/rhdisk5)
NOTE: Assigning number (1,2) to disk (/dev/rhdisk6)
NOTE: initializing header on grp 1 disk CRS_0000
NOTE: initializing header on grp 1 disk CRS_0001
NOTE: initializing header on grp 1 disk CRS_0002
SQL> CREATE DISKGROUP DATA EXTERNAL REDUNDANCY  DISK
'/dev/rhdisk9' SIZE 614400M  ATTRIBUTE 'compatible.asm'='11.2.0.0.0','au_size'='1M' /* ASMCA */
NOTE: Assigning number (2,0) to disk (/dev/rhdisk9)
NOTE: initializing header on grp 2 disk DATA_0000
SQL> CREATE DISKGROUP FBA EXTERNAL REDUNDANCY  DISK
'/dev/rhdisk8' SIZE 204800M  ATTRIBUTE 'compatible.asm'='11.2.0.0.0','au_size'='1M' /* ASMCA */
NOTE: Assigning number (3,0) to disk (/dev/rhdisk8)
NOTE: initializing header on grp 3 disk FBA_0000
SQL> CREATE DISKGROUP ARCH EXTERNAL REDUNDANCY  DISK
'/dev/rhdisk7' SIZE 102400M  ATTRIBUTE 'compatible.asm'='11.2.0.0.0','au_size'='1M' /* ASMCA */
NOTE: Assigning number (4,0) to disk (/dev/rhdisk7)
NOTE: initializing header on grp 4 disk ARCH_0000

这里可以确定asm disk为rhdisk[4-9],通过kfed分析全部和rhdisk4一样的问题,也符合lspv查询出来的结果,使用kfed repair修复asm disk header后

SQL> alter diskgroup data mount;
Diskgroup altered.
SQL> alter diskgroup fba mount;
Diskgroup altered.
SQL> alter diskgroup arch mount;
Diskgroup altered.
SQL> alter diskgroup crs mount;
Diskgroup altered.
SQL> select group_number,disk_number,path from v$asm_disk;
GROUP_NUMBER DISK_NUMBER PATH
------------ ----------- --------------------------------------------------
           2           0 /dev/rhdisk4
           2           1 /dev/rhdisk5
           2           2 /dev/rhdisk6
           1           0 /dev/rhdisk7
           4           0 /dev/rhdisk8
           3           0 /dev/rhdisk9
6 rows selected.
SQL> select group_number,name from v$asm_diskgroup;
GROUP_NUMBER NAME
------------ ------------------------------------------------------------
           1 ARCH
           2 CRS
           3 DATA
           4 FBA

这里证明通过kfed对磁盘头的修复,asm磁盘组已经全部mount成功,GI状态也恢复正常

[db2/#]crsctl status res -t
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.ARCH.dg
               ONLINE  ONLINE       db1
               ONLINE  ONLINE       db2
ora.CRS.dg
               ONLINE  ONLINE       db1
               ONLINE  ONLINE       db2
ora.DATA.dg
               ONLINE  ONLINE       db1
               ONLINE  ONLINE       db2
ora.FBA.dg
               ONLINE  ONLINE       db1
               ONLINE  ONLINE       db2
ora.LISTENER.lsnr
               ONLINE  ONLINE       db1
               ONLINE  ONLINE       db2
ora.asm
               ONLINE  ONLINE       db1                      Started
               ONLINE  ONLINE       db2                      Started
ora.gsd
               OFFLINE OFFLINE      db1
               OFFLINE OFFLINE      db2
ora.net1.network
               ONLINE  ONLINE       db1
               ONLINE  ONLINE       db2
ora.ons
               ONLINE  ONLINE       db1
               ONLINE  ONLINE       db2
ora.registry.acfs
               ONLINE  ONLINE       db1
               ONLINE  ONLINE       db2
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  ONLINE       db1
ora.cvu
      1        ONLINE  ONLINE       db1
ora.db1.vip
      1        ONLINE  ONLINE       db1
ora.db2.vip
      1        ONLINE  ONLINE       db2
ora.nkora.db
      1        ONLINE  ONLINE       db1                      Open
      2        ONLINE  ONLINE       db2                      Open
ora.oc4j
      1        ONLINE  ONLINE       db1
ora.scan1.vip
      1        ONLINE  ONLINE       db1

这里忽略了一个问题,在修复磁盘头之前没有清除pvid,导致磁盘头修复后,pvid依然存储在odm中

[db2/dev#]lspv
hdisk0          00f9733ef7cf27e9                    rootvg          active
hdisk1          00f9733e21b953e6                    rootvg          active
hdisk2          00f9733e21b97a83                    appvg           active
hdisk3          00f9733e21b98434                    appvg           active
hdisk4          00f9733d67553e0a                    None
hdisk5          00f9733d67553f31                    None
hdisk6          00f9733d67554011                    None
hdisk7          00f9733d67554165                    None
hdisk8          00f9733d675541e5                    None
hdisk9          00f9733d675542e4                    None
hdisk10         none                                None

通过分析发现fba磁盘组中无任何记录,使用该磁盘组进行直接清除pvid测试

$ sqlplus / as sysasm
SQL*Plus: Release 11.2.0.4.0 Production on Sun Dec 21 03:13:31 2014
Copyright (c) 1982, 2013, Oracle.  All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Real Application Clusters and Automatic Storage Management options
SQL> alter diskgroup fba dismount;
Diskgroup altered.
SQL> exit
Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Real Application Clusters and Automatic Storage Management options
$ exit
You have mail in /usr/spool/mail/root
[db2/#]chdev -l hdisk8 -a pv=clear
hdisk8 changed
[db2/#]lspv
hdisk0          00f9733ef7cf27e9                    rootvg          active
hdisk1          00f9733e21b953e6                    rootvg          active
hdisk2          00f9733e21b97a83                    appvg           active
hdisk3          00f9733e21b98434                    appvg           active
hdisk4          00f9733d67553e0a                    None
hdisk5          00f9733d67553f31                    None
hdisk6          00f9733d67554011                    None
hdisk7          00f9733d67554165                    None
hdisk8          none                                None
hdisk9          00f9733d675542e4                    None
hdisk10         none                                None
[db2/#]su - grid
$ sqlplus / as sysasm
SQL*Plus: Release 11.2.0.4.0 Production on Sun Dec 21 03:15:19 2014
Copyright (c) 1982, 2013, Oracle.  All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Real Application Clusters and Automatic Storage Management options
SQL> alter diskgroup fba mount;
Diskgroup altered.
SQL> exit
Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Real Application Clusters and Automatic Storage Management options

通过测试直接清除pvid asm 磁盘头依然工作正常,关闭GI,使用chdev清除hdisk[4-9]所有pvid,启动GI一切正常

[db1/#]crsctl status res -t
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.ARCH.dg
               ONLINE  ONLINE       db1
               ONLINE  ONLINE       db2
ora.CRS.dg
               ONLINE  ONLINE       db1
               ONLINE  ONLINE       db2
ora.DATA.dg
               ONLINE  ONLINE       db1
               ONLINE  ONLINE       db2
ora.FBA.dg
               ONLINE  ONLINE       db1
               ONLINE  ONLINE       db2
ora.LISTENER.lsnr
               ONLINE  ONLINE       db1
               ONLINE  ONLINE       db2
ora.asm
               ONLINE  ONLINE       db1                      Started
               ONLINE  ONLINE       db2                      Started
ora.gsd
               OFFLINE OFFLINE      db1
               OFFLINE OFFLINE      db2
ora.net1.network
               ONLINE  ONLINE       db1
               ONLINE  ONLINE       db2
ora.ons
               ONLINE  ONLINE       db1
               ONLINE  ONLINE       db2
ora.registry.acfs
               ONLINE  ONLINE       db1
               ONLINE  ONLINE       db2
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  ONLINE       db1
ora.cvu
      1        ONLINE  ONLINE       db1
ora.db1.vip
      1        ONLINE  ONLINE       db1
ora.db2.vip
      1        ONLINE  ONLINE       db2
ora.nkora.db
      1        ONLINE  ONLINE       db1                      Open
      2        ONLINE  ONLINE       db2                      Open
ora.oc4j
      1        ONLINE  ONLINE       db1
ora.scan1.vip
      1        ONLINE  ONLINE       db1
[db1/#]lspv
hdisk0          00f9733df7c7a9db                    rootvg          active
hdisk1          00f9733d21dad8fe                    rootvg          active
hdisk2          00f9733d21dbd08b                    appvg           active
hdisk3          00f9733d21dbd2ab                    appvg           active
hdisk4          none                                None
hdisk5          none                                None
hdisk6          none                                None
hdisk7          none                                None
hdisk8          none                                None
hdisk9          none                                None
hdisk10         none                                None

至此设置pvid导致asm disk header损坏的asm 恢复正常,实现数据0丢失。
温馨提示:aix asm disk磁盘中不能设置pvid,否则将会导致asm disk header 损坏,无法正常mount
如果您遇到此类情况,无法解决请联系我们,提供专业ORACLE数据库恢复技术支持
Phone:17813235971    Q Q:107644445QQ咨询惜分飞    E-Mail:dba@xifenfei.com

SYSTEM表空间坏块恢复—C_TS#对象坏块恢复(file 1 block 60)

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:SYSTEM表空间坏块恢复—C_TS#对象坏块恢复(file 1 block 60)

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

一朋友给我电话,说他们客户公司数据库故障,被另外一家公司恢复了一天不能正常恢复,请求我协助解决.接手一看数据库已经被破坏的不像样子了,根据alert日志信息大概分析了故障原因和上家公司处理情况。后面接手后通过bbed修复block数据库恢复过程,在本次恢复中出现大量ORA-600错误,主要包括ORA-00600 400,ORA-00600 2662,ORA-00600 2663,ORA-00600 krhpfh_03-1209,ORA-00600 3600,ORA-00600 ktsitbs_info1,ORA-00600 4137,ORA-00600 4511,ORA-00600 4198,ORA-00600 6807等
故障原因redo文件丢失

Thu Nov 20 11:28:39 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_lgwr_1404.trc:
ORA-00313: open failed for members of log group 7 of thread 1
ORA-00312: online log 9 thread 1: '/data2/oradata/redo0902.log'
ORA-27037: unable to obtain file status
SVR4 Error: 2: No such file or directory
Additional information: 3
Thu Nov 20 11:28:39 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_lgwr_1404.trc:
ORA-00313: open failed for members of log group 7 of thread 1
ORA-00312: online log 9 thread 1: '/data2/oradata/redo0902.log'
ORA-27037: unable to obtain file status
SVR4 Error: 2: No such file or directory
Additional information: 3
Thu Nov 20 11:28:39 2014
LGWR: terminating instance due to error 313
Thu Nov 20 11:28:39 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_pmon_1394.trc:
ORA-00313: open failed for members of log group  of thread
Thu Nov 20 11:28:39 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_ckpt_1406.trc:
ORA-00313: open failed for members of log group  of thread
Instance terminated by LGWR, pid = 1404

尝试clear redo文件方式恢复

Thu Nov 20 13:04:16 2014
alter database clear logfile group 9
Thu Nov 20 13:04:16 2014
ORA-1624 signalled during: alter database clear logfile group 9...
Thu Nov 20 13:04:45 2014
alter database clear logfile group 9
Thu Nov 20 13:04:46 2014
ORA-1624 signalled during: alter database clear logfile group 9...
Thu Nov 20 13:04:59 2014
alter database clear unarchived logfile group 9
Thu Nov 20 13:04:59 2014
ORA-1624 signalled during: alter database clear unarchived logfile group 9...
Thu Nov 20 13:05:00 2014
alter database clear unarchived logfile group 9
Thu Nov 20 13:05:00 2014
ORA-1624 signalled during: alter database clear unarchived logfile group 9...

不完全恢复resetlogs尝试打开数据库

ORA-279 signalled during: ALTER DATABASE RECOVER  database using backup controlfile  ...
Thu Nov 20 13:49:01 2014
ALTER DATABASE RECOVER    CONTINUE DEFAULT
Thu Nov 20 13:49:02 2014
Media Recovery Log /opt/oracle/flash_recovery_area/xifenfei/archivelog/2014_11_20/o1_mf_1_285999_%u_.arc
Errors with log /opt/oracle/flash_recovery_area/xifenfei/archivelog/2014_11_20/o1_mf_1_285999_%u_.arc
ORA-308 signalled during: ALTER DATABASE RECOVER    CONTINUE DEFAULT  ...
Thu Nov 20 13:49:02 2014
ALTER DATABASE RECOVER    CONTINUE DEFAULT
Thu Nov 20 13:49:02 2014
Media Recovery Log /opt/oracle/flash_recovery_area/xifenfei/archivelog/2014_11_20/o1_mf_1_285999_%u_.arc
Errors with log /opt/oracle/flash_recovery_area/xifenfei/archivelog/2014_11_20/o1_mf_1_285999_%u_.arc
ORA-308 signalled during: ALTER DATABASE RECOVER    CONTINUE DEFAULT  ...
Thu Nov 20 13:49:02 2014
ALTER DATABASE RECOVER CANCEL
Thu Nov 20 13:49:03 2014
Media Recovery Canceled
Completed: ALTER DATABASE RECOVER CANCEL
Thu Nov 20 13:49:33 2014
alter database open resetlogs
Thu Nov 20 13:49:34 2014
ORA-1113 signalled during: alter database open resetlogs...

使用隐含参数

_allow_resetlogs_corruption= TRUE

进行不完全恢复,尝试open数据库报ORA-600 4000错误

Thu Nov 20 14:35:02 2014
ALTER DATABASE   MOUNT
Thu Nov 20 14:35:07 2014
Setting recovery target incarnation to 2
Thu Nov 20 14:35:07 2014
Successful mount of redo thread 1, with mount id 4039504598
Thu Nov 20 14:35:07 2014
Database mounted in Exclusive Mode
Completed: ALTER DATABASE   MOUNT
Thu Nov 20 14:40:33 2014
ALTER DATABASE RECOVER  database until cancel
Thu Nov 20 14:40:33 2014
Media Recovery Start
Thu Nov 20 14:40:33 2014
Media Recovery failed with error 1610
ORA-283 signalled during: ALTER DATABASE RECOVER  database until cancel  ...
Thu Nov 20 14:41:23 2014
ALTER DATABASE RECOVER  database using backup controlfile until cancel
Thu Nov 20 14:43:08 2014
alter database open resetlogs
Thu Nov 20 14:43:08 2014
RESETLOGS is being done without consistancy checks. This may result
in a corrupted database. The database should be recreated.
RESETLOGS after incomplete recovery UNTIL CHANGE 31293973571
Resetting resetlogs activation ID 3855216310 (0xe5c9eeb6)
Online log /data2/oradata/redo0802.log: Thread 1 Group 8 was previously cleared
Online log /data2/oradata/redo0902.log: Thread 1 Group 9 was previously cleared
Thu Nov 20 14:43:14 2014
Setting recovery target incarnation to 3
Thu Nov 20 14:43:14 2014
Assigning activation ID 4039504598 (0xf0c5f2d6)
Thread 1 opened at log sequence 1
  Current log# 9 seq# 1 mem# 0: /data2/oradata/redo0902.log
Successful open of redo thread 1
Thu Nov 20 14:43:14 2014
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Thu Nov 20 14:43:14 2014
SMON: enabling cache recovery
Thu Nov 20 14:43:14 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_1844.trc:
ORA-00600: internal error code, arguments: [4000], [17], [], [], [], [], [], []
Thu Nov 20 14:43:16 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_1844.trc:
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00600: internal error code, arguments: [4000], [17], [], [], [], [], [], []
Thu Nov 20 14:43:16 2014
Error 704 happened during db open, shutting down database
USER: terminating instance due to error 704
Instance terminated by USER, pid = 1844
ORA-1092 signalled during: alter database open resetlogs...

尝试隐含屏蔽回滚段

_corrupted_rollback_segments= _SYSSMU1$, _SYSSMU2$,…………

错误依旧ORA-600 4000

Thu Nov 20 15:09:21 2014
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Thu Nov 20 15:09:21 2014
SMON: enabling cache recovery
Thu Nov 20 15:09:21 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_624.trc:
ORA-00600: internal error code, arguments: [4000], [17], [], [], [], [], [], []
Thu Nov 20 15:09:23 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_624.trc:
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00600: internal error code, arguments: [4000], [17], [], [], [], [], [], []
Thu Nov 20 15:09:23 2014
Error 704 happened during db open, shutting down database
USER: terminating instance due to error 704
Instance terminated by USER, pid = 624
ORA-1092 signalled during: alter database open

多次重启,resetlogs后,数据库出现ORA-600 2662错误

Successful open of redo thread 1
Thu Nov 20 17:13:24 2014
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Thu Nov 20 17:13:24 2014
SMON: enabling cache recovery
Thu Nov 20 17:13:24 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_7967.trc:
ORA-00600: internal error code, arguments: [2662], [7], [1229382552], [7], [1229560642], [8388633], [], []
Thu Nov 20 17:13:25 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_7967.trc:
ORA-00600: internal error code, arguments: [2662], [7], [1229382552], [7], [1229560642], [8388633], [], []
Thu Nov 20 17:13:25 2014
Error 600 happened during db open, shutting down database
USER: terminating instance due to error 600
Instance terminated by USER, pid = 7967
ORA-1092 signalled during: ALTER DATABASE OPEN...
Thu Nov 20 17:18:23 2014
USER: terminating instance due to error 1092
Instance terminated by USER, pid = 7967

offline undo相关文件,尝试打开数据库

Database mounted in Exclusive Mode
Completed: ALTER DATABASE   MOUNT
Thu Nov 20 17:52:31 2014
ALTER DATABASE RECOVER  database until cancel
Thu Nov 20 17:52:31 2014
Media Recovery Start
 parallel recovery started with 15 processes
ORA-279 signalled during: ALTER DATABASE RECOVER  database until cancel  ...
Thu Nov 20 17:53:42 2014
ALTER DATABASE RECOVER CANCEL
Thu Nov 20 17:53:44 2014
ORA-1547 signalled during: ALTER DATABASE RECOVER CANCEL ...
Thu Nov 20 17:56:34 2014
alter database datafile '/opt/oracle/oradata/xifenfei/undotbs01.dbf' offline
Thu Nov 20 17:56:35 2014
Completed: alter database datafile '/opt/oracle/oradata/xifenfei/undotbs01.dbf' offline
Thu Nov 20 17:57:01 2014
alter database datafile '/data2/oradata/undotbs02.dbf' offline
Thu Nov 20 17:57:02 2014
Completed: alter database datafile '/data2/oradata/undotbs02.dbf' offline
Thu Nov 20 17:57:26 2014
alter database datafile '/data2/oradata/undotbs03.dbf' offline
Thu Nov 20 17:57:27 2014
Completed: alter database datafile '/data2/oradata/undotbs03.dbf' offline
Thu Nov 20 17:57:43 2014
alter database open resetlogs
Thu Nov 20 17:57:43 2014
RESETLOGS is being done without consistancy checks. This may result
in a corrupted database. The database should be recreated.
ORA-1245 signalled during: alter database open resetlogs...
Thu Nov 20 17:58:58 2014
alter database datafile '/opt/oracle/oradata/xifenfei/undotbs01.dbf' offline drop
Thu Nov 20 17:58:58 2014
Completed: alter database datafile '/opt/oracle/oradata/xifenfei/undotbs01.dbf' offline drop
Thu Nov 20 17:59:15 2014
alter database open resetlogs
Thu Nov 20 17:59:15 2014
RESETLOGS is being done without consistancy checks. This may result
in a corrupted database. The database should be recreated.
ORA-1245 signalled during: alter database open resetlogs...
Thu Nov 20 17:59:35 2014
alter database datafile '/data2/oradata/undotbs02.dbf' offline drop
Thu Nov 20 17:59:35 2014
Completed: alter database datafile '/data2/oradata/undotbs02.dbf' offline drop
Thu Nov 20 17:59:50 2014
alter database datafile '/data2/oradata/undotbs03.dbf' offline drop
Thu Nov 20 17:59:50 2014
Completed: alter database datafile '/data2/oradata/undotbs03.dbf' offline drop
Thu Nov 20 18:00:07 2014
alter database open resetlogs
Thu Nov 20 18:00:07 2014
RESETLOGS is being done without consistancy checks. This may result
in a corrupted database. The database should be recreated.
RESETLOGS after incomplete recovery UNTIL CHANGE 31294173628
Resetting resetlogs activation ID 4039492628 (0xf0c5c414)
Online log /data2/oradata/redo0802.log: Thread 1 Group 8 was previously cleared
Thu Nov 20 18:00:14 2014
Setting recovery target incarnation to 8
Thu Nov 20 18:00:14 2014
Assigning activation ID 4039504142 (0xf0c5f10e)
Thread 1 opened at log sequence 1
  Current log# 9 seq# 1 mem# 0: /data2/oradata/redo0902.log
Successful open of redo thread 1
Thu Nov 20 18:00:15 2014
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Thu Nov 20 18:00:15 2014
SMON: enabling cache recovery
Thu Nov 20 18:00:15 2014
Successfully onlined Undo Tablespace 1.
Dictionary check beginning
File #2 is offline, but is part of an online tablespace.
data file 2: '/opt/oracle/oradata/xifenfei/undotbs01.dbf'
File #100 is offline, but is part of an online tablespace.
data file 100: '/data2/oradata/undotbs02.dbf'
Thu Nov 20 18:00:28 2014
File #185 is offline, but is part of an online tablespace.
data file 185: '/data2/oradata/undotbs03.dbf'
Dictionary check complete
Thu Nov 20 18:00:35 2014
SMON: enabling tx recovery
Thu Nov 20 18:00:36 2014
Database Characterset is ZHS16CGB231280
Thu Nov 20 18:00:37 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_28472.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-00376: file 185 cannot be read at this time
ORA-01110: data file 185: '/data2/oradata/undotbs03.dbf'
Error 604 happened during db open, shutting down database
USER: terminating instance due to error 604
Thu Nov 20 18:00:37 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_lgwr_28450.trc:
ORA-00604: error occurred at recursive SQL level
Thu Nov 20 18:00:37 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_dbw0_28446.trc:
ORA-00604: error occurred at recursive SQL level
Instance terminated by USER, pid = 28472
ORA-1092 signalled during: alter database open resetlogs...

不知道做了什么操作出现file 1 block 60坏块,很可能bbed修改错误导致

Thu Nov 20 19:18:15 2014
SMON: enabling cache recovery
Thu Nov 20 19:18:16 2014
Hex dump of (file 1, block 60) in trace file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_13232.trc
Corrupt block relative dba: 0x0040003c (file 1, block 60)
Bad header found during buffer read
Data in bad block:
 type: 128 format: 0 rdba: 0x0040003c
 last change scn: 0x0005.ebe04bc9 seq: 0x2 flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x4bc90602
 check value in block header: 0x6faa
 computed block checksum: 0x0
Reread of rdba: 0x0040003c (file 1, block 60) found same corrupted data
Successfully onlined Undo Tablespace 1.
Thu Nov 20 19:18:16 2014
SMON: enabling tx recovery
Thu Nov 20 19:18:17 2014
Database Characterset is ZHS16CGB231280
Thu Nov 20 19:18:17 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_13232.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-00376: file 185 cannot be read at this time
ORA-01110: data file 185: '/data2/oradata/undotbs03.dbf'
Error 604 happened during db open, shutting down database
USER: terminating instance due to error 604
Instance terminated by USER, pid = 13232
ORA-1092 signalled during: alter database open...

尝试不完全恢复,并resetlogs操作

ALTER DATABASE RECOVER  database until cancel
Thu Nov 20 19:33:41 2014
Media Recovery Start
Datafile 2 is on orphaned branch
          File status = 4
        Abs fuzzy SCN = 0
 Hot backup fuzzy SCN = 0
Thu Nov 20 19:33:41 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_20878.trc:
ORA-00600: internal error code, arguments: [krhpfh_03-1209], [2], [864151207], [864153315], [1229402557], [7], [0], [0]
ORA-01110: data file 2: '/opt/oracle/oradata/xifenfei/undotbs01.dbf'
Thu Nov 20 19:33:42 2014
Media Recovery failed with error 600
ORA-283 signalled during: ALTER DATABASE RECOVER  database until cancel  ...
Thu Nov 20 19:34:06 2014
alter database open resetlogs
Thu Nov 20 19:34:06 2014
ORA-1139 signalled during: alter database open resetlogs...
Thu Nov 20 19:34:17 2014
alter database open
Thu Nov 20 19:34:17 2014
ORA-1190 signalled during: alter database open...
Thu Nov 20 19:35:57 2014
ALTER DATABASE RECOVER  database until cancel
Thu Nov 20 19:35:57 2014
Media Recovery Start
Datafile 2 is on orphaned branch
          File status = 4
        Abs fuzzy SCN = 0
 Hot backup fuzzy SCN = 0
Thu Nov 20 19:35:58 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_20878.trc:
ORA-00600: internal error code, arguments: [krhpfh_03-1209], [2], [864151207], [864153315], [1229402557], [7], [0], [0]
ORA-01110: data file 2: '/opt/oracle/oradata/xifenfei/undotbs01.dbf'
Thu Nov 20 19:35:59 2014
Media Recovery failed with error 600
ORA-283 signalled during: ALTER DATABASE RECOVER  database until cancel  ...
Thu Nov 20 19:37:19 2014
alter database open resetlogs
Thu Nov 20 19:37:19 2014
ORA-1139 signalled during: alter database open resetlogs...

继续打开报 ORA-600 3600错误

Thu Nov 20 19:43:14 2014
alter database datafile '/opt/oracle/oradata/xifenfei/undotbs01.dbf' offline drop
Thu Nov 20 19:43:14 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_dbw0_20856.trc:
ORA-00600: internal error code, arguments: [3600], [2], [14], [], [], [], [], []
Thu Nov 20 19:43:15 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_dbw0_20856.trc:
ORA-00600: internal error code, arguments: [3600], [2], [14], [], [], [], [], []
Thu Nov 20 19:43:15 2014
DBW0: terminating instance due to error 471
Instance terminated by DBW0, pid = 20856

<strong>中间多次重启和resetlogs,还出现ORA-600 2663错误</strong>

Fri Nov 21 12:35:12 2014
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Fri Nov 21 12:35:12 2014
SMON: enabling cache recovery
Fri Nov 21 12:35:13 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_15596.trc:
ORA-00600: internal error code, arguments: [2663], [7], [1229543007], [7], [1229560642], [], [], []
Fri Nov 21 12:35:14 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_15596.trc:
ORA-00600: internal error code, arguments: [2663], [7], [1229543007], [7], [1229560642], [], [], []
Fri Nov 21 12:35:14 2014
Error 600 happened during db open, shutting down database
USER: terminating instance due to error 600
Fri Nov 21 12:35:14 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_mman_15572.trc:
ORA-00600: internal error code, arguments: [], [], [], [], [], [], [], []
Fri Nov 21 12:35:14 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_dbw1_15576.trc:
ORA-00600: internal error code, arguments: [], [], [], [], [], [], [], []
Instance terminated by USER, pid = 15596
ORA-1092 signalled during: ALTER DATABASE OPEN..

继续尝试打开数据库出现ORA-600 ktsitbs_info1错误

SMON: enabling cache recovery
Fri Nov 21 13:54:25 2014
Hex dump of (file 1, block 60) in trace file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_21111.trc
Corrupt block relative dba: 0x0040003c (file 1, block 60)
Bad header found during buffer read
Data in bad block:
 type: 128 format: 0 rdba: 0x0040003c
 last change scn: 0x0005.ebe04bc9 seq: 0x2 flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x4bc90602
 check value in block header: 0x6faa
 computed block checksum: 0x0
Reread of rdba: 0x0040003c (file 1, block 60) found same corrupted data
Fri Nov 21 13:54:25 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_21111.trc:
ORA-00600: internal error code, arguments: [ktsitbs_info1], [2], [], [], [], [], [], []
Fri Nov 21 13:54:27 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_21111.trc:
ORA-00600: internal error code, arguments: [ktsitbs_info1], [2], [], [], [], [], [], []
Error 600 happened during db open, shutting down database
USER: terminating instance due to error 600
Instance terminated by USER, pid = 21111
ORA-1092 signalled during: alter database open...

以上是客户数据库故障原因和问题大概的处理过程,下面是我接手后的处理过程


dbv 检查system01.dbf文件,得到结果

HNDX-DB% dbv file=/opt/oracle/oradata/xifenfei/system01.dbf
DBVERIFY: Release 10.2.0.1.0 - Production on Fri Nov 21 16:22:37 2014
Copyright (c) 1982, 2005, Oracle.  All rights reserved.
DBVERIFY - Verification starting : FILE = /opt/oracle/oradata/xifenfei/system01.dbf
Page 60 is marked corrupt
Corrupt block relative dba: 0x0040003c (file 1, block 60)
Bad header found during dbv:
Data in bad block:
 type: 128 format: 0 rdba: 0x0040003c
 last change scn: 0x0005.ebe04bc9 seq: 0x2 flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x4bc90602
 check value in block header: 0x6faa
 computed block checksum: 0x0
Corrupt block relative dba: 0x004001f2 (file 1, block 498)
Bad check value found during buffer read
Data in bad block:
 type: 6 format: 2 rdba: 0x004001f2
 last change scn: 0x0007.49499ca1 seq: 0x1 flg: 0x06
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x9ca10601
 check value in block header: 0xe458
 computed block checksum: 0x9720
DBVERIFY - Verification complete
Total Pages Examined         : 786432
Total Pages Processed (Data) : 201131
Total Pages Failing   (Data) : 2
Total Pages Processed (Index): 221394
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 60265
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 303641
Total Pages Marked Corrupt   : 2
Total Pages Influx           : 0
Highest block SCN            : 1229823477 (7.1229823477)

这里知道数据库有两个坏块,而且根据对于bootstrap$的经验,可以大概确定60坏块很可能是C_TS#,第一反应type异常,498可能是seq$

对数据库启动过程做10046,得到trace文件

PARSING IN CURSOR #1 len=275 dep=2 uid=0 oct=3 lid=0 tim=27978051403575 hv=3408408745 ad='7df93cd0'
select name,online$,contents$,undofile#,undoblock#,blocksize,dflmaxext,dflinit,dflincr,dflextpct,dflminext,
dflminlen, owner#,scnwrp,scnbas, NVL(pitrscnwrp, 0), NVL(pitrscnbas, 0), dflogging, bitmapped, inc#, flags,
plugged, NVL(spare1,0), NVL(spare2,0) from ts$ where ts#=:1
END OF STMT
PARSE #1:c=0,e=92,p=0,cr=0,cu=0,mis=0,r=0,dep=2,og=4,tim=27978051403569
BINDS #1:
kkscoacd
 Bind#0
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=08 fl2=0001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=ffffffff7dbac9a8  bln=22  avl=02  flg=05
  value=2
EXEC #1:c=0,e=310,p=0,cr=0,cu=0,mis=0,r=0,dep=2,og=4,tim=27978051404296
WAIT #1: nam='db file sequential read' ela= 42 file#=1 block#=60 blocks=1 obj#=-1 tim=27978051404449
Hex dump of (file 1, block 60)
Corrupt block relative dba: 0x0040003c (file 1, block 60)
Bad header found during buffer read
Data in bad block:
 type: 128 format: 0 rdba: 0x0040003c
 last change scn: 0x0005.ebe04bc9 seq: 0x2 flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x4bc90602
 check value in block header: 0x6faa
 computed block checksum: 0x0
Reread of rdba: 0x0040003c (file 1, block 60) found same corrupted data
FETCH #1:c=10000,e=4072,p=1,cr=2,cu=0,mis=0,r=0,dep=2,og=4,tim=27978051408438
STAT #1 id=1 cnt=0 pid=0 pos=1 obj=16 op='TABLE ACCESS CLUSTER TS$ (cr=2 pr=1 pw=0 time=4075 us)'
STAT #1 id=2 cnt=1 pid=1 pos=1 obj=7 op='INDEX UNIQUE SCAN I_TS# (cr=1 pr=0 pw=0 time=13 us)'
*** 2014-11-22 14:44:43.235
ksedmp: internal or fatal error
ORA-00600: internal error code, arguments: [ktsitbs_info1], [2], [], [], [], [], [], []
Current SQL statement for this session:
select max(maxconcurrency) from sys.wrh$_undostat  where instance_number = :1 and dbid = :2
and snap_id in   (select snap_id from dba_hist_snapshot where end_interval_time >
(select max(end_interval_time)-7 from dba_hist_snapshot))

这里显示了数据库启动报ORA-00600[ktsitbs_info1],[2],明显的表示了b中的2是表示表空间号,由于ts$坏块,无法读取ts$中表空间信息,从而出现数据字典不一致,从而出现该错误。所以恢复该库的关键是修复file 1 block 60.

bbed尝试修复file 1 block 60

HNDX-DB% bbed password=blockedit mode=edit
BBED: Release 2.0.0.0.0 - Limited Production on Sat Nov 22 15:16:26 2014
Copyright (c) 1982, 2005, Oracle.  All rights reserved.
************* !!! For Oracle Internal Use only !!! ***************
BBED> set filename '/opt/oracle/oradata/xifenfei/system01.dbf'
        FILENAME        /opt/oracle/oradata/xifenfei/system01.dbf
BBED> set block 8192
        BLOCK#          8192
BBED> set block 60
        BLOCK#          60
BBED> set count 64
        COUNT           64
BBED> map
 File: /opt/oracle/oradata/xifenfei/system01.dbf (0)
 Block: 60                                    Dba:0x00000000
------------------------------------------------------------
BBED-00400: invalid blocktype (128)
BBED> set block 61
        BLOCK#          61
BBED> map
 File: /opt/oracle/oradata/xifenfei/system01.dbf (0)
 Block: 61                                    Dba:0x00000000
------------------------------------------------------------
 KTB Data Block (Table/Cluster)
 struct kcbh, 20 bytes                      @0
 struct ktbbh, 72 bytes                     @20
 struct kdbh, 14 bytes                      @92
 struct kdbt[3], 12 bytes                   @106
 sb2 kdbr[2]                                @118
 ub1 freespace[7959]                        @122
 ub1 rowdata[107]                           @8081
 ub4 tailchk                                @8188
BBED> p kcbh
struct kcbh, 20 bytes                       @0
   ub1 type_kcbh                            @0        0x06
   ub1 frmt_kcbh                            @1        0xa2
   ub1 spare1_kcbh                          @2        0x00
   ub1 spare2_kcbh                          @3        0x00
   ub4 rdba_kcbh                            @4        0x0040003d
   ub4 bas_kcbh                             @8        0x0000235b
   ub2 wrp_kcbh                             @12       0x0000
   ub1 seq_kcbh                             @14       0x01
   ub1 flg_kcbh                             @15       0x04 (KCBHFCKV)
   ub2 chkval_kcbh                          @16       0x7a85
   ub2 spare3_kcbh                          @18       0x0000
BBED> set block 60
        BLOCK#          60
BBED> d
 File: /opt/oracle/oradata/xifenfei/system01.dbf (0)
 Block: 60               Offsets:    0 to   63           Dba:0x00000000
------------------------------------------------------------------------
 80000000 0040003c ebe04bc9 00050204 6faa0000 01000000 00000006 29b3a204
 00040ca0 00020200 00000000 000a0000 00000002 0080009b 00000100 80000000
 <32 bytes per line>
BBED> d block 61
 File: /opt/oracle/oradata/xifenfei/system01.dbf (0)
 Block: 61               Offsets:    0 to   63           Dba:0x00000000
------------------------------------------------------------------------
 06a20000 0040003d 0000235b 00000104 7a850000 01000000 00000006 00001837
 00001738 00020200 00000000 0007002e 00000002 00800075 00012300 80000000
 <32 bytes per line>
BBED> set block 60
        BLOCK#          60
BBED> m /x 06a2
 File: /opt/oracle/oradata/xifenfei/system01.dbf (0)
 Block: 60               Offsets:    0 to   63           Dba:0x00000000
------------------------------------------------------------------------
 06a20000 0040003c ebe04bc9 00050204 6faa0000 01000000 00000006 29b3a204
 00040ca0 00020200 00000000 000a0000 00000002 0080009b 00000100 80000000
 <32 bytes per line>
BBED> map
 File: /opt/oracle/oradata/xifenfei/system01.dbf (0)
 Block: 60                                    Dba:0x00000000
------------------------------------------------------------
 KTB Data Block (Table/Cluster)
 struct kcbh, 20 bytes                      @0
 struct ktbbh, 72 bytes                     @20
 struct kdbh, 14 bytes                      @92
 struct kdbt[3], 12 bytes                   @106
 sb2 kdbr[2]                                @118
 ub1 freespace[7598]                        @122
 ub1 rowdata[468]                           @7720
 ub4 tailchk                                @8188
BBED> sum apply
Check value for File 0, Block 60:
current = 0xe908, required = 0xe908
BBED> verify
DBVERIFY - Verification starting
FILE = /opt/oracle/oradata/xifenfei/system01.dbf
BLOCK = 60
DBVERIFY - Verification complete
Total Blocks Examined         : 1
Total Blocks Processed (Data) : 1
Total Blocks Failing   (Data) : 0
Total Blocks Processed (Index): 0
Total Blocks Failing   (Index): 0
Total Blocks Empty            : 0
Total Blocks Marked Corrupt   : 0
Total Blocks Influx           : 0
BBED>

尝试启动数据库

Sat Nov 22 15:51:33 2014
alter database open
Sat Nov 22 15:51:34 2014
Thread 1 opened at log sequence 7
  Current log# 8 seq# 7 mem# 0: /data2/oradata/redo0802.log
Successful open of redo thread 1
Sat Nov 22 15:51:34 2014
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Sat Nov 22 15:51:34 2014
SMON: enabling cache recovery
SMON: enabling tx recovery
Sat Nov 22 15:51:34 2014
Database Characterset is ZHS16CGB231280
Hex dump of (file 1, block 498) in trace file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_2818.trc
Corrupt block relative dba: 0x004001f2 (file 1, block 498)
Bad check value found during buffer read
Data in bad block:
 type: 6 format: 2 rdba: 0x004001f2
 last change scn: 0x0007.49499ca1 seq: 0x1 flg: 0x06
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x9ca10601
 check value in block header: 0xe458
 computed block checksum: 0x9720
Reread of rdba: 0x004001f2 (file 1, block 498) found same corrupted data
Sat Nov 22 15:51:35 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_smon_2803.trc:
ORA-00600: internal error code, arguments: [4000], [12], [], [], [], [], [], []
replication_dependency_tracking turned off (no async multimaster replication found)
Starting background process QMNC
QMNC started with pid=18, OS id=3000
Sat Nov 22 15:51:36 2014
Completed: alter database open
Sat Nov 22 15:51:36 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_3010.trc:
ORA-00600: internal error code, arguments: [6807], [AUDSES$], [144], [], [], [], [], []
Sat Nov 22 15:51:37 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_mmon_2809.trc:
ORA-00600: internal error code, arguments: [6807], [WRI$_ALERT_SEQUENCE], [8783], [], [], [], [], []
Sat Nov 22 15:51:37 2014
Non-fatal internal error happenned while SMON was doing non-existent object cleanup.
SMON encountered 1 out of maximum 100 non-fatal internal errors.
Sat Nov 22 15:51:38 2014
ORA-600 encountered when generating server alert SMG-3000
Sat Nov 22 15:51:38 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_mmon_2809.trc:
ORA-00600: internal error code, arguments: [ktcpoptx_0], [0x772705E60], [], [], [], [], [], []

只要出现ORA-600 4000和ORA-600 6807错误,其中ORA-600 6807错误比较明显是由于seq$坏块,导致AUDSES$ seq异常导致。ORA-600 4000应该是回滚段异常,继续分析回滚段

SQL> select name,ts#,status$ from undo$;
NAME                                  TS#    STATUS$
------------------------------ ---------- ----------
SYSTEM                                  0          2
_SYSSMU1$                               1          2
_SYSSMU2$                               1          2
_SYSSMU3$                               1          2
…………
_SYSSMU168$                             1          2
_SYSSMU169$                             1          2

这里很异常,system回滚段在数据库open之后,按照常理不可能处于STATUS$=2(OFFLINE)状态。而且其他回滚段全部为OFFLINE状态也属于异常情况.而且尝试drop undo报ORA-01561,另外在dba_rollback_segs中无SYSTEM(查询结果忘记保存)

SQL> drop tablespace undotbs1 including contents;
drop tablespace undotbs1 including contents
*
ERROR at line 1:
ORA-01561: failed to remove all objects in the tablespace specified

通过这一系列很怀疑是由于bbed 修改了undo$等相关基表信息导致现在system中的undo信息混乱.信息反馈给客户后,客户想起来昨天给他们恢复的公司在bbed操作前备份了system01.dbf.突然感觉救星来了.实在怕不懂bbed的人折腾bbed

dbv检测备份文件

DBVERIFY - Verification starting : FILE = /data3/backup/system01.dbf_bak
Page 60 is marked corrupt
Corrupt block relative dba: 0x0040003c (file 1, block 60)
Bad header found during dbv:
Data in bad block:
 type: 128 format: 0 rdba: 0x0040003c
 last change scn: 0x0005.ebe04bc9 seq: 0x2 flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x4bc90602
 check value in block header: 0x6faa
 computed block checksum: 0x0
Block Checking: DBA = 4194802, Block Type = KTB-managed data block
data header at 0x1002ef05c
kdbchk: row locked by non-existent transaction
        table=0   slot=4
        lockid=1   ktbbhitc=2
Page 498 failed with check code 6101
DBVERIFY - Verification complete
Total Pages Examined         : 786432
Total Pages Processed (Data) : 201131
Total Pages Failing   (Data) : 1
Total Pages Processed (Index): 221394
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 60265
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 303641
Total Pages Marked Corrupt   : 1
Total Pages Influx           : 0
Highest block SCN            : 1229823477 (7.1229823477)

好家伙只有一个物理坏块和一个逻辑坏块,而对于物理坏块block 60已经知道如何修复,逻辑坏块可以尝试设置隐含参数跳过去,bbed修改相关block(同上步骤)

再次启动数据库

dd if=/opt/oracle/oradata/xifenfei/system01.dbf bs=8192 count=2 of=/tmp/system01.2
dd if=/tmp/system01.2 of=/data3/backup/system01.dbf_bak bs=8192 count=2 conv=notrunc
Sat Nov 22 17:52:50 2014
Database mounted in Exclusive Mode
Completed: ALTER DATABASE   MOUNT
Sat Nov 22 17:53:38 2014
alter database rename file '/opt/oracle/oradata/xifenfei/system01.dbf' to '/data3/backup/system01.dbf_bak'
Sat Nov 22 17:53:39 2014
Completed: alter database rename file '/opt/oracle/oradata/xifenfei/system01.dbf' to '/data3/backup/system01.dbf_bak'
Sat Nov 22 17:55:43 2014
alter database open
Sat Nov 22 17:55:48 2014
LGWR: STARTING ARCH PROCESSES
ARC0 started with pid=18, OS id=15858
Sat Nov 22 17:56:10 2014
ARC0: Archival started
ARC1: Archival started
LGWR: STARTING ARCH PROCESSES COMPLETE
ARC1 started with pid=17, OS id=15879
Sat Nov 22 17:56:19 2014
Thread 1 opened at log sequence 7
  Current log# 8 seq# 7 mem# 0: /data2/oradata/redo0802.log
Successful open of redo thread 1
Sat Nov 22 17:56:19 2014
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Sat Nov 22 17:56:19 2014
SMON: enabling cache recovery
SMON: enabling tx recovery
Sat Nov 22 17:56:20 2014
ARC1: STARTING ARCH PROCESSES
Sat Nov 22 17:56:20 2014
ARC0: Becoming the 'no FAL' ARCH
ARC0: Becoming the 'no SRL' ARCH
Sat Nov 22 17:56:22 2014
Database Characterset is ZHS16CGB231280
replication_dependency_tracking turned off (no async multimaster replication found)
Starting background process QMNC
Sat Nov 22 17:56:33 2014
ARC2: Archival started
ARC1: STARTING ARCH PROCESSES COMPLETE
ARC1: Becoming the heartbeat ARCH
ARC2 started with pid=23, OS id=15928
QMNC started with pid=25, OS id=15996
Sat Nov 22 17:57:11 2014
Completed: alter database open
Sat Nov 22 17:57:18 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_16010.trc:
ORA-00600: internal error code, arguments: [4511], [], [], [], [], [], [], []
Sat Nov 22 17:57:26 2014
Errors in file /opt/oracle/admin/xifenfei/udump/xifenfei_ora_16012.trc:
ORA-00600: internal error code, arguments: [4511], [], [], [], [], [], [], []
Sat Nov 22 17:58:17 2014
Starting background process EMN0
Sat Nov 22 18:00:03 2014
Shutting down instance: further logons disabled
EMN0 started with pid=71, OS id=16421
Sat Nov 22 18:00:12 2014
SMON: Restarting fast_start parallel rollback
Sat Nov 22 18:00:23 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_p000_15951.trc:
ORA-00600: internal error code, arguments: [4198], [9], [], [], [], [], [], []
Sat Nov 22 18:00:24 2014
Stopping background process CJQ0
Sat Nov 22 18:00:24 2014
Stopping background process QMNC
Sat Nov 22 18:00:27 2014
Doing block recovery for file 2 block 41
Block recovery from logseq 7, block 180883 to scn 214748389244
Sat Nov 22 18:00:27 2014
Recovery of Online Redo Log: Thread 1 Group 8 Seq 7 Reading mem 0
  Mem# 0 errs 0: /data2/oradata/redo0802.log
Block recovery stopped at EOT rba 7.180988.16
Block recovery completed at rba 7.180988.16, scn 50.24441
Sat Nov 22 18:00:32 2014
Stopping background process MMNL
Sat Nov 22 18:00:38 2014
Stopping background process MMON
Sat Nov 22 18:00:41 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_smon_15395.trc:
ORA-00600: internal error code, arguments: [4137], [], [], [], [], [], [], []
Sat Nov 22 18:00:42 2014
ORACLE Instance xifenfei (pid = 9) - Error 600 encountered while recovering transaction (3, 4).
Sat Nov 22 18:00:42 2014
Errors in file /opt/oracle/admin/xifenfei/bdump/xifenfei_smon_15395.trc:
ORA-00600: internal error code, arguments: [4137], [], [], [], [], [], [], []

这里都是很常规的错误,查询undo$也已经正常,重建新undo表空间删除老undo,然后alert日志中无其他报错,数据库恢复至此完成,建议客户导出导入重建数据库

Oracle 12C的第一次异常恢复—文件头坏块

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:Oracle 12C的第一次异常恢复—文件头坏块

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

接到第一个使用Oracle 12C作为生产库的恢复救援.有两个业务数据文件报文件头损坏,其他数据文件全部是9月份的一次备份,在当前的条件下,希望我们能够帮他们恢复出来业务文件中的数据
数据库版本信息

SQL> select * from v$version;
BANNER                                                                               CON_ID
-------------------------------------------------------------------------------- ----------
Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production              0
PL/SQL Release 12.1.0.1.0 - Production                                                    0
CORE    12.1.0.1.0      Production                                                        0
TNS for Linux: Version 12.1.0.1.0 - Production                                            0
NLSRTL Version 12.1.0.1.0 - Production                                                    0

数据库故障
具体脚本请参考:数据库恢复检查脚本(Oracle Database Recovery Check)
控制文件信息
1
控制文件中关于数据文件信息
2
数据文件头信息
3
alert日志报错

Reading datafile '/app/oracle/oradata/freetouch/sales.dbf' for corruption at rdba: 0x00000001 (file 4, block 1)
Reread (file 4, block 1) found same corrupt data (no logical check)
Hex dump of (file 5, block 1) in trace file /app/oracle/diag/rdbms/valuenet/valuenet/trace/valuenet_ora_12384.trc
Corrupt block relative dba: 0x00000001 (file 5, block 1)
Fractured block found during kcvxfh v8
Data in bad block:
 type: 0 format: 2 rdba: 0x00000001
 last change scn: 0x0000.00000000 seq: 0x1 flg: 0x05
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x00000001
 check value in block header: 0xa701
 computed block checksum: 0x0
Reading datafile '/app/oracle/oradata/freetouch/drp_200200' for corruption at rdba: 0x00000001 (file 5, block 1)
Reread (file 5, block 1) found same corrupt data (no logical check)
Hex dump of (file 4, block 1) in trace file /app/oracle/diag/rdbms/valuenet/valuenet/trace/valuenet_ora_12384.trc
Corrupt block relative dba: 0x00000001 (file 4, block 1)
Fractured block found during kcvxfh v8
Data in bad block:
 type: 0 format: 2 rdba: 0x00000001
 last change scn: 0x0000.00000000 seq: 0x1 flg: 0x05
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x00000001
 check value in block header: 0xa701
 computed block checksum: 0x0

odu无法识别异常文件

[oracle@db odu]$ ./odu
Oracle Data Unloader trial version 4.1.3
Copyright (c) 2008,2009,2010,2011 XiongJun. All rights reserved.
loading default config.......
byte_order little
block_size  8192
db_timezone -7
Invalid db timezone:-7
client_timezone 8
Invalid client timezone:8
asmfile_extract_path /home/oracle/hongye/odu/data
data_path  /home/oracle/hongye/odu/data
lob_path  /home/oracle/hongye/odu/data
charset_name ZHS16GBK
ncharset_name AL16UTF16
output_form  dmp
error at line 10.
lob_storage infile
clob_byte_order big
trace_level 1
delimiter |
unload_deleted yes
file_header_offset 0
is_tru64 no
record_row_addr no
convert_clob_charset yes
load config file 'config.txt' successful
loading default asm disk file ......
can not open file 'asmdisk.txt', error message:No such file or directory.
loading default control file ......
unknown file format '/app/oracle/oradata/freetouch/sales.dbf'
unknown file format '/app/oracle/oradata/freetouch/drp_200200'
 ts#   fn  rfn bsize   blocks bf offset filename
---- ---- ---- ----- -------- -- ------ --------------------------------------------
   1    1    1  8192   194560 N       0 /app/oracle/oradata/freetouch/system01.dbf
   6    2   10  8192    45840 N       0 /app/oracle/oradata/freetouch/example01.dbf
   1    3    3  8192   907520 N       0 /app/oracle/oradata/freetouch/sysaux01.dbf
   4 1024   10  8192        0 N       0 /app/oracle/oradata/freetouch/sales.dbf
   5 1024    9  8192        0 N       0 /app/oracle/oradata/freetouch/drp_200200
   4    6    6  8192   128320 N       0 /app/oracle/oradata/freetouch/users01.dbf
   7    7    7  8192   780288 N       0 /app/oracle/oradata/freetouch/undotbs03.dbf
  11    8    8  8192    25600 N       0 /app/oracle/oradata/freetouch/indx01.dbf
load control file 'control.txt' successful
loading dictionary data......done
loading scanned data......done

dul无法识别异常文件

[oracle@db dul]$ ./dul
Data UnLoader: 10.2.0.5.32 - Internal Only - on Sun Nov  2 23:34:42 2014
with 64-bit io functions
Copyright (c) 1994 2014 Bernard van Duijnen All rights reserved.
 Strictly Oracle Internal Use Only
DUL: Warning: ulimit process stack size is only 33554432
Found db_id = 270587870
Found db_name = VALUENET
DUL: Warning: Cannot verify file number for /app/oracle/oradata/freetouch/sales.dbf
DUL: Warning: First four bytes(76 162 0 0) of block 2 are not the start of a proper data block header
DUL: Warning: Block corruption or configuration error
DUL: Warning: Check db_block_size and/or osd_file_leader_size and/or file offset
DUL: Error: File Number can only be zero for Single Tablespace Datafiles
DUL: Warning: Cannot verify file number for /app/oracle/oradata/freetouch/drp_200200
DUL: Warning: First four bytes(76 162 0 0) of block 2 are not the start of a proper data block header
DUL: Warning: Block corruption or configuration error
DUL: Warning: Check db_block_size and/or osd_file_leader_size and/or file offset
DUL: Error: File Number can only be zero for Single Tablespace Datafiles
DUL> show datafiles;
ts# rf# start   blocks offs open  err file name
  0   1     0   194561    0    1    0 /app/oracle/oradata/freetouch/system01.dbf
  1   3     0   907521    0    1    0 /app/oracle/oradata/freetouch/sysaux01.dbf
  4   6     0   128321    0    1    0 /app/oracle/oradata/freetouch/users01.dbf
  7   7     0   780289    0    1    0 /app/oracle/oradata/freetouch/undotbs03.dbf
 11   8     0    25601    0    1    0 /app/oracle/oradata/freetouch/indx01.dbf
  6  10     0    45841    0    1    0 /app/oracle/oradata/freetouch/example01.dbf

该异常文件使用dul/odu均无法正常识别.证明文件头确实已经损坏

dbv 检测

[oracle@db trace]$ dbv file=/app/oracle/oradata/freetouch/drp_200200
DBVERIFY: Release 12.1.0.1.0 - Production on Sun Nov 2 14:08:34 2014
Copyright (c) 1982, 2013, Oracle and/or its affiliates.  All rights reserved.
DBVERIFY - Verification starting : FILE = /app/oracle/oradata/freetouch/drp_200200
DBVERIFY - Verification complete
Total Pages Examined         : 194560
Total Pages Processed (Data) : 114596
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 26198
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 37787
Total Pages Processed (Seg)  : 1
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 15979
Total Pages Marked Corrupt   : 0
Total Pages Influx           : 0
Total Pages Encrypted        : 0
Highest block SCN            : 129603862 (0.129603862)
[oracle@db ~]$ dbv file=/app/oracle/oradata/freetouch/sales.dbf
DBVERIFY: Release 12.1.0.1.0 - Production on Sun Nov 2 23:12:05 2014
Copyright (c) 1982, 2013, Oracle and/or its affiliates.  All rights reserved.
DBVERIFY - Verification starting : FILE = /app/oracle/oradata/freetouch/sales.dbf
DBVERIFY - Verification complete
Total Pages Examined         : 655360
Total Pages Processed (Data) : 294938
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 233404
Total Pages Failing   (Index): 0
Total Pages Processed (Lob)  : 38
Total Pages Failing   (Lob)  : 0
Total Pages Processed (Other): 23252
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 103728
Total Pages Marked Corrupt   : 0
Total Pages Influx           : 0
Total Pages Encrypted        : 0
Highest block SCN            : 134665298 (0.134665298)

dbv检测结果无坏块,但是v$datafile_header和alert日志中报坏块,初步判断是由于该文件是bigfile,dbv未检测到文件头坏块,实际该该数据文件头损坏,其他block正常.所幸的是该库有9月份的rman备份(中间归档丢失),因此使用rman还原出来9月份的数据文件,然后使用dd拷贝两个 block(block 0和block 1)到异常文件.

[root@db freetouch]# dd if=/app1/oracle/oradata/freetouch/sales.dbf bs=8192 count=2 of=/tmp/odu/sales.2
2+0 records in
2+0 records out
16384 bytes (16 kB) copied, 0.0176368 s, 929 kB/s
[root@db freetouch]# dd if=/tmp/odu/sales.2 of=/app/oracle/oradata/freetouch/sales.dbf bs=8192 count=2 conv=notrunc
2+0 records in
2+0 records out
16384 bytes (16 kB) copied, 6.4281e-05 s, 255 MB/s
[root@db freetouch]# dd if=/app1/oracle/oradata/freetouch/drp_200200 bs=8192 count=2 of=/tmp/odu/drp_200200.2
2+0 records in
2+0 records out
16384 bytes (16 kB) copied, 0.0185934 s, 881 kB/s
[root@db freetouch]# dd if=/tmp/odu/drp_200200.2 of=/app/oracle/oradata/freetouch/drp_200200 bs=8192 count=2 conv=notrunc
2+0 records in
2+0 records out
16384 bytes (16 kB) copied, 6.4419e-05 s, 254 MB/s

尝试恢复数据库

SQL> recover database;
ORA-00283: recovery session canceled due to errors
ORA-01610: recovery using the BACKUP CONTROLFILE option must be done
SQL> recover database using backup controlfile;
ORA-00283: recovery session canceled due to errors
ORA-19909: datafile 4 belongs to an orphan incarnation
ORA-01110: data file 4: '/app/oracle/oradata/freetouch/sales.dbf'

使用bbed修改相关文件头,然后继续恢复
具体见:bbed解决ORA-01190类似方法处理

SQL> recover database using backup controlfile;
ORA-00279: change 129603904 generated at 11/02/2014 19:19:54 needed for thread
1
ORA-00289: suggestion :
/app/oracle/recovery_area/VALUENET/archivelog/2014_11_02/o1_mf_1_1_%u_.arc
ORA-00280: change 129603904 for thread 1 is in sequence #1
Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
cancel;
ORA-00308: cannot open archived log 'cancel;'
ORA-27037: unable to obtain file status
Linux-x86_64 Error: 2: No such file or directory
Additional information: 3
Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
cancel
Media recovery cancelled.
SQL> alter database open resetlogs;
alter database open resetlogs
*
ERROR at line 1:
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00600: internal error code, arguments: [2662], [0], [129603911], [0],
[129603913], [29360256], [], [], [], [], [], []
Process ID: 19881
Session ID: 1 Serial number: 3

出现ORA-600[2662]错误,因为scn相差比较小,重启数据库机器,出现ORA-600[4194]错误

SQL> startup pfile='/tmp/pfile.txt' mount
ORACLE instance started.
Total System Global Area 2505338880 bytes
Fixed Size                  2291472 bytes
Variable Size             973080816 bytes
Database Buffers         1526726656 bytes
Redo Buffers                3239936 bytes
Database mounted.
SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-01113: file 1 needs media recovery
ORA-01110: data file 1: '/app/oracle/oradata/freetouch/system01.dbf'
SQL> recover database ;
ORA-00283: recovery session canceled due to errors
ORA-16433: The database or pluggable database must be opened in read/write
mode.

重建控制文件后继续恢复

SQL> alter database open resetlogs;
alter database open resetlogs
*
ERROR at line 1:
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00600: internal error code, arguments: [4194], [46], [19], [], [], [], [],
[], [], [], [], []
Process ID: 20351
Session ID: 1 Serial number: 3

设置undo_management=MANUAL然后继续恢复

[oracle@db tmp]$ sqlplus / as sysdba
SQL*Plus: Release 12.1.0.1.0 Production on Sun Nov 2 19:29:45 2014
Copyright (c) 1982, 2013, Oracle.  All rights reserved.
Connected to an idle instance.
SQL> startup pfile='/tmp/pfile.txt'
ORACLE instance started.
Total System Global Area 2505338880 bytes
Fixed Size                  2291472 bytes
Variable Size             973080816 bytes
Database Buffers         1526726656 bytes
Redo Buffers                3239936 bytes
Database mounted.
Database opened.

这次的恢复也证明Oracle 12C确实有着越来越多的用户在使用.

ORA-27069: skgfdisp: 尝试在文件范围外执行 I/O

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:ORA-27069: skgfdisp: 尝试在文件范围外执行 I/O

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

接到网友技术支持请求,win 2003 ntfs格式文件系统,Oracle 8.1.7版本,主机重启后,数据库无法正常启动,offline datafile 15,数据库open成功,但是datafile 无法正常online,报错为:ORA-27069: skgfdisp: attempt to do I/O beyond the range of the file,请求协助处理

SQL> recover datafile 'D:\ORACLE\ORADATA\ORCL\ZSF_DATA.DBF';
ORA-00283: 恢复会话因错误而取消
ORA-01115: 从文件 15 读取块时出现 IO 错误 (块 # 1030071)
ORA-01110: 数据文件 15: 'D:\ORACLE\ORADATA\ORCL\ZSF_DATA.DBF'
ORA-27069: skgfdisp: 尝试在文件范围外执行 I/O
OSD-04026: 无效的参数经过. (OS 1030071)

使用bbed,成功online datafile 15

Tue Oct 28 16:30:35 2014
ALTER DATABASE RECOVER  datafile 15
Tue Oct 28 16:30:35 2014
Media Recovery Datafile: 15
Media Recovery Start
Media Recovery Log
Recovery of Online Redo Log: Thread 1 Group 1 Seq 245110 Reading mem 0
  Mem# 0 errs 0: D:\ORACLE\ORADATA\ORCL\REDO03.LOG
Media Recovery failed with error 1115
ORA-283 signalled during: ALTER DATABASE RECOVER  datafile 15  ...
Tue Oct 28 16:32:53 2014
Shutting down instance (abort)
License high water mark = 6
Instance terminated by USER, pid = 1548
Starting up ORACLE RDBMS Version: 8.1.7.0.0.
System parameters with non-default values:
  processes                = 600
  shared_pool_size         = 52428800
  large_pool_size          = 20971520
  java_pool_size           = 20971520
  control_files            = D:\oracle\oradata\ORCL\control01.ctl, D:\oracle\oradata\ORCL\control02.ctl
  db_block_buffers         = 19200
  db_block_size            = 8192
  compatible               = 8.1.0
  log_buffer               = 32768
  log_checkpoint_interval  = 10000
  log_checkpoint_timeout   = 1800
  db_files                 = 1024
  db_file_multiblock_read_count= 8
  max_enabled_roles        = 30
  remote_login_passwordfile= EXCLUSIVE
  global_names             = TRUE
  distributed_transactions = 500
  instance_name            = ORCL
  service_names            = ORCL
  mts_dispatchers          = (PROTOCOL=TCP)(PRE=oracle.aurora.server.SGiopServer)
  open_links               = 4
  sort_area_size           = 65536
  sort_area_retained_size  = 65536
  db_name                  = ORCL
  open_cursors             = 500
  ifile                    = D:\oracle\admin\ORCL\pfile\init.ora
  os_authent_prefix        =
  job_queue_processes      = 4
  job_queue_interval       = 10
  parallel_max_servers     = 5
  background_dump_dest     = D:\oracle\admin\ORCL\bdump
  user_dump_dest           = D:\oracle\admin\ORCL\udump
  max_dump_file_size       = 10240
  oracle_trace_collection_name=
PMON started with pid=2
DBW0 started with pid=3
LGWR started with pid=4
CKPT started with pid=5
SMON started with pid=6
RECO started with pid=7
SNP0 started with pid=8
SNP1 started with pid=9
SNP2 started with pid=10
SNP3 started with pid=11
Tue Oct 28 16:33:01 2014
starting up 1 shared server(s) ...
starting up 1 dispatcher(s) for network address '(ADDRESS=(PARTIAL=YES)(PROTOCOL=TCP))'...
Tue Oct 28 16:33:02 2014
ALTER DATABASE   MOUNT
Tue Oct 28 16:33:06 2014
Successful mount of redo thread 1, with mount id 1389958722.
Tue Oct 28 16:33:06 2014
Database mounted in Exclusive Mode.
Completed: ALTER DATABASE   MOUNT
Tue Oct 28 16:33:49 2014
ALTER DATABASE RECOVER  database until cancel
Tue Oct 28 16:33:49 2014
Media Recovery Start
Media Recovery Log
kcrrga: Warning.  Log sequence in archive filename wrapped
to fix length as indicated by %S in LOG_ARCHIVE_FORMAT.
Old log archive with same name might be overwritten.
ORA-279 signalled during: ALTER DATABASE RECOVER  database until cancel  ...
Tue Oct 28 16:34:03 2014
ALTER DATABASE RECOVER    LOGFILE 'D:\ORACLE\ORADATA\ORCL\REDO02.LOG'
Tue Oct 28 16:34:03 2014
Media Recovery Log D:\ORACLE\ORADATA\ORCL\REDO02.LOG
Incomplete recovery applied all redo ever generated.
Recovery completed through change %s139866389
Media Recovery Complete
Completed: ALTER DATABASE RECOVER    LOGFILE 'D:\ORACLE\ORADA
Tue Oct 28 16:34:29 2014
alter database datafile 15 online
Tue Oct 28 16:34:29 2014
Completed: alter database datafile 15 online
Tue Oct 28 16:34:36 2014
alter database open resetlogs
RESETLOGS is being done without consistancy checks. This may result
in a corrupted database. The database should be recreated.
RESETLOGS after incomplete recovery UNTIL CHANGE 139866389
Tue Oct 28 16:34:38 2014
Thread 1 opened at log sequence 1
  Current log# 2 seq# 1 mem# 0: D:\ORACLE\ORADATA\ORCL\REDO02.LOG
Successful open of redo thread 1.
Tue Oct 28 16:34:38 2014
SMON: enabling cache recovery
Tue Oct 28 16:34:38 2014
Dictionary check beginning
Dictionary check complete
Tue Oct 28 16:34:39 2014
SMON: enabling tx recovery
Tue Oct 28 16:34:44 2014
Completed: alter database open resetlogs

数据库datafile 15 online成功后,客户操作业务继续发生ORA-600[ktsxs_add2]错误

Tue Oct 28 17:07:42 2014
Errors in file D:\oracle\admin\ORCL\udump\ORA02340.TRC:
ORA-00600: 内部错误代码,自变量: [ktsxs_add2], [14], [15], [42534], [5732], [5733], [], []
Tue Oct 28 17:07:53 2014
Errors in file D:\oracle\admin\ORCL\udump\ORA02340.TRC:
ORA-00600: 内部错误代码,自变量: [ktsxs_add2], [14], [15], [42534], [5732], [5733], [], []
Tue Oct 28 17:08:03 2014
Errors in file D:\oracle\admin\ORCL\udump\ORA02340.TRC:
ORA-00600: 内部错误代码,自变量: [ktsxs_add2], [14], [15], [42534], [5732], [5733], [], []
Tue Oct 28 17:08:16 2014
Errors in file D:\oracle\admin\ORCL\udump\ORA02340.TRC:
ORA-00600: 内部错误代码,自变量: [ktsxs_add2], [14], [15], [42534], [5732], [5733], [], []
Tue Oct 28 17:08:23 2014
Errors in file D:\oracle\admin\ORCL\udump\ORA02308.TRC:
ORA-00600: 内部错误代码,自变量: [ktsxs_add2], [14], [15], [42534], [5732], [5733], [], []
Tue Oct 28 17:08:31 2014
Errors in file D:\oracle\admin\ORCL\udump\ORA02340.TRC:
ORA-00600: 内部错误代码,自变量: [ktsxs_add2], [14], [15], [42534], [5732], [5733], [], []
Tue Oct 28 17:08:38 2014
Errors in file D:\oracle\admin\ORCL\udump\ORA02308.TRC:
ORA-00600: 内部错误代码,自变量: [ktsxs_add2], [14], [15], [42534], [5732], [5733], [], []

通过分析相关日志发现是insert插入表报错,很好理解,该库的datafile 15已经超过了系统的限制,现在继续插入数据,因此报错,查询可能异常对象

SQL> col segment_name for a20
SQL> SELECT distinct OWNER, SEGMENT_NAME, SEGMENT_TYPE, A.PARTITION_NAME
  2    FROM DBA_EXTENTS A
  3   WHERE FILE_ID = 15
  4     AND 1030071 <= BLOCK_ID;
OWNER                          SEGMENT_NAME         SEGMENT_TYPE
------------------------------ -------------------- ------------------
PARTITION_NAME
------------------------------
ZSF                            DETAIL               TABLE
ZSF                            DETAIL1              INDEX
ZSF                            DETAIL2              INDEX
OWNER                          SEGMENT_NAME         SEGMENT_TYPE
------------------------------ -------------------- ------------------
PARTITION_NAME
------------------------------
ZSF                            DETAIL3              INDEX
ZSF                            DETAIL4              INDEX
ZSF                            FK_RECI_ORD          INDEX
OWNER                          SEGMENT_NAME         SEGMENT_TYPE
------------------------------ -------------------- ------------------
PARTITION_NAME
------------------------------
ZSF                            PREPAY1              INDEX
ZSF                            RECEDETAIL1          INDEX

创建新表空间

Create tablespace zsf_new datafile  'D:\ORACLE\ORADATA\ORCL\ZSF_DATA_new01.dbf' size 4096m;
alter tablespace zsf_new add datafile 'D:\ORACLE\ORADATA\ORCL\ZSF_DATA_new02.dbf'
size 128m autoextend on next 128M maxsize 4096m;

迁移异常对象到新表空间

alter table ZSF.DETAIL move tablespace ZSF_new;
alter index ZSF.DETAIL1 rebuild tablespace ZSF_new;
alter index ZSF.DETAIL2 rebuild tablespace ZSF_new;
alter index ZSF.DETAIL3 rebuild tablespace ZSF_new;
alter index ZSF.DETAIL4 rebuild tablespace ZSF_new;
alter index ZSF.FK_RECI_ORD rebuild tablespace ZSF_new;
alter index ZSF.PREPAY1 rebuild tablespace ZSF_new;
alter index ZSF.RECEDETAIL1 rebuild tablespace ZSF_new;

然后对于datafile 15所在表空间增加新文件,因为已经迁移了异常对象,然后resize datafile 15小于8G,关闭自扩展,至此该数据库恢复完成

bootstrap$核心index(I_OBJ1,I_USER1,I_FILE#_BLOCK#,I_IND1,I_TS#,I_CDEF1等)异常恢复—ORA-00701错误解决

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:bootstrap$核心index(I_OBJ1,I_USER1,I_FILE#_BLOCK#,I_IND1,I_TS#,I_CDEF1等)异常恢复—ORA-00701错误解决

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

在Oracle使用的过程中,经常遭遇一些核心index出现异常,导致数据库无法正常使用,特别是在数据库open的情况下,因为出现这些bootstrap$中的部分index是无法通过设置event38003进行重建,从而导致数据库无法正常使用,最常见异常index 有:I_ICOL1, I_TS1, I_CDEF1, I_CDEF2, I_CDEF3, I_CDEF4, I_PROXY_DATA$, I_IND1, I_TS#, I_UNDO1, I_UNDO2, I_COBJ#, I_USER1, I_USER2, I_CON1, I_CON2, I_FILE1, I_FILE2, I_FILE#_BLOCK#, I_USER#, I_OBJ#, I_PROXY_ROLE_DATA$_1, I_PROXY_ROLE_DATA$_2, I_CCOL1, I_CCOL2, I_TAB1, I_COL1, I_COL2, I_COL3, I_OBJ1, I_OBJ2, I_OBJ3, I_OBJ4, I_OBJ5
重建的index异常报错有:ORA-00701,ORA-01410,ORA-08102,ORA-08103,ORA-01578,ORA-01499,ORA-00600等各类错误
因为处理细节复杂,无法一一描述出来.这里假设Oracle数据库一个非常核心的表OBJ$中的I_OBJ1异常,现在准备重建它,大概步骤如下:
1. 假设异常index信息(I_OBJ1)

SQL> alter index sys.i_obj1 rebuild;
alter index sys.i_obj1 rebuild
*
ERROR at line 1:
ORA-00701: object necessary for warmstarting database cannot be altered
SQL> SELECT SQL_TEXT FROM BOOTSTRAP$ WHERE SQL_TEXT LIKE '%obj%';
SQL_TEXT
--------------------------------------------------------------------------------
CREATE UNIQUE INDEX I_OBJ1 ON OBJ$(OBJ#,OWNER#,TYPE#) PCTFREE 10 INITRANS 2 MAXT
RANS 255 STORAGE (  INITIAL 64K NEXT 1024K MINEXTENTS 1 MAXEXTENTS 2147483645 PC
TINCREASE 0 OBJNO 36 EXTENTS (FILE 1 BLOCK 336))

2. 创建获取异常index所属表

SQL>  select table_name from dba_indexes where owner='SYS' AND INDEX_NAME='I_OBJ1';
TABLE_NAME
--------------------------------------------------------------------------------
OBJ$

3. 获取obj$ ddl 语句

SQL> set long 10000
SQL> set pages 1000
SQL>  SELECT DBMS_METADATA.GET_DDL('TABLE','OBJ$','SYS') FROM DUAL;
DBMS_METADATA.GET_DDL('TABLE','OBJ$','SYS')
--------------------------------------------------------------------------------
  CREATE TABLE "SYS"."OBJ$"
   (    "OBJ#" NUMBER NOT NULL ENABLE,
        "DATAOBJ#" NUMBER,
        "OWNER#" NUMBER NOT NULL ENABLE,
        "NAME" VARCHAR2(128) NOT NULL ENABLE,
        "NAMESPACE" NUMBER NOT NULL ENABLE,
        "SUBNAME" VARCHAR2(128),
        "TYPE#" NUMBER NOT NULL ENABLE,
        "CTIME" DATE NOT NULL ENABLE,
        "MTIME" DATE NOT NULL ENABLE,
        "STIME" DATE NOT NULL ENABLE,
        "STATUS" NUMBER NOT NULL ENABLE,
        "REMOTEOWNER" VARCHAR2(128),
        "LINKNAME" VARCHAR2(128),
        "FLAGS" NUMBER,
        "OID$" RAW(16),
        "SPARE1" NUMBER,
        "SPARE2" NUMBER,
        "SPARE3" NUMBER,
        "SPARE4" VARCHAR2(1000),
        "SPARE5" VARCHAR2(1000),
        "SPARE6" DATE,
        "SIGNATURE" RAW(16),
        "SPARE7" NUMBER,
        "SPARE8" NUMBER,
        "SPARE9" NUMBER
   ) PCTFREE 10 PCTUSED 40 INITRANS 1 MAXTRANS 255
 NOCOMPRESS LOGGING
  STORAGE(INITIAL 16384 NEXT 106496 MINEXTENTS 1 MAXEXTENTS 2147483645
  PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1
  BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT)
  TABLESPACE "SYSTEM"

4. 获取obj$相关index 语句

SQL>  SELECT DBMS_METADATA.GET_DDL('INDEX',INDEX_NAME,'SYS') FROM dba_indexes where owner='SYS' AND TABLE_NAME='OBJ$';
DBMS_METADATA.GET_DDL('INDEX',INDEX_NAME,'SYS')
--------------------------------------------------------------------------------
  CREATE UNIQUE INDEX "SYS"."I_OBJ5" ON "SYS"."OBJ$" ("SPARE3", "NAME", "NAMESPA
CE", "TYPE#", "OWNER#", "REMOTEOWNER", "LINKNAME", "SUBNAME", "OBJ#")
  PCTFREE 10 INITRANS 2 MAXTRANS 255 COMPUTE STATISTICS
  STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
  PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1
  BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT)
  TABLESPACE "SYSTEM"
  CREATE INDEX "SYS"."I_OBJ4" ON "SYS"."OBJ$" ("DATAOBJ#", "TYPE#", "OWNER#")
  PCTFREE 10 INITRANS 2 MAXTRANS 255 COMPUTE STATISTICS
  STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
  PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1
  BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT)
  TABLESPACE "SYSTEM"
  CREATE INDEX "SYS"."I_OBJ3" ON "SYS"."OBJ$" ("OID$")
  PCTFREE 10 INITRANS 2 MAXTRANS 255 COMPUTE STATISTICS
  STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
  PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1
  BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT)
  TABLESPACE "SYSTEM"
  CREATE UNIQUE INDEX "SYS"."I_OBJ2" ON "SYS"."OBJ$" ("OWNER#", "NAME", "NAMESPA
CE", "REMOTEOWNER", "LINKNAME", "SUBNAME", "TYPE#", "SPARE3", "OBJ#")
  PCTFREE 10 INITRANS 2 MAXTRANS 255 COMPUTE STATISTICS
  STORAGE(INITIAL 16384 NEXT 106496 MINEXTENTS 1 MAXEXTENTS 2147483645
  PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1
  BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT)
  TABLESPACE "SYSTEM"
  CREATE UNIQUE INDEX "SYS"."I_OBJ1" ON "SYS"."OBJ$" ("OBJ#", "OWNER#", "TYPE#")
  PCTFREE 10 INITRANS 2 MAXTRANS 255 COMPUTE STATISTICS
  STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
  PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1
  BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT)
  TABLESPACE "SYSTEM"

5. 创建obj$img表和相关index

SQL>   CREATE TABLE "SYS"."OBJ$IMG"
  2     (    "OBJ#" NUMBER NOT NULL ENABLE,
  3          "DATAOBJ#" NUMBER,
  4          "OWNER#" NUMBER NOT NULL ENABLE,
  5          "NAME" VARCHAR2(128) NOT NULL ENABLE,
  6          "NAMESPACE" NUMBER NOT NULL ENABLE,
  7          "SUBNAME" VARCHAR2(128),
  8          "TYPE#" NUMBER NOT NULL ENABLE,
  9          "CTIME" DATE NOT NULL ENABLE,
 10          "MTIME" DATE NOT NULL ENABLE,
 11          "STIME" DATE NOT NULL ENABLE,
 12          "STATUS" NUMBER NOT NULL ENABLE,
 13          "REMOTEOWNER" VARCHAR2(128),
 14          "LINKNAME" VARCHAR2(128),
 15          "FLAGS" NUMBER,
 16          "OID$" RAW(16),
 17          "SPARE1" NUMBER,
 18          "SPARE2" NUMBER,
 19          "SPARE3" NUMBER,
 20          "SPARE4" VARCHAR2(1000),
 21          "SPARE5" VARCHAR2(1000),
 22          "SPARE6" DATE,
 23          "SIGNATURE" RAW(16),
 24          "SPARE7" NUMBER,
 25          "SPARE8" NUMBER,
 26          "SPARE9" NUMBER
 27     ) PCTFREE 10 PCTUSED 40 INITRANS 1 MAXTRANS 255
 28   NOCOMPRESS LOGGING
 29    STORAGE(INITIAL 16384 NEXT 106496 MINEXTENTS 1 MAXEXTENTS 2147483645
 30    PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1
 31    BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT)
 32    TABLESPACE "SYSTEM"
 33  ;
Table created.
SQL>
SQL>   CREATE UNIQUE INDEX "SYS"."I_OBJ_5IMG" ON "SYS"."OBJ$IMG" ("SPARE3", "NAME", "NAMESPACE", "TYPE#", "OWNER#", "REMOTEOWNER", "LINKNAME", "SUBNAME", "OBJ#")
  2    PCTFREE 10 INITRANS 2 MAXTRANS 255 COMPUTE STATISTICS
  3    STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
  4    PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1
  5    BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT)
  6    TABLESPACE "SYSTEM";
Index created.
SQL>   CREATE INDEX "SYS"."I_OBJ4IMG" ON "SYS"."OBJ$IMG" ("DATAOBJ#", "TYPE#", "OWNER#")
  2    PCTFREE 10 INITRANS 2 MAXTRANS 255 COMPUTE STATISTICS
  3    STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
  4    PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1
  5    BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT)
  6    TABLESPACE "SYSTEM";
Index created.
SQL>   CREATE INDEX "SYS"."I_OBJ3IMG" ON "SYS"."OBJ$IMG" ("OID$")
  2    PCTFREE 10 INITRANS 2 MAXTRANS 255 COMPUTE STATISTICS
  3    STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
  4    PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1
  5    BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT)
  6    TABLESPACE "SYSTEM";
Index created.
SQL>   CREATE UNIQUE INDEX "SYS"."I_OBJ2IMG" ON "SYS"."OBJ$IMG" ("OWNER#", "NAME", "NAMESPACE", "REMOTEOWNER", "LINKNAME", "SUBNAME", "TYPE#", "SPARE3", "OBJ#")
  2    PCTFREE 10 INITRANS 2 MAXTRANS 255 COMPUTE STATISTICS
  3    STORAGE(INITIAL 16384 NEXT 106496 MINEXTENTS 1 MAXEXTENTS 2147483645
  4    PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1
  5    BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT)
  6    TABLESPACE "SYSTEM";
Index created.
SQL>   CREATE UNIQUE INDEX "SYS"."I_OBJ1IMG" ON "SYS"."OBJ$IMG" ("OBJ#", "OWNER#", "TYPE#")
  2    PCTFREE 10 INITRANS 2 MAXTRANS 255 COMPUTE STATISTICS
  3    STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
  4    PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1
  5    BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT)
  6    TABLESPACE "SYSTEM"
  7  ;
Index created.

6. 获取bootstrap$语句

SQL> SELECT DBMS_METADATA.GET_DDL('TABLE','BOOTSTRAP$','SYS') FROM DUAL;
DBMS_METADATA.GET_DDL('TABLE','BOOTSTRAP$','SYS')
--------------------------------------------------------------------------------
  CREATE TABLE "SYS"."BOOTSTRAP$"
   (    "LINE#" NUMBER NOT NULL ENABLE,
        "OBJ#" NUMBER NOT NULL ENABLE,
        "SQL_TEXT" VARCHAR2(4000) NOT NULL ENABLE
   ) PCTFREE 10 PCTUSED 40 INITRANS 1 MAXTRANS 255
 NOCOMPRESS LOGGING
  STORAGE(INITIAL 57344 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
  PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1
  BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT)
  TABLESPACE "SYSTEM"

7. 创建bootstrap$img表

SQL>   CREATE TABLE "SYS"."BOOTSTRAP$IMG"
  2     (    "LINE#" NUMBER NOT NULL ENABLE,
  3          "OBJ#" NUMBER NOT NULL ENABLE,
  4          "SQL_TEXT" VARCHAR2(4000) NOT NULL ENABLE
  5     ) PCTFREE 10 PCTUSED 40 INITRANS 1 MAXTRANS 255
  6   NOCOMPRESS LOGGING
  7    STORAGE(INITIAL 57344 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
  8    PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1
  9    BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT)
 10    TABLESPACE "SYSTEM"
 11  /
Table created.

8.插入数据

insert into obj$img select * from obj$;
insert into bootstrap$img select * from bootstrap$;
commit;

9. 删除bootstrap$img对象名

delete from bootstrap$img where obj# in
 (select obj# from obj$
  where name in ('OBJ$',  'I_OBJ1',  'I_OBJ2', 'I_OBJ3', 'I_OBJ4', 'I_OBJ5',
                 'BOOTSTRAP$'));
commit;

10. 插入新创建对象

insert into bootstrap$img select * from bootstrap$tmpstr;
commit;

11. 关闭数据库

SQL> shutdown immediate;
Database closed.
Database dismounted.
ORACLE instance shut down.

12. bbed修改相关block值

[oracle@oel6 ~]$ bbed
Password:
BBED: Release 2.0.0.0.0 - Limited Production on Thu Aug 7 04:54:50 2014
Copyright (c) 1982, 2013, Oracle and/or its affiliates.  All rights reserved.
************* !!! For Oracle Internal Use only !!! ***************
BBED> set filename '/u01/app/oracle/oradata/XIFENFEI/system01.dbf'
        FILENAME        /u01/app/oracle/oradata/XIFENFEI/system01.dbf
BBED> set blocksize 8192
        BLOCKSIZE       8192
BBED> set block 1
        BLOCK#          1
BBED> set mode edit
        MODE            Edit
BBED> set count 32
        COUNT           32
BBED> m /x e81d
Warning: contents of previous BIFILE will be lost. Proceed? (Y/N) y
 File: /u01/app/oracle/oradata/XIFENFEI/system01.dbf (0)
 Block: 1                Offsets:   196 to  227           Dba:0x00000000
------------------------------------------------------------------------
 e81d4000 12000000 00000000 41bad632 15bad632 01000000 00000000 00000000
 <32 bytes per line>
BBED> m /x 5200 offset +2
 File: /u01/app/oracle/oradata/XIFENFEI/system01.dbf (0)
 Block: 1                Offsets:   198 to  229           Dba:0x00000000
------------------------------------------------------------------------
 52001200 00000000 000041ba d63215ba d6320100 00000000 00000000 00000000
 <32 bytes per line>
BBED> d offset 96
 File: /u01/app/oracle/oradata/XIFENFEI/system01.dbf (0)
 Block: 1                Offsets:   196 to  227           Dba:0x00000000
------------------------------------------------------------------------
 e81d5200 12000000 00000000 41bad632 15bad632 01000000 00000000 00000000
 <32 bytes per line>
BBED> sum apply
Check value for File 0, Block 1:
current = 0x5fbf, required = 0x5fbf
BBED> verify
DBVERIFY - Verification starting
FILE = /u01/app/oracle/oradata/XIFENFEI/system01.dbf
BLOCK = 1
DBVERIFY - Verification complete
Total Blocks Examined         : 1
Total Blocks Processed (Data) : 0
Total Blocks Failing   (Data) : 0
Total Blocks Processed (Index): 0
Total Blocks Failing   (Index): 0
Total Blocks Empty            : 0
Total Blocks Marked Corrupt   : 0
Total Blocks Influx           : 0
Message 531 not found;  product=RDBMS; facility=BBED
BBED> exit

13. 启动数据库

SQL> startup
ORACLE instance started.
Total System Global Area  663945216 bytes
Fixed Size                  2291808 bytes
Variable Size             369100704 bytes
Database Buffers          289406976 bytes
Redo Buffers                3145728 bytes
Database mounted.
Database opened.

14. 验证I_OBJ1 index重建情况

SQL> SELECT OBJECT_ID,DATA_OBJECT_ID,CREATED FROM DBA_OBJECTS WHERE OBJECT_NAME='I_OBJ1';
 OBJECT_ID DATA_OBJECT_ID CREATED
---------- -------------- ---------
     77120          77120 06-AUG-14
SQL> !date
Thu Aug  6 05:29:25 CST 2014
SQL> SELECT HEADER_FILE,HEADER_BLOCK FROM DBA_SEGMENTS WHERE SEGMENT_NAME='I_OBJ1';
HEADER_FILE HEADER_BLOCK
----------- ------------
          1        77296
SQL> SELECT SQL_TEXT FROM BOOTSTRAP$ WHERE SQL_TEXT LIKE '%obj%';
SQL_TEXT
-----------------------------------------------------------------------------------------------------------------------------------------------
create unique index i_obj1 on obj$(obj#, owner#, type#) PCTFREE 10 INITRANS 2 MAXTRANS 255 STORAGE (  INITIAL 65536 NEXT 1048576 MINEXTENTS 1 M
AXEXTENTS 2147483645 PCTINCREASE 0 OBJNO 77120 EXTENTS (FILE 1 BLOCK 77296))
create unique index i_obj2 on obj$(owner#, name, namespace,remoteowner, linkname, subname, type#, spare3, obj#) PCTFREE 10 INITRANS 2 MAXTRANS
255 STORAGE (  INITIAL 16384 NEXT 106496 MINEXTENTS 1 MAXEXTENTS 2147483645 PCTINCREASE 0 OBJNO 77119 EXTENTS (FILE 1 BLOCK 77288))
create index i_obj3 on obj$(oid$) PCTFREE 10 INITRANS 2 MAXTRANS 255 STORAGE (  INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645 P
CTINCREASE 0 OBJNO 77118 EXTENTS (FILE 1 BLOCK 77280))
create index i_obj4 on obj$(dataobj#, type#, owner#) PCTFREE 10 INITRANS 2 MAXTRANS 255 STORAGE (  INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXE
XTENTS 2147483645 PCTINCREASE 0 OBJNO 77117 EXTENTS (FILE 1 BLOCK 77272))
create unique index i_obj5 on obj$(spare3, name, namespace, type#, owner#, remoteowner, linkname, subname, obj#) PCTFREE 10 INITRANS 2 MAXTRANS
 255 STORAGE (  INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645 PCTINCREASE 0 OBJNO 77116 EXTENTS (FILE 1 BLOCK 77264))

不同版本对应的基表和index结构不一样(本篇blog使用的是12c做试验),但是本方法支持数据库从Oracle 7到Oracle 12c对于bootstrap$中核心index处理.
因为重建bootstrap$中的核心index是一个复杂而且风险非常大的事情,在你无法确定风险或者无法正常完成此类操作之时,如有需要请联系我
Phone:17813235971    Q Q:107644445    E-Mail:dba@xifenfei.com