rman制造坏块,bbed修复坏块

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:rman制造坏块,bbed修复坏块

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

在rman中有隐藏的命令clear,可以标记数据块为corrupt,从而实现数据库坏块试验,本篇blog通过dbv,bbed,table select来验证坏块的出现和修复
创建测试表

SQL>  create table t_xifenfei as
  2   select object_id,object_name from dba_objects where rownum<10;
Table created.
SQL>  select rowid,
  2   dbms_rowid.rowid_relative_fno(rowid) rel_fno,
  3   dbms_rowid.rowid_block_number(rowid) block
  4   from chf.t_xifenfei;
ROWID                 REL_FNO      BLOCK
------------------ ---------- ----------
AAAStAAAEAAAACrAAA          4        171
AAAStAAAEAAAACrAAB          4        171
AAAStAAAEAAAACrAAC          4        171
AAAStAAAEAAAACrAAD          4        171
AAAStAAAEAAAACrAAE          4        171
AAAStAAAEAAAACrAAF          4        171
AAAStAAAEAAAACrAAG          4        171
AAAStAAAEAAAACrAAH          4        171
AAAStAAAEAAAACrAAI          4        171
9 rows selected.
SQL> alter system checkpoint;
System altered.

dbv验证坏块

[oracle@xifenfei ~]$ dbv file=/u01/oracle/oradata/ora11g/users01.dbf
DBVERIFY: Release 11.2.0.3.0 - Production on Sun Jan 20 09:12:16 2013
Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.
DBVERIFY - Verification starting : FILE = /u01/oracle/oradata/ora11g/users01.dbf
DBVERIFY - Verification complete
Total Pages Examined         : 7680
Total Pages Processed (Data) : 3776
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 167
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 744
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 2993
Total Pages Marked Corrupt   : 0
Total Pages Influx           : 0
Total Pages Encrypted        : 0
Highest block SCN            : 1006717955 (2956.1006717955)

这里创建了t_xifenfei表,数据存储在file 4 block 171中,现在该block一切正常,本试验就是要通过rman来使得该block corrupt,然后通过bbed来修复

bbed查看kcbh

BBED> set block 171
        BLOCK#          171
BBED> p kcbh
struct kcbh, 20 bytes                       @0
   ub1 type_kcbh                            @0        0x06
   ub1 frmt_kcbh                            @1        0xa2
   ub1 spare1_kcbh                          @2        0x00
   ub1 spare2_kcbh                          @3        0x00
   ub4 rdba_kcbh                            @4        0x010000ab
   ub4 bas_kcbh                             @8        0x3c014bfe
   ub2 wrp_kcbh                             @12       0x0b8c
   ub1 seq_kcbh                             @14       0x02
   ub1 flg_kcbh                             @15       0x04 (KCBHFCKV)
   ub2 chkval_kcbh                          @16       0x6e0c  <--重点关注
   ub2 spare3_kcbh                          @18       0x0000

rman标记坏块

RMAN> BLOCKRECOVER DATAFILE 4 block 171 clear;
Starting recover at 20-JAN-13
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=18 device type=DISK
Finished recover at 20-JAN-13

dbv再次检查坏块

[oracle@xifenfei ~]$ dbv file=/u01/oracle/oradata/ora11g/users01.dbf
DBVERIFY: Release 11.2.0.3.0 - Production on Sun Jan 20 09:53:16 2013
Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.
DBVERIFY - Verification starting : FILE = /u01/oracle/oradata/ora11g/users01.dbf
Page 171 is marked corrupt
Corrupt block relative dba: 0x010000ab (file 4, block 171)
Bad check value found during dbv:
Data in bad block:
 type: 6 format: 2 rdba: 0x010000ab
 last change scn: 0x0b8c.3c014bfe seq: 0x2 flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x4bfe0602
 check value in block header: 0x6e0c
 computed block checksum: 0xb5bc
DBVERIFY - Verification complete
Total Pages Examined         : 7680
Total Pages Processed (Data) : 3775
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 167
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 744
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 2993
Total Pages Marked Corrupt   : 1
Total Pages Influx           : 1
Total Pages Encrypted        : 0
Highest block SCN            : 1006717955 (2956.1006717955)

尝试查询数据

SQL> select count(*) from t_xifenfei;
select count(*) from t_xifenfei
                     *
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 4, block # 171)
ORA-01110: data file 4: '/u01/oracle/oradata/ora11g/users01.dbf'

bbed验证坏块

BBED> set block 171
        BLOCK#          171
BBED> p kcbh
struct kcbh, 20 bytes                       @0
   ub1 type_kcbh                            @0        0x06
   ub1 frmt_kcbh                            @1        0xa2
   ub1 spare1_kcbh                          @2        0x00
   ub1 spare2_kcbh                          @3        0x00
   ub4 rdba_kcbh                            @4        0x010000ab
   ub4 bas_kcbh                             @8        0x3c014bfe
   ub2 wrp_kcbh                             @12       0x0b8c
   ub1 seq_kcbh                             @14       0x02
   ub1 flg_kcbh                             @15       0x04 (KCBHFCKV)
   ub2 chkval_kcbh                          @16       0x6e0c  <--注意该值未变化
   ub2 spare3_kcbh                          @18       0x0000
BBED> verify
DBVERIFY - Verification starting
FILE = /u01/oracle/oradata/ora11g/users01.dbf
BLOCK = 171
Block 171 is corrupt
Corrupt block relative dba: 0x010000ab (file 0, block 171)
Bad check value found during verification
Data in bad block:
 type: 6 format: 2 rdba: 0x010000ab
 last change scn: 0x0b8c.3c014bfe seq: 0x2 flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x4bfe0602
 check value in block header: 0x6e0c
 computed block checksum: 0xb5bc
DBVERIFY - Verification complete
Total Blocks Examined         : 1
Total Blocks Processed (Data) : 0
Total Blocks Failing   (Data) : 0
Total Blocks Processed (Index): 0
Total Blocks Failing   (Index): 0
Total Blocks Empty            : 0
Total Blocks Marked Corrupt   : 1
Total Blocks Influx           : 0
Message 531 not found;  product=RDBMS; facility=BBED

这里通过dbv,bbed,select table都证明rman能够标记block为Corrupt.
这里需要分析:block已经被标记,那证明该块肯定有修改,也就是说chkval_kcbh一定要变化,但是这里没有变化,证明该处异常

bbed修复rman生成坏块

BBED> sum apply
Warning: contents of previous BIFILE will be lost. Proceed? (Y/N) y
Check value for File 0, Block 171:
current = 0xdbb0, required = 0xdbb0
BBED> verify
DBVERIFY - Verification starting
FILE = /u01/oracle/oradata/ora11g/users01.dbf
BLOCK = 171
DBVERIFY - Verification complete
Total Blocks Examined         : 1
Total Blocks Processed (Data) : 1
Total Blocks Failing   (Data) : 0
Total Blocks Processed (Index): 0
Total Blocks Failing   (Index): 0
Total Blocks Empty            : 0
Total Blocks Marked Corrupt   : 0
Total Blocks Influx           : 0
Message 531 not found;  product=RDBMS; facility=BBED

bbed测试坏块已经修复

dbv验证坏块

[oracle@xifenfei ~]$ dbv file=/u01/oracle/oradata/ora11g/users01.dbf
DBVERIFY: Release 11.2.0.3.0 - Production on Sun Jan 20 10:01:46 2013
Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.
DBVERIFY - Verification starting : FILE = /u01/oracle/oradata/ora11g/users01.dbf
DBVERIFY - Verification complete
Total Pages Examined         : 7680
Total Pages Processed (Data) : 3776
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 167
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 744
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 2993
Total Pages Marked Corrupt   : 0
Total Pages Influx           : 0
Total Pages Encrypted        : 0
Highest block SCN            : 1006717955 (2956.1006717955)

dbv测试坏块也被修复

查询表验证

SQL> select count(*) from t_xifenfei;
  COUNT(*)
----------
         9

数据也可以正常查询,证明rman的坏块问题通过bbed解决

总结说明
1.我们可以通过rman的clear命令来标记坏块(BLOCKRECOVER DATAFILE file# BLOCK block1#, block2#, block3#… CLEAR 😉
2.我们可以通过bbed的sum apply命令来修复该类型坏块

ORA-600 kghstack_free2异常恢复

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:ORA-600 kghstack_free2异常恢复

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

今天接触一案例ORA-00600[kghstack_free2],恢复起来很简单,顺便记录下,供其他人遇到类似情况参考.

Wed Feb 20 15:05:04 2013
SMON: enabling cache recovery
Wed Feb 20 15:05:05 2013
Successfully onlined Undo Tablespace 1.
Wed Feb 20 15:05:05 2013
SMON: enabling tx recovery
Wed Feb 20 15:05:06 2013
Errors in file f:\oracle\product\10.2.0\admin\cksoltp\bdump\cksoltp_smon_3556.trc:
ORA-00600: internal error code, arguments: [kghstack_free2], [], [], [], [], [], [], []
Wed Feb 20 15:05:06 2013
Database Characterset is AL32UTF8
replication_dependency_tracking turned off (no async multimaster replication found)
Wed Feb 20 15:05:08 2013
Errors in file f:\oracle\product\10.2.0\admin\cksoltp\bdump\cksoltp_smon_3556.trc:
ORA-07445: exception encountered: core dump [ACCESS_VIOLATION] [_kghstack_err+101]
[PC:0x603CC77F] [ADDR:0x59004E04] [UNABLE_TO_READ] []
ORA-00600: internal error code, arguments: [kghstack_free2], [], [], [], [], [], [], []
Starting background process QMNC
QMNC started with pid=18, OS id=3504
Wed Feb 20 15:05:13 2013
Errors in file f:\oracle\product\10.2.0\admin\cksoltp\bdump\cksoltp_pmon_4064.trc:
ORA-00474: SMON process terminated with error

这个错误很明显是因为smon在回滚事务的时候,遇到异常从而出现ORA-00600[kghstack_free2]错误,使得数据库不能被正常open,通过分析trace文件发现回滚段36中有事务未提交,需要回滚,针对这样的数据库恢复方法很简单:1.屏蔽事务回滚,2.屏蔽回滚段强制offline等
1) EVENT=”10513 trace name context forever”
2)_offline_rollback_segments= _SYSSMU36$
数据库启动后,通过分析trace文件,找出来异常对象,然后重建该对象,除掉event/_offline_rollback_segments即可

通过rowid获取segment header坏块数据

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:通过rowid获取segment header坏块数据

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

在上篇(table中各种坏块对select/dml操作影响)中说到如果segment header异常了,不能通过ctas来获得相关数据,在群的讨论中,EZIO说到可以通过rowid方式来获得相关数据,通过测试证明,确实可以通过该方法获得数据,以后遇到此类错误,大家也不必惊慌.
创建测试表

SQL> create table t_xifenfei
  2  as
  3  select * from dba_objects where object_id is not null;
SQL> alter table t_xifenfei
  2  add constraint PK_t_xifenfei primary key (object_id)
  3  ;
Table altered.
SQL> alter system checkpoint;
System altered.
SQL> select count(*) from chf.t_xifenfei;
  COUNT(*)
----------
     74762
SQL> select header_file,header_block from
  2  DBA_SEGments where segment_name='T_XIFENFEI' AND OWNER='CHF';
HEADER_FILE HEADER_BLOCK
----------- ------------
          4          170

dump block

alter system dump datafile 4 block 170;
Dump of buffer cache at level 4 for tsn=4 rdba=16777386
Block dump from disk:
buffer tsn: 4 rdba: 0x010000aa (4/170)
scn: 0x0b8c.3c0092e4 seq: 0x01 flg: 0x04 tail: 0x91e42301
frmt: 0x02 chkval: 0xa531 type: 0x23=PAGETABLE SEGMENT HEADER

通过header_block和dump block确定block 170即为PAGETABLE SEGMENT HEADER

bbed制造SEGMENT HEADER坏块

BBED> set block 170
        BLOCK#          170
BBED> set offset 8188
        OFFSET          8188
BBED> d
 File: /u01/oracle/oradata/ora11g/users01.dbf (0)
 Block: 170              Offsets: 8188 to 8191           Dba:0x00000000
------------------------------------------------------------------------
 0123e492
 <32 bytes per line>
BBED> m /x 0123e491
Warning: contents of previous BIFILE will be lost. Proceed? (Y/N) y
 File: /u01/oracle/oradata/ora11g/users01.dbf (0)
 Block: 170              Offsets: 8188 to 8191           Dba:0x00000000
------------------------------------------------------------------------
 0123e491
 <32 bytes per line>
BBED> sum apply
Check value for File 0, Block 170:
current = 0xa531, required = 0xa531
BBED> verify
DBVERIFY - Verification starting
FILE = /u01/oracle/oradata/ora11g/users01.dbf
BLOCK = 170
Block 170 is corrupt
Corrupt block relative dba: 0x010000aa (file 0, block 170)
Fractured block found during verification
Data in bad block:
 type: 35 format: 2 rdba: 0x010000aa
 last change scn: 0x0b8c.3c0092e4 seq: 0x1 flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x91e42301
 check value in block header: 0xa531
 computed block checksum: 0x0
DBVERIFY - Verification complete
Total Blocks Examined         : 1
Total Blocks Processed (Data) : 0
Total Blocks Failing   (Data) : 0
Total Blocks Processed (Index): 0
Total Blocks Failing   (Index): 0
Total Blocks Empty            : 0
Total Blocks Marked Corrupt   : 1
Total Blocks Influx           : 2
Message 531 not found;  product=RDBMS; facility=BBED

测试segment header坏块后select操作

SQL> select * from chf.t_xifenfei;
select * from chf.t_xifenfei
                         *
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 4, block # 170)
ORA-01110: data file 4: '/u01/oracle/oradata/ora11g/users01.dbf'
SQL> select /*+index(t PK_T_XIFENFEI)*/  count(rowid) from chf.t_xifenfei t;
COUNT(ROWID)
------------
       74762

基于rowid获取segment header 坏块对象数据

SQL> create table chf.bad_rows (table_name varchar2(60),
   2 row_id rowid, oracle_error_code number);
Table created.
SQL> DECLARE
  2   TYPE RowIDTab IS TABLE OF ROWID INDEX BY BINARY_INTEGER;
  3   CURSOR c1 IS  select /*+index(t PK_T_XIFENFEI)*/ rowid from chf.t_xifenfei t;
  4   r RowIDTab;
  5   rows  NATURAL := 20000;
  6   bad_rows number := 0 ;
  7   errors number;
  8   error_code number;
  9   myrowid rowid;
 10  BEGIN
 11   OPEN c1;
 12   LOOP
 13     FETCH  c1 BULK COLLECT INTO r LIMIT rows;
 14     EXIT WHEN r.count=0;
 15     BEGIN
 16      FORALL i IN r.FIRST..r.LAST SAVE EXCEPTIONS
 17       insert into chf.t_xifenfei_new
 18       select /*+ ROWID(A) */ *
 19       from chf.t_xifenfei A where rowid = r(i);
 20     EXCEPTION
 21     when OTHERS then
 22      BEGIN
 23       errors := SQL%BULK_EXCEPTIONS.COUNT;
 24       FOR err1 IN 1..errors LOOP
 25           error_code := SQL%BULK_EXCEPTIONS(err1).ERROR_CODE;
 26           myrowid := r(SQL%BULK_EXCEPTIONS(err1).ERROR_INDEX);
 27           bad_rows := bad_rows + 1;
 28           insert into chf.bad_rows values('chf.t_xifenfei',myrowid, error_code);
 29       END LOOP;
 30       END;
 31     END;
 32    commit;
 33   END LOOP;
 34   commit;
 35   CLOSE c1;
 36   dbms_output.put_line('Total Bad Rows: '||bad_rows);
 37  END;
 38  /
Total Bad Rows: 0
PL/SQL procedure successfully completed.
SQL> select count(*) from chf.t_xifenfei_new;
  COUNT(*)
----------
     74762

通过上面pl/sql,基于rowid成功获得segment header 异常对象中的所有数据记录.如果没有主键的表出现该问题,可以参考:使用plsql抢救数据

table中各种类型block坏块是否能被跳过

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:table中各种类型block坏块是否能被跳过

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

在table遇到的各种坏块中,大部分情况,我们都可以通过设置event 10231或者dbms_repair来跳过坏块,抢救其他数据;但是在部分情况下,我们设置了他们依然不能跳过坏块,数据库依然报ORA-01578,本文测试了table中各种类型的block,证明在哪些blog出现异常之后不能被跳过.
如果是事务数据块出现坏块可以通过dbms_repair.skip_corrupt_blocks来标记坏块(也可以使用event 10231)来实现;对于BITMAP BLOCK如果出现坏块不会对于读取数据无影响(至于其他操作,请见后续blog);SEGMENT HEADER如果出现坏块,无法通过跳过坏块来实现获取其他数据(正常block)
创建测试表

SQL> select * from v$version;
BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - Production
PL/SQL Release 11.2.0.3.0 - Production
CORE    11.2.0.3.0      Production
TNS for Linux: Version 11.2.0.3.0 - Production
NLSRTL Version 11.2.0.3.0 - Production
SQL> create table t_xifenfei
  2  tablespace users
  3  as
  4  select * from dba_objects;
Table created.
SQL>  select count(*) from chf.t_xifenfei;
  COUNT(*)
----------
     74663

查询相关block信息

SQL> select SEGMENT_NAME,HEADER_FILE,HEADER_BLOCK,blocks,extents from DBA_SEGMENTS
  2  WHERE OWNER='CHF' AND SEGMENT_NAME='T_XIFENFEI';
SEGMENT_NAME    HEADER_FILE HEADER_BLOCK     BLOCKS    EXTENTS
--------------- ----------- ------------ ---------- ----------
T_XIFENFEI                4          378       1152         24
SQL>   select
  2    dbms_rowid.rowid_relative_fno(rowid)rel_fno,
  3    max(dbms_rowid.rowid_block_number(rowid)) max_block,
  4    min(dbms_rowid.rowid_block_number(rowid)) min_block
  5    from chf.t_xifenfei
  6    group by dbms_rowid.rowid_relative_fno(rowid);
   REL_FNO  MAX_BLOCK  MIN_BLOCK
---------- ---------- ----------
         4       1728        379
SQL> select EXTENT_ID,FILE_ID,BLOCK_ID,blocks from dba_extents where owner='CHF'
   2  AND SEGMENT_NAME='T_XIFENFEI';
 EXTENT_ID    FILE_ID   BLOCK_ID     BLOCKS
---------- ---------- ---------- ----------
         0          4        376          8
         1          4        640          8
         2          4        648          8
         3          4        656          8
         4          4        664          8
         5          4        672          8
         6          4        680          8
         7          4        688          8
         8          4        696          8
         9          4        704          8
        10          4        712          8
        11          4        720          8
        12          4        728          8
        13          4        736          8
        14          4        744          8
        15          4        752          8
        16          4        768        128
        17          4        896        128
        18          4       1024        128
        19          4       1152        128
        20          4       1280        128
        21          4       1408        128
        22          4       1536        128
        23          4       1664        128

通过这里可以知道:真正的存储数据是从block 379开始,至于block 376、377、378是什么,使用dump block分析

验证block类型

SQL> alter system dump datafile 4 block 376;
System altered.
SQL> alter system dump datafile 4 block 377;
System altered.
SQL> alter system dump datafile 4 block 378;
System altered.
SQL> alter system dump datafile 4 block 379;
System altered.
--该block是另外extent的开始,所以也尝试分析是否有特殊之处
SQL> alter system dump datafile 4 block 640;
System altered.
--dump 文件header信息
Start dump data blocks tsn: 4 file#:4 minblk 376 maxblk 376
Block dump from cache:
Dump of buffer cache at level 4 for tsn=4 rdba=16777592
Block dump from disk:
buffer tsn: 4 rdba: 0x01000178 (4/376)
scn: 0x0b8c.3bfc6517 seq: 0x04 flg: 0x04 tail: 0x65172004
frmt: 0x02 chkval: 0xc8b3 type: 0x20=FIRST LEVEL BITMAP BLOCK
Start dump data blocks tsn: 4 file#:4 minblk 377 maxblk 377
Block dump from cache:
Dump of buffer cache at level 4 for tsn=4 rdba=16777593
Block dump from disk:
buffer tsn: 4 rdba: 0x01000179 (4/377)
scn: 0x0b8c.3bfc6517 seq: 0x18 flg: 0x04 tail: 0x65172118
frmt: 0x02 chkval: 0x9e8c type: 0x21=SECOND LEVEL BITMAP BLOCK
Start dump data blocks tsn: 4 file#:4 minblk 378 maxblk 378
Block dump from cache:
Dump of buffer cache at level 4 for tsn=4 rdba=16777594
BH (0x2a7f7f0c) file#: 4 rdba: 0x0100017a (4/378) class: 4 ba: 0x2a742000
  set: 5 pool: 3 bsz: 8192 bsi: 0 sflg: 1 pwc: 0,0
  dbwrid: 0 obj: 76372 objn: 76372 tsn: 4 afn: 4 hint: f
  hash: [0x3150a748,0x3150a748] lru: [0x2a7f8094,0x2a7f7ee4]
  lru-flags: hot_buffer
  ckptq: [NULL] fileq: [NULL] objq: [0x2f72dc34,0x2f72dc34] objaq: [0x2f72dc2c,0x2f72dc2c]
  st: XCURRENT md: NULL fpin: 'ktewh25: kteinicnt' tch: 1
  flags:
  LRBA: [0x0.0.0] LSCN: [0x0.0] HSCN: [0xffff.ffffffff] HSUB: [65535]
Block dump from disk:
buffer tsn: 4 rdba: 0x0100017a (4/378)
scn: 0x0b8c.3bfc651b seq: 0x01 flg: 0x04 tail: 0x651b2301
frmt: 0x02 chkval: 0xb2ae type: 0x23=PAGETABLE SEGMENT HEADER
Start dump data blocks tsn: 4 file#:4 minblk 379 maxblk 379
Block dump from cache:
Dump of buffer cache at level 4 for tsn=4 rdba=16777595
Block dump from disk:
buffer tsn: 4 rdba: 0x0100017b (4/379)
scn: 0x0b8c.3bfc6494 seq: 0x01 flg: 0x04 tail: 0x64940601
frmt: 0x02 chkval: 0x0567 type: 0x06=trans data
Start dump data blocks tsn: 4 file#:4 minblk 640 maxblk 640
Block dump from cache:
Dump of buffer cache at level 4 for tsn=4 rdba=16777856
Block dump from disk:
buffer tsn: 4 rdba: 0x01000280 (4/640)
scn: 0x0b8c.3bfc6496 seq: 0x01 flg: 0x04 tail: 0x64960601
frmt: 0x02 chkval: 0x0efe type: 0x06=trans data

这里可以知道:
1.block 376、377为BITMAP BLOCK
2.block 378为SEGMENT HEADER(和dba_segments视图中一致)
3.除extent 0中有特殊(含BITMAP BLOCK和SEGMENT HEADER)block,其他extent只包含事务数据

测试block 640

--block 640包含条数
SQL> select   count(rowid)
  2    from chf.t_xifenfei
  3     where dbms_rowid.rowid_block_number(rowid)=640
  4  and dbms_rowid.rowid_relative_fno(rowid)=4;
COUNT(ROWID)
------------
          79
--bbed修改tailchk
BBED> set filename '/u01/oracle/oradata/ora11g/users01.dbf'
        FILENAME        /u01/oracle/oradata/ora11g/users01.dbf
BBED> set block 640
        BLOCK#          640
BBED> set mode edit
        MODE            Edit
BBED> p tailchk
ub4 tailchk                                 @8188     0x64960601
BBED> m /x 64960602
Warning: contents of previous BIFILE will be lost. Proceed? (Y/N) y
 File: /u01/oracle/oradata/ora11g/users01.dbf (0)
 Block: 640              Offsets: 8188 to 8191           Dba:0x00000000
------------------------------------------------------------------------
 64960602
 <32 bytes per line>
BBED> sum apply
Check value for File 0, Block 640:
current = 0xf80b, required = 0xf80b
BBED> verify
DBVERIFY - Verification starting
FILE = /u01/oracle/oradata/ora11g/users01.dbf
BLOCK = 640
Block 640 is corrupt
Corrupt block relative dba: 0x01000280 (file 0, block 640)
Fractured block found during verification
Data in bad block:
 type: 6 format: 2 rdba: 0x01000280
 last change scn: 0x0b8c.3bfc6496 seq: 0x1 flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x02069664
 check value in block header: 0xf80b
 computed block checksum: 0x0
DBVERIFY - Verification complete
Total Blocks Examined         : 1
Total Blocks Processed (Data) : 0
Total Blocks Failing   (Data) : 0
Total Blocks Processed (Index): 0
Total Blocks Failing   (Index): 0
Total Blocks Empty            : 0
Total Blocks Marked Corrupt   : 1
Total Blocks Influx           : 2
Message 531 not found;  product=RDBMS; facility=BBED
--查询坏块
SQL> select count(*) from chf.t_xifenfei;
select count(*) from chf.t_xifenfei
                         *
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 4, block # 640)
ORA-01110: data file 4: '/u01/oracle/oradata/ora11g/users01.dbf'
--跳过坏块
SQL> exec dbms_repair.skip_corrupt_blocks('CHF','T_XIFENFEI');
PL/SQL procedure successfully completed.
SQL>  select skip_corrupt from dba_tables where table_name='T_XIFENFEI' AND OWNER='CHF';
SKIP_COR
--------
ENABLED
SQL>  select count(*) from chf.t_xifenfei;
  COUNT(*)
----------
     74584
--修复坏块
BBED> m /x 01069664
Warning: contents of previous BIFILE will be lost. Proceed? (Y/N) y
 File: /u01/oracle/oradata/ora11g/users01.dbf (0)
 Block: 640              Offsets: 8188 to 8191           Dba:0x00000000
------------------------------------------------------------------------
 01069664
 <32 bytes per line>
BBED> p tailchk
ub4 tailchk                                 @8188     0x64960601
BBED> sum apply
Check value for File 0, Block 640:
current = 0x0efe, required = 0x0efe
BBED> verify
DBVERIFY - Verification starting
FILE = /u01/oracle/oradata/ora11g/users01.dbf
BLOCK = 640
DBVERIFY - Verification complete
Total Blocks Examined         : 1
Total Blocks Processed (Data) : 1
Total Blocks Failing   (Data) : 0
Total Blocks Processed (Index): 0
Total Blocks Failing   (Index): 0
Total Blocks Empty            : 0
Total Blocks Marked Corrupt   : 0
Total Blocks Influx           : 0
Message 531 not found;  product=RDBMS; facility=BBED
--除掉标记表坏块
SQL> BEGIN
  2  DBMS_REPAIR.SKIP_CORRUPT_BLOCKS (
  3  SCHEMA_NAME => 'CHF',
  4  OBJECT_NAME => 'T_XIFENFEI',
  5  OBJECT_TYPE => dbms_repair.table_object,
  6  FLAGS => dbms_repair.NOSKIP_FLAG);
  7  END;
  8  /
PL/SQL procedure successfully completed.
SQL> select skip_corrupt from dba_tables where table_name='T_XIFENFEI' AND OWNER='CHF';
SKIP_COR
--------
DISABLED
--查询表记录正常
SQL>  select count(*) from chf.t_xifenfei;
  COUNT(*)
----------
     74663

在后续的操作中,也是按照类似步骤操作,考虑到篇幅有限,部分过程不再贴出来

测试block 379

SQL>  select count(*) from chf.t_xifenfei;
 select count(*) from chf.t_xifenfei
                          *
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 4, block # 379)
ORA-01110: data file 4: '/u01/oracle/oradata/ora11g/users01.dbf'
SQL> exec dbms_repair.skip_corrupt_blocks('CHF','T_XIFENFEI');
PL/SQL procedure successfully completed.
SQL> select count(*) from chf.t_xifenfei;
  COUNT(*)
----------
     74575

测试block 378

BBED> set block 378
        BLOCK#          378
--segment header 不支持bbed查看结构
BBED> p tailchk
BBED-00400: invalid blocktype (35)
BBED> map
 File: /u01/oracle/oradata/ora11g/users01.dbf (0)
 Block: 378                                   Dba:0x00000000
------------------------------------------------------------
BBED-00400: invalid blocktype (35)
BBED> set offset 8188
        OFFSET          8188
BBED> d
 File: /u01/oracle/oradata/ora11g/users01.dbf (0)
 Block: 378              Offsets: 8188 to 8191           Dba:0x00000000
------------------------------------------------------------------------
 01231b65
 <32 bytes per line>
BBED> m /x 651b2302
Warning: contents of previous BIFILE will be lost. Proceed? (Y/N) y
 File: /u01/oracle/oradata/ora11g/users01.dbf (0)
 Block: 378              Offsets: 8188 to 8191           Dba:0x00000000
------------------------------------------------------------------------
 651b2302
 <32 bytes per line>
BBED> sum apply
Check value for File 0, Block 378:
current = 0xedf2, required = 0xedf2
--验证坏块
BBED> verify
DBVERIFY - Verification starting
FILE = /u01/oracle/oradata/ora11g/users01.dbf
BLOCK = 378
Block 378 is corrupt
Corrupt block relative dba: 0x0100017a (file 0, block 378)
Fractured block found during verification
Data in bad block:
 type: 35 format: 2 rdba: 0x0100017a
 last change scn: 0x0b8c.3bfc651b seq: 0x1 flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x02231b65
 check value in block header: 0xedf2
 computed block checksum: 0x0
DBVERIFY - Verification complete
Total Blocks Examined         : 1
Total Blocks Processed (Data) : 0
Total Blocks Failing   (Data) : 0
Total Blocks Processed (Index): 0
Total Blocks Failing   (Index): 0
Total Blocks Empty            : 0
Total Blocks Marked Corrupt   : 1
Total Blocks Influx           : 2
Message 531 not found;  product=RDBMS; facility=BBED
SQL>  select count(*) from chf.t_xifenfei;
 select count(*) from chf.t_xifenfei
                          *
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 4, block # 378)
ORA-01110: data file 4: '/u01/oracle/oradata/ora11g/users01.dbf'
--标记跳过坏块
SQL> exec dbms_repair.skip_corrupt_blocks('CHF','T_XIFENFEI');
PL/SQL procedure successfully completed.
--查询依然失败
SQL> select count(*) from chf.t_xifenfei;
select count(*) from chf.t_xifenfei
                         *
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 4, block # 378)
ORA-01110: data file 4: '/u01/oracle/oradata/ora11g/users01.dbf'

测试block 377

BBED> m /x 18211766
Warning: contents of previous BIFILE will be lost. Proceed? (Y/N) y
 File: /u01/oracle/oradata/ora11g/users01.dbf (0)
 Block: 377              Offsets: 8188 to 8191           Dba:0x00000000
------------------------------------------------------------------------
 18211766
 <32 bytes per line>
BBED> sum apply
Check value for File 0, Block 377:
current = 0x9d8c, required = 0x9d8c
--bbed验证为坏块
BBED> verify
DBVERIFY - Verification starting
FILE = /u01/oracle/oradata/ora11g/users01.dbf
BLOCK = 377
Block 377 is corrupt
Corrupt block relative dba: 0x01000179 (file 0, block 377)
Fractured block found during verification
Data in bad block:
 type: 33 format: 2 rdba: 0x01000179
 last change scn: 0x0b8c.3bfc6517 seq: 0x18 flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x66172118
 check value in block header: 0x9d8c
 computed block checksum: 0x0
DBVERIFY - Verification complete
Total Blocks Examined         : 1
Total Blocks Processed (Data) : 0
Total Blocks Failing   (Data) : 0
Total Blocks Processed (Index): 0
Total Blocks Failing   (Index): 0
Total Blocks Empty            : 0
Total Blocks Marked Corrupt   : 1
Total Blocks Influx           : 2
Message 531 not found;  product=RDBMS; facility=BBED
--dbv验证为坏块
[oracle@xifenfei ~]$ dbv file=/u01/oracle/oradata/ora11g/users01.dbf
DBVERIFY: Release 11.2.0.3.0 - Production on Fri Jan 18 03:32:18 2013
Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.
DBVERIFY - Verification starting : FILE = /u01/oracle/oradata/ora11g/users01.dbf
Page 377 is influx - most likely media corrupt
Corrupt block relative dba: 0x01000179 (file 4, block 377)
Fractured block found during dbv:
Data in bad block:
 type: 33 format: 2 rdba: 0x01000179
 last change scn: 0x0b8c.3bfc6517 seq: 0x18 flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x66172118
 check value in block header: 0x9d8c
 computed block checksum: 0x0
DBVERIFY - Verification complete
Total Pages Examined         : 2560
Total Pages Processed (Data) : 1434
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 10
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 213
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 902
Total Pages Marked Corrupt   : 1
Total Pages Influx           : 1
Total Pages Encrypted        : 0
Highest block SCN            : 1006486374 (2956.1006486374)
--查询表记录
SQL> select skip_corrupt from dba_tables where table_name='T_XIFENFEI' AND OWNER='CHF';
SKIP_COR
--------
DISABLED
SQL> select count(*) from chf.t_xifenfei;
  COUNT(*)
----------
     74663

测试block 376

BBED> m /x 04201766
 File: /u01/oracle/oradata/ora11g/users01.dbf (0)
 Block: 376              Offsets: 8188 to 8191           Dba:0x00000000
------------------------------------------------------------------------
 04201766
 <32 bytes per line>
BBED> sum apply
Check value for File 0, Block 376:
current = 0xcbb3, required = 0xcbb3
BBED> verify
DBVERIFY - Verification starting
FILE = /u01/oracle/oradata/ora11g/users01.dbf
BLOCK = 376
Block 376 is corrupt
Corrupt block relative dba: 0x01000178 (file 0, block 376)
Fractured block found during verification
Data in bad block:
 type: 32 format: 2 rdba: 0x01000178
 last change scn: 0x0b8c.3bfc6517 seq: 0x4 flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x66172004
 check value in block header: 0xcbb3
 computed block checksum: 0x0
DBVERIFY - Verification complete
Total Blocks Examined         : 1
Total Blocks Processed (Data) : 0
Total Blocks Failing   (Data) : 0
Total Blocks Processed (Index): 0
Total Blocks Failing   (Index): 0
Total Blocks Empty            : 0
Total Blocks Marked Corrupt   : 1
Total Blocks Influx           : 2
Message 531 not found;  product=RDBMS; facility=BBED
SQL>  select count(*) from chf.t_xifenfei;
  COUNT(*)
----------
     74663

通过测试证明,如果是事务数据块出现坏块可以通过dbms_repair.skip_corrupt_blocks来标记坏块(也可以使用event 10231)来实现;对于BITMAP BLOCK如果出现坏块不会对于读取数据无影响(至于其他操作,请见table中各种坏块对select/dml操作影响);SEGMENT HEADER如果出现坏块,无法通过跳过坏块来实现获取其他数据(正常block)

recover遇到坏块处理本质探讨

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:recover遇到坏块处理本质探讨

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

如果在还原出来的数据文件中有坏块,而归档日志和联机日志是正常的,那么在应用日志恢复过程中,会出现什么情况,这里通过一个简单的测试给予其中一种情况的说明
创建测试表

SQL> select * from v$version;
BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - Prod
PL/SQL Release 10.2.0.4.0 - Production
CORE    10.2.0.4.0      Production
TNS for Linux: Version 10.2.0.4.0 - Production
NLSRTL Version 10.2.0.4.0 - Production
SQL> create table t_xifenfei(object_id,object_name) tablespace xifenfei
  2  as
  3  select object_id,object_name from dba_objects
  4  where rownum<11;
Table created.
SQL> col object_name for a30
SQL> select   object_id,object_name,
  2   dbms_rowid.rowid_relative_fno(rowid)rel_fno,
  3  dbms_rowid.rowid_block_number(rowid)blockno
  4   from chf.t_xifenfei;
 OBJECT_ID OBJECT_NAME                       REL_FNO    BLOCKNO
---------- ------------------------------ ---------- ----------
        20 ICOL$                                   5         12
        44 I_USER1                                 5         12
        28 CON$                                    5         12
        15 UNDO$                                   5         12
        29 C_COBJ#                                 5         12
         3 I_OBJ#                                  5         12
        25 PROXY_ROLE_DATA$                        5         12
        39 I_IND1                                  5         12
        51 I_CDEF2                                 5         12
        26 I_PROXY_ROLE_DATA$_1                    5         12
10 rows selected.
SQL> select name from v$datafile where file#=5;
NAME
--------------------------------------------------------------
/u01/oracle/oradata/XFF/xifenfei01.dbf
SQL> update t_xifenfei set object_name='WWW.XIFENFEI.COM';
10 rows updated.
SQL> commit;
Commit complete.
SQL> create table t_xifenfei_new(object_id,object_name) tablespace xifenfei
  2  as
  3  select object_id,object_name from dba_objects
  4  where rownum<11;
Table created.
SQL> conn / as sysdba
Connected.
SQL> shutdown immediate
Database closed.
Database dismounted.
ORACLE instance shut down.

备份数据文件

[oracle@xifenfei XFF]$ cp xifenfei01.dbf  ../tmp/
[oracle@xifenfei XFF]$ ll ../tmp/xifenfei01.dbf
-rw-r----- 1 oracle oinstall 10493952 Sep 28 19:05 ../tmp/xifenfei01.dbf
[oracle@xifenfei XFF]$ date
Fri Sep 28 19:05:42 CST 2012

bbed破坏备份文件

[oracle@xifenfei XFF]$ bbed password=blockedit
BBED: Release 2.0.0.0.0 - Limited Production on Fri Sep 28 19:05:59 2012
Copyright (c) 1982, 2007, Oracle.  All rights reserved.
************* !!! For Oracle Internal Use only !!! ***************
BBED> set filename '/u01/oracle/oradata/tmp/xifenfei01.dbf'
        FILENAME        /u01/oracle/oradata/tmp/xifenfei01.dbf
BBED> set block 12
        BLOCK#          12
BBED> set mode edit
        MODE            Edit
BBED> map
 File: /u01/oracle/oradata/tmp/xifenfei01.dbf (0)
 Block: 12                                    Dba:0x00000000
------------------------------------------------------------
 KTB Data Block (Table/Cluster)
 struct kcbh, 20 bytes                      @0
 struct ktbbh, 96 bytes                     @20
 struct kdbh, 14 bytes                      @124
 struct kdbt[1], 4 bytes                    @138
 sb2 kdbr[10]                               @142
 ub1 freespace[7666]                        @162
 ub1 rowdata[360]                           @7828
 ub4 tailchk                                @8188
BBED> p kcbh
struct kcbh, 20 bytes                       @0
   ub1 type_kcbh                            @0        0x06
   ub1 frmt_kcbh                            @1        0xa2
   ub1 spare1_kcbh                          @2        0x00
   ub1 spare2_kcbh                          @3        0x00
   ub4 rdba_kcbh                            @4        0x0140000c
   ub4 bas_kcbh                             @8        0x0004d7b0
   ub2 wrp_kcbh                             @12       0x000a
   ub1 seq_kcbh                             @14       0x01
   ub1 flg_kcbh                             @15       0x06 (KCBHFDLC, KCBHFCKV)
   ub2 chkval_kcbh                          @16       0xe573
   ub2 spare3_kcbh                          @18       0x0000
BBED> d offset 8188
 File: /u01/oracle/oradata/tmp/xifenfei01.dbf (0)
 Block: 12               Offsets: 8188 to 8191           Dba:0x00000000
------------------------------------------------------------------------
 0106b0d7
 <32 bytes per line>
BBED> m /x 11
Warning: contents of previous BIFILE will be lost. Proceed? (Y/N) y
 File: /u01/oracle/oradata/tmp/xifenfei01.dbf (0)
 Block: 12               Offsets: 8188 to 8191           Dba:0x00000000
------------------------------------------------------------------------
 1106b0d7
 <32 bytes per line>
BBED> sum apply
Check value for File 0, Block 12:
current = 0xe563, required = 0xe563
BBED> verify
DBVERIFY - Verification starting
FILE = /u01/oracle/oradata/tmp/xifenfei01.dbf
BLOCK = 12
Block 12 is corrupt
Corrupt block relative dba: 0x0140000c (file 0, block 12)
Fractured block found during verification
Data in bad block:
 type: 6 format: 2 rdba: 0x0140000c
 last change scn: 0x000a.0004d7b0 seq: 0x1 flg: 0x06
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0xd7b00611
 check value in block header: 0xe563
 computed block checksum: 0x0
DBVERIFY - Verification complete
Total Blocks Examined         : 1
Total Blocks Processed (Data) : 0
Total Blocks Failing   (Data) : 0
Total Blocks Processed (Index): 0
Total Blocks Failing   (Index): 0
Total Blocks Empty            : 0
Total Blocks Marked Corrupt   : 1
Total Blocks Influx           : 2

修改数据库记录

SQL> startup
ORACLE instance started.
Total System Global Area  318767104 bytes
Fixed Size                  1267236 bytes
Variable Size             109054428 bytes
Database Buffers          201326592 bytes
Redo Buffers                7118848 bytes
Database mounted.
Database opened.
SQL> conn chf/xifenfei
Connected.
SQL> update t_xifenfei set object_name='惜分飞';
10 rows updated.
SQL> update t_xifenfei_new set object_name='惜分飞';
10 rows updated.
SQL> commit;
Commit complete.
SQL> alter system switch logfile;
System altered.
SQL> alter system switch logfile;
System altered.
SQL> alter system switch logfile;
System altered.
SQL> conn / as sysdba
Connected.
SQL> shutdown immediate
Database closed.
Database dismounted.
ORACLE instance shut down.

利用备份数据文件恢复数据库

[oracle@xifenfei XFF]$ cp xifenfei01.dbf xifenfei01.dbf_bak
[oracle@xifenfei XFF]$ cp ../tmp/xifenfei01.dbf  xifenfei01.dbf
[oracle@xifenfei XFF]$ sqlplus / as sysdba
SQL*Plus: Release 10.2.0.4.0 - Production on Fri Sep 28 19:13:59 2012
Copyright (c) 1982, 2007, Oracle.  All Rights Reserved.
Connected to an idle instance.
SQL> startup
ORACLE instance started.
Total System Global Area  318767104 bytes
Fixed Size                  1267236 bytes
Variable Size             109054428 bytes
Database Buffers          201326592 bytes
Redo Buffers                7118848 bytes
Database mounted.
ORA-01113: file 5 needs media recovery
ORA-01110: data file 5: '/u01/oracle/oradata/XFF/xifenfei01.dbf'
--提示数据需要恢复
SQL> recover datafile 5;
ORA-00279: change 42949990720 generated at 09/28/2012 19:04:10 needed for
thread 1
ORA-00289: suggestion : /u01/oracle/oradata/XFF/archivelog/1_24_792679299.dbf
ORA-00280: change 42949990720 for thread 1 is in sequence #24
Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
auto
Log applied.
Media recovery complete.
SQL> alter database open;
Database altered.
--利用被破坏的数据文件+归档日志恢复数据库正常
SQL> col object_name for a30
SQL> select   object_id,object_name from t_xifenfei;
select   object_id,object_name from t_xifenfei
                                    *
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 5, block # 12)
ORA-01110: data file 5: '/u01/oracle/oradata/XFF/xifenfei01.dbf'
--提示被破坏的数据块,查询不能完成
--证明坏块之外的数据块还是被正常应用日志
SQL>  select   object_id,object_name from t_xifenfei_new;
 OBJECT_ID OBJECT_NAME
---------- ------------------------------
        20 惜分飞
        44 惜分飞
        28 惜分飞
        15 惜分飞
        29 惜分飞
         3 惜分飞
        25 惜分飞
        39 惜分飞
        51 惜分飞
        26 惜分飞
10 rows selected.
SQL> shutdown immediate
Database closed.
Database dismounted.
ORACLE instance shut down.

dbv检查坏块

[oracle@xifenfei XFF]$ dbv file=xifenfei01.dbf
DBVERIFY: Release 10.2.0.4.0 - Production on Fri Sep 28 19:14:52 2012
Copyright (c) 1982, 2007, Oracle.  All rights reserved.
DBVERIFY - Verification starting : FILE = xifenfei01.dbf
DBV-00200: Block, DBA 20971532, already marked corrupt
--这里可以看出来,该数据块已经被标志为坏块
DBVERIFY - Verification complete
Total Pages Examined         : 1280
Total Pages Processed (Data) : 2
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 0
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 14
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 1264
Total Pages Marked Corrupt   : 1
Total Pages Influx           : 0
Highest block SCN            : 318700 (10.318700)

查看恢复过程alert日志

Fri Sep 28 19:14:06 2012
Database mounted in Exclusive Mode
Completed: ALTER DATABASE   MOUNT
Fri Sep 28 19:14:06 2012
ALTER DATABASE OPEN
ORA-1113 signalled during: ALTER DATABASE OPEN...
Fri Sep 28 19:14:11 2012
ALTER DATABASE RECOVER  datafile 5
Media Recovery Start
 parallel recovery started with 2 processes
ORA-279 signalled during: ALTER DATABASE RECOVER  datafile 5  ...
Fri Sep 28 19:14:16 2012
ALTER DATABASE RECOVER    CONTINUE DEFAULT
Fri Sep 28 19:14:16 2012
--恢复数据库的时候,发现坏块
Media Recovery Log /u01/oracle/oradata/XFF/archivelog/1_24_792679299.dbf
Fri Sep 28 19:14:16 2012
Hex dump of (file 5, block 12) in trace file /u01/oracle/admin/XFF/bdump/xff_p001_23011.trc
Corrupt block relative dba: 0x0140000c (file 5, block 12)
Fractured block found during media recovery
Data in bad block:
 type: 6 format: 2 rdba: 0x0140000c
 last change scn: 0x000a.0004d7b0 seq: 0x1 flg: 0x06
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0xd7b00611
 check value in block header: 0xe563
 computed block checksum: 0x0
Reread of rdba: 0x0140000c (file 5, block 12) found same corrupted data
--继续恢复
Fri Sep 28 19:14:16 2012
Recovery of Online Redo Log: Thread 1 Group 1 Seq 25 Reading mem 0
  Mem# 0: /u01/oracle/oradata/XFF/redo01.log
Fri Sep 28 19:14:16 2012
Recovery of Online Redo Log: Thread 1 Group 2 Seq 26 Reading mem 0
  Mem# 0: /u01/oracle/oradata/XFF/redo02.log
Fri Sep 28 19:14:16 2012
Recovery of Online Redo Log: Thread 1 Group 3 Seq 27 Reading mem 0
  Mem# 0: /u01/oracle/oradata/XFF/redo03.log
Fri Sep 28 19:14:16 2012
Media Recovery Complete (XFF)
Completed: ALTER DATABASE RECOVER    CONTINUE DEFAULT
Fri Sep 28 19:14:31 2012
alter database open

bbed查看修改相关信息

BBED> set filename '/u01/oracle/oradata/XFF/xifenfei01.dbf'
        FILENAME        /u01/oracle/oradata/XFF/xifenfei01.dbf
BBED> set block 12
        BLOCK#          12
BBED> map
 File: /u01/oracle/oradata/XFF/xifenfei01.dbf (0)
 Block: 12                                    Dba:0x00000000
------------------------------------------------------------
 KTB Data Block (Table/Cluster)
 struct kcbh, 20 bytes                      @0
 struct ktbbh, 96 bytes                     @20
 struct kdbh, 14 bytes                      @124
 struct kdbt[1], 4 bytes                    @138
 sb2 kdbr[10]                               @142
 ub1 freespace[7666]                        @162
 ub1 rowdata[360]                           @7828
 ub4 tailchk                                @8188
BBED> p kcbh
struct kcbh, 20 bytes                       @0
   ub1 type_kcbh                            @0        0x06
   ub1 frmt_kcbh                            @1        0xa2
   ub1 spare1_kcbh                          @2        0x00
   ub1 spare2_kcbh                          @3        0x00
   ub4 rdba_kcbh                            @4        0x0140000c
   ub4 bas_kcbh                             @8        0x00000000
   ub2 wrp_kcbh                             @12       0x0000
   ub1 seq_kcbh                             @14       0xff   <--因为被标记为坏块,所以为ff
   ub1 flg_kcbh                             @15       0x04 (KCBHFCKV)
   ub2 chkval_kcbh                          @16       0xe77d
   ub2 spare3_kcbh                          @18       0x0000
--查看数据块中记录
BBED> p *kdbr[5]
rowdata[69]
-----------
ub1 rowdata[69]                             @7897     0x2c
BBED> x /rnc
rowdata[69]                                 @7897
-----------
flag@7897: 0x2c (KDRHFL, KDRHFF, KDRHFH)
lock@7898: 0x02
cols@7899:    2
col    0[2] @7900: 3
col   1[16] @7903: WWW.XIFENFEI.COM  <--确实没有被恢复,而是直接被跳过
BBED> set mode edit
        MODE            Edit
BBED> m /x 01 offset 14
Warning: contents of previous BIFILE will be lost. Proceed? (Y/N) y
 File: /u01/oracle/oradata/XFF/xifenfei01.dbf (0)
 Block: 12               Offsets:   14 to  525           Dba:0x00000000
------------------------------------------------------------------------
 01047de7 00000100 00000dcc 000098d7 …………
 <32 bytes per line>
BBED> d offset 8188
 File: /u01/oracle/oradata/XFF/xifenfei01.dbf (0)
 Block: 12               Offsets: 8188 to 8191           Dba:0x00000000
------------------------------------------------------------------------
 ff060000    <--这么说明:数据块被标志为坏块的时候,同时会修改tailchk值
 <32 bytes per line>
BBED> m /x 01 offset 8188
 File: /u01/oracle/oradata/XFF/xifenfei01.dbf (0)
 Block: 12               Offsets: 8188 to 8191           Dba:0x00000000
------------------------------------------------------------------------
 01060000
 <32 bytes per line>
BBED> sum apply
Check value for File 0, Block 12:
current = 0xe77d, required = 0xe77d
--验证块已经标记为正常块
BBED> verify
DBVERIFY - Verification starting
FILE = /u01/oracle/oradata/XFF/xifenfei01.dbf
BLOCK = 12
DBVERIFY - Verification complete
Total Blocks Examined         : 1
Total Blocks Processed (Data) : 1
Total Blocks Failing   (Data) : 0
Total Blocks Processed (Index): 0
Total Blocks Failing   (Index): 0
Total Blocks Empty            : 0
Total Blocks Marked Corrupt   : 0
Total Blocks Influx           : 0

启动数据库测试

SQL> startup
ORACLE instance started.
Total System Global Area  318767104 bytes
Fixed Size                  1267236 bytes
Variable Size             109054428 bytes
Database Buffers          201326592 bytes
Redo Buffers                7118848 bytes
Database mounted.
Database opened.
SQL> conn chf/xifenfei
Connected.
SQL> col object_name for a30
SQL> select   object_id,object_name from t_xifenfei;
 OBJECT_ID OBJECT_NAME
---------- ------------------------------
        20 WWW.XIFENFEI.COM
        44 WWW.XIFENFEI.COM
        28 WWW.XIFENFEI.COM
        15 WWW.XIFENFEI.COM
        29 WWW.XIFENFEI.COM
         3 WWW.XIFENFEI.COM
        25 WWW.XIFENFEI.COM
        39 WWW.XIFENFEI.COM
        51 WWW.XIFENFEI.COM
        26 WWW.XIFENFEI.COM
10 rows selected.
--通过修改数据块的seq_kcbh和tailchk,让这个块恢复正常,但是记录依然丢失,
--因为应用日志恢复之时标记为坏块跳过该块的日志应用

通过实验证明:
1.如果只有数据块异常,应用日志恢复,不一定会出现ORA-600[3020],而是直接把该块标记为坏块,继续应用日志
2.标记坏块其实就是修改seq_kcbh为ff,同时也修改tailchk值
3.经验值:如果在数据库应用日志恢复的时候,如果出现ORA-600[3020]错误,可以使用allow 2 corruption来跳过坏块处理,其实也是修改seq_kcbh为ff,然后让数据库跳过该块的恢复.

使用plsql抢救数据

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:使用plsql抢救数据

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

“在oracle出现ORA-8103/ORA-1578/ORA-376″等情况下抢救数据的争论没有停止过,很多人想到的是使用bbed,dul等工具来抢救,其实在很多时候我们使用pl/sql也可以完美的抢救数据.在这里我们通过模拟ORA-8103错误,然后使用plsql来找回数据.这中处理方法相对于bbed风险小,但是缺点是如果数据量大处理时间可能比较长,可能比dul有的一比,但是dul的工具不是任何人都有的.所以整体来说,在大部分情况下,这种方法处理某个数据块错误,抢救某个对象数据,还是很好的方法.
1.有非空列index情况

--创建测试表
SQL> create table xifenfei
  2  as
  3  select * from dba_objects;
Table created.
--修改某个项为非空值
SQL> alter table xifenfei modify object_id not null;
Table altered.
--创建一个唯一index
SQL> create unique index ind_xifenfei  on xifenfei(object_id);
Index created.
--表总记录
SQL> select count(*) from xifenfei;
  COUNT(*)
----------
     50088
--extent的分布情况
SQL> set pages 100
SQL>  select file_id,block_id,block_id+blocks-1
  2    from dba_extents
  3   where segment_name ='XIFENFEI' AND owner='CHF';
   FILE_ID   BLOCK_ID BLOCK_ID+BLOCKS-1
---------- ---------- -----------------
         9       1545              1552
         9       1553              1560
         9       1561              1568
         9       1569              1576
         9       1577              1584
         9       1585              1592
         9       1593              1600
         9       1601              1608
         9       1609              1616
         9       1617              1624
         9       1625              1632
         9       1633              1640
         9       1641              1648
         9       1649              1656
         9       1657              1664
         9       1665              1672
         9       1673              1800
         9       1801              1928
         9       1929              2056
         9       2057              2184
         9       2185              2312
21 rows selected.
--2200数据块包含记录
SQL> select   count(*)
  2  from chf.xifenfei where dbms_rowid.rowid_block_number(rowid)=2200;
  COUNT(*)
----------
        69
--关闭数据库
SQL> shutdown immediate
Database closed.
Database dismounted.
ORACLE instance shut down.
--破坏数据块
[oracle@xifenfei ~]$ dd if=/dev/zero of=/u01/oracle/oradata/XFF/users03.dbf  bs=8192  count=1 seek=2200 conv=notrunc
1+0 records in
1+0 records out
8192 bytes (8.2 kB) copied, 0.000151554 seconds, 54.1 MB/s
--启动数据库
SQL> startup
ORACLE instance started.
Total System Global Area  318767104 bytes
Fixed Size                  1267236 bytes
Variable Size             104860124 bytes
Database Buffers          205520896 bytes
Redo Buffers                7118848 bytes
Database mounted.
Database opened.
--查询结果
SQL>  select /*+ full(xifenfei) */ count(*) from chf.xifenfei;
 select /*+ full(xifenfei) */ count(*) from chf.xifenfei
                                                *
ERROR at line 1:
ORA-08103: object no longer exists
SQL> create table chf.xifenfei_new
  2  as
  3  select * from chf.xifenfei;
select * from chf.xifenfei
                  *
ERROR at line 3:
ORA-08103: object no longer exists
--创建备份表
SQL> create table chf.xifenfei_new
  2  as
  3  select * from chf.xifenfei where 1=0;
Table created.
--创建坏块相关rowid记录表
SQL> create table chf.bad_rows (row_id rowid, oracle_error_code number);
Table created.
--执行plsql脚本
DECLARE
 TYPE RowIDTab IS TABLE OF ROWID INDEX BY BINARY_INTEGER;
 CURSOR c1 IS  select /*+ index(xifenfei ind_xifenfei) */ rowid
 from chf.xifenfei
 where object_id is NOT NULL;
 r RowIDTab;
 rows  NATURAL := 20000;
 bad_rows number := 0 ;
 errors number;
 error_code number;
 myrowid rowid;
BEGIN
 OPEN c1;
 LOOP
   FETCH  c1 BULK COLLECT INTO r LIMIT rows;
   EXIT WHEN r.count=0;
   BEGIN
    FORALL i IN r.FIRST..r.LAST SAVE EXCEPTIONS
     insert into chf.xifenfei_new
     select /*+ ROWID(A) */ *
     from chf.xifenfei A where rowid = r(i);
   EXCEPTION
   when OTHERS then
    BEGIN
     errors := SQL%BULK_EXCEPTIONS.COUNT;
     FOR err1 IN 1..errors LOOP
       error_code := SQL%BULK_EXCEPTIONS(err1).ERROR_CODE;
       if error_code in (1410, 8103) then
         myrowid := r(SQL%BULK_EXCEPTIONS(err1).ERROR_INDEX);
         bad_rows := bad_rows + 1;
         insert into chf.bad_rows values(myrowid, error_code);
       else
         raise;
       end if;
     END LOOP;
     END;
   END;
  commit;
 END LOOP;
 commit;
 CLOSE c1;
 dbms_output.put_line('Total Bad Rows: '||bad_rows);
END;
/
--查询错误记录
SQL> select count(*) from chf.bad_rows ;
  COUNT(*)
----------
        69
SQL> select * from chf.bad_rows where rownum<10;
ROW_ID             ORACLE_ERROR_CODE
------------------ -----------------
AAAMugAAJAAAAiYAAA              8103
AAAMugAAJAAAAiYAAB              8103
AAAMugAAJAAAAiYAAC              8103
AAAMugAAJAAAAiYAAD              8103
AAAMugAAJAAAAiYAAE              8103
AAAMugAAJAAAAiYAAF              8103
AAAMugAAJAAAAiYAAG              8103
AAAMugAAJAAAAiYAAH              8103
AAAMugAAJAAAAiYAAI              8103
9 rows selected.
--查询备份表记录
SQL> select count(*) from chf.xifenfei_new;
  COUNT(*)
----------
     50019
50088-50019=69和被破坏块中记录一致,证明所有好块中记录全部被找回来

2.无非空列index情况

--创建表
SQL> CONN CHF/XIFENFEI
Connected.
SQL> create table t_xifenfei
  2  as
  3  select * from dba_objects;
Table created.
--表中记录总数
SQL> select count(*) from t_xifenfei;
  COUNT(*)
----------
     50086
--extent分布
SQL> SET PAGES 100
SQL>  select file_id,block_id,block_id+blocks-1
  2    from dba_extents
  3   where segment_name ='T_XIFENFEI' AND owner='CHF';
   FILE_ID   BLOCK_ID BLOCK_ID+BLOCKS-1
---------- ---------- -----------------
         9          9                16
         9         17                24
         9         25                32
         9         33                40
         9         41                48
         9         49                56
         9         57                64
         9         65                72
         9         73                80
         9         81                88
         9         89                96
         9         97               104
         9        105               112
         9        113               120
         9        121               128
         9        129               136
         9        137               264
         9        265               392
         9        393               520
         9        521               648
         9        649               776
21 rows selected.
--700数据块中记录数
SQL> select   count(*)
  2  from chf.t_xifenfei where dbms_rowid.rowid_block_number(rowid)=700;
  COUNT(*)
----------
        73
--关闭数据库
SQL> shutdown immediate
Database closed.
Database dismounted.
ORACLE instance shut down.
--破坏block 700的数据块
[oracle@xifenfei ~]$ dd if=/dev/zero of=/u01/oracle/oradata/XFF/users03.dbf  bs=8192  count=1 seek=700 conv=notrunc
1+0 records in
1+0 records out
8192 bytes (8.2 kB) copied, 0.000156576 seconds, 52.3 MB/s
--启动数据库
SQL> startup
ORACLE instance started.
Total System Global Area  318767104 bytes
Fixed Size                  1267236 bytes
Variable Size             104860124 bytes
Database Buffers          205520896 bytes
Redo Buffers                7118848 bytes
Database mounted.
Database opened.
--查询报错
SQL> select count(*) from chf.t_xifenfei;
select count(*) from chf.t_xifenfei
                         *
ERROR at line 1:
ORA-08103: object no longer exists
--创建备份表
SQL> CREATE TABLE T_XIFENFEI_NEW
  2  AS
  3  SELECT * FROM T_XIFENFEI WHERE 1=0;
--找回记录
set serveroutput on
set concat off
DECLARE
 nrows number;
 rid rowid;
 dobj number;
 ROWSPERBLOCK number;
BEGIN
 ROWSPERBLOCK:=1000;  --估算最大的一个块中记录条数
 nrows:=0;
 select data_object_id  into dobj
 from dba_objects
 where owner = 'CHF'
 and object_name = 'T_XIFENFEI'
-- and subobject_name = '<table partition>'  Add this condition if table is partitioned
 ;
 for i in (select relative_fno, block_id, block_id+blocks-1 totblocks
           from dba_extents
           where owner = 'CHF'
             and segment_name = 'T_XIFENFEI'
-- and partition_name = '<table partition>' Add this condition if table is partitioned
-- and file_id != <OFFLINED DATAFILE> This condition is only used if a datafile needs to be skipped due to ORA-376 (A)
          order by extent_id)
 loop
   for br in i.block_id..i.totblocks loop
    for j in 1..ROWSPERBLOCK loop
    begin
      rid := dbms_rowid.ROWID_CREATE(1,dobj,i.relative_fno, br , j-1);
      insert into CHF.T_XIFENFEI_NEW
      select /*+ ROWID(A) */ *
      from CHF.T_XIFENFEI A
      where rowid = rid;
      if sql%rowcount = 1 then nrows:=nrows+1; end if;
      if (mod(nrows,10000)=0) then commit; end if;
    exception when others then null;
    end;
    end loop;
  end loop;
 end loop;
 COMMIT;
 dbms_output.put_line('Total rows: '||to_char(nrows));
END;
/
--找回记录数
SQL> SELECT COUNT(*) FROM CHF.T_XIFENFEI_NEW;
  COUNT(*)
----------
     50013
50086-50013=73  证明非坏块中的数据都被完全寻找回来

参考:
ORA-8103 Troubleshooting, Diagnostic and Solution [ID 268302.1]
Extract rows from a CORRUPT table creating ROWID from DBA_EXTENTS [ID 422547.1]

TXChecker初试

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:TXChecker初试

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

当我们对数据库进行异常恢复,很多时候要选择屏蔽回滚段,但是我们有没有办法来评估屏蔽回滚段到底会对我们的数据库的数据产生多大影响呢?其实我们可以通过TXChecker工具来评估数据库undo异常时候受到影响的数据对象有哪些,从而进一步确定是否真的需要对其undo下手处理.
模拟数据异常事务为提交情况

[oracle@xifenfei ~]$ sqlplus / as sysdba
SQL*Plus: Release 10.2.0.4.0 - Production on Thu Aug 30 20:32:27 2012
Copyright (c) 1982, 2007, Oracle.  All Rights Reserved.
Connected to an idle instance.
SQL> startup;
ORACLE instance started.
Total System Global Area  318767104 bytes
Fixed Size                  1267236 bytes
Variable Size             100665820 bytes
Database Buffers          209715200 bytes
Redo Buffers                7118848 bytes
Database mounted.
Database opened.
SQL> create table t_xifenfei tablespace system
  2  as
  3  select * from dba_objects;
Table created.
SQL> create table chf.t_xifenfei tablespace users
  2  as
  3  select * from dba_objects;
Table created.
SQL> delete from t_xifenfei where rownum<10;
9 rows deleted.
SQL> delete from chf.t_xifenfei where rownum<100;
99 rows deleted.
SQL> alter system checkpoint;
System altered.
SQL> shutdow abort;
ORACLE instance shut down.

在system和users表空间分别模拟了sys和chf的t_xifenfei表含事务未提交情况

TXChecker用法说明

[oracle@xifenfei txchecker]$ ./TXChecker
TXChecker - v1.4 by Center Of Expertise (COE), Oracle Corporation (build 10/16/07)
Usage is: TXChecker [options]
Options:
  -a                     When scanning datafiles (with -d/-f/-l/-t options) report objects using any of the undo
                         segments (not just those with errors) (OPTIONAL)
  -b                     For objects found, print the datablock addresses. See readme for further details (OPTIONAL)
  -c<controlfile_name>   Fully qualified controlfile name to read (MANDATORY)
  -d                     Scan database for active TXs (use when undo not available) (OPTIONAL)
  -f<filename>           Scan the named datafile for active TXs (OPTIONAL)
  -g                     Indicates you want to find all blocks taking part in transactions with
                         a USN > than the USN supplied in -x<XID> parameter (same constraints as -w) (OPTIONAL)
  -l<listfile>           Scan all the datafiles listed in the listfile for active TXs (OPTIONAL)
  -m<minutes>            Number of minutes used to consider a TX as active (1-120) (DEFAULTS TO 60 MINUTES)
  -p                     Show the names and last known status of the UNDO segments (OPTIONAL)
  -s                     Skip read-only or offline normal datafiles (OPTIONAL)
  -t<tablespace>         Scan all the datafiles for this tablespace (OPTIONAL)
  -u                     Report ITL entries active if marked with an upper bound ('U' flag) fast commit SCN
                         instead if active transactions (OPTIONAL)
  -w<wrap#>              Wrap# for XID in ITL entry to report blocks where wrap# > this one (OPTIONAL)
                         Must use -x with this option. See the readme for details
  -x<XID>                XID for transaction wanting to search for (OPTIONAL)
                         Use format rrrr.ssss.wwwwwwww using Hexadecimal numbers
                         See the readme for full instructions on using -x, -w and -g options
NOTE: Options -d/-f/-l/-t are exclusive, and only one should be specified.
[/sql]
<strong>TXChecker初始</strong>

[oracle@xifenfei txchecker]$ ./TXChecker -c/u01/oracle/oradata/XFF/control01.ctl -d -a
TXChecker - v1.4 by Center Of Expertise (COE), Oracle Corporation (build 10/16/07)
Database Name: XFF   Version: 10.2.0
*** Database last checkpointed at 08/30/2012 20:34:43 (SCN: 0xa.0x1e142)
*** Using 60 minutes to find most active transactions (-m60)
*** Undo Segments:
USN:   0  Name: SYSTEM          TBS#:  0 File:   1  Block:      9  Instance:  0  SMU: N  Status:  3 - Online
                                SCN: 0000.00000000   XactSQN: 0x00000000   UndoSQN: 0x00000000
USN:   1  Name: _SYSSMU1$       TBS#:  1 File:   2  Block:      9  Instance:  0  SMU: Y  Status:  3 - Online
                                SCN: 000a.000191c3   XactSQN: 0x000000d2   UndoSQN: 0x0000013c
…………省略
USN:   9  Name: _SYSSMU9$       TBS#:  1 File:   2  Block:    137  Instance:  0  SMU: Y  Status:  3 - Online
                                SCN: 000a.000191d8   XactSQN: 0x0000011a   UndoSQN: 0x000000dc
USN:  10  Name: _SYSSMU10$      TBS#:  1 File:   2  Block:    153  Instance:  0  SMU: Y  Status:  3 - Online
                                SCN: 000a.000191d7   XactSQN: 0x000000c1   UndoSQN: 0x000000d9
…………省略
USN:  20  Name: _SYSSMU20$      TBS#:  5 File:   5  Block:    153  Instance:  0  SMU: Y  Status:  1 - Invalid / Dropped
                                SCN: 0000.00070ec6   XactSQN: 0x00000002   UndoSQN: 0x00000001
*** Active Transactions:
USN: 6  Name: _SYSSMU6$  File: 2  Block: 89  Instance: 0  Status: 3 - Online
* Active TX at slot 6  #undo blocks: 3  Last bk: 2.90
Obj#:  51938  Name: CHF.T_XIFENFEI                                    Type: TABLE               Undo recs: 99
              Used undo segment IDs: 6
Obj#:  51937  Name: SYS.T_XIFENFEI                                    Type: TABLE               Undo recs: 9
              Used undo segment IDs: 6
         File (validate_objects.sql) being created for object validattion already exists
         and will be overwritten!!!
         Do you want to continue overwriting [Y/N]?y
*** Undo segments (headers) that encountered errors preventing Active TX scan:
USN:  11  Name: _SYSSMU11$      File:   5  Block:       9  Instance:  0  Error: 27 - Undo segment was dropped
…………省略
USN:  20  Name: _SYSSMU20$      File:   5  Block:     153  Instance:  0  Error: 27 - Undo segment was dropped
WARNING: Analyzing the full database for active transactions will take some time!
         Are you sure you want to analyze the full database [Y/N]?
         Are you sure you want to analyze the full database [Y/N]?y
*** Scanning database for datablocks that may require undo (PLEASE WAIT...):
*** Asterisk ('*') denotes blocks being updated since 08/08/2012 03:30:32 (SCN: 0x0.0x79607)
Scanning datafile:  3 - /u01/oracle/oradata/XFF/sysaux01.dbf (SYSAUX)                           - Active TX blocks: 147 *
Scanning datafile:  1 - /u01/oracle/oradata/XFF/system01.dbf (SYSTEM)                           - Active TX blocks: 11597 *
--undo表空间跳过
Undo datafile (/u01/oracle/oradata/XFF/undotbs01.dbf) - SKIPPING
Scanning datafile:  4 - /u01/oracle/oradata/XFF/users01.dbf (USERS)                             - Active TX blocks: 2 *
--因为数据文件丢失控制文件中offline的跳过(其实只要数据文件丢失就会跳过)
Cannot access datafile (/u01/oracle/oradata/XFF/xifenfei01.dbf) (error 2 - No such file or directory) - SKIPPING
Cannot access datafile (/u01/oracle/oradata/XFF/xifenfei02.dbf) (error 2 - No such file or directory) - SKIPPING
Scanning datafile:  7 - /u01/oracle/oradata/XFF/xifenfei03.dbf (XIFENFEI2)                      - Active TX blocks: 0
*** Objects that may require undo data:
*** Asterisk ('*') denotes blocks being updated since 08/08/2012 03:30:32 (SCN: 0x0.0x79607)
DataObj#:  51938  Name: CHF.T_XIFENFEI                                    Type: TABLE               Datablocks: 2 *
Used undo segment IDs: 6
DataObj#:  46434  Name: MDSYS.SYS_IL0000046432C00006$$                    Type: INDEX               Datablocks: 4
Used undo segment IDs: 2
DataObj#:    124  Name: SYS.I_ACCESS1                                     Type: INDEX               Datablocks: 27
Used undo segment IDs: 1 2 3 4 5 6 7 8 9 10
…………省略
DataObj#:   8824  Name: SYS.SYS_IL0000008822C00008$$                      Type: INDEX               Datablocks: 1 *
Used undo segment IDs: 4
DataObj#:  51937  Name: SYS.T_XIFENFEI                                    Type: TABLE               Datablocks: 1 *
Used undo segment IDs: 6
DataObj#:  51557  Name: SYS.UTL_RECOMP_COMPILED                           Type: TABLE               Datablocks: 1
Used undo segment IDs: 8
…………省略
DataObj#:  42131  Name: XDB.XDB$ELEMENT_PROPNUMBER                        Type: INDEX               Datablocks: 1
Used undo segment IDs: 2
*** Use validate_objects.sql script file to validate the structure of possibly corrupt objects
    if the undo required is not available.
Undo Segment Usage Summary
**************************
*** Undo segments identified in use by active transaction datablocks AFTER 08/08/2012 03:30:32 (SCN: 0x0.0x79607):
USN:   1  Name: _SYSSMU1$
USN:   2  Name: _SYSSMU2$
USN:   3  Name: _SYSSMU3$
USN:   4  Name: _SYSSMU4$
USN:   5  Name: _SYSSMU5$
USN:   6  Name: _SYSSMU6$
USN:   9  Name: _SYSSMU9$
*** Undo segments identified in use by active transacation datablocks BEFORE 08/08/2012 03:30:32 (SCN: 0x0.0x79607):
USN:   1  Name: _SYSSMU1$
USN:   2  Name: _SYSSMU2$
USN:   3  Name: _SYSSMU3$
USN:   4  Name: _SYSSMU4$
USN:   5  Name: _SYSSMU5$
USN:   7  Name: _SYSSMU7$
USN:   8  Name: _SYSSMU8$
USN:   9  Name: _SYSSMU9$
USN:  10  Name: _SYSSMU10$
NOTE!NOTE!NOTE!NOTE!NOTE!NOTE!NOTE!NOTE!NOTE!NOTE!NOTE!NOTE!NOTE!NOTE!NOTE!NOTE!NOTE!
*************************************************************************************
Upload the logfile (TXChecker_083012_2042_XFF.log) to Oracle Support Services for analysis.
Do NOT attempt to force the database open until the logfile has been analyzed.

验证对象脚本

[oracle@xifenfei txchecker]$ more validate_objects.sql
rem validate_objects.sql - checks strcuture of objects needing unavailable undo data
rem                        Created by findtxns program, Oracle Corporation
set echo on
ANALYZE TABLE CHF.T_XIFENFEI VALIDATE STRUCTURE CASCADE;
ANALYZE INDEX MDSYS.SYS_IL0000046432C00006$$ VALIDATE STRUCTURE;
ANALYZE INDEX SYS.I_ACCESS1 VALIDATE STRUCTURE;
…………省略
ANALYZE INDEX XDB.SYS_C003167 VALIDATE STRUCTURE;
ANALYZE INDEX XDB.XDB$ELEMENT_PROPNAME VALIDATE STRUCTURE;
ANALYZE INDEX XDB.XDB$ELEMENT_PROPNUMBER VALIDATE STRUCTURE;

ORA-607/ORA-600[4194]不一定是重大灾难

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:ORA-607/ORA-600[4194]不一定是重大灾难

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

以前解决过ORA-607/ORA-600[4194]和模拟过ORA-607/ORA-600[4194]错误,所以固定思维任务ORA-607/ORA-600[4194]可能就是重大灾难,通过这个案例来说明ORA-607/ORA-600[4194]可能也就是一个常规的不能再常规的错误:有一网友数据库因意外关闭电源导致启动过程出现ORA-00607/ORA-00600[4194]/ORA-00600[4097]的错误,使得数据库启动失败.

SMON: enabling tx recovery
Fri Aug 31 23:14:08 2012
Database Characterset is ZHS16GBK
replication_dependency_tracking turned off (no async multimaster replication found)
Starting background process QMNC
QMNC started with pid=19, OS id=15619
Fri Aug 31 23:14:10 2012
Errors in file /u01/oradata/orcl/bdump/orcl_smon_15589.trc:
ORA-00600: internal error code, arguments: [4097], [], [], [], [], [], [], []
Fri Aug 31 23:14:12 2012
Non-fatal internal error happenned while SMON was doing logging scn->time mapping.
SMON encountered 1 out of maximum 100 non-fatal internal errors.
Fri Aug 31 23:14:12 2012
Completed: alter database open
Fri Aug 31 23:14:14 2012
Errors in file /u01/oradata/orcl/bdump/orcl_smon_15589.trc:
ORA-01595: error freeing extent (2) of rollback segment (4))
ORA-00607: Internal error occurred while making a change to a data block
ORA-00600: internal error code, arguments: [4194], [48], [34], [], [], [], [], []
Fri Aug 31 23:29:41 2012
Errors in file /u01/oradata/orcl/bdump/orcl_smon_15589.trc:
ORA-00607: Internal error occurred while making a change to a data block
ORA-00600: internal error code, arguments: [4194], [17], [10], [], [], [], [], []
Fri Aug 31 23:29:43 2012
Errors in file /u01/oradata/orcl/bdump/orcl_smon_15589.trc:
ORA-00600: internal error code, arguments: [4194], [48], [34], [], [], [], [], []
Fri Aug 31 23:29:44 2012
Errors in file /u01/oradata/orcl/bdump/orcl_pmon_15577.trc:
ORA-00474: SMON process terminated with error
Fri Aug 31 23:29:44 2012
PMON: terminating instance due to error 474
Instance terminated by PMON, pid = 15577

通过alert日志可以定位到SMON_SCN_TIME表或者其回滚操作可能异常,结合alert和trace分析,发现这次错误的操作主要sql语句为:

ksedmp: internal or fatal error
ORA-00600: internal error code, arguments: [4194], [48], [34], [], [], [], [], []
Current SQL statement for this session:
UPDATE SYS.COL_USAGE$
   SET EQUALITY_PREDS    = EQUALITY_PREDS +
                           DECODE(BITAND(:FLAG, 1), 0, 0, 1),
       EQUIJOIN_PREDS    = EQUIJOIN_PREDS +
                           DECODE(BITAND(:FLAG, 2), 0, 0, 1),
       NONEQUIJOIN_PREDS = NONEQUIJOIN_PREDS +
                           DECODE(BITAND(:FLAG, 4), 0, 0, 1),
       RANGE_PREDS       = RANGE_PREDS + DECODE(BITAND(:FLAG, 8), 0, 0, 1),
       LIKE_PREDS        = LIKE_PREDS + DECODE(BITAND(:FLAG, 16), 0, 0, 1),
       NULL_PREDS        = NULL_PREDS + DECODE(BITAND(:FLAG, 32), 0, 0, 1),
       TIMESTAMP         = :TIME
 WHERE OBJ# = :OBJN
   AND INTCOL# = :COLN
ksedmp: internal or fatal error
ORA-00600: internal error code, arguments: [4194], [17], [10], [], [], [], [], []
Current SQL statement for this session:
UPDATE SYS.MON_MODS$
   SET INSERTS       = INSERTS + :INS,
       UPDATES       = UPDATES + :UPD,
       DELETES       = DELETES + :DEL,
       FLAGS        =
       (DECODE(BITAND(FLAGS, :FLAG), :FLAG, FLAGS, FLAGS + :FLAG)),
       DROP_SEGMENTS = DROP_SEGMENTS + :DROPSEG,
       TIMESTAMP     = :TIME
 WHERE OBJ# = :OBJN
ksedmp: internal or fatal error
ORA-00600: internal error code, arguments: [4097], [], [], [], [], [], [], []
Current SQL statement for this session:
INSERT INTO SMON_SCN_TIME
  (THREAD,
   TIME_MP,
   TIME_DP,
   SCN,
   SCN_WRP,
   SCN_BAS,
   NUM_MAPPINGS,
   TIM_SCN_MAP)
VALUES
  (0, :1, :2, :3, :4, :5, :6, :7)

这里主要涉及到对oracle的三张表的操作
COL_USAGE$:主要是在收集统计信息的时候作为是否需要收集列直方图信息参考
MON_MODS$:Oracle主要利用该表来记录那些表的数据发生改变,方便收集统计信息
SMON_SCN_TIME:记录SCN和TIME的对应关系
通过这里的分析可以确定这三张表中的数据对于数据库来说不是致命的基表信息,在数据库运行过程中可以清理掉这些信息,最多就是因为数据库性能的下降或者SCN和TIME互转功能不完善.

解决思路
完整的undo异常处理顺序
1.从alert中可以看出来数据库是在open之后由于SMON回滚到上述几条sql异常导致数据库down,所以可以尝试使用system回滚段启动数据库,看看是否可以屏蔽相关问题
2.如果方法1不可行,那使用event屏蔽smon对回滚段的相关操作,使得数据库正常启动
3.如果由于存在特殊事务,event无法屏蔽,尝试使用隐含参数处理该问题
4.如果隐含参数尚无法解决给问题考虑使用bbed
5.如果bbed不能解决,那只能选择dul或者其类似工具处理
这个案例中我们明确的看到是因为上面的三条sql回滚异常出现问题导致,对于这样的问题,经过测试使用方法1和2都能够顺利解决问题(open库之后需要重建undo,删除有问题undo表空间,修改参数[可能包括event],切换undo表空间).因为遇到几次ORA-607/ORA-600[4194]是因为system rollback损坏导致,所以这次开始也认为是一次比较复杂的恢复,最后证明这次是一种非常常规的恢复.对于ORACLE的数据库恢复有经验可能会比较快的定位问题,但是如果按照固定的思路去想可能会让自己走进死胡同.

处理smon清理临时段导致数据库异常案例

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:处理smon清理临时段导致数据库异常案例

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

一个朋友的数据库在经过自己的千辛万苦终于open成功,但是几分钟就down掉,使得他想导出数据重建库的目标不能实现.让我帮忙处理
alert日志报ORA-00600[kafspa:columnBuffer1]

Wed Aug  8 10:55:31 2012
Completed: ALTER DATABASE OPEN
Wed Aug  8 10:55:41 2012
Errors in file /oracle/ora10/admin/ora10g/udump/ora10g_ora_12160.trc:
ORA-00600: internal error code, arguments: [kafspa:columnBuffer1], [28], [22], [], [], [], [], []
Wed Aug  8 10:55:47 2012
Non-fatal internal error happenned while SMON was doing temporary segment drop.
SMON encountered 2 out of maximum 100 non-fatal internal errors.
Wed Aug  8 10:55:47 2012
Errors in file /oracle/ora10/admin/ora10g/bdump/ora10g_smon_8941.trc:
ORA-00600: internal error code, arguments: [kafspa:columnBuffer1], [28], [22], [], [], [], [], []
Wed Aug  8 10:55:58 2012
Non-fatal internal error happenned while SMON was doing temporary segment drop.
SMON encountered 3 out of maximum 100 non-fatal internal errors.
Wed Aug  8 10:55:59 2012
Errors in file /oracle/ora10/admin/ora10g/bdump/ora10g_smon_8941.trc:
ORA-00600: internal error code, arguments: [kafspa:columnBuffer1], [28], [22], [], [], [], [], []
Wed Aug  8 10:56:10 2012
Non-fatal internal error happenned while SMON was doing temporary segment drop.
SMON encountered 4 out of maximum 100 non-fatal internal errors.
Wed Aug  8 10:56:11 2012
Errors in file /oracle/ora10/admin/ora10g/bdump/ora10g_smon_8941.trc:
ORA-00600: internal error code, arguments: [kafspa:columnBuffer1], [28], [22], [], [], [], [], []
Wed Aug  8 10:56:22 2012
Non-fatal internal error happenned while SMON was doing temporary segment drop.
SMON encountered 5 out of maximum 100 non-fatal internal errors.
Wed Aug  8 10:56:32 2012
Errors in file /oracle/ora10/admin/ora10g/bdump/ora10g_smon_8941.trc:
ORA-00600: internal error code, arguments: [kafspa:columnBuffer1], [28], [22], [], [], [], [], []
Wed Aug  8 10:56:43 2012
Non-fatal internal error happenned while SMON was doing temporary segment drop.
SMON encountered 6 out of maximum 100 non-fatal internal errors.
Wed Aug  8 10:56:53 2012
Errors in file /oracle/ora10/admin/ora10g/bdump/ora10g_smon_8941.trc:
ORA-00600: internal error code, arguments: [kafspa:columnBuffer1], [28], [22], [], [], [], [], []
Wed Aug  8 10:57:04 2012
Non-fatal internal error happenned while SMON was doing temporary segment drop.
SMON encountered 7 out of maximum 100 non-fatal internal errors.
Wed Aug  8 10:57:14 2012
Errors in file /oracle/ora10/admin/ora10g/bdump/ora10g_smon_8941.trc:
ORA-00600: internal error code, arguments: [kafspa:columnBuffer1], [28], [22], [], [], [], [], []
Wed Aug  8 10:57:25 2012
Non-fatal internal error happenned while SMON was doing temporary segment drop.
SMON encountered 8 out of maximum 100 non-fatal internal errors.
Wed Aug  8 10:57:35 2012
Errors in file /oracle/ora10/admin/ora10g/bdump/ora10g_smon_8941.trc:
ORA-00600: internal error code, arguments: [kafspa:columnBuffer1], [28], [22], [], [], [], [], []
Wed Aug  8 10:57:38 2012
Errors in file /oracle/ora10/admin/ora10g/bdump/ora10g_pmon_8856.trc:
ORA-00474: SMON process terminated with error
Wed Aug  8 10:57:38 2012
PMON: terminating instance due to error 474
Instance terminated by PMON, pid = 8856

这里出现ORA-00600[kafspa:columnBuffer1],一个未知的错误,但是根据相关的提示,可以大概猜出来是什么原因导致数据库异常

Non-fatal internal error happenned while SMON was doing temporary segment drop.

出现这个错误,使得我们想到一个smon的功能,清理临时段.该数据库down掉很可能和smon清理临时段的过程发生失败有关系

SMON encountered 8 out of maximum 100 non-fatal internal errors.

这个错误提示是因为smon内部最多允许发生100次错误,记录错误发生了8次,当然这次数据库down掉是smon还没有达到100次就直接abort掉

SQL> col name for a32
SQL> col value for a24
SQL> col description for a70
SQL> set linesize 150
SQL> select a.ksppinm name,b.ksppstvl value,a.ksppdesc description
  2    from x$ksppi a,x$ksppcv b
  3   where a.inst_id = USERENV ('Instance')
  4     and b.inst_id = USERENV ('Instance')
  5     and a.indx = b.indx
  6     and upper(a.ksppinm) LIKE upper('%&param%')
  7  order by name
  8
SQL> /
Enter value for param: smon_internal_errlimit
old   6:    and upper(a.ksppinm) LIKE upper('%&param%')
new   6:    and upper(a.ksppinm) LIKE upper('%smon_internal_errlimit%')
NAME                             VALUE                    DESCRIPTION
-------------------------------- ------------------------ -----------------------------------
_smon_internal_errlimit          100                      limit of SMON internal errors

分析trace文件

Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
ORACLE_HOME = /oracle/ora10/product
System name:	Linux
Node name:	DBN-HLD-155-2
Release:	2.6.18-92.el5PAE
Version:	#1 SMP Tue Apr 29 13:31:02 EDT 2008
Machine:	i686
Instance name: ora10g
Redo thread mounted by this instance: 1
Oracle process number: 8
Unix process pid: 8941, image: oracle@DBN-HLD-155-2 (SMON)
*** SERVICE NAME:() 2012-08-08 10:55:20.208
*** SESSION ID:(274.1) 2012-08-08 10:55:20.208
*** 2012-08-08 10:55:20.208
ksedmp: internal or fatal error
ORA-00600: internal error code, arguments: [25027], [0], [0], [], [], [], [], []
Current SQL statement for this session:
select file#, block#, ts# from seg$ where type# = 3

果然是smon在查询type#=3的时候发现异常,出现ORA-00600[25027]错误.通过对seg$相关视图分析,可以知道type#=3表示临时段,也就是说数据库smon在查询哪些segment是临时段的时候发生意外,而结合alert日志,完整的错误应该就是:数据库启动后,smon进程为了清理临时段,需要通过select file#, block#, ts# from seg$ where type# = 3查询临时段,但是在查询的时候发生错误.而这个错误累积几次导致数据库异常关闭.

解决问题

--修改两个参数
event='10061 trace name context forever, level 10'
_smon_internal_errlimit=1000000
--启动数据库
SQL> startup pfile='/tmp/pfile'
ORACLE instance started.
Total System Global Area 1577058304 bytes
Fixed Size                  1267716 bytes
Variable Size             385878012 bytes
Database Buffers         1174405120 bytes
Redo Buffers               15507456 bytes
Database mounted.
Database opened.

因为屏蔽了smon回收临时段,数据库未出现开始时错误,观察数据库几个小时,运行正常,到此基本上解决了此次异常,通过exp可以顺利导出数据然后导入到新库中.

导致该异常sql分析

SQL> select file#, block#, ts# from seg$ where type# = 3;
select file#, block#, ts# from seg$ where type# = 3
                               *
ERROR at line 1:
ORA-00600: internal error code, arguments: [25027], [0], [0], [], [], [], [],
[]
SQL> set autot trace exp
SQL> select file#, block#, ts# from seg$ where type# = 3;
Execution Plan
----------------------------------------------------------
Plan hash value: 1605285479
--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |    10 |   150 |   389   (1)| 00:00:05 |
|*  1 |  TABLE ACCESS FULL| SEG$ |    10 |   150 |   389   (1)| 00:00:05 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("TYPE#"=3)
SQL> select count(rowid) from seg$;
Execution Plan
----------------------------------------------------------
Plan hash value: 763549841
---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |    12 |   389   (1)| 00:00:05 |
|   1 |  SORT AGGREGATE    |      |     1 |    12 |            |          |
|   2 |   TABLE ACCESS FULL| SEG$ | 45682 |   535K|   389   (1)| 00:00:05 |
---------------------------------------------------------------------------
SQL> select /*+ full(t) */ count(*) from seg$;
Execution Plan
----------------------------------------------------------
Plan hash value: 763549841
-------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Cost (%CPU)| Time     |
-------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |    29   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |            |          |
|   2 |   TABLE ACCESS FULL| SEG$ |  3223 |    29   (0)| 00:00:01 |
-------------------------------------------------------------------
SQL> select ts# from seg$;
Execution Plan
----------------------------------------------------------
Plan hash value: 1605285479
--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |  3223 |  9669 |    29   (0)| 00:00:01 |
|   1 |  TABLE ACCESS FULL| SEG$ |  3223 |  9669 |    29   (0)| 00:00:01 |
--------------------------------------------------------------------------
SQL> set autot off
SQL> select count(rowid) from seg$;
SQL> /
COUNT(ROWID)
------------
       45727
SQL> select /*+ full(t) */ count(*) from seg$;
  COUNT(*)
----------
     45727
SQL> select  ts# from seg$;
……
7
7
ORA-00600: internal error code, arguments: [25027], [0], [0], [], [], [], [],
[]
SQL> !dbv file='/oracle/ora10/oradata/ora10g/system01.dbf'
DBVERIFY: Release 10.2.0.4.0 - Production on Thu Aug 9 14:05:09 2012
Copyright (c) 1982, 2007, Oracle.  All rights reserved.
DBVERIFY - Verification starting : FILE = /oracle/ora10/oradata/ora10g/system01.dbf
Block Checking: DBA = 4225673, Block Type = KTB-managed data block
data header at 0xb6d8225c
kdbchk: bad row offset slot 6 offs 3030 fseo 3752 dtl 8168 bhs 72
Page 31369 failed with check code 6135
Block Checking: DBA = 4236289, Block Type = KTB-managed data block
data header at 0xb6d7225c
kdbchk: tosp bad (-13399)
Page 41985 failed with check code 6127
DBVERIFY - Verification complete
Total Pages Examined         : 192000
Total Pages Processed (Data) : 47588
Total Pages Failing   (Data) : 2
Total Pages Processed (Index): 40929
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 1784
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 101699
Total Pages Marked Corrupt   : 0
Total Pages Influx           : 0
Highest block SCN            : 1775671440 (0.1775671440)
SQL> ANALYZE TABLE sys.SEG$ VALIDATE STRUCTURE CASCADE;
ANALYZE TABLE sys.SEG$ VALIDATE STRUCTURE CASCADE
*
ERROR at line 1:
ORA-01498: block check failure - see trace file

通过这里的分析大概可以确定是由于两块KTB-managed data block数据块异常,导致直接对seg$进行TABLE ACCESS FULL操作的时候发生异常.因为这个库已经破坏了数据一致性,先导出来数据,至于出现该错误的原因,后续继续关注分析

创建控制文件遭遇ORA-00600[3753]故障解决

联系:手机/微信(+86 17813235971) QQ(107644445)

标题:创建控制文件遭遇ORA-00600[3753]故障解决

作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

一位网友的数据库正常关闭,然后控制文件意外丢失,需要通过trace中的信息重建控制文件,但是在重建的过程中,出现ORA-00600[3753]错误,远程帮忙处理,记录处理过程如下
1.启动数据库至nomount状态,然后尝试noresetlogs模式重建控制文件

SQL>@XFF_NORESETLOGS_CTL.sql
CREATE CONTROLFILE REUSE DATABASE "ORA10G" NORESETLOGS  ARCHIVELOG
*
第 1 行出现错误:
ORA-01503: CREATE CONTROLFILE 失败
ORA-00600: 内部错误代码, 参数: [3753], [3], [2], [], [], [], [], []

2.检查alert日志

Tue Aug 07 20:40:47 2012
WARNING: Default Temporary Tablespace not specified in CREATE DATABASE command
Default Temporary Tablespace will be necessary for a locally managed database in future release
Tue Aug 07 20:40:48 2012
Errors in file d:\oracle\product\10.2.0\db_1\admin\ora10g\udump\ora10g_ora_11596.trc:
ORA-00600: 内部错误代码, 参数: [3753], [3], [2], [], [], [], [], []
Tue Aug 07 20:40:53 2012
Errors in file d:\oracle\product\10.2.0\db_1\admin\ora10g\udump\ora10g_ora_11596.trc:
ORA-00600: 内部错误代码, 参数: [3753], [3], [2], [], [], [], [], []
ORA-1503 signalled during: CREATE CONTROLFILE REUSE DATABASE "ORA10G" NORESETLOGS  ARCHIVELOG

3.分析trace文件

Tue Aug 07 20:40:48 2012
ORACLE V10.2.0.1.0 - Production vsnsta=0
vsnsql=14 vsnxtr=3
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Production
With the Partitioning, OLAP and Data Mining options
Windows NT Version V6.1 Service Pack 1
CPU                 : 2 - type 586, 2 Physical Cores
Process Affinity    : 0x00000000
Memory (Avail/Total): Ph:166M/1901M, Ph+PgF:619M/5536M, VA:812M/2047M
Instance name: ora10g
Redo thread mounted by this instance: 0 <none>
Oracle process number: 16
Windows thread id: 11596, image: ORACLE.EXE (SHAD)
*** SERVICE NAME:() 2012-08-07 20:40:48.413
*** SESSION ID:(158.7) 2012-08-07 20:40:48.413
*** 2012-08-07 20:40:48.413
ksedmp: internal or fatal error
ORA-00600: 内部错误代码, 参数: [3753], [3], [2], [], [], [], [], []
Current SQL statement for this session:
CREATE CONTROLFILE REUSE DATABASE "ORA10G" NORESETLOGS  ARCHIVELOG
…………
----- Call Stack Trace -----
calling              call     entry                argument values in hex
location             type     point                (? means dubious value)
-------------------- -------- -------------------- ----------------------------
_ksedst+38           CALLrel  _ksedst1+0           0 1
_ksedmp+898          CALLrel  _ksedst+0            0
_ksfdmp+14           CALLrel  _ksedmp+0            3
603A816A             CALLreg  00000000             87CF110 3
603A83FF             CALLrel  603A80D8             87CF110 8191090 EA9 2 8CCC438
_ksesic2+59          CALLrel  _kgesiv+0            87CF110 8191090 EA9 2 8CCC438
                                                   EA9 2 8CCC438
__VInfreq__kctbce+1  CALLrel  _ksesic2+0           EA9 0 3 0 0 2 0
63
_kcfccfl+356         CALLrel  _kctbce+0            543414C 81DB8A8
_cdbdrv+1037         CALLrel  _kcfccfl+0           543414C 1 8CCD060 8CCD04C
                                                   19000 3
_opiexe+11999        CALLrel  _cdbdrv+0            1
_opiosq0+6088        CALLrel  _opiexe+0            4 0 8CCD894
_kpooprx+232         CALLrel  _opiosq0+0           3 E 8CCD9AC A4
_kpoal8+775          CALLrel  _kpooprx+0           8CCF6CC 8196414 A16 1 0 A4
_opiodr+1099         CALLreg  00000000             5E 17 8CCF6C8
60FEFF8D             CALLreg  00000000             5E 17 8CCF6C8 0
_opitsk+1017         CALL???  00000000
_opiino+1087         CALLrel  _opitsk+0            0 0
_opiodr+1099         CALLreg  00000000             3C 4 8CCFC60
_opidrv+819          CALLrel  _opiodr+0            3C 4 8CCFC60 0
_sou2o+45            CALLrel  _opidrv+0            3C 4 8CCFC60
_opimai_real+112     CALLrel  _sou2o+0             8CCFC54 3C 4 8CCFC60
_opimai+92           CALLrel  _opimai_real+0       2 8CCFC8C
_OracleThreadStart@  CALLrel  _opimai+0
4+708
__pRawDllMain+10931  CALLptr  00000000
2903
__pRawDllMain+12925  CALLreg  00000000
4809
__pRawDllMain+12925  CALLrel  __pRawDllMain+12925
4761                          4772
--------------------- Binary Stack Dump ---------------------
    ----------------------------------------
    SO: 4FB3DF5C, type: 4, owner: 4FA4CBFC, flag: INIT/-/-/0x00
    (session) sid: 158 trans: 4EBB8954, creator: 4FA4CBFC, flag: (41) USR/- BSY/-/-/-/-/-
              DID: 0000-0010-0000000A, short-term DID: 0000-0000-00000000
              txn branch: 00000000
              oct: 0, prv: 0, sql: 00000000, psql: 4F707298, user: 0/SYS
    O/S info: user: superv06-PC\superv06, term: SUPERV06-PC, ospid: 7788:11636, machine: WORKGROUP\SUPERV06-PC
              program: sqlplus.exe
    application name: sqlplus.exe, hash value=0
    last wait for 'log file sequential read' blocking sess=0x00000000 seq=31
    wait_time=159 seconds since wait started=0
                log#=0, block#=1, blocks=1
    Dumping Session Wait History
     for 'log file sequential read' count=1 wait_time=159
                log#=0, block#=1, blocks=1
     for 'log file sequential read' count=1 wait_time=502
                log#=0, block#=1, blocks=1
     for 'log file sequential read' count=1 wait_time=163
                log#=0, block#=1, blocks=1
     for 'db file sequential read' count=1 wait_time=18840
                file#=ffffffff, block#=1, blocks=1
     for 'db file sequential read' count=1 wait_time=254
                file#=ffffffff, block#=1, blocks=1
     for 'db file sequential read' count=1 wait_time=7654
                file#=ffffffff, block#=1, blocks=1
     for 'db file sequential read' count=1 wait_time=150
                file#=ffffffff, block#=1, blocks=1
     for 'db file sequential read' count=1 wait_time=102
                file#=ffffffff, block#=1, blocks=1
     for 'db file sequential read' count=1 wait_time=123
                file#=ffffffff, block#=1, blocks=1
     for 'db file sequential read' count=1 wait_time=14010
                file#=ffffffff, block#=1, blocks=1

通过这里我们发现创建控制文件的进程在读取redo log的时候出现了等待比较多而且时间比较长,而对于ORA-00600[3753]错误互联网上没有任何更多的信息.通过对于创建控制文件时候因为使用noresetlogs的分析:这种模式下需要读取redo log,所以导致等待较多,从而出现ORA-00600[3753]错误使得创建控制文件失败.因为本库是shutdown immediate关闭,所以我们完全可以通过resetlogs模式来创建控制文件,从而避免读取redo log.

4.创建resetlogs控制文件

SQL>@XFF_RESETLOGS_CTL.sql
Control file created.

5.然后不完全恢复使用resetlogs open数据库

这次的处理我也没有什么经验可以借鉴,MOS和互联网上没有该错误的任何信息,解决这个问题关键凭的是自己对于noresetlogs和resetlogs的理解.对于数据库原理的理解,对解决一些陌生问题帮助很大;在学习ORACLE过程中注重对原理的理解和消化