非归档异常数据库rman备份

最近在数据库恢复中遇到一个案例:xx单位1.5T oracle 10.2.0.4(redhat 4.5),因为异常关闭操作系统,导致数据库不能启动,需要帮忙恢复。该数据库为非归档模式,使用裸设备,一个裸设备文件大小(35G),数据库文件大小4-30G都有,现在客户要求我们不能对现有环境进行任何操作,需要克隆一份数据库出来,然后在克隆的库上进行数据库恢复操作.数据库环境的克隆最好的方法就是使用rman来完成,但是该数据库为非归档模式,无法直接使用rman进行备份操作.最后采取dd的方式处理(需要注意dd文件大小为block_size*(v$datafile.blocks+1)+v$datafile.offset).因为不能使用rman的一条命令处理,心里一直不舒服,在家里实验,终于还是确定可以通过重建控制文件的方法来欺骗rman是归档模式,来实现rman完成类似工作.
数据库非非归档模式

SQL> select * from v$version;
BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - Production
PL/SQL Release 11.2.0.3.0 - Production
CORE    11.2.0.3.0      Production
TNS for Linux: Version 11.2.0.3.0 - Production
NLSRTL Version 11.2.0.3.0 - Production
SQL> archive log list;
Database log mode              No Archive Mode
Automatic archival             Disabled
Archive destination            /u01/oracle/oradata/ora11g/archivelog
Oldest online log sequence     7
Current log sequence           9

非归档模式尝试rman 备份

RMAN> backup database format '/u01/oracle/oradata/orall1g_%U';
Starting backup at 22-JAN-13
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=10 device type=DISK
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
RMAN-03009: failure of backup command on ORA_DISK_1 channel at 01/22/2013 16:10:49
ORA-19602: cannot backup or copy active file in NOARCHIVELOG mode
continuing other job steps, job failed will not be re-run
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
including current control file in backup set
including current SPFILE in backup set
channel ORA_DISK_1: starting piece 1 at 22-JAN-13
channel ORA_DISK_1: finished piece 1 at 22-JAN-13
piece handle=/u01/oracle/oradata/orall1g_13o02k8a_1_1 tag=TAG20130122T161048 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:03
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================

rman的backup or copy命令不能在非归档模式下执行

尝试修改数据库为归档模式

SQL> alter database archivelog;
alter database archivelog
*
ERROR at line 1:
ORA-00265: instance recovery required, cannot set ARCHIVELOG mode

数据库非正常关闭,不能修改归档模式

重建控制文件

SQL> alter database backup controlfile to trace as '/tmp/ctl';
Database altered.
SQL> shutdown immediate
ORA-01109: database not open
Database dismounted.
ORACLE instance shut down.
--备份当前控制文件(保留控制文件现场)
SQL> STARTUP NOMOUNT
ORACLE instance started.
Total System Global Area  313860096 bytes
Fixed Size                  1344652 bytes
Variable Size             285215604 bytes
Database Buffers           20971520 bytes
Redo Buffers                6328320 bytes
SQL> CREATE CONTROLFILE REUSE DATABASE "ORA11G" NORESETLOGS  ARCHIVELOG
  2      MAXLOGFILES 16
  3      MAXLOGMEMBERS 3
  4      MAXDATAFILES 100
  5      MAXINSTANCES 8
  6      MAXLOGHISTORY 292
  7  LOGFILE
  8    GROUP 1 '/u01/oracle/oradata/ora11g/redo01.log'  SIZE 50M BLOCKSIZE 512,
  9    GROUP 2 '/u01/oracle/oradata/ora11g/redo02.log'  SIZE 50M BLOCKSIZE 512,
 10    GROUP 3 '/u01/oracle/oradata/ora11g/redo03.log'  SIZE 50M BLOCKSIZE 512
 11  DATAFILE
 12    '/u01/oracle/oradata/ora11g/system01.dbf',
 13    '/u01/oracle/oradata/ora11g/sysaux01.dbf',
 14    '/u01/oracle/oradata/ora11g/users01.dbf',
 15    '/u01/oracle/oradata/ora11g/dbfs01.dbf',
 16    '/u01/oracle/oradata/ora11g/tts_xifenfei02.dbf',
 17    '/u01/oracle/oradata/ora11g/tts_xifenfei01.dbf',
 18    '/u01/oracle/oradata/ora11g/system02.dbf',
 19    '/u01/oracle/oradata/ora11g/czum01.dbf',
 20    '/u01/oracle/oradata/ora11g/undotbs02.dbf'
 21  CHARACTER SET ZHS16GBK
 22  ;
Control file created.

数据库已经变为归档模式

SQL> archive log list;
Database log mode              Archive Mode
Automatic archival             Disabled
Archive destination            /u01/oracle/oradata/ora11g/archivelog
Oldest online log sequence     7
Next log sequence to archive   7
Current log sequence           9

归档模式尝试rman备份

RMAN> backup datafile 1 format '/u01/oracle/oradata/system01_%U';
Starting backup at 22-JAN-13
using channel ORA_DISK_1
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00001 name=/u01/oracle/oradata/ora11g/system01.dbf
channel ORA_DISK_1: starting piece 1 at 22-JAN-13
channel ORA_DISK_1: finished piece 1 at 22-JAN-13
piece handle=/u01/oracle/oradata/system01_02o02kl7_1_1 tag=TAG20130122T161742 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:01:15
Finished backup at 22-JAN-13
RMAN-06497: WARNING: control file is not current, control file AUTOBACKUP skipped

因为现在的控制文件是新创建的,不能算是数据库的当前控制文件,所以未被rman自动备份(很好理解,重建控制文件后,我们做恢复都要使用using backup controlfile命令)

总结说明
1.数据库为非归档模式,不能使用rman的backup和copy命令来备份
2.因为数据库为非正常关闭不能直接修改为归档模式
3.通过重建控制文件修改数据库(注意备份)为归档模式实现rman正常备份
4.当rman备份好之后,使用原先控制文件替换现在控制文件

bbed模拟提交事务二之屏蔽smon回滚事务

在上一篇修改datablock itl(bbed模拟提交事务一之修改itl)的基础之上,本篇实现修改undo segment header中的相关事务槽信息,从而屏蔽数据库在重启或者进程异常的时候,smon的回滚操作,从而比较完美的实现了手工提交数据库事务
update table and uncommit(session 1)

SQL> select distinct
  2  dbms_rowid.rowid_relative_fno(rowid) rel_fno,
  3  dbms_rowid.rowid_block_number(rowid) block_no
  4  from chf.t_xifenfei;
   REL_FNO   BLOCK_NO
---------- ----------
         4         28
SQL> select * from chf.t_xifenfei;
 OBJECT_ID OBJECT_NAME
---------- --------------------
        20 xifenfei.com
        44 xifenfei.com
        28 xifenfei.com
        15 xifenfei.com
        29 xifenfei.com
         3 xifenfei.com
        25 xifenfei.com
        39 xifenfei.com
        51 xifenfei.com
        26 xifenfei.com
        17 xifenfei.com
        13 xifenfei.com
         9 xifenfei.com
        41 xifenfei.com
        48 I_CON1
        38 I_OBJ3
         7 I_TS#
        53 I_CDEF4
        19 IND$
19 rows selected.
SQL> update chf.t_xifenfei set object_name='orasos.com' where rownum<10;
9 rows updated.
SQL> select * from chf.t_xifenfei;
 OBJECT_ID OBJECT_NAME
---------- --------------------
        20 orasos.com
        44 orasos.com
        28 orasos.com
        15 orasos.com
        29 orasos.com
         3 orasos.com
        25 orasos.com
        39 orasos.com
        51 orasos.com
        26 xifenfei.com
        17 xifenfei.com
        13 xifenfei.com
         9 xifenfei.com
        41 xifenfei.com
        48 I_CON1
        38 I_OBJ3
         7 I_TS#
        53 I_CDEF4
        19 IND$
19 rows selected.

dump undo header(session 2)

SQL> alter system flush buffer_cache;
System altered.
SQL> alter system checkpoint;
System altered.
SQL> SELECT XIDUSN,XIDSLOT,XIDSQN,UBAFIL,UBABLK,UBASQN,UBAREC FROM v$transaction;
    XIDUSN    XIDSLOT     XIDSQN     UBAFIL     UBABLK     UBASQN     UBAREC
---------- ---------- ---------- ---------- ---------- ---------- ----------
         1         13        367          2       1126        362         32
SQL>  alter system dump undo header "_SYSSMU1$";
System altered.
index  state cflags  wrap#    uel         scn            dba     parent-xid          nub       stmt_num       cmt
------------------------------------------------------------------------------------------------------------------
   …………
0x0c    9    0x00  0x016e  0x0029  0x0b2c.c02d1f6a  0x00800464  0x0000.000.00000000  0x00000001   0x00000000  1358813163
0x0d   10    0x80  0x016f  0x0002  0x0b2c.c02d7b15  0x00800466  0x0000.000.00000000  0x00000001   0x00000000  0
0x0e    9    0x00  0x016f  0x000f  0x0b2c.c02d2ae2  0x00800466  0x0000.000.00000000  0x00000001   0x00000000  1358820065
   …………

通过结合dump undo header 中的TRN TBL的state为10的为active事务,然后结合scn/dba等信息,
来确定是哪条记录是需要我们修改.然后通过find命令快速定位到0x0d这条记录,然后进行修改

通过结合bbed的dump命令得出16进制数据分析得出如下结论

--index 0x0c
6e01        0000 64048000      6a1f2dc0 2c0b0000      09       00          2900
0000000000000000            00000000          01000000     ebd7fd50(1358813163注意存储顺序)
--index 0x0d
6f01  wrap# 0000 66048000 dba  157b2dc0 2c0b0000 scn  0a state 80  cflags  0200 uel
0000000000000000 parent-xid 00000000 stmt_num 01000000 nub 00000000 cmt
--index 0x0e
6f01        0000 66048000      e22a2dc0 2c0b0000      09       00          0f00
0000000000000000            00000000          01000000     e1f2fd50(1358820065)

bbed modify undo segment header(session 2)

BBED> f /x 0a80
 File: /u01/oracle/oradata/XFF/undotbs01.dbf (0)
 Block: 9                Offsets: 6736 to 6751           Dba:0x00000000
------------------------------------------------------------------------
 0a800200 00000000 00000000 00000000
 <32 bytes per line>
BBED> m /x 0900
Warning: contents of previous BIFILE will be lost. Proceed? (Y/N) y
 File: /u01/oracle/oradata/XFF/undotbs01.dbf (0)
 Block: 9                Offsets: 6736 to 6751           Dba:0x00000000
------------------------------------------------------------------------
 09000200 00000000 00000000 00000000
 <32 bytes per line>
BBED> sum apply
Check value for File 0, Block 9:
current = 0xecdd, required = 0xecdd

bbed modify data block itl(session 2)

struct ktbbhitl[0], 24 bytes             @44
      struct ktbitxid, 8 bytes              @44
         ub2 kxidusn                        @44       0x0001
         ub2 kxidslt                        @46       0x000d
         ub4 kxidsqn                        @48       0x0000016f
      struct ktbituba, 8 bytes              @52
         ub4 kubadba                        @52       0x00800466
         ub2 kubaseq                        @56       0x016a
         ub1 kubarec                        @58       0x20
      ub2 ktbitflg                          @60       0x0009 (NONE)
      union _ktbitun, 2 bytes               @62
         b2 _ktbitfsc                       @62       18
         ub2 _ktbitwrp                      @62       0x0012
      ub4 ktbitbas                          @64       0x00000000
BBED> m /x 0080 offset 60
Warning: contents of previous BIFILE will be lost. Proceed? (Y/N) y
 File: /u01/oracle/oradata/XFF/users01.dbf (0)
 Block: 28               Offsets:   60 to  571           Dba:0x00000000
------------------------------------------------------------------------
 00801200 00000000 09002a00 36020000
 <32 bytes per line>
BBED> sum apply
Check value for File 0, Block 28:
current = 0xccf1, required = 0xccf1

restart db and select table(session 3)

SQL> shutdown abort
ORACLE instance shut down.
SQL> startup
ORACLE instance started.
Total System Global Area  306184192 bytes
Fixed Size                  1267164 bytes
Variable Size             109054500 bytes
Database Buffers          188743680 bytes
Redo Buffers                7118848 bytes
Database mounted.
Database opened.
SQL> col object_name for a20
SQL> set pages 100
SQL> select * from chf.t_xifenfei;
 OBJECT_ID OBJECT_NAME
---------- --------------------
        20 orasos.com
        44 orasos.com
        28 orasos.com
        15 orasos.com
        29 orasos.com
         3 orasos.com
        25 orasos.com
        39 orasos.com
        51 orasos.com
        26 xifenfei.com
        17 xifenfei.com
        13 xifenfei.com
         9 xifenfei.com
        41 xifenfei.com
        48 I_CON1
        38 I_OBJ3
         7 I_TS#
        53 I_CDEF4
        19 IND$
19 rows selected.

到此证明,通过修改undo segment header中的state和cflags实现数据库启动不回滚未提交事务;通过修改datablock itl实现数据库在访问未提交数据块时候不访问undo。从而整体上较完美的实现了手工提交一个事务(数据库提交一个事务涉及的方方面面较为复杂,这里只是通过修改最核心的两部分来大致模拟提交事务)

bbed模拟提交事务一之修改itl

我们都知道,根据oracle 事务的一致性,当我们在session 1中进行dml操作,如果未提交在其他会话中是无法看到修改后的值(只能看到修改前的值).这里通过bbed模拟部分提交事务从而实现在其他会话中查看到另外会话未提交事务(本质已经部分模拟提交,还有undo segment header中信息未清理,下篇补充)
create table(session 1)

SQL> create table chf.t_xifenfei
  2  as
  3  select object_id,object_name from dba_objects where rownum<20;
Table created.
SQL> select rowid,
  2  dbms_rowid.rowid_relative_fno(rowid) rel_fno,
  3  dbms_rowid.rowid_block_number(rowid) block_no
  4  from chf.t_xifenfei;
ROWID                 REL_FNO   BLOCK_NO
------------------ ---------- ----------
AAANL3AAEAAAAAcAAA          4         28
AAANL3AAEAAAAAcAAB          4         28
AAANL3AAEAAAAAcAAC          4         28
AAANL3AAEAAAAAcAAD          4         28
AAANL3AAEAAAAAcAAE          4         28
AAANL3AAEAAAAAcAAF          4         28
AAANL3AAEAAAAAcAAG          4         28
AAANL3AAEAAAAAcAAH          4         28
AAANL3AAEAAAAAcAAI          4         28
AAANL3AAEAAAAAcAAJ          4         28
AAANL3AAEAAAAAcAAK          4         28
AAANL3AAEAAAAAcAAL          4         28
AAANL3AAEAAAAAcAAM          4         28
AAANL3AAEAAAAAcAAN          4         28
AAANL3AAEAAAAAcAAO          4         28
AAANL3AAEAAAAAcAAP          4         28
AAANL3AAEAAAAAcAAQ          4         28
AAANL3AAEAAAAAcAAR          4         28
AAANL3AAEAAAAAcAAS          4         28
19 rows selected.
SQL>  select * from chf.t_xifenfei;
 OBJECT_ID OBJECT_NAME
---------- --------------------
        20 ICOL$
        44 I_USER1
        28 CON$
        15 UNDO$
        29 C_COBJ#
         3 I_OBJ#
        25 PROXY_ROLE_DATA$
        39 I_IND1
        51 I_CDEF2
        26 I_PROXY_ROLE_DATA$_1
        17 FILE$
        13 UET$
         9 I_FILE#_BLOCK#
        41 I_FILE1
        48 I_CON1
        38 I_OBJ3
         7 I_TS#
        53 I_CDEF4
        19 IND$
19 rows selected.

dump block(session 2)

SQL> alter system checkpoint;
System altered.
SQL> alter system dump datafile 4 block 28;
System altered.
 Itl           Xid                  Uba         Flag  Lck        Scn/Fsc
0x01   0xffff.000.00000000  0x00000000.0000.00  C---    0  scn 0x0b2c.c02d1987
0x02   0x0000.000.00000000  0x00000000.0000.00  ----    0  fsc 0x0000.00000000
0x03   0x0000.000.00000000  0x00000000.0000.00  ----    0  fsc 0x0000.00000000

update record(session 1)

SQL> update chf.t_xifenfei set object_name ='www.xifenfei.com' where rownum<15;
14 rows updated.
SQL> col object_name for a20
SQL> select * from chf.t_xifenfei;
 OBJECT_ID OBJECT_NAME
---------- --------------------
        20 www.xifenfei.com
        44 www.xifenfei.com
        28 www.xifenfei.com
        15 www.xifenfei.com
        29 www.xifenfei.com
         3 www.xifenfei.com
        25 www.xifenfei.com
        39 www.xifenfei.com
        51 www.xifenfei.com
        26 www.xifenfei.com
        17 www.xifenfei.com
        13 www.xifenfei.com
         9 www.xifenfei.com
        41 www.xifenfei.com
        48 I_CON1
        38 I_OBJ3
         7 I_TS#
        53 I_CDEF4
        19 IND$
19 rows selected.

dump block(session 2)

SQL> alter system checkpoint;
System altered.
--注意flush buffer_cache(不然后面bbed修改会被不能通过select显示,而且会被覆盖)
SQL> alter system flush buffer_cache;
System altered.
SQL>  alter system flush shared_pool;
System altered.
SQL> alter system dump datafile 4 block 28;
System altered.
 Itl           Xid                  Uba         Flag  Lck        Scn/Fsc
0x01   0xffff.000.00000000  0x00000000.0000.00  C---    0  scn 0x0b2c.c02d1987
0x02   0x000a.00a.0000017e  0x0081ffc7.01a2.22  ----   14  fsc 0x0000.00000000  <--注意Lck 14
0x03   0x0000.000.00000000  0x00000000.0000.00  ----    0  fsc 0x0000.00000000

bbed commit Transaction(session 3)

BBED> p ktbbh
struct ktbbh, 96 bytes                      @20
   ub1 ktbbhtyp                             @20       0x01 (KDDBTDATA)
   union ktbbhsid, 4 bytes                  @24
      ub4 ktbbhsg1                          @24       0x0000d2f7
      ub4 ktbbhod1                          @24       0x0000d2f7
   struct ktbbhcsc, 8 bytes                 @28
      ub4 kscnbas                           @28       0xc02d1aec
      ub2 kscnwrp                           @32       0x0b2c
   b2 ktbbhict                              @36       3
   ub1 ktbbhflg                             @38       0x32 (NONE)
   ub1 ktbbhfsl                             @39       0x00
   ub4 ktbbhfnx                             @40       0x01000019
   struct ktbbhitl[0], 24 bytes             @44
      struct ktbitxid, 8 bytes              @44
         ub2 kxidusn                        @44       0xffff
         ub2 kxidslt                        @46       0x0000
         ub4 kxidsqn                        @48       0x00000000
      struct ktbituba, 8 bytes              @52
         ub4 kubadba                        @52       0x00000000
         ub2 kubaseq                        @56       0x0000
         ub1 kubarec                        @58       0x00
      ub2 ktbitflg                          @60       0x8000 (KTBFCOM)
      union _ktbitun, 2 bytes               @62
         b2 _ktbitfsc                       @62       2860
         ub2 _ktbitwrp                      @62       0x0b2c
      ub4 ktbitbas                          @64       0xc02d1987
   struct ktbbhitl[1], 24 bytes             @68
      struct ktbitxid, 8 bytes              @68
         ub2 kxidusn                        @68       0x000a
         ub2 kxidslt                        @70       0x000a
         ub4 kxidsqn                        @72       0x0000017e
      struct ktbituba, 8 bytes              @76
         ub4 kubadba                        @76       0x0081ffc7
         ub2 kubaseq                        @80       0x01a2
         ub1 kubarec                        @82       0x22
      ub2 ktbitflg                          @84       0x000e (NONE) <--修改要改为8000
      union _ktbitun, 2 bytes               @86
         b2 _ktbitfsc                       @86       0
         ub2 _ktbitwrp                      @86       0x0000
      ub4 ktbitbas                          @88       0x00000000
   struct ktbbhitl[2], 24 bytes             @92
      struct ktbitxid, 8 bytes              @92
         ub2 kxidusn                        @92       0x0000
         ub2 kxidslt                        @94       0x0000
         ub4 kxidsqn                        @96       0x00000000
      struct ktbituba, 8 bytes              @100
         ub4 kubadba                        @100      0x00000000
         ub2 kubaseq                        @104      0x0000
         ub1 kubarec                        @106      0x00
      ub2 ktbitflg                          @108      0x0000 (NONE)
      union _ktbitun, 2 bytes               @110
         b2 _ktbitfsc                       @110      0
         ub2 _ktbitwrp                      @110      0x0000
      ub4 ktbitbas                          @112      0x00000000
BBED> m /x 0080 offset 84
Warning: contents of previous BIFILE will be lost. Proceed? (Y/N) y
 File: /u01/oracle/oradata/XFF/users01.dbf (0)
 Block: 28               Offsets:   84 to  115           Dba:0x00000000
------------------------------------------------------------------------
 00800000 00000000 00000000 00000000
 <32 bytes per line>
BBED> sum apply
Check value for File 0, Block 28:
current = 0x03dc, required = 0x03dc
BBED> p ktbbh
struct ktbbh, 96 bytes                      @20
  …………
   struct ktbbhitl[1], 24 bytes             @68
      struct ktbitxid, 8 bytes              @68
         ub2 kxidusn                        @68       0x000a
         ub2 kxidslt                        @70       0x000a
         ub4 kxidsqn                        @72       0x0000017e
      struct ktbituba, 8 bytes              @76
         ub4 kubadba                        @76       0x0081ffc7
         ub2 kubaseq                        @80       0x01a2
         ub1 kubarec                        @82       0x22
      ub2 ktbitflg                          @84       0x8000 (KTBFCOM) <--修改值
      union _ktbitun, 2 bytes               @86
         b2 _ktbitfsc                       @86       0
         ub2 _ktbitwrp                      @86       0x0000
      ub4 ktbitbas                          @88       0x00000000
  …………

dump block(session 2)

SQL> alter system dump datafile 4 block 28;
System altered.
 Itl           Xid                  Uba         Flag  Lck        Scn/Fsc
0x01   0xffff.000.00000000  0x00000000.0000.00  C---    0  scn 0x0b2c.c02d1987
0x02   0x000a.00a.0000017e  0x0081ffc7.01a2.22  C---    0  scn 0x0000.00000000
0x03   0x0000.000.00000000  0x00000000.0000.00  ----    0  fsc 0x0000.00000000

select data(session 4)

SQL> select object_id,object_name from chf.t_xifenfei;
 OBJECT_ID OBJECT_NAME
---------- --------------------
        20 www.xifenfei.com
        44 www.xifenfei.com
        28 www.xifenfei.com
        15 www.xifenfei.com
        29 www.xifenfei.com
         3 www.xifenfei.com
        25 www.xifenfei.com
        39 www.xifenfei.com
        51 www.xifenfei.com
        26 www.xifenfei.com
        17 www.xifenfei.com
        13 www.xifenfei.com
         9 www.xifenfei.com
        41 www.xifenfei.com
        48 I_CON1
        38 I_OBJ3
         7 I_TS#
        53 I_CDEF4
        19 IND$
19 rows selected.

这里可以看到,已经模拟出来在其他session中可以访问数据库为commit的记录(在该block级别已经模拟了commit)

undo异常事务回滚规则分析

undo事务具体是如何回滚,这里提供了大概的异常undo事务回滚的一个过程(更加准确的说,这个过程是在以下几种情况中发生的过程:1.数据库非正常关闭后启动,2.事务未提交会话终止),数据库先扫描所有回滚段,然后发现有事务未提交回滚段,然后根据这个回滚段定位到undo block,然后定位到data block,当一个undo block回滚完成之后,利用undo的链表规则完成下一个undo block的回滚操作,依次类此,从而实现数据库的回滚操作;回滚的过程是先回滚后操作的块(先进后出原则)
创建测试表

SQL> create table chf.t_xifenfei(a varchar2(4000));
Table created.
SQL> insert into chf.t_xifenfei values (lpad('www.xifenfei.com',4000,'a'));
1 row created.
SQL> insert into chf.t_xifenfei values (lpad('www.xifenfei.com',4000,'b'));
1 row created.
SQL> commit;
Commit complete.
SQL> update chf.t_xifenfei set a=lpad('www.xifenfei.com',4000,'F');
2 rows updated.
SQL> select
  2  dbms_rowid.rowid_relative_fno(rowid) rel_fno,
  3  dbms_rowid.rowid_block_number(rowid) block_no
  4  from chf.t_xifenfei;
   REL_FNO   BLOCK_NO
---------- ----------
         9        421
         9        422

确保表中有两条记录,存储在两个block中

查询测试表相关信息

SQL> select object_id,data_object_id from dba_objects where owner='CHF' AND OBJECT_NAME='T_XIFENFEI';
 OBJECT_ID DATA_OBJECT_ID
---------- --------------
     53917          53917
SQL> SELECT TO_CHAR(53917,'XXXXX') FROM DUAL;
TO_CHA
------
  D29D
SQL> SELECT XIDUSN,XIDSLOT,XIDSQN,UBAFIL,UBABLK,UBASQN,UBAREC FROM v$transaction;
    XIDUSN    XIDSLOT     XIDSQN     UBAFIL     UBABLK     UBASQN     UBAREC
---------- ---------- ---------- ---------- ---------- ---------- ----------
         2         28        513          2       3627        253          1
SQL> select to_char(28,'xxx') from dual;
TO_C
----
  1c
SQL> select to_char(513,'xxx') from dual;
TO_C
----
 201
------------------------
--xid=0002.01c.00000201
------------------------
SQL> variable dba varchar2(30)
SQL> exec :dba := dbms_utility.make_data_block_address(2,3627);
PL/SQL procedure successfully completed.
SQL> print dba
DBA
--------------------------------
8392235
SQL> select to_char(253,'xxx') from dual;
TO_C
----
  fd
---------------------
--uda=800e2a.00fd.01
---------------------

通过这些查询可以得知:
1)chf.t_xifenfei的object_id/data_object_id的值为53917/D29D
2)xid=0002.01c.00000201(Xid=usn.slot.wrap)
3)uda=800e2a.00fd.01(Uba=undo block dba.undo record number.undo block sequence number)

dump rollback header

SQL> select * from v$rollname where usn=2;
       USN NAME
---------- ------------------------------
         2 _SYSSMU2$
SQL>  alter system dump undo header "_SYSSMU2$";
System altered.
 index  state cflags  wrap#    uel         scn            dba            parent-xid    nub     stmt_num    cmt
  ------------------------------------------------------------------------------------------------
0x00    9    0x00  0x0201  0x0012  0x0b2c.c02c9b85  0x00800e27  0x0000.000.00000000  0x00000001   0x00000000  1358780575
   ………………
0x1b    9    0x00  0x0201  0x0028  0x0b2c.c02c9bf4  0x00800e27  0x0000.000.00000000  0x00000001   0x00000000  1358780575
0x1c   10    0x80  0x0201  0x0002  0x0b2c.c02ca2a8  0x00800e2b  0x0000.000.00000000  0x00000002   0x00000000  0
0x1d    9    0x00  0x0200  0x0004  0x0b2c.c02c9a8b  0x00800e26  0x0000.000.00000000  0x00000001   0x00000000  1358780444
   ………………
0x2f    9    0x00  0x0201  0x001a  0x0b2c.c02ca1da  0x00800e29  0x0000.000.00000000  0x00000001   0x00000000  1358784176
------------------------------------------------------------------------------
==> Transaction Table
==>  state      0 = IDLE
                1 = Collecting       2 = Prepared            3 = Committed
                4 = Forced Abort     5 = Forced Commit       6 = Forced Mixed
                7 = try again later
                9 = No TX (Committed)    10= 'a' = Active local TX
==>  cflags     1 = TX has started storing collecting information
                2 = TX has forced the collecting information
                4 = Prepared TX needs distributed recovery
                10= Rollback failed on this TX - mark SMON for recover
                20= TX has rolled back its updates
==>  wrap#      is incremented TX slot reuse.
==>  uel        ?
==>  scn        SCN for the TX prepare / commit
==>  dba        is DBA of HEAD of the REDO - Ie: The MOST RECENT CHANG
------------------------------------------------------------------------------
--通过上面的提示可以知道index为0x1c为未提交事务
SQL> select to_number('800e2b','xxxxxxx') from dual;
TO_NUMBER('800E2B','XXXXXXX')
-----------------------------
                      8392235
SQL> select dbms_utility.data_block_address_block(8392235) "block",
  2  dbms_utility.data_block_address_file(8392235)  "file" from dual;
     block       file
---------- ----------
      3627          2
--v$transaction表中查询出来的UBAFIL与UBABLK一致

dump undo block

SQL> alter system dump datafile 2 block 3627;
System altered.
UNDO BLK:
xid: 0x0002.01c.00000201  seq: 0xfd  cnt: 0x1   irb: 0x1   icl: 0x0   flg: 0x0000
 Rec Offset      Rec Offset      Rec Offset      Rec Offset      Rec Offset
---------------------------------------------------------------------------
0x01 0x0fdc
*-----------------------------
* Rec #0x1  slt: 0x1c  objn: 53917(0x0000d29d)  objd: 53917  tblspc: 9(0x00000009)
*       Layer:  11 (Row)   opc: 1   rci 0x00  <--- 表明指向下一个block
Undo type:  Regular undo   Last buffer split:  No
Temp Object:  No
Tablespace Undo:  No
rdba: 0x00800e2a   <--- 为下一个undo block rdba
*-----------------------------
KDO undo record:
KTB Redo
op: 0x03  ver: 0x01
op: Z
Array Update of 1 rows:
tabn: 0 slot: 0(0x0) flag: 0x2c lock: 0 ckix: 71
ncol: 1 nnew: 1 size: 0
KDO Op code:  21 row dependencies Disabled
  xtype: XAxtype KDO_KDOM2 flags: 0x00000080  bdba: 0x024001a6  hdba: 0x024001a3
itli: 2  ispac: 0  maxfr: 4858
--bdba表示undo对应的data block,这里对应的是datafile 9 block 422
--hdba表示chf.t_xifenfei的extent 0的 first block(first mapbit)
SQL> select to_number('800e2a','xxxxxxx') from dual;
TO_NUMBER('800E2A','XXXXXXX')
-----------------------------
                      8392234
SQL> select dbms_utility.data_block_address_block(8392234) "block",
  2  dbms_utility.data_block_address_file(8392234)  "file" from dual;
     block       file
---------- ----------
      3626          2
SQL> alter system dump datafile 2 block 3626;
System altered.
UNDO BLK:
xid: 0x0002.01c.00000201  seq: 0xfd  cnt: 0x1   irb: 0x1   icl: 0x0   flg: 0x0000
 Rec Offset      Rec Offset      Rec Offset      Rec Offset      Rec Offset
---------------------------------------------------------------------------
0x01 0x0fac
*-----------------------------
* Rec #0x1  slt: 0x1c  objn: 53917(0x0000d29d)  objd: 53917  tblspc: 9(0x00000009)
*       Layer:  11 (Row)   opc: 1   rci 0x00   <--- 表明指向下一个block
Undo type:  Regular undo    Begin trans    Last buffer split:  No
Temp Object:  No
Tablespace Undo:  No
rdba: 0x00000000  <--- 下一个block为 0x00000000,表示事务到此为止
*-----------------------------
uba: 0x00800e29.00fd.24 ctl max scn: 0x0b2c.c02c99cc prv tx scn: 0x0b2c.c02c9a27
txn start scn: scn: 0x0b2c.c02ca0b3 logon user: 0
 prev brb: 8392230 prev bcl: 0
KDO undo record:
KTB Redo
op: 0x03  ver: 0x01
op: Z
Array Update of 1 rows:
tabn: 0 slot: 0(0x0) flag: 0x2c lock: 0 ckix: 71
ncol: 1 nnew: 1 size: 0
KDO Op code:  21 row dependencies Disabled
  xtype: XAxtype KDO_KDOM2 flags: 0x00000080  bdba: 0x024001a5  hdba: 0x024001a3
itli: 2  ispac: 0  maxfr: 4858
--bdba表示undo对应的data block,这里对应的是datafile 9 block 421

XID: this is the transaction ID, which matches the value made up previously from the query on V$TRANSACTION.
CNT: this is the number of undo records in this block.
IRB: this is the index if the first record to be considered in case of a rollback.
OBJN: this is the object number being altered by the undo record. This matches the value previously queried from DBA_OBJECTS.
RCI: this is the next undo record in the chain to be examined. When rolling back transactions, these undo records are applied one by one on the data blocks to undo the changes.
1. undo链表的意思为:从IRB表示undo 回滚的开始,RCI表示未IRB对应的下一条记录,依次类推。当RCI为0,而且rdba不为0的时候,表示undo 记录关联到下一个block;当RCI为0,而且rdba也为0表示undo结束
2. undo事务回滚是倒序的,这里可以看到事务先读取block 422,再读取block 421

dump data block

SQL> alter system dump datafile 9 block 421;
System altered.
Block header dump:  0x024001a5
 Object id on Block? Y
 seg/obj: 0xd29d  csc: 0xb2c.c02ca2bc  itc: 2  flg: E  typ: 1 - DATA
     brn: 0  bdba: 0x24001a1 ver: 0x01 opc: 0
     inc: 0  exflg: 0
 Itl           Xid                  Uba         Flag  Lck        Scn/Fsc
0x01   0x0009.01d.00000232  0x0083dbe0.02be.22  C---    0  scn 0x0b2c.c02ca2a8
0x02   0x0002.01c.00000201  0x00800e2a.00fd.01  ----    1  fsc 0x0000.00000000
SQL> alter system dump datafile 9 block 422;
System altered.
Block header dump:  0x024001a6
 Object id on Block? Y
 seg/obj: 0xd29d  csc: 0xb2c.c02ca2bd  itc: 2  flg: E  typ: 1 - DATA
     brn: 0  bdba: 0x24001a1 ver: 0x01 opc: 0
     inc: 0  exflg: 0
 Itl           Xid                  Uba         Flag  Lck        Scn/Fsc
0x01   0x0009.01d.00000232  0x0083dbe0.02be.23  C---    0  scn 0x0b2c.c02ca2a8
0x02   0x0002.01c.00000201  0x00800e2b.00fd.01  ----    1  fsc 0x0000.00000000

这里可以看到Xid,Uba和上面计算出来的一致,看出来事务未提交标记和上述一致.

关于blockrecover 解决坏块相关测试与总结

悲剧的客户因为IBM p系列小机更换电源导致主机直接掉电,起来后发现数据库出现不少坏块,而且还有部分坏块中含有回滚事务,导致alert日志一直报smon回滚遇到坏块错误,该数据库版本是9.2.0.8 RAC,根据客户的备份情况,为了减少对业务的影响,决定使用blockrecover对其处理.这里通过10g数据库大概模拟出现含事务坏块的情况以及处理过程,重现了我们在处理的时候不确定的一些知识.
创建测试表

SQL> select * from v$version;
BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - Prod
PL/SQL Release 10.2.0.4.0 - Production
CORE    10.2.0.4.0      Production
TNS for Linux: Version 10.2.0.4.0 - Production
NLSRTL Version 10.2.0.4.0 - Production
SQL> conn /as sysdba
Connected.
SQL> archive log list;
Database log mode              Archive Mode
Automatic archival             Enabled
Archive destination            /u01/oracle/oradata/XFF/archivelog
Oldest online log sequence     231
Next log sequence to archive   233
Current log sequence           233
SQL> conn chf/xifenfei
Connected.
SQL> create table t_xifenfei
  2  as
  3  select * from dba_objects where rownum<10;
Table created.
SQL> select rowid,
  2  dbms_rowid.rowid_relative_fno(rowid) rel_fno,
  3  dbms_rowid.rowid_block_number(rowid) block
  4  from chf.t_xifenfei;
ROWID                 REL_FNO      BLOCK
------------------ ---------- ----------
AAANIqAAEAAAAAcAAA          4         28
AAANIqAAEAAAAAcAAB          4         28
AAANIqAAEAAAAAcAAC          4         28
AAANIqAAEAAAAAcAAD          4         28
AAANIqAAEAAAAAcAAE          4         28
AAANIqAAEAAAAAcAAF          4         28
AAANIqAAEAAAAAcAAG          4         28
AAANIqAAEAAAAAcAAH          4         28
AAANIqAAEAAAAAcAAI          4         28
9 rows selected.

当前的seq是233(也就是说我在233归档上创建了t_xienfei表)

dbv检查block

[oracle@xifenfei ~]$ dbv file='/u01/oracle/oradata/XFF/users01.dbf'
DBVERIFY: Release 10.2.0.4.0 - Production on Sun Jan 20 22:16:16 2013
Copyright (c) 1982, 2007, Oracle.  All rights reserved.
DBVERIFY - Verification starting : FILE = /u01/oracle/oradata/XFF/users01.dbf
DBVERIFY - Verification complete
Total Pages Examined         : 1280
Total Pages Processed (Data) : 904
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 38
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 44
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 294
Total Pages Marked Corrupt   : 0
Total Pages Influx           : 0
Highest block SCN            : 3224018224 (2860.3224018224)

证明无任何坏块

切换归档

SQL> alter system switch logfile;
System altered.
SQL> /
System altered.
SQL> archive log list;
Database log mode              Archive Mode
Automatic archival             Enabled
Archive destination            /u01/oracle/oradata/XFF/archivelog
Oldest online log sequence     233
Next log sequence to archive   235
Current log sequence           235

现在已经切换seq到235

rman备份我们需要测试block(file 4 block 28)对应的数据文件

RMAN>  backup datafile 4 format '/u01/oracle/oradata/xff_4.rman';

具体见:rman制造坏块,bbed修复坏块

模拟数据库进行其他操作

SQL> alter system switch logfile;
System altered.
SQL> /
System altered.
SQL> archive log list;
Database log mode              Archive Mode
Automatic archival             Enabled
Archive destination            /u01/oracle/oradata/XFF/archivelog
Oldest online log sequence     235
Next log sequence to archive   237
Current log sequence           237
SQL> conn chf/xifenfei
Connected.
SQL> create table t_xff
  2  as
  3  select * from dba_objects;
Table created.
SQL> alter system switch logfile;
System altered.

主要是为了模拟对其他block操作,对于block 28的恢复影响

对block 28进行操作

SQL> archive log list;
Database log mode              Archive Mode
Automatic archival             Enabled
Archive destination            /u01/oracle/oradata/XFF/archivelog
Oldest online log sequence     237
Next log sequence to archive   239
Current log sequence           239
SQL> update chf.t_xifenfei set object_name='www.xifenfei.com';
9 rows updated.
SQL> commit;
Commit complete.
SQL> alter system switch logfile;
System altered.
SQL> /
System altered.
SQL> /
System altered.

在seq为239的时候对block 28进行了一次update操作

模拟其他业务操作

SQL> archive log list;
Database log mode              Archive Mode
Automatic archival             Enabled
Archive destination            /u01/oracle/oradata/XFF/archivelog
Oldest online log sequence     240
Next log sequence to archive   242
Current log sequence           242
SQL> delete from chf.t_xff ;
50491 rows deleted.
SQL> commit;
Commit complete.
SQL> alter system switch logfile;
System altered.
SQL> /
System altered.
SQL> archive log list;
Database log mode              Archive Mode
Automatic archival             Enabled
Archive destination            /u01/oracle/oradata/XFF/archivelog
Oldest online log sequence     242
Next log sequence to archive   244
Current log sequence           244
SQL>  alter system switch logfile;
System altered.
SQL> /
System altered.

这里可以知道在seq为246的时候做了备份归档操作

备份归档操作

RMAN> backup archivelog all format '/u01/oracle/oradata/xff_arch_%U' delete input;

模拟继续操作

SQL> alter system switch logfile;
System altered.
SQL> /
System altered.
SQL> archive log list;
Database log mode              Archive Mode
Automatic archival             Enabled
Archive destination            /u01/oracle/oradata/XFF/archivelog
Oldest online log sequence     248
Next log sequence to archive   250
Current log sequence           250
SQL> update chf.t_xifenfei set object_name='www.orasos.com' where rownum<5;
4 rows updated.

这里可以发现,在seq为250的时候我们再次对block 28进行了操作

使用rman制造坏块

RMAN> BLOCKRECOVER DATAFILE 4 block 28 clear;

dbv检查坏块

[oracle@xifenfei ~]$ dbv file='/u01/oracle/oradata/XFF/users01.dbf'
DBVERIFY: Release 10.2.0.4.0 - Production on Sun Jan 20 23:01:24 2013
Copyright (c) 1982, 2007, Oracle.  All rights reserved.
DBVERIFY - Verification starting : FILE = /u01/oracle/oradata/XFF/users01.dbf
Page 28 is influx - most likely media corrupt
Corrupt block relative dba: 0x0100001c (file 4, block 28)
Fractured block found during dbv:
Data in bad block:
 type: 6 format: 2 rdba: 0x0100001c
 last change scn: 0x0b2c.c02ab081 seq: 0x1 flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0xb0c4a6ea
 check value in block header: 0x393f
 computed block checksum: 0xc917
DBVERIFY - Verification complete
Total Pages Examined         : 1280
Total Pages Processed (Data) : 903
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 38
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 47
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 291
Total Pages Marked Corrupt   : 1
Total Pages Influx           : 1
Highest block SCN            : 3224022228 (2860.3224022228)

强制kill数据库

[oracle@xifenfei ~]$ ps -ef|grep pmon
oracle    9744  9638  0 23:03 pts/1    00:00:00 grep pmon
oracle   32156     1  0 14:17 ?        00:00:10 ora_pmon_XFF
[oracle@xifenfei ~]$ kill -9 32156
[oracle@xifenfei ~]$ ps -ef|grep pmon
oracle    9751  9638  0 23:03 pts/1    00:00:00 grep pmon

为了模拟含事务的block出现坏块

启动数据库

SQL> startup mount
ORACLE instance started.
Total System Global Area  306184192 bytes
Fixed Size                  1267164 bytes
Variable Size             109054500 bytes
Database Buffers          188743680 bytes
Redo Buffers                7118848 bytes
Database mounted.
SQL> ALTER DATABASE OPEN;
Database altered.

数据库启动正常

查询坏块

SQL> select count(*) from chf.t_xifenfei;
select count(*) from chf.t_xifenfei
                         *
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 4, block # 28)
ORA-01110: data file 4: '/u01/oracle/oradata/XFF/users01.dbf'

alert日志

Sun Jan 20 23:04:37 2013
SMON: enabling tx recovery
Sun Jan 20 23:04:37 2013
Database Characterset is ZHS16GBK
Sun Jan 20 23:04:37 2013
Hex dump of (file 4, block 28) in trace file /u01/oracle/admin/XFF/bdump/xff_smon_9775.trc
Corrupt block relative dba: 0x0100001c (file 4, block 28)
Fractured block found during buffer read
Data in bad block:
 type: 6 format: 2 rdba: 0x0100001c
 last change scn: 0x0b2c.c02ab081 seq: 0x1 flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0xb0c4a6ea
 check value in block header: 0x393f
 computed block checksum: 0xc917
Reread of rdba: 0x0100001c (file 4, block 28) found same corrupted data
ORACLE Instance XFF (pid = 8) - Error 1578 encountered while recovering transaction (9, 37) on object 53802.
Sun Jan 20 23:04:38 2013
Errors in file /u01/oracle/admin/XFF/bdump/xff_smon_9775.trc:
ORA-01578: ORACLE data block corrupted (file # 4, block # 28)
ORA-01110: data file 4: '/u01/oracle/oradata/XFF/users01.dbf'

通过试验步骤和alert日志可以发现因为block有事务,但是被标记为了坏块,所以smon无法回滚该事务,从而出现alert中类似提示

继续切换归档

SQL> alter system switch logfile;
System altered.
SQL> /
System altered.
SQL> /
System altered.
SQL> /
System altered.
SQL> archive log list;
Database log mode              Archive Mode
Automatic archival             Enabled
Archive destination            /u01/oracle/oradata/XFF/archivelog
Oldest online log sequence     252
Next log sequence to archive   254
Current log sequence           254

移走归档

[oracle@xifenfei archivelog]$ ls -l
total 2224
-rw-r----- 1 oracle oinstall  360960 Jan 20 22:59 1_247_792679299.dbf
-rw-r----- 1 oracle oinstall    1024 Jan 20 22:59 1_248_792679299.dbf
-rw-r----- 1 oracle oinstall 1630208 Jan 20 23:04 1_249_792679299.dbf
-rw-r----- 1 oracle oinstall  249344 Jan 20 23:09 1_250_792679299.dbf
-rw-r----- 1 oracle oinstall    1024 Jan 20 23:09 1_251_792679299.dbf
-rw-r----- 1 oracle oinstall    4608 Jan 20 23:09 1_252_792679299.dbf
-rw-r----- 1 oracle oinstall    1024 Jan 20 23:09 1_253_792679299.dbf
[oracle@xifenfei archivelog]$ mkdir bak
[oracle@xifenfei archivelog]$ mv *.dbf bak
[oracle@xifenfei archivelog]$ ll
total 4
drwxr-xr-x 2 oracle oinstall 4096 Jan 20 23:11 bak

为了重现,当我们使用blockrecover恢复的时候,如果缺少归档会怎么样

尝试blockrecover

RMAN> BLOCKRECOVER DATAFILE 4 block 28;
Starting blockrecover at 20-JAN-13
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=135 devtype=DISK
allocated channel: ORA_DISK_2
channel ORA_DISK_2: sid=126 devtype=DISK
allocated channel: ORA_DISK_3
channel ORA_DISK_3: sid=125 devtype=DISK
allocated channel: ORA_DISK_4
channel ORA_DISK_4: sid=124 devtype=DISK
channel ORA_DISK_1: restoring block(s)
channel ORA_DISK_1: specifying block(s) to restore from backup set
restoring blocks of datafile 00004
channel ORA_DISK_1: reading from backup piece /u01/oracle/oradata/xff_4.rman
channel ORA_DISK_1: restored block(s) from backup piece 1
piece handle=/u01/oracle/oradata/xff_4.rman tag=TAG20130120T222333
channel ORA_DISK_1: block restore complete, elapsed time: 00:00:01
starting media recovery
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of blockrecover command at 01/20/2013 23:11:41
RMAN-06053: unable to perform media recovery because of missing log
RMAN-06025: no backup of log thread 1 seq 253 lowscn 12286830510311 found to restore
RMAN-06025: no backup of log thread 1 seq 252 lowscn 12286830510307 found to restore
RMAN-06025: no backup of log thread 1 seq 251 lowscn 12286830510305 found to restore
RMAN-06025: no backup of log thread 1 seq 250 lowscn 12286830509979 found to restore
RMAN-06025: no backup of log thread 1 seq 249 lowscn 12286830489543 found to restore
RMAN-06025: no backup of log thread 1 seq 248 lowscn 12286830489541 found to restore
RMAN-06025: no backup of log thread 1 seq 247 lowscn 12286830489279 found to restore

alert日志记录

Sun Jan 20 23:11:38 2013
alter database recover datafile list clear
Sun Jan 20 23:11:38 2013
Completed: alter database recover datafile list clear
Sun Jan 20 23:11:38 2013
Starting block media recovery
Sun Jan 20 23:11:39 2013
Media Recovery Log /u01/oracle/oradata/XFF/archivelog/1_235_792679299.dbf
Sun Jan 20 23:11:41 2013
alter database recover cancel
Sun Jan 20 23:11:41 2013
Media Recovery Canceled
Completed: alter database recover cancel

blockrecover恢复途中或者异常终止,dbv检测

[oracle@xifenfei ~]$ dbv file=/u01/oracle/oradata/XFF/users01.dbf
DBVERIFY: Release 10.2.0.4.0 - Production on Sun Jan 20 23:18:29 2013
Copyright (c) 1982, 2007, Oracle.  All rights reserved.
DBVERIFY - Verification starting : FILE = /u01/oracle/oradata/XFF/users01.dbf
Page 28 is marked corrupt
Corrupt block relative dba: 0x0100001c (file 4, block 28)
Bad check value found during dbv:
Data in bad block:
 type: 6 format: 2 rdba: 0x0100001c
 last change scn: 0x0b2c.c02b0248 seq: 0x1 flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x02480601
 check value in block header: 0x13fc
 computed block checksum: 0x663b
DBVERIFY - Verification complete
Total Pages Examined         : 1280
Total Pages Processed (Data) : 903
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 38
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 47
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 291
Total Pages Marked Corrupt   : 1
Total Pages Influx           : 0
Highest block SCN            : 3224022228 (2860.3224022228)

在blockrecover处理过程,或者处理失败的后,block依然是坏块,如果遇到这类情况,没有事务可以直接指定seq/scn/time的方法来恢复,如果有事务,需要指定恢复时间点过该事务的时间点,让smon能够正常回滚,从而使得smon进程正常工作

还原归档后继续测试

[oracle@xifenfei archivelog]$ mv bak/* ./
RMAN> BLOCKRECOVER DATAFILE 4 block 28;
starting media recovery
archive log thread 1 sequence 247 is already on disk as file /u01/oracle/oradata/XFF/archivelog/1_247_792679299.dbf
…………
media recovery complete, elapsed time: 00:00:01
Finished blockrecover at 20-JAN-13
[oracle@xifenfei ~]$ dbv file=/u01/oracle/oradata/XFF/users01.dbf
DBVERIFY: Release 10.2.0.4.0 - Production on Sun Jan 20 23:15:43 2013
Copyright (c) 1982, 2007, Oracle.  All rights reserved.
DBVERIFY - Verification starting : FILE = /u01/oracle/oradata/XFF/users01.dbf
DBVERIFY - Verification complete
Total Pages Examined         : 1280
Total Pages Processed (Data) : 904
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 38
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 47
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 291
Total Pages Marked Corrupt   : 0
Total Pages Influx           : 0
Highest block SCN            : 3224023169 (2860.3224023169)

>
blockrecover处理alert日志

Sun Jan 20 23:15:01 2013
Media Recovery Log /u01/oracle/oradata/XFF/archivelog/1_250_792679299.dbf
Sun Jan 20 23:15:01 2013
Media Recovery Log /u01/oracle/oradata/XFF/archivelog/1_251_792679299.dbf
Sun Jan 20 23:15:01 2013
Recovery of Online Redo Log: Thread 1 Group 3 Seq 252 Reading mem 0
  Mem# 0: /u01/oracle/oradata/XFF/redo03.log
Sun Jan 20 23:15:01 2013
Recovery of Online Redo Log: Thread 1 Group 1 Seq 253 Reading mem 0
  Mem# 0: /u01/oracle/oradata/XFF/redo01.log
Sun Jan 20 23:15:01 2013
Recovery of Online Redo Log: Thread 1 Group 2 Seq 254 Reading mem 0
  Mem# 0: /u01/oracle/oradata/XFF/redo02.log
Sun Jan 20 23:15:02 2013
Completed block media recovery

补充说明
1.在9i中使用blockrecover会在Archive destination中生产block的备份文件,类似469_519791_3063_2442393528.bkd(file 469 block 519791),需要注意Archive destination目录结尾需要”/”,不然可能出现直接写入和该目录并列的Archive destination+blockrecover产生文件;10g中不生成该文件;如果该block在9i中未备份,也不会在生产相关文件,而是利用归档恢复.
2.blockrecover需要还原从该block从备份之后的所有归档(如果被备份起来需要还原出来),如果缺少归档可能导致恢复失败,包括基于scn/seq/time的等
3.blockrecover在执行过程中或者执行异常终止,该block依然是坏块,不会对其他block产生影响.在使用blockrecover出现异常终止后的block修复,在后续blog中提供解决方法
4.对于不含事务的坏块,如果数据允许丢失可以通过设置event跳过坏块(特殊block除外),然后重建对象;对于含event的block 坏块,建议使用blockrecover处理或者直接恢复数据文件,如果没有备份,考试使用event跳过事务回滚,然后屏蔽坏块处理

创建控制文件遭遇ORA-600 kccscf_1

数据库版本10.2.0.1创建控制文件出现ORA-00600[kccscf_1]错误

SQL> CREATE CONTROLFILE REUSE DATABASE "ITPLUSDW" RESETLOGS  ARCHIVELOG
  2      MAXLOGFILES 16
  3      MAXLOGMEMBERS 3
  4      MAXDATAFILES 100
  5      MAXINSTANCES 8
  6      MAXLOGHISTORY 292000
  7  LOGFILE
  8    GROUP 1 '/backup/oracle/oradata/redo01.log'  SIZE 50M,
  9    GROUP 2 '/backup/oracle/oradata/redo02.log'  SIZE 50M,
 10    GROUP 3 '/backup/oracle/oradata/redo03.log'  SIZE 50M
 11  -- STANDBY LOGFILE
 12  DATAFILE
 13    '/backup/oracle/oradata/system01.dbf',
 14    '/backup/oracle/oradata/undotbs01.dbf',
 15    '/backup/oracle/oradata/sysaux01.dbf',
 16    '/backup/oracle/oradata/users01.dbf',
 17    '/backup/oracle/oradata/itplus_data',
 18    '/backup/oracle/oradata/invertories_data',
 19    '/backup/oracle/oradata/inventories_data01',
 20    '/backup/oracle/oradata/itplus_discover',
 21    '/backup/oracle/oradata/inventories_data02',
 22    '/backup/oracle/oradata/inventories_data03',
 23    '/backup/oracle/oradata/inventories_data05',
 24    '/backup/oracle/oradata/inventories_data04'
 25  CHARACTER SET ZHS16CGB231280
 26  ;
CREATE CONTROLFILE REUSE DATABASE "ITPLUSDW" RESETLOGS  ARCHIVELOG
*
ERROR at line 1:
ORA-01503: CREATE CONTROLFILE failed
ORA-00600: internal error code, arguments: [kccscf_1], [9], [292000], [65535],[], [], [], []

alert日志错误

Default Temporary Tablespace will be necessary for a locally managed database in future release
Tue Feb 26 15:49:36 2013
Errors in file /backup/oracle/admin/udump/itplustest_ora_25001.trc:
ORA-00600: internal error code, arguments: [kccscf_1], [9], [292000], [65535], [], [], [], []
Tue Feb 26 15:49:36 2013
Errors in file /backup/oracle/admin/udump/itplustest_ora_25001.trc:
ORA-00600: internal error code, arguments: [kccscf_1], [9], [292000], [65535], [], [], [], []
ORA-1503 signalled during: CREATE CONTROLFILE REUSE DATABASE "ITPLUSDW" NORESETLOGS  ARCHIVELOG

查询mos发现Create Control File fails with ORA-600 [kccscf_1] [ID 387210.1],中有描述,该bug在10.2.0.1到11.1.0.6版本中,如果MAXLOGHISTORY超过了65535均有可能出现,在10.2.0.4和11.1.0.6中修复,部分版本有Patch 4877360修复该问题。在不能升级或者没有patch的情况下,解决该问题的思路是设置MAXLOGHISTORY为65535或者更小值,重建控制文件

rman制造坏块,bbed修复坏块

在rman中有隐藏的命令clear,可以标记数据块为corrupt,从而实现数据库坏块试验,本篇blog通过dbv,bbed,table select来验证坏块的出现和修复
创建测试表

SQL>  create table t_xifenfei as
  2   select object_id,object_name from dba_objects where rownum<10;
Table created.
SQL>  select rowid,
  2   dbms_rowid.rowid_relative_fno(rowid) rel_fno,
  3   dbms_rowid.rowid_block_number(rowid) block
  4   from chf.t_xifenfei;
ROWID                 REL_FNO      BLOCK
------------------ ---------- ----------
AAAStAAAEAAAACrAAA          4        171
AAAStAAAEAAAACrAAB          4        171
AAAStAAAEAAAACrAAC          4        171
AAAStAAAEAAAACrAAD          4        171
AAAStAAAEAAAACrAAE          4        171
AAAStAAAEAAAACrAAF          4        171
AAAStAAAEAAAACrAAG          4        171
AAAStAAAEAAAACrAAH          4        171
AAAStAAAEAAAACrAAI          4        171
9 rows selected.
SQL> alter system checkpoint;
System altered.

dbv验证坏块

[oracle@xifenfei ~]$ dbv file=/u01/oracle/oradata/ora11g/users01.dbf
DBVERIFY: Release 11.2.0.3.0 - Production on Sun Jan 20 09:12:16 2013
Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.
DBVERIFY - Verification starting : FILE = /u01/oracle/oradata/ora11g/users01.dbf
DBVERIFY - Verification complete
Total Pages Examined         : 7680
Total Pages Processed (Data) : 3776
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 167
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 744
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 2993
Total Pages Marked Corrupt   : 0
Total Pages Influx           : 0
Total Pages Encrypted        : 0
Highest block SCN            : 1006717955 (2956.1006717955)

这里创建了t_xifenfei表,数据存储在file 4 block 171中,现在该block一切正常,本试验就是要通过rman来使得该block corrupt,然后通过bbed来修复

bbed查看kcbh

BBED> set block 171
        BLOCK#          171
BBED> p kcbh
struct kcbh, 20 bytes                       @0
   ub1 type_kcbh                            @0        0x06
   ub1 frmt_kcbh                            @1        0xa2
   ub1 spare1_kcbh                          @2        0x00
   ub1 spare2_kcbh                          @3        0x00
   ub4 rdba_kcbh                            @4        0x010000ab
   ub4 bas_kcbh                             @8        0x3c014bfe
   ub2 wrp_kcbh                             @12       0x0b8c
   ub1 seq_kcbh                             @14       0x02
   ub1 flg_kcbh                             @15       0x04 (KCBHFCKV)
   ub2 chkval_kcbh                          @16       0x6e0c  <--重点关注
   ub2 spare3_kcbh                          @18       0x0000

rman标记坏块

RMAN> BLOCKRECOVER DATAFILE 4 block 171 clear;
Starting recover at 20-JAN-13
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=18 device type=DISK
Finished recover at 20-JAN-13

dbv再次检查坏块

[oracle@xifenfei ~]$ dbv file=/u01/oracle/oradata/ora11g/users01.dbf
DBVERIFY: Release 11.2.0.3.0 - Production on Sun Jan 20 09:53:16 2013
Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.
DBVERIFY - Verification starting : FILE = /u01/oracle/oradata/ora11g/users01.dbf
Page 171 is marked corrupt
Corrupt block relative dba: 0x010000ab (file 4, block 171)
Bad check value found during dbv:
Data in bad block:
 type: 6 format: 2 rdba: 0x010000ab
 last change scn: 0x0b8c.3c014bfe seq: 0x2 flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x4bfe0602
 check value in block header: 0x6e0c
 computed block checksum: 0xb5bc
DBVERIFY - Verification complete
Total Pages Examined         : 7680
Total Pages Processed (Data) : 3775
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 167
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 744
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 2993
Total Pages Marked Corrupt   : 1
Total Pages Influx           : 1
Total Pages Encrypted        : 0
Highest block SCN            : 1006717955 (2956.1006717955)

尝试查询数据

SQL> select count(*) from t_xifenfei;
select count(*) from t_xifenfei
                     *
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 4, block # 171)
ORA-01110: data file 4: '/u01/oracle/oradata/ora11g/users01.dbf'

bbed验证坏块

BBED> set block 171
        BLOCK#          171
BBED> p kcbh
struct kcbh, 20 bytes                       @0
   ub1 type_kcbh                            @0        0x06
   ub1 frmt_kcbh                            @1        0xa2
   ub1 spare1_kcbh                          @2        0x00
   ub1 spare2_kcbh                          @3        0x00
   ub4 rdba_kcbh                            @4        0x010000ab
   ub4 bas_kcbh                             @8        0x3c014bfe
   ub2 wrp_kcbh                             @12       0x0b8c
   ub1 seq_kcbh                             @14       0x02
   ub1 flg_kcbh                             @15       0x04 (KCBHFCKV)
   ub2 chkval_kcbh                          @16       0x6e0c  <--注意该值未变化
   ub2 spare3_kcbh                          @18       0x0000
BBED> verify
DBVERIFY - Verification starting
FILE = /u01/oracle/oradata/ora11g/users01.dbf
BLOCK = 171
Block 171 is corrupt
Corrupt block relative dba: 0x010000ab (file 0, block 171)
Bad check value found during verification
Data in bad block:
 type: 6 format: 2 rdba: 0x010000ab
 last change scn: 0x0b8c.3c014bfe seq: 0x2 flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x4bfe0602
 check value in block header: 0x6e0c
 computed block checksum: 0xb5bc
DBVERIFY - Verification complete
Total Blocks Examined         : 1
Total Blocks Processed (Data) : 0
Total Blocks Failing   (Data) : 0
Total Blocks Processed (Index): 0
Total Blocks Failing   (Index): 0
Total Blocks Empty            : 0
Total Blocks Marked Corrupt   : 1
Total Blocks Influx           : 0
Message 531 not found;  product=RDBMS; facility=BBED

这里通过dbv,bbed,select table都证明rman能够标记block为Corrupt.
这里需要分析:block已经被标记,那证明该块肯定有修改,也就是说chkval_kcbh一定要变化,但是这里没有变化,证明该处异常

bbed修复rman生成坏块

BBED> sum apply
Warning: contents of previous BIFILE will be lost. Proceed? (Y/N) y
Check value for File 0, Block 171:
current = 0xdbb0, required = 0xdbb0
BBED> verify
DBVERIFY - Verification starting
FILE = /u01/oracle/oradata/ora11g/users01.dbf
BLOCK = 171
DBVERIFY - Verification complete
Total Blocks Examined         : 1
Total Blocks Processed (Data) : 1
Total Blocks Failing   (Data) : 0
Total Blocks Processed (Index): 0
Total Blocks Failing   (Index): 0
Total Blocks Empty            : 0
Total Blocks Marked Corrupt   : 0
Total Blocks Influx           : 0
Message 531 not found;  product=RDBMS; facility=BBED

bbed测试坏块已经修复

dbv验证坏块

[oracle@xifenfei ~]$ dbv file=/u01/oracle/oradata/ora11g/users01.dbf
DBVERIFY: Release 11.2.0.3.0 - Production on Sun Jan 20 10:01:46 2013
Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.
DBVERIFY - Verification starting : FILE = /u01/oracle/oradata/ora11g/users01.dbf
DBVERIFY - Verification complete
Total Pages Examined         : 7680
Total Pages Processed (Data) : 3776
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 167
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 744
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 2993
Total Pages Marked Corrupt   : 0
Total Pages Influx           : 0
Total Pages Encrypted        : 0
Highest block SCN            : 1006717955 (2956.1006717955)

dbv测试坏块也被修复

查询表验证

SQL> select count(*) from t_xifenfei;
  COUNT(*)
----------
         9

数据也可以正常查询,证明rman的坏块问题通过bbed解决

总结说明
1.我们可以通过rman的clear命令来标记坏块(BLOCKRECOVER DATAFILE file# BLOCK block1#, block2#, block3#… CLEAR 😉
2.我们可以通过bbed的sum apply命令来修复该类型坏块

ORA-600 kghstack_free2异常恢复

今天接触一案例ORA-00600[kghstack_free2],恢复起来很简单,顺便记录下,供其他人遇到类似情况参考.

Wed Feb 20 15:05:04 2013
SMON: enabling cache recovery
Wed Feb 20 15:05:05 2013
Successfully onlined Undo Tablespace 1.
Wed Feb 20 15:05:05 2013
SMON: enabling tx recovery
Wed Feb 20 15:05:06 2013
Errors in file f:\oracle\product\10.2.0\admin\cksoltp\bdump\cksoltp_smon_3556.trc:
ORA-00600: internal error code, arguments: [kghstack_free2], [], [], [], [], [], [], []
Wed Feb 20 15:05:06 2013
Database Characterset is AL32UTF8
replication_dependency_tracking turned off (no async multimaster replication found)
Wed Feb 20 15:05:08 2013
Errors in file f:\oracle\product\10.2.0\admin\cksoltp\bdump\cksoltp_smon_3556.trc:
ORA-07445: exception encountered: core dump [ACCESS_VIOLATION] [_kghstack_err+101]
[PC:0x603CC77F] [ADDR:0x59004E04] [UNABLE_TO_READ] []
ORA-00600: internal error code, arguments: [kghstack_free2], [], [], [], [], [], [], []
Starting background process QMNC
QMNC started with pid=18, OS id=3504
Wed Feb 20 15:05:13 2013
Errors in file f:\oracle\product\10.2.0\admin\cksoltp\bdump\cksoltp_pmon_4064.trc:
ORA-00474: SMON process terminated with error

这个错误很明显是因为smon在回滚事务的时候,遇到异常从而出现ORA-00600[kghstack_free2]错误,使得数据库不能被正常open,通过分析trace文件发现回滚段36中有事务未提交,需要回滚,针对这样的数据库恢复方法很简单:1.屏蔽事务回滚,2.屏蔽回滚段强制offline等
1) EVENT=”10513 trace name context forever”
2)_offline_rollback_segments= _SYSSMU36$
数据库启动后,通过分析trace文件,找出来异常对象,然后重建该对象,除掉event/_offline_rollback_segments即可

通过rowid获取segment header坏块数据

在上篇(table中各种坏块对select/dml操作影响)中说到如果segment header异常了,不能通过ctas来获得相关数据,在群的讨论中,EZIO说到可以通过rowid方式来获得相关数据,通过测试证明,确实可以通过该方法获得数据,以后遇到此类错误,大家也不必惊慌.
创建测试表

SQL> create table t_xifenfei
  2  as
  3  select * from dba_objects where object_id is not null;
SQL> alter table t_xifenfei
  2  add constraint PK_t_xifenfei primary key (object_id)
  3  ;
Table altered.
SQL> alter system checkpoint;
System altered.
SQL> select count(*) from chf.t_xifenfei;
  COUNT(*)
----------
     74762
SQL> select header_file,header_block from
  2  DBA_SEGments where segment_name='T_XIFENFEI' AND OWNER='CHF';
HEADER_FILE HEADER_BLOCK
----------- ------------
          4          170

dump block

alter system dump datafile 4 block 170;
Dump of buffer cache at level 4 for tsn=4 rdba=16777386
Block dump from disk:
buffer tsn: 4 rdba: 0x010000aa (4/170)
scn: 0x0b8c.3c0092e4 seq: 0x01 flg: 0x04 tail: 0x91e42301
frmt: 0x02 chkval: 0xa531 type: 0x23=PAGETABLE SEGMENT HEADER

通过header_block和dump block确定block 170即为PAGETABLE SEGMENT HEADER

bbed制造SEGMENT HEADER坏块

BBED> set block 170
        BLOCK#          170
BBED> set offset 8188
        OFFSET          8188
BBED> d
 File: /u01/oracle/oradata/ora11g/users01.dbf (0)
 Block: 170              Offsets: 8188 to 8191           Dba:0x00000000
------------------------------------------------------------------------
 0123e492
 <32 bytes per line>
BBED> m /x 0123e491
Warning: contents of previous BIFILE will be lost. Proceed? (Y/N) y
 File: /u01/oracle/oradata/ora11g/users01.dbf (0)
 Block: 170              Offsets: 8188 to 8191           Dba:0x00000000
------------------------------------------------------------------------
 0123e491
 <32 bytes per line>
BBED> sum apply
Check value for File 0, Block 170:
current = 0xa531, required = 0xa531
BBED> verify
DBVERIFY - Verification starting
FILE = /u01/oracle/oradata/ora11g/users01.dbf
BLOCK = 170
Block 170 is corrupt
Corrupt block relative dba: 0x010000aa (file 0, block 170)
Fractured block found during verification
Data in bad block:
 type: 35 format: 2 rdba: 0x010000aa
 last change scn: 0x0b8c.3c0092e4 seq: 0x1 flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x91e42301
 check value in block header: 0xa531
 computed block checksum: 0x0
DBVERIFY - Verification complete
Total Blocks Examined         : 1
Total Blocks Processed (Data) : 0
Total Blocks Failing   (Data) : 0
Total Blocks Processed (Index): 0
Total Blocks Failing   (Index): 0
Total Blocks Empty            : 0
Total Blocks Marked Corrupt   : 1
Total Blocks Influx           : 2
Message 531 not found;  product=RDBMS; facility=BBED

测试segment header坏块后select操作

SQL> select * from chf.t_xifenfei;
select * from chf.t_xifenfei
                         *
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 4, block # 170)
ORA-01110: data file 4: '/u01/oracle/oradata/ora11g/users01.dbf'
SQL> select /*+index(t PK_T_XIFENFEI)*/  count(rowid) from chf.t_xifenfei t;
COUNT(ROWID)
------------
       74762

基于rowid获取segment header 坏块对象数据

SQL> create table chf.bad_rows (table_name varchar2(60),
   2 row_id rowid, oracle_error_code number);
Table created.
SQL> DECLARE
  2   TYPE RowIDTab IS TABLE OF ROWID INDEX BY BINARY_INTEGER;
  3   CURSOR c1 IS  select /*+index(t PK_T_XIFENFEI)*/ rowid from chf.t_xifenfei t;
  4   r RowIDTab;
  5   rows  NATURAL := 20000;
  6   bad_rows number := 0 ;
  7   errors number;
  8   error_code number;
  9   myrowid rowid;
 10  BEGIN
 11   OPEN c1;
 12   LOOP
 13     FETCH  c1 BULK COLLECT INTO r LIMIT rows;
 14     EXIT WHEN r.count=0;
 15     BEGIN
 16      FORALL i IN r.FIRST..r.LAST SAVE EXCEPTIONS
 17       insert into chf.t_xifenfei_new
 18       select /*+ ROWID(A) */ *
 19       from chf.t_xifenfei A where rowid = r(i);
 20     EXCEPTION
 21     when OTHERS then
 22      BEGIN
 23       errors := SQL%BULK_EXCEPTIONS.COUNT;
 24       FOR err1 IN 1..errors LOOP
 25           error_code := SQL%BULK_EXCEPTIONS(err1).ERROR_CODE;
 26           myrowid := r(SQL%BULK_EXCEPTIONS(err1).ERROR_INDEX);
 27           bad_rows := bad_rows + 1;
 28           insert into chf.bad_rows values('chf.t_xifenfei',myrowid, error_code);
 29       END LOOP;
 30       END;
 31     END;
 32    commit;
 33   END LOOP;
 34   commit;
 35   CLOSE c1;
 36   dbms_output.put_line('Total Bad Rows: '||bad_rows);
 37  END;
 38  /
Total Bad Rows: 0
PL/SQL procedure successfully completed.
SQL> select count(*) from chf.t_xifenfei_new;
  COUNT(*)
----------
     74762

通过上面pl/sql,基于rowid成功获得segment header 异常对象中的所有数据记录.如果没有主键的表出现该问题,可以参考:使用plsql抢救数据

table中各种坏块对select/dml操作影响

在春节前写过table中各种类型block坏块是否能被跳过,本来准备节前写完它的姊妹篇关于table中各种blog如果出现坏块,对select/dml操作影响,因为回家一些事情给耽误了,今天补上该文章,这篇文章主要基于试验测试为主,没有从相关block原理上进行分析,如果有时间,后续文章从原理上来分析为什么这些select/dml操作不能执行
创建测试表

SQL> create table t_xifenfei as
  2  select * from dba_objects where rownum<10;
Table created.
SQL> select count(*) from chf.t_xifenfei;
  COUNT(*)
----------
         9
SQL> select
  2  dbms_rowid.rowid_relative_fno(rowid)rel_fno,
  3       max(dbms_rowid.rowid_block_number(rowid)) max_block,
  4       min(dbms_rowid.rowid_block_number(rowid)) min_block
  5       from chf.t_xifenfei
  6       group by dbms_rowid.rowid_relative_fno(rowid);
   REL_FNO  MAX_BLOCK  MIN_BLOCK
---------- ---------- ----------
         4        171        171
SQL> select EXTENT_ID,FILE_ID,BLOCK_ID,blocks from dba_extents where owner='CHF'
  2   AND SEGMENT_NAME='T_XIFENFEI';
 EXTENT_ID    FILE_ID   BLOCK_ID     BLOCKS
---------- ---------- ---------- ----------
         0          4        168          8
SQL> select SEGMENT_NAME,HEADER_FILE,HEADER_BLOCK,blocks,extents from DBA_SEGMENTS
  2  WHERE OWNER='CHF' AND SEGMENT_NAME='T_XIFENFEI';
SEGMENT_NAME                   HEADER_FILE HEADER_BLOCK     BLOCKS    EXTENTS
------------------------------ ----------- ------------ ---------- ----------
T_XIFENFEI                               4          170          8          1

通过alter system dump datafile 4 block n得出相关block数据块类型
168为FIRST LEVEL BITMAP BLOCK
169为SECOND LEVEL BITMAP BLOCK
170为PAGETABLE SEGMENT HEADER
171为trans data

处理block 168

--制造坏块
BBED> set block 168
        BLOCK#          168
BBED> set offset 8188
        OFFSET          8188
BBED> d
 File: /u01/oracle/oradata/ora11g/users01.dbf (0)
 Block: 168              Offsets: 8188 to 8191           Dba:0x00000000
------------------------------------------------------------------------
 0320d14f
 <32 bytes per line>
BBED> m /x 0320d14e
Warning: contents of previous BIFILE will be lost. Proceed? (Y/N) y
 File: /u01/oracle/oradata/ora11g/users01.dbf (0)
 Block: 168              Offsets: 8188 to 8191           Dba:0x00000000
------------------------------------------------------------------------
 0320d14e
 <32 bytes per line>
BBED> sum apply
Check value for File 0, Block 168:
current = 0xf60b, required = 0xf60b
BBED> verify
DBVERIFY - Verification starting
FILE = /u01/oracle/oradata/ora11g/users01.dbf
BLOCK = 168
Block 168 is corrupt
Corrupt block relative dba: 0x010000a8 (file 0, block 168)
Fractured block found during verification
Data in bad block:
 type: 32 format: 2 rdba: 0x010000a8
 last change scn: 0x0b8c.3bff4fd1 seq: 0x3 flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x4ed12003
 check value in block header: 0xf60b
 computed block checksum: 0x0
DBVERIFY - Verification complete
Total Blocks Examined         : 1
Total Blocks Processed (Data) : 0
Total Blocks Failing   (Data) : 0
Total Blocks Processed (Index): 0
Total Blocks Failing   (Index): 0
Total Blocks Empty            : 0
Total Blocks Marked Corrupt   : 1
Total Blocks Influx           : 2
Message 531 not found;  product=RDBMS; facility=BBED
--select操作
SQL> select count(*) from chf.t_xifenfei;
  COUNT(*)
----------
         9
--dml操作
SQL> delete from chf.t_xifenfei where rownum<3;
2 rows deleted.
----注意update操作
SQL> update chf.t_xifenfei set object_name='www.xifenfei.com';
7 rows updated.
SQL> insert into chf.t_xifenfei select * from dba_objects where rownum=1;
insert into chf.t_xifenfei select * from dba_objects where rownum=1
                *
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 4, block # 168)
ORA-01110: data file 4: '/u01/oracle/oradata/ora11g/users01.dbf'

这里证明对于FIRST LEVEL BITMAP BLOCK,在delete,select操作正常,insert操作异常,update操作待定(update操作不一定能够立马展示效果)

处理block 169

--标记坏块
BBED> set block 169
        BLOCK#          169
BBED> set offset 8188
        OFFSET          8188
BBED> d
 File: /u01/oracle/oradata/ora11g/users01.dbf (0)
 Block: 169              Offsets: 8188 to 8191           Dba:0x00000000
------------------------------------------------------------------------
 0221ce4f
 <32 bytes per line>
BBED> m /x 0221ce4e
Warning: contents of previous BIFILE will be lost. Proceed? (Y/N) y
 File: /u01/oracle/oradata/ora11g/users01.dbf (0)
 Block: 169              Offsets: 8188 to 8191           Dba:0x00000000
------------------------------------------------------------------------
 0221ce4e
 <32 bytes per line>
BBED> sum apply
Check value for File 0, Block 169:
current = 0x9d2f, required = 0x9d2f
BBED> verify
DBVERIFY - Verification starting
FILE = /u01/oracle/oradata/ora11g/users01.dbf
BLOCK = 169
Block 169 is corrupt
Corrupt block relative dba: 0x010000a9 (file 0, block 169)
Fractured block found during verification
Data in bad block:
 type: 33 format: 2 rdba: 0x010000a9
 last change scn: 0x0b8c.3bff4fce seq: 0x2 flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x4ece2102
 check value in block header: 0x9d2f
 computed block checksum: 0x0
DBVERIFY - Verification complete
Total Blocks Examined         : 1
Total Blocks Processed (Data) : 0
Total Blocks Failing   (Data) : 0
Total Blocks Processed (Index): 0
Total Blocks Failing   (Index): 0
Total Blocks Empty            : 0
Total Blocks Marked Corrupt   : 1
Total Blocks Influx           : 2
Message 531 not found;  product=RDBMS; facility=BBED
--select操作
SQL> select count(*) from chf.t_xifenfei;
  COUNT(*)
----------
         9
--dml操作
SQL> delete from chf.t_xifenfei where rownum<2;
1 row deleted.
----注意update操作
SQL> update chf.t_xifenfei set object_name='www.xifenfei.com';
9 rows updated.
SQL> alter table t_xifenfei modify EDITION_NAME varchar2(4000);
Table altered.
SQL> update t_xifenfei set EDITION_NAME=lpad('www.xifenfei.com', 4000, '0');
update t_xifenfei set EDITION_NAME=lpad('www.xifenfei.com', 4000, '0')
       *
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 4, block # 169)
ORA-01110: data file 4: '/u01/oracle/oradata/ora11g/users01.dbf'
SQL> insert into chf.t_xifenfei
  2  select * from dba_objects where rownum<2;
insert into chf.t_xifenfei
                *
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 4, block # 169)
ORA-01110: data file 4: '/u01/oracle/oradata/ora11g/users01.dbf'

SECOND LEVEL BITMAP BLOCK在delete,select操作正常,insert操作异常,update操作分情况(如果更新的列字符串交短,可能不报错,如果更新的字符串较长可能报错)

处理block 170

--标记坏块
BBED> SET BLOCK 170
        BLOCK#          170
BBED> set offset 8188
        OFFSET          8188
BBED> d
 File: /u01/oracle/oradata/ora11g/users01.dbf (0)
 Block: 170              Offsets: 8188 to 8191           Dba:0x00000000
------------------------------------------------------------------------
 0223b91b
 <32 bytes per line>
BBED> m /x  0223b91a
 File: /u01/oracle/oradata/ora11g/users01.dbf (0)
 Block: 170              Offsets: 8188 to 8191           Dba:0x00000000
------------------------------------------------------------------------
 0223b91a
 <32 bytes per line>
BBED> sum apply
Check value for File 0, Block 170:
current = 0xb7d4, required = 0xb7d4
BBED> verify
DBVERIFY - Verification starting
FILE = /u01/oracle/oradata/ora11g/users01.dbf
BLOCK = 170
Block 170 is corrupt
Corrupt block relative dba: 0x010000aa (file 0, block 170)
Fractured block found during verification
Data in bad block:
 type: 35 format: 2 rdba: 0x010000aa
 last change scn: 0x0b8c.3c001bb9 seq: 0x2 flg: 0x04
 spare1: 0x0 spare2: 0x0 spare3: 0x0
 consistency value in tail: 0x1ab92302
 check value in block header: 0xb7d4
 computed block checksum: 0x0
DBVERIFY - Verification complete
Total Blocks Examined         : 1
Total Blocks Processed (Data) : 0
Total Blocks Failing   (Data) : 0
Total Blocks Processed (Index): 0
Total Blocks Failing   (Index): 0
Total Blocks Empty            : 0
Total Blocks Marked Corrupt   : 1
Total Blocks Influx           : 2
Message 531 not found;  product=RDBMS; facility=BBED
--select操作
SQL> select count(*) from chf.t_xifenfei;
select count(*) from chf.t_xifenfei
                         *
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 4, block # 170)
ORA-01110: data file 4: '/u01/oracle/oradata/ora11g/users01.dbf'
--dml操作
SQL> update chf.t_xifenfei where object_name='www.xifenfei.com';
update chf.t_xifenfei where object_name='www.xifenfei.com'
                      *
ERROR at line 1:
ORA-00971: missing SET keyword
SQL> update chf.t_xifenfei set object_name='www.xifenfei.com';
update chf.t_xifenfei set object_name='www.xifenfei.com'
           *
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 4, block # 170)
ORA-01110: data file 4: '/u01/oracle/oradata/ora11g/users01.dbf'
SQL> delete from chf.t_xifenfei where rownum<2;
delete from chf.t_xifenfei where rownum<2
                *
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 4, block # 170)
ORA-01110: data file 4: '/u01/oracle/oradata/ora11g/users01.dbf'

PAGETABLE SEGMENT HEADER异常的时候,数据库包括select,update,delete,insert操作都不能进行

结果汇总
1.BITMAP BLOCK异常的时候,select/delete操作可以正常进行,insert操作异常,update操作可能异常也可能正常
2.SEGMENT HEADER异常的时候,数据库包括select,update,delete,insert操作都不能进行
3.对于这些特殊的block出现坏块,如果有rman备份,从10g开始可以通过rman blockrecover来修复
4.如果没有rman备份,可以BITMAP BLOCK可以类似ctas重建,SEGMENT HEADER可以通过dul scan extent抽取数据
5.对于trans data太过于常见,而且event就可以跳过,在以前的文章中说过,不再讲述